[FRPythoneers] OK, next question - Regular expressions

jafo at tummy.com jafo at tummy.com
Tue May 1 14:09:52 MDT 2001

On Tue, May 01, 2001 at 02:02:06PM -0600, J. Wayde Allen wrote:
>Yes, well ... as you've probably figured out I'm kind of new to Python,
>and I'm not too sure I understand the way that Python handles regular
>expressions.  What actually happens when one "compiles" a regular

It turns it into a representation that can be used to match lines.  All
regular expressions need to be compiled.  When you say:

   re.match(r'some regex', line)

it actually compiles it internally (and caches a few of them I believe),
then does a compiledRegex.match(line) sort of thing.

>If I want to extract lines from a file that begin with numerical data for
>instance would I do something like:
>    pattern = regex.compile('^[0-9]']
>    for line in open(filename, 'r').readlines():
>       if pattern.match(line):
>          print line

Sure.  Usually you'd use regular expressions for more complex things than
just checking to see if the first character of a line is a digit.  You can
do that more quickly with:

   if len(line) > 0 and line[0] in string.digits: print line

You'd usually use it for something like:

   m = re.search(r'^\s*(\d+)', line)
   if m:
      print 'Found number:', m.group(1)

\s* means that the regex will be tolerant of leading spaces and just ignore
them.  \d is the same as [0-9].  \d+ means one or more digits.  Then it
extracts that from the line.

Note that search looks for the text anywhere in the line (in this case it's
anchored to start at the beginning of the line).  match() *ONLY* matches if
the string is EXACTLY that, in your example it would only match a line
which consisted of EXACTLY one digit character.  Most people want to use
search, but match is earlier in the man page.  I need to submit a patch for
that.  This'll be the third time this week I've answered this question from

 The question of whether a computer can think is no more interesting than the
 question of whether a submarine can swim."  -- Edsgar W. Dijkstra
Sean Reifschneider, Inimitably Superfluous <jafo at tummy.com>
tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python

More information about the FRPythoneers mailing list