4.2.4 Match Objects

MatchObject instances support the following methods and attributes:

group ([group1, group2, ...])
Returns one or more subgroups of the match. If there is a single argument, the result is a single string; if there are multiple arguments, the result is a tuple with one item per argument. Without arguments, group1 defaults to zero (i.e. the whole match is returned). If a groupN argument is zero, the corresponding return value is the entire matching string; if it is in the inclusive range [1..99], it is the string matching the the corresponding parenthesized group. If a group number is negative or larger than the number of groups defined in the pattern, an IndexError exception is raised. If a group is contained in a part of the pattern that did not match, the corresponding result is None. If a group is contained in a part of the pattern that matched multiple times, the last match is returned.

If the regular expression uses the (?P<name>...) syntax, the groupN arguments may also be strings identifying groups by their group name. If a string argument is not used as a group name in the pattern, an IndexError exception is raised.

A moderately complicated example:

m = re.match(r"(?P<int>\d+)\.(\d*)", '3.14')

After performing this match, m.group(1) is '3', as is m.group('int'), and m.group(2) is '14'.

groups ([default])
Return a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern. The default argument is used for groups that did not participate in the match; it defaults to None. (Incompatibility note: in the original Python 1.5 release, if the tuple was one element long, a string would be returned instead. In later versions (from 1.5.1 on), a singleton tuple is returned in such cases.)

groupdict ([default])
Return a dictionary containing all the named subgroups of the match, keyed by the subgroup name. The default argument is used for groups that did not participate in the match; it defaults to None.

start ([group])
end ([group])
Return the indices of the start and end of the substring matched by group; group defaults to zero (meaning the whole matched substring). Return None if group exists but did not contribute to the match. For a match object m, and a group g that did contribute to the match, the substring matched by group g (equivalent to m.group(g)) is

m.string[m.start(g):m.end(g)]

Note that m.start(group) will equal m.end(group) if group matched a null string. For example, after m = re.search('b(c?)', 'cba'), m.start(0) is 1, m.end(0) is 2, m.start(1) and m.end(1) are both 2, and m.start(2) raises an IndexError exception.

span ([group])
For MatchObject m, return the 2-tuple (m.start(group), m.end(group)). Note that if group did not contribute to the match, this is (None, None). Again, group defaults to zero.

pos
The value of pos which was passed to the search() or match() function. This is the index into the string at which the regex engine started looking for a match.

endpos
The value of endpos which was passed to the search() or match() function. This is the index into the string beyond which the regex engine will not go.

re
The regular expression object whose match() or search() method produced this MatchObject instance.

string
The string passed to match() or search().

See Also:

Jeffrey Friedl, Mastering Regular Expressions, O'Reilly. The Python material in this book dates from before the re module, but it covers writing good regular expression patterns in great detail.