The module defines the following functions and constants, and an exception:
The expression's behaviour can be modified by specifying a flags value. Values can be any of the following variables, combined using bitwise OR (the | operator).
The sequence
prog = re.compile(pat) result = prog.match(str)
is equivalent to
result = re.match(pat, str)
but the version using compile() is more efficient when the expression will be used several times in a single program.
>>> re.split('\W+', 'Words, words, words.') ['Words', 'words', 'words', ''] >>> re.split('(\W+)', 'Words, words, words.') ['Words', ', ', 'words', ', ', 'words', '.', ''] >>> re.split('\W+', 'Words, words, words.', 1) ['Words', 'words, words.']
Return a list of all non-overlapping matches of pattern in string. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result.
>>> def dashrepl(matchobj): .... if matchobj.group(0) == '-': return ' ' .... else: return '-' >>> re.sub('-{1,2}', dashrepl, 'pro----gram-files') 'pro--gram files'
The optional argument count is the maximum number of pattern occurrences to be replaced; count must be a non-negative integer, and the default value of 0 means to replace all occurrences.
Empty matches for the pattern are replaced only when not adjacent to a previous match, so "sub('x*', '-', 'abc')" returns '-a-b-c-'.
If repl is a string, any backslash escapes in it are processed. That is, "\n" is converted to a single newline character, "\r" is converted to a linefeed, and so forth. Unknown escapes such as "\j" are left alone. Backreferences, such as "\6", are replaced with the substring matched by group 6 in the pattern.
In addition to character escapes and backreferences as described above, "\g<name>" will use the substring matched by the group named "name", as defined by the (?P<name>...) syntax. "\g<number>" uses the corresponding group number; "\ g<2>" is therefore equivalent to "\2", but isn't ambiguous in a replacement such as "\g<2>0". "\20" would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character "0".