首页 > 解决方案 > Python Regular Expression for pattern containing multiple lines

问题描述

I want to extract all the text printed after "AAAAAAAAAAAAAAAAAA"

Give me some text!
AAAAAAAAAAAAAAAAAA




        S
       p
      p
     p
Epppp

The following does not work:

import re

m = re.findall(r'AAAAAAAAAAAAAAAAAA(.*)', result)

print m[0]

Also, can I specify a variable in a regular expression instead of a hard coded string: "AAAAAAAAAAAAAAAAAA"?

Reason being, the text: "AAAAAAAAAAAAAAAAAA" is a variable and changes. So, I would like to look for a specific variable value in the pattern and then extract all the text after it.

标签: pythonregex

解决方案


Use re.S or re.DOTALL (they are synonyms) to have findall match across lines. Or, in your case, search is probably more appropriate since you only want one match. Also, to have it work for a non-hard-coded string, simply use string formatting or string concatenation. To avoid having unescaped regex characters in the string, run it through re.escape.

import re

result = """Give me some text!
AAAAAAAAAAAAAAAAAA




        S
       p
      p
     p
Epppp"""

s = 'AAAAAAAAAAAAAAAAAA'
# With formatting
m = re.search(r'{}(.*)'.format(re.escape(s)), result, re.S)
# With concatenation
m = re.search(re.escape(s) + r'(.*)', result, re.S)

print m.group(1)

推荐阅读