首页 > 解决方案 > Python中的正则表达式在“:”之后获取文本

问题描述

我一直在尝试不同的组合来提取“:”之后的文本。

    materials[3] = 'PE HD Monofilament Yarn CFR India Assessment Main Ports Spot 2-4 Weeks Full Market Range Weekly (Low) : USD/tonne'

    re.match(r'(?<=:+.)(.*)', materials[3])

但是我在 PyCharm 上尝试了不同的错误,尽管当我在https://regexr.com/中测试并模拟读数时序列 aobe 没问题。

从 Python 检索到的错误如下:

  re.match(r'(?<=:+.)(.*)', materials[3])
Traceback (most recent call last):
  File "C:\Users\p119124\AppData\Local\Programs\Python\Python37\lib\site-packages\IPython\core\interactiveshell.py", line 3343, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-210-556fd124536f>", line 1, in <module>
    re.match(r'(?<=:+.)(.*)', materials[3])
  File "C:\Users\p119124\AppData\Local\Programs\Python\Python37\lib\re.py", line 173, in match
    return _compile(pattern, flags).match(string)
  File "C:\Users\p119124\AppData\Local\Programs\Python\Python37\lib\re.py", line 286, in _compile
    p = sre_compile.compile(pattern, flags)
  File "C:\Users\p119124\AppData\Local\Programs\Python\Python37\lib\sre_compile.py", line 768, in compile
    code = _code(p, flags)
  File "C:\Users\p119124\AppData\Local\Programs\Python\Python37\lib\sre_compile.py", line 607, in _code
    _compile(code, p.data, flags)
  File "C:\Users\p119124\AppData\Local\Programs\Python\Python37\lib\sre_compile.py", line 182, in _compile
    raise error("look-behind requires fixed-width pattern")
re.error: look-behind requires fixed-width pattern 

你能帮帮我吗?

这个想法只是提取“美元/吨”。

标签: pythonpython-3.xregexre

解决方案


中的 Lookbehind 模式re必须匹配固定长度的字符串。

使用捕获组:

import re
materials = 'PE HD Monofilament Yarn CFR India Assessment Main Ports Spot 2-4 Weeks Full Market Range Weekly (Low) : USD/tonne'
match = re.search(r'.*:\s*(.+)', materials)
if match:
  print(match.group(1))

请参阅Python 证明正则表达式证明

表达式解释

--------------------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  :                        ':'
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    .+                       any character except \n (1 or more times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1

推荐阅读