首页 > 解决方案 > 正则表达式匹配字符串,其中单词后跟空格,然后是数字点或连字符,单词后跟空格,然后(一些信息)

问题描述

我有一个具有以下格式的字符串: name category (more info)

例如:Foo Bar-8.io 5.61.0-rc-1 (data)

我需要一个正则表达式,它基本上过滤掉符合上述格式的字符串。

名称可以是带有空格的字母数字,-并且.

类别可以以数字开头,后跟单词,包括点或连字符

数据可以是任何.*包含在()

我试过这个:^[\w\s]+.*\s.*\(.*\)$但似乎没有涵盖上述模式。

标签: pythonregex

解决方案


利用

^(.*)\s+(\S+)\s+\((.*)\)$

请参阅正则表达式证明

解释

--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    \S+                      non-whitespace (all but \n, \r, \t, \f,
                             and " ") (1 or more times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
  )                        end of \2
--------------------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  \(                       '('
--------------------------------------------------------------------------------
  (                        group and capture to \3:
--------------------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \3
--------------------------------------------------------------------------------
  \)                       ')'
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

推荐阅读