python - 正则表达式：匹配重复（任意时间）模式，但在不同的组中排序

问题描述

我正在尝试匹配（如果可能，仅）包含在以下行中的坐标值：

function f is described by the (x,y) couples: 0.000000E+00 0.000000E+00  5.00000     0.500000E-01  1.0000     0.290000      2.0000      1.56000      3.0000      5.47000      4.0000      17.3000      4.50000      31.2000      5.0000      52.6000

第一对是根据需要匹配的，也就是说在两个不同的组中，通过

(?<=\bcouples:\s)(\S+)\s+(\S+)\s+

然后，

    (?<=\bcouples:\s)((\S+)\s+(\S+)\s+)+

匹配整行，但仅将最后两个坐标拆分为不同的组。

精度：坐标对的数量不知道，所以只加了几次

(\S+)\s+(\S+)\s+

在正则表达式的末尾不是一个选项。

感谢您的输入！

标签： pythonregexregex-group

使用 findall()：

re.findall(r"(?:\s+([\d\.Ee+-]+)\s+([\d\.Ee+-]+))+?",s)

([\d\.Ee+-]+)\s+([\d\.Ee+-]+) --> two float numbers,
                                  () each of grouped;
 (?:\s+ ... )+? -->  +? there can be more couples, ? means non-greedy matching,
                     (?: the outer group is not interesting;

编辑：您可以选择适当的行：

 if "couples:" in s:
     coords= re.findall(...)

如果您的文本包含更多“情侣”，您可以将其拆分。在以下示例中，我们可以将正则表达式应用于拆分字符串的第二个或第三个，或两者：

s="function f is described by the (x,y) couples: 0.000000E+00 0.000000E+00  5.00000     0.500000E-01 function g is described by the (x,y) couples: 0.1E+00 0.2E+00  9.00000     0.900000E-01"

ls=s.split("couples")
print(ls)
['function f is described by the (x,y) ',
 ': 0.000000E+00 0.000000E+00  5.00000     0.500000E-01 function g is described by the (x,y) ',
 ': 0.1E+00 0.2E+00  9.00000     0.900000E-01']

 re.findall(r"(?:\s+([\d\.Ee+-]+)\s+([\d\.Ee+-]+))+?",ls[1])
 [('0.000000E+00', '0.000000E+00'), ('5.00000', '0.500000E-01')]

python - 正则表达式：匹配重复（任意时间）模式，但在不同的组中排序

问题描述

解决方案

推荐阅读