首页 > 解决方案 > 带有 finditer 的 Python 多行正则表达式组仅返回最后一个匹配项

问题描述

我有一个重复的文本输出,我想从每个重复中捕获五个组。该模式跨越多个换行符。我想得到一个元组的迭代器。我试过这个,但它似乎只捕获最后一个匹配,尝试 findall 返回一个包含最后一个元组的列表:

import re
string = '''-----------------------------------------------------------------------
Selecting top 2 features.
Top features (not sorted): CXVol,CCVol
Total prediction score (mean accuracy): 0.611111
              precision    recall  f1-score   support

           1       0.62      0.83      0.71         6
           2       1.00      0.50      0.67         6
           3       0.43      0.50      0.46         6

    accuracy                           0.61        18
   macro avg       0.68      0.61      0.61        18
weighted avg       0.68      0.61      0.61        18

Ranking of other features (sorted): IL10,IL5,R2GP,R2Thal,FACC,IL6,FASTR,R2CC,p75,TNF,GPVol,R2STR,ODISTR,STRVol,R2CC,FACX,ILB,ODIGP,FAHIPP,MDThal,FAThal,IL2,MDCC,MDSTR,MDGP
-----------------------------------------------------------------------

-----------------------------------------------------------------------
Selecting top 3 features.
Top features (not sorted): CXVol,CCVol,IL10
Total prediction score (mean accuracy): 0.666667
              precision    recall  f1-score   support

           1       0.60      1.00      0.75         6
           2       0.75      0.50      0.60         6
           3       0.75      0.50      0.60         6

    accuracy                           0.67        18
   macro avg       0.70      0.67      0.65        18
weighted avg       0.70      0.67      0.65        18

Ranking of other features (sorted): IL5,R2GP,R2Thal,FACC,IL6,FASTR,R2CC,p75,TNF,GPVol,R2STR,ODISTR,STRVol,R2CC,FACX,ILB,ODIGP,FAHIPP,MDThal,FAThal,IL2,MDCC,MDSTR,MDGP
-----------------------------------------------------------------------

-----------------------------------------------------------------------
Selecting top 4 features.
Top features (not sorted): CXVol,CCVol,IL5,IL10
Total prediction score (mean accuracy): 0.611111
              precision    recall  f1-score   support

           1       0.60      1.00      0.75         6
           2       0.75      0.50      0.60         6
           3       0.50      0.33      0.40         6

    accuracy                           0.61        18
   macro avg       0.62      0.61      0.58        18
weighted avg       0.62      0.61      0.58        18

Ranking of other features (sorted): R2GP,R2Thal,FACC,IL6,FASTR,R2CC,p75,TNF,GPVol,R2STR,ODISTR,STRVol,R2CC,FACX,ILB,ODIGP,FAHIPP,MDThal,FAThal,IL2,MDCC,MDSTR,MDGP
-----------------------------------------------------------------------
'''

p = re.compile(".*top\s(\d+)\sf"
               ".*Top.*ed\):\s(\S+)\n"
               ".*curacy\):\s(\S+)\n"
               ".*hted\savg\s+(\S+)\s+(\S+)", re.S)
m = p.finditer(string)
[print(x.groups()) for x in m]

#Out ('4', 'CXVol,CCVol,IL5,IL10', '0.611111', '0.62', '0.61')

标签: pythonregex

解决方案


推荐阅读