python - 带有 finditer 的 Python 多行正则表达式组仅返回最后一个匹配项
问题描述
我有一个重复的文本输出,我想从每个重复中捕获五个组。该模式跨越多个换行符。我想得到一个元组的迭代器。我试过这个,但它似乎只捕获最后一个匹配,尝试 findall 返回一个包含最后一个元组的列表:
import re
string = '''-----------------------------------------------------------------------
Selecting top 2 features.
Top features (not sorted): CXVol,CCVol
Total prediction score (mean accuracy): 0.611111
precision recall f1-score support
1 0.62 0.83 0.71 6
2 1.00 0.50 0.67 6
3 0.43 0.50 0.46 6
accuracy 0.61 18
macro avg 0.68 0.61 0.61 18
weighted avg 0.68 0.61 0.61 18
Ranking of other features (sorted): IL10,IL5,R2GP,R2Thal,FACC,IL6,FASTR,R2CC,p75,TNF,GPVol,R2STR,ODISTR,STRVol,R2CC,FACX,ILB,ODIGP,FAHIPP,MDThal,FAThal,IL2,MDCC,MDSTR,MDGP
-----------------------------------------------------------------------
-----------------------------------------------------------------------
Selecting top 3 features.
Top features (not sorted): CXVol,CCVol,IL10
Total prediction score (mean accuracy): 0.666667
precision recall f1-score support
1 0.60 1.00 0.75 6
2 0.75 0.50 0.60 6
3 0.75 0.50 0.60 6
accuracy 0.67 18
macro avg 0.70 0.67 0.65 18
weighted avg 0.70 0.67 0.65 18
Ranking of other features (sorted): IL5,R2GP,R2Thal,FACC,IL6,FASTR,R2CC,p75,TNF,GPVol,R2STR,ODISTR,STRVol,R2CC,FACX,ILB,ODIGP,FAHIPP,MDThal,FAThal,IL2,MDCC,MDSTR,MDGP
-----------------------------------------------------------------------
-----------------------------------------------------------------------
Selecting top 4 features.
Top features (not sorted): CXVol,CCVol,IL5,IL10
Total prediction score (mean accuracy): 0.611111
precision recall f1-score support
1 0.60 1.00 0.75 6
2 0.75 0.50 0.60 6
3 0.50 0.33 0.40 6
accuracy 0.61 18
macro avg 0.62 0.61 0.58 18
weighted avg 0.62 0.61 0.58 18
Ranking of other features (sorted): R2GP,R2Thal,FACC,IL6,FASTR,R2CC,p75,TNF,GPVol,R2STR,ODISTR,STRVol,R2CC,FACX,ILB,ODIGP,FAHIPP,MDThal,FAThal,IL2,MDCC,MDSTR,MDGP
-----------------------------------------------------------------------
'''
p = re.compile(".*top\s(\d+)\sf"
".*Top.*ed\):\s(\S+)\n"
".*curacy\):\s(\S+)\n"
".*hted\savg\s+(\S+)\s+(\S+)", re.S)
m = p.finditer(string)
[print(x.groups()) for x in m]
#Out ('4', 'CXVol,CCVol,IL5,IL10', '0.611111', '0.62', '0.61')