首页 > 解决方案 > 如何使用 re.match() 更改文件中的多行

问题描述

我在输入文件 ( ) 中有四个单词,input.txt我想在另一个输入文件 ( file.txt) 中搜索这些单词。如果单词与条件匹配,则 的第四列file.txt将乘以 2。我的代码将数据写入输出文件 ( file_edited.txt) 4 次。这不是我想要的。如何修复我的代码?

我的代码:

# !/usr/bin/env python3
import os
import re
import numpy as np

out = open("file_edited.txt", "w")
with open("input.txt", mode='r') as f:
    for lines in f:
        line = lines.strip()

        search_str = line    
        with open('file.txt', mode='r') as infile:
            for line in infile:
                data = line.rstrip().split()
                if (len(data) == 4) and re.match(search_str, line):
                  data[3] = 2 * float(data[3])
                  out.write("%s %5s %12s %7.2f\n" % (data[0], data[1], data[2], data[3]))
                else:
                    out.write(line)
out.close()

输入.txt

C48D42
CC3752
A52C35
A4814C

文件.txt:

C1522      1    07.123222    1.98  1.222222 
C48D42     9    08.222222    2.13
C48D42     4    07.288822    5.58  5.356359
CC3D51     2    09.227822    2.58  3.568523
CC3752     3    07.333333    4.45
ABCD15     3    07.266222    2.50  5.084582 
CC3752     6    07.222222    3.25  4.084582  
CC3552     3    07.223222    8.42  8.356359
A52C35     3    09.222222    2.15
A4814C     3    07.222222    2.55  5.256254
A4814C     3    07.222222    3.45
CCD152     3    07.222222    0.00  2.451678

所需的输出(file_edited.txt):

C1522      1    07.123222    1.98  1.222222 
C48D42     9    08.222222    4.26
C48D42     4    07.288822    5.58  5.356359
CC3D51     2    09.227822    2.58  3.568523
CC3752     3    07.333333    8.90
ABCD15     3    07.266222    2.50  5.084582 
CC3752     6    07.222222    3.25  4.084582  
CC3552     3    07.223222    8.42  8.356359
A52C35     3    09.222222    4.30
A4814C     3    07.222222    2.55  5.256254
A4814C     3    07.222222    6.90
CCD152     3    07.222222    0.00  2.451678

标签: python

解决方案


每次迭代输入时,在第二个循环中再次编写所有行。尝试将信息存储在数据结构中作为默认字典。

# !/usr/bin/env python3
import os
import re
import numpy as np
from collections import defaultdict

# Create a data structure to keep track of values
d = defaultdict(str)

# Here you initialize all default values
# because you just replace what you need
with open('file.txt', mode='r') as infile:
    for index, line in enumerate(infile):
        d[index] = line

out = open("file_edited.txt", "w")
with open("input.txt", mode='r') as f:
    for lines in f:
        line = lines.strip()

        search_str = line
        with open('file.txt', mode='r') as infile:
            for index, line in enumerate(infile):
                data = line.rstrip().split()
                if (len(data) == 4) and re.match(search_str, line):
                    data[3] = 2 * float(data[3])
                    # Now you just replace the value if you really need
                    # If you do not replace the value it will be the default already initialized
                    d[index] = "%s %5s %12s %7.2f\n" % (data[0], data[1], data[2], data[3])

# Now you just print the values to file without repeat them
for k, v in d.items():
    out.write(v)

out.close()

推荐阅读