首页 > 解决方案 > 如何从文件中删除重复的行?

问题描述

我有one.txt数据文件:

 822.25 111.48 883.59 256.68
 822.25 111.48 883.59 256.68
 8.6 123.68 467.27 276.69
 0.0 186.77 165.62 375.0
 0.0 186.77 165.62 375.0
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 438.03 148.5 540.88 198.54
 511.99 170.97 571.74 215.81
 511.99 170.97 571.74 215.81

对于重复的行,我只想为它们写一行。例如:

724.76 177.83 923.52 316.78

重复5次,我只想写一次,对其他行也做同样的事情,然后将新数据写入文件。

我的代码:

with open('one.txt', 'r') as infile:
    with open('output.txt', 'w') as outfile:
        for line in infile:
            #how to do this?
            if line are repeated remove and replace them with only one line
               outfile.write(line)

标签: python

解决方案


这可以通过 linux 实用程序uniq来完成,只需在终端中键入即可uniq <infile.txt >outfile.txt。这里符号><告诉 shell 使用提供的文件而不是标准输入和输出。

要在 python 中重新发明这个实用程序,可以这样写:

with open('one.txt', 'r') as infile:
    with open('output.txt', 'w') as outfile:
        prev_line = infile.readline()  # read first line
        outfile.write(prev_line)
        for line in infile:
            if line != prev_line:  # if the line is a different one, print it
                prev_line = line
                outfile.write(line)

推荐阅读