首页 > 解决方案 > 在文本文件中按升序排列

问题描述

所以我有一个看起来像这样的文本文件:

07,12,9201
07,12,9201
06,18,9209
06,18,9209
06,19,9209
06,19,9209
07,11,9201

我首先要删除所有重复的行,然后按升序对第 1 列进行排序,然后按升序对第 2 列进行排序,因为第 1 列仍然是升序。输出:

06,18,9209
06,19,9209
07,11,9201
07,12,9201

到目前为止,我已经尝试过:

with open('abc.txt') as f:
lines = [line.split(' ') for line in f]

考虑另一个例子:

00,0,6098
00,1,6098
00,3,6098
00,4,6094
00,5,6094
00,6,6094
00,7,6094
00,8,6094
00,9,6498
00,2,6098
00,20,6102
00,21,6087
00,22,6087
00,23,6087
00,3,6098
00,4,6094
00,5,6094
00,6,6094
00,7,6094
00,8,6094
00,9,6498

此文件的输出应为:

00,0,6098
00,1,6098
00,2,6098
00,3,6098
00,4,6094
00,5,6094
00,6,6094
00,7,6094
00,8,6094
00,9,6498
00,20,6102
00,21,6087
00,22,6087
00,23,6087

标签: python-3.x

解决方案


您可以执行以下操作。

from itertools import groupby, chain
from collections import OrderedDict

input_file = 'input_file.txt'

# Collecting lines
lines = [tuple(line.strip().split(',')) for line in open(input_file)]

# Removing dups and Sorting by first column
sorted_lines = sorted(set(lines), key=lambda x: int(x[0]))

# Grouping and ordering by second column
result = OrderedDict()
for k, g in groupby(sorted_lines, key=lambda x: x[0]):
    result[k] = sorted(g, key = lambda x : int(x[1]))

print(result)
for v in chain(*result.values()):
    print(','.join(v))


输出 1:

06,18,9209
06,19,9209
07,11,9201
07,12,9201

输出 2:

00,0,6098
00,1,6098
00,2,6098
00,3,6098
00,4,6094
00,5,6094
00,6,6094
00,7,6094
00,8,6094
00,9,6498
00,20,6102
00,21,6087
00,22,6087
00,23,6087

推荐阅读