首页 > 解决方案 > 如何在python中检查其中包含制表符的列表?

问题描述

我有一个 data.csv 文件,其中包含以下内容,并且在该文件的末尾,它也有一些新行。现在我想读取这个文件并从最后一行获取特定列的值。

Connecting to the ControlService endpoint

Found 3 rows.
Requests List:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Client ID                                                                   | Client Type                  | Service Type | Status               | Trust Domain              | Data Instance Name | Data Version | Creation Time              | Last Update                | Scheduled Time | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 REFRESH_ROUTINGTIER_ARTIFACTS_1465901168866                              | ROUTINGTIER_ARTIFACTS | SYSTEM       | COMPLETED            | RRA Bulk Client    | soa_server1       | 18.2.2.0.0  | 2016-06-14 03:49:55 -07:00 | 2016-06-14 03:49:57 -07:00 | ---            | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 500333443                                                          | CREATE                        | [FA_GSI]     | COMPLETED            | holder       | soa_server1       | 18.3.2.0.0  | 2018-08-07 11:59:57 -07:00 | 2018-08-07 12:04:37 -07:00 | ---            | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 500333446                                                          | CREATE                        | [FA_GSI]     | COMPLETED            | holder-test  | soa_server1       | 18.3.2.0.0  | 2018-08-07 12:04:48 -07:00 | 2018-08-07 12:08:52 -07:00 | ---            | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

现在我想从最后一行解析上面的文件和额外的值。我想在最后一行增加“客户 ID”和“信任域”列的值,即:

Client ID: 500333446
Trust Domain: holder-test

我得到了下面的 python 脚本,但由于 csv 文件末尾的新行而失败?如果我的 csv 文件没有任何新行,那么它工作正常。

import csv

lines_to_skip = 4
with open('data.csv', 'r') as f:
    reader = csv.reader(f, delimiter='|')
    for i in range(lines_to_skip):
        next(reader)

    data = []
    for line in reader:
        if line[0].find("---") != 0:
            print line
            data.append(line)

print("{}={}".format(data[-1][0].replace(" ",""),data[-1][4].replace(" ","")))

如果我的 csv 文件末尾有一些新行,我会在 if block line 处收到此错误:

Traceback (most recent call last):
  File "test.py", line 11, in <module>
    if line[0].find("---") != 0:
IndexError: list index out of range

这是最后打印出来的行:

[' \t\t']

标签: pythonlist

解决方案


您可以尝试将每一行拆分|为一个字典列表,并且只打印最后一行的Client IDand Trust Domain

with open('data.txt') as f:

    # collect rows of interest
    rows = []
    for line in f:
        if '|' in line:
            items = [item.strip() for item in line.split('|')]
            rows.append(items)

    # first item will be headers
    headers = rows[0]

    # put each row into dictionary
    data = [dict(zip(headers, row)) for row in rows[1:]]

    # print out last row information of interest
    print('Client ID:', data[-1]['Client ID'])
    print('Trust Domain:', data[-1]['Trust Domain'])

哪些输出:

Client ID: 500333446
Trust Domain: holder-test

根据评论中的要求,如果您想打印500333446=holder-test,可以将最终打印顺序更改为:

print('%s=%s' % (data[-1]['Client ID'], data[-1]['Trust Domain']))
# 500333446=holder-test

推荐阅读