首页 > 解决方案 > 如何从文本文件的每一行创建列表字典?

问题描述

我有一个包含此数据的文本文件

1, Jane Doe, 1991
2, Sam Smith, 1982
3, John Sung, 1965
4, Tony Tembo, 1977

我有类似的东西,但它只有在你只有 ID 和名字的情况下才有效

names = {}
with open("dict.txt") as f:
  for line in f:
      (key, val) = line.strip().split(',')
      names[int(key)] = val

print (d) 我可以从这个文件创建一个字典,如下所示:

{1: [Jane Doe, 1991], 2: [Sam Smith, 1982]...}

标签: pythondictionary

解决方案


csv处理逗号分隔的数据时最好使用该模块。

import csv

data_file = 'test.csv'
parsed = {}

with open(data_file) as data:
    reader = csv.reader(data)
    for row in reader:
        print(row)
        # row is ['1', ' Jane Doe', ' 1991']

        parsed[row[0]] = row[1:]

print(parsed)
['1', ' Jane Doe', ' 1991']
['2', ' Sam Smith', ' 1982']
['3', ' John Sung', ' 1965']
['4', ' Tony Tembo', ' 1977']
{'1': [' Jane Doe', ' 1991'], '2': [' Sam Smith', ' 1982'], '3': [' John Sung', ' 1965'], '4': [' Tony Tembo', ' 1977']}

reader行内容作为列表返回。然后,只需将第一个列表元素存储为 dict 键,将其余列表元素 ( [1:]) 存储为 dict 值。

现在,上面的代码将所有内容读取为str,并且您可以注意到文件中包含额外的空格。如果您需要键和值是某些类型(并且您需要列表来保存不同类型的元素),那么您需要分别解析它们:

with open(data_file) as data:
    reader = csv.reader(data)
    for row in reader:
        print(row)
        # row is ['1', ' Jane Doe', ' 1991']

        key = int(row[0])
        name = row[1].strip()
        year = int(row[2])

        parsed[key] = [name, year]

print(parsed)
['1', ' Jane Doe', ' 1991']
['2', ' Sam Smith', ' 1982']
['3', ' John Sung', ' 1965']
['4', ' Tony Tembo', ' 1977']
{1: ['Jane Doe', 1991], 2: ['Sam Smith', 1982], 3: ['John Sung', 1965], 4: ['Tony Tembo', 1977]}

当然,如果您有更多包含不同类型数据的列,您将需要调整索引和类型强制。

更进一步,我会将用于解析实际数据和类型强制代码放入一个类(确切地说是一个数据类)中。这样,读取文件与解析实际内容是分开的:

import csv
from dataclasses import dataclass

@dataclass
class Person:
    index: int
    name: str
    year: int

    # This is just an example to match the sample data.
    # Add more type-checking and error-handling as necessary.
    def __post_init__(self):
        # If we didn't get an int, force it to an int
        # Will raise ValueError if int(...) fails
        if not isinstance(self.index, int):
            self.index = int(self.index)
        if not isinstance(self.year, int):
            self.year = int(self.year)

        # Clean extra spaces
        self.name = self.name.strip()


data_file = 'test.csv'
parsed = {}

with open(data_file) as data:
    reader = csv.reader(data)
    for row in reader:
        print(row)
        # row is ['1', ' Jane Doe', ' 1991']

        # Unpack the row contents and pass to the Person's __init__
        # Make sure it matches the order of the dataclass fields
        person = Person(*row)

        parsed[person.index] = [person.name, person.year]

print(parsed)
['1', ' Jane Doe', ' 1991']
['2', ' Sam Smith', ' 1982']
['3', ' John Sung', ' 1965']
['4', ' Tony Tembo', ' 1977']
{'1': [' Jane Doe', ' 1991'], '2': [' Sam Smith', ' 1982'], '3': [' John Sung', ' 1965'], '4': [' Tony Tembo', ' 1977']}

推荐阅读