首页 > 解决方案 > CSV 中 Pandas 数据框列中的字典

问题描述

我有一个 CSV,其中有一列名为“位置”的字典,如下所示。

{"city": "Bellevue", "country": "United States", "address2": "Ste 2A - 178", "state": "WA", "postal_code": "98005", "address1": "677 120th Ave NE"}

{"city": "Atlanto", "country": "United States", "address2": "Ste A-200", "state": "GA", "postal_code": "30319", "address1": "4062 Peachtree Rd NE"}

{"city": "Suffield", "state": "CT", "postal_code": "06078", "country": "United States"}

{"city": "Nashville", "state": "TN", "country": "United States", "postal_code": "37219", "address1": "424 Church St"}

我希望将 CSV 读入数据框,然后将城市、州、街道、邮政编码、国家/地区解析到他们自己的列中。

有人有什么想法吗?

标签: pythonpandasdictionary

解决方案


将列加载到列表中并创建一个新的数据框。

例如:

import pandas as pd

column = [{"city": "Bellevue", "country": "United States", "address2": "Ste 2A - 178", "state": "WA", "postal_code": "98005", "address1": "677 120th Ave NE"},
          {"city": "Atlanto", "country": "United States", "address2": "Ste A-200", "state": "GA", "postal_code": "30319", "address1": "4062 Peachtree Rd NE"},
          {"city": "Suffield", "state": "CT", "postal_code": "06078", "country": "United States"},
          {"city": "Nashville", "state": "TN", "country": "United States", "postal_code": "37219", "address1": "424 Church St"}]


df = pd.DataFrame(column)
print(df)

印刷:

        city        country      address2 state postal_code              address1
0   Bellevue  United States  Ste 2A - 178    WA       98005      677 120th Ave NE
1    Atlanto  United States     Ste A-200    GA       30319  4062 Peachtree Rd NE
2   Suffield  United States           NaN    CT       06078                   NaN
3  Nashville  United States           NaN    TN       37219         424 Church St

编辑:要从 csv 文件中读取,您可以使用以下示例:

import csv
import pandas as pd
from ast import literal_eval

data = []
with open('data.csv', 'r', newline='') as f_in:
    csv_reader = csv.reader(f_in, delimiter=',', quotechar='"')
    for row in csv_reader:
        data.append(literal_eval(row[0]))  # <-- here i use [0], beacuse the dictionary is in column 0

df = pd.DataFrame(data)
print(df)

推荐阅读