首页 > 解决方案 > 如何将熊猫数据框中的文本拆分为新的数据框列

问题描述

我有一个清单

list1= ['{"bank_name": null, "country": null, "url": null, "type": "Debit", "scheme": "Visa", "bin": "789452"}\n',
 '{"prepaid": "", "bin": "123457", "scheme": "Visa", "type": "Debit", "bank_name": "Ohio", "url": "www.u.org", "country": "UKs"}\n']

我把它传给了一个dataframe

df = pd.DataFrame({'bincol':list1})
print(df)
                                               bincol
0  {"bank_name": null, "country": null, "url": nu...
1  {"prepaid": "", "bin": "123457", "scheme": "Vi...

我正在尝试bincol使用此功能将列拆分为新列

def explode_col(df, column_value):
    df = df.dropna(subset=[column_value])
    if isinstance(df[str(column_value)].iloc[0], str):
        df[column_value] = df[str(column_value)].apply(ast.literal_eval)
    expanded_child_df = (pd.concat({i: json_normalize(x) for i, x in .pop(str(column_value)).items()}).reset_index(level=1,drop=True).join(df, how='right', lsuffix='_left', rsuffix='_right').reset_index(drop=True))
    expanded_child_df.columns = map(str.lower, expanded_child_df.columns)

    return expanded_child_df

df2 = explode_col(df,'bincol')

但是我收到了这个错误,我在这里遗漏了什么吗?

raise ValueError(f'malformed node or string: {node!r}')
ValueError: malformed node or string: <_ast.Name object at 0x7fd3aa05c400>

标签: pythonpandasdataframe

解决方案


对我来说,在您的示例数据中工作以json.loads将数据转换为字典,然后json_normalize用于DataFrame

import json

df = pd.json_normalize(df['bincol'].apply(json.loads))
print(df)

  bank_name country        url   type scheme     bin prepaid
0      None    None       None  Debit   Visa  789452     NaN
1      Ohio     UKs  www.u.org  Debit   Visa  123457        

推荐阅读