首页 > 解决方案 > Split pandas list to different column and calculate the counts

问题描述

I've a pandas dataframe with a column name ids that contains list elements. So I want to split the list column to different columns.

id          partner_id                       ids
1                12             ["1","4","187275","187358","946475"]
2                12             ["1","191","28925","31441"]
3                16             ["1","2","293915","1573130","293918"]
4                11             ["1","13","294064","1238496"]
5                16             ["1","153339","155025","155029"]

Desired output:

id          partner_id          id1   id2     id3      id4        id5 
1                12             1     4      187275    187358    946475     
2                12             1     191    28925     31441     NaN     
3                16             1     2      293915    1573130   293918              
4                11             1     13     294064    1238496   NaN       
5                16             1     153339 155025    155029    NaN 

What I've tried:

df2 = pd.DataFrame(df.parent_path.values.tolist(), index=df.index)

Full Code:

import pandas as pd
import numpy as np
pd.set_option('display.max_columns', 85)
pd.set_option('display.max_rows', 85)
df = pd.read_csv('../dataset/property_location_count.csv',low_memory=False)
df2 = pd.DataFrame(df.ids.values.tolist(), index=df.index)

But it doesn't split the columns as it does here :https://stackoverflow.com/a/35491399/1138192

标签: pythonpandas

解决方案


我认为您很接近,仅用于DataFrame.join附加到原始列,DataFrame.pop用于提取列,然后在必要时将字符串转换为数字并最后重命名列名称:

列表的字符串 repr 也是必需的json.loads

import json

df = (df.join(pd.DataFrame([json.loads(x) for x in df.pop('ids')], index=df.index)
        .astype(float)
        .astype('Int64')
        .rename(columns=lambda x: f'id{x+1}')))
print (df)
   id  partner_id  id1     id2     id3      id4     id5
0   1          12    1       4  187275   187358  946475
1   2          12    1     191   28925    31441     NaN
2   3          16    1       2  293915  1573130  293918
3   4          11    1      13  294064  1238496     NaN
4   5          16    1  153339  155025   155029     NaN

推荐阅读