首页 > 解决方案 > 从熊猫数据框中制作具有特定格式的字典

问题描述

尝试转换数据框

在此处输入图像描述

进入具有这种特定结构的字典:

    sales = { 
   "clients": [
       {"ID_client": "241341", 
       "purchases": [
            "Item 101",
            "Item 202",
            "Item 324",
        ],
        "payment": [
            "visa", "master", "visa"
        ]           
       },
       {"ID_client": "24356", 
       "purchases": [
            "Item 2320",
            "Item 2342",
            "Item 5604",
        ],
        "payment": [
            "diners", "cash", "diners"
        ]           
       },    
       {"ID_client": "5534", 
       "purchases": [
            "Item 50563",
            "Item 52878",
            "Item 54233",
        ],
        "payment": [
            "diners", "master", "visa"
        ]           
       }       
   ]

}

我一直在尝试一些 for 循环,例如:

 d = {"sales":[]}
  for i in df1['ID_Client'].unique():/
     clients = {"ID_client": df1['ID_client'][i]}
     d[i] = [{df1['purchases'][j]: df1['payment'][j]} for j in 
     df1[df1['ID_Client']==i].index]

任何帮助将不胜感激。提前致谢。

标签: pythonpandasdictionary

解决方案


np.repeat这是使用and的一种方法itertools.chain

import pandas as pd, numpy as np
from itertools import chain

df = pd.DataFrame(sales['clients'])

res = pd.DataFrame({'ID_client': np.repeat(df['ID_client'], df['payment'].map(len)),
                    'payment': list(chain.from_iterable(df['payment'])),
                    'purchases': list(chain.from_iterable(df['purchases']))})

print(res)

  ID_client payment   purchases
0    241341    visa    Item 101
0    241341  master    Item 202
0    241341    visa    Item 324
1     24356  diners   Item 2320
1     24356    cash   Item 2342
1     24356  diners   Item 5604
2      5534  diners  Item 50563
2      5534  master  Item 52878
2      5534    visa  Item 54233

请注意,使用此方法,每个唯一索引都与ID_client, 根据您的输入对齐。


推荐阅读