首页 > 解决方案 > Create a dict using two columns from dataframe with duplicates in one column

问题描述

I want to create a dict from 2 columns of a dataframe.

Let's say they look like this:

A         B
car1     brand1
car2     brand2
car3     brand1
car4     brand3
car5     brand2

output:

{'brand1': ['car1', 'car3'], 'brand2': ['car2', 'car5'], 'brand3': 'car4'}

There is to_dict method, however when i try to use it, i can't get it to add values to keys, instead it only maps 1 value to 1 key.

I know I can for loop column A, check value in column B with iloc and then make if else to either create a new key or add a value to existing key, but I am looking for an elegant solution.

标签: pythonpandasdataframe

解决方案


Borrowing from grouping rows in list in pandas groupby you can aggregate to list with a groupby, then use to_dict()

df.groupby('B')['A'].apply(list).to_dict()
{'brand1': ['car1', 'car3'], 'brand2': ['car2', 'car5'], 'brand3': ['car4']}

推荐阅读