首页 > 解决方案 > 替换列中值的最佳python数据结构?

问题描述

我正在使用需要替换 1 列中的值的数据框。我的本能是去找一个 python 字典但是,这是我的数据看起来像的一个例子(original_col):

original_col  desired_col
cat           animal
dog           animal
bunny         animal
cat           animal
chair         furniture
couch         furniture
Bob           person
Lisa          person

字典看起来像:

my_dict: {'animal': ['cat', 'dog', 'bunny'], 'furniture': ['chair', 'couch'], 'person': ['Bob', 'Lisa']}

我不能使用典型的 my_dict.get() 因为我正在寻找相应的 KEY 而不是值。字典是最好的数据结构吗?有什么建议么?

标签: python-3.xpandasdataframedictionarydata-structures

解决方案


The pandas map() function uses a dictionary or another pandas Series to perform this kind of lookup, IIUC:

# original column / data
data = ['cat', 'dog', 'bunny', 'cat', 'chair', 'couch', 'Bob', 'Lisa']

# original dict
my_dict: {'animal': ['cat', 'dog', 'bunny'], 
          'furniture': ['chair', 'couch'], 
          'person': ['Bob', 'Lisa']
         }

# invert the dictionary
new_dict = { v: k 
             for k, vs in my_dict.items()
             for v in vs }

# create series and use `map()` to perform dictionary lookup
df = pd.concat([
    pd.Series(data).rename('original_col'),
    pd.Series(data).map(new_values).rename('desired_col')], axis=1)

print(df)

  original_col desired_col
0          cat      animal
1          dog      animal
2        bunny      animal
3          cat      animal
4        chair   furniture
5        couch   furniture
6          Bob      person
7         Lisa      person

推荐阅读