首页 > 解决方案 > Replace ones in binary columns with values from another column

问题描述

I have a data frame that looks like this:

df = pd.DataFrame({"value": [4, 5, 3], "item1": [0, 1, 0], "item2": [1, 0, 0], "item3": [0, 0, 1]})
df

  value item1   item2   item3
0   4   0      1         0
1   5   1      0         0
2   3   0      0         1

Basically what I want to do is replace the value of the one hot encoded elements with the value from the "value" column and then delete the "value" column. The resulting data frame should be like this:

df_out = pd.DataFrame({"item1": [0, 5, 0], "item2": [4, 0, 0], "item3": [0, 0, 3]})

   item1    item2   item3
0   0        4      0
1   5        0      0
2   0        0      3

标签: pythonpandasdataframe

解决方案


Why not just multiply?

df.pop('value').values * df

   item1  item2  item3
0      0      5      0
1      4      0      0
2      0      0      3

DataFrame.pop has the nice effect of in-place removing and returning a column, so you can do this in a single step.


if the "item_*" columns have anything besides 1 in them, then you can multiply with bools:

df.pop('value').values * df.astype(bool)

   item1  item2  item3
0      0      5      0
1      4      0      0
2      0      0      3

If your DataFrame has other columns, then do this:

df
   value  name  item1  item2  item3
0      4  John      0      1      0
1      5  Mike      1      0      0
2      3  Stan      0      0      1

# cols = df.columns[df.columns.str.startswith('item')]
cols = df.filter(like='item').columns
df[cols] = df.pop('value').values * df[cols]

df
  name  item1  item2  item3
0  John      0      5      0
1  Mike      4      0      0
2  Stan      0      0      3

推荐阅读