python - pandas 在文档单词矩阵中转换文档单词列表

问题描述

我有一个这样的熊猫数据集：

  Brand AssociatedWord  Weight
0  pepsi           red      10
1  pepsi        yellow       3
2  coke            red       5
3  coke           grey       5
4  coke           pink       2

我需要将其转换为以下矩阵：

  Brand   red   yellow   grey   pink
0  pepsi   10        3      0      0
1  coke     5        0      5      2

现在每一行都是一个品牌，每个关联词都有一列，其中报告了关联的权重。零值表示缺少关联。列的顺序并不重要。你能帮助我吗？

标签： pythonpython-3.xpandas

使用DataFrame.pivot_table：

new_df=df.pivot_table(index='Brand',columns='AssociatedWord',values='Weight',fill_value=0).reset_index()
print(new_df)

AssociatedWord  Brand  grey  pink  red  yellow
0                coke     5     2    5       0
1               pepsi     0     0   10       3

注意：AssociatedWord 是列的名称，您可以使用以下方法更改它：

new_df.columns.name=None

   Brand  grey  pink  red  yellow
0   coke     5     2    5       0
1  pepsi     0     0   10       3

您也可以使用set_index+ unstack：

new_df=df.set_index(['Brand','AssociatedWord']).unstack(fill_value=0).reset_index()
print(new_df)


new_name        Brand Weight                
AssociatedWord          grey pink red yellow
0                coke      5    2   5      0
1               pepsi      0    0  10      3

python - pandas 在文档单词矩阵中转换文档单词列表

问题描述

解决方案

推荐阅读