首页 > 解决方案 > 如何在python中将分类行转换为列

问题描述

我有一个包含两列('X','Y')的数据框数据框看起来像这样

           X                                 Y
0               id:                                35252916702903
1         userName:                                           IAMAdmin
2         eventTime                               2020-02-04T05:42:16Z
3         awsRegion                                          us-east-1
4   sourceIPAddress                                     203.99.xx.xx
5               id:                                3525291679
6         userName:                                           IAMAdmin
7         eventTime                               2020-02-04T05:41:58Z
8         awsRegion                                          us-east-1
9   sourceIPAddress                                     203.99.xx.xx
10              id:                               3525288310411
11        userName:                                      EC2FullAccess
12        eventTime                               2020-02-04T05:18:39Z
13        awsRegion                                          us-east-1
14  sourceIPAddress                                       34.229.xx.xx

现在我希望上面的数据框看起来像这样

       id           userName               eventTime                    awsRegion     sourceIPAddress

35252916702         IAMAdmin              2020-02-04T05:42:16Z          us-east-1      203.99.xx.xx
352529167           IAMAdmin              2020-02-04T05:41:58Z          us-east-1       34.229.xx.xx
 ....
 ...

“x”列中的分类值应作为唯一列,其值应作为观察值。

如何使用熊猫做到这一点?

标签: pythonpandas

解决方案


groupby.cumcount与 一起使用DataFrame.pivot_table

new_df = (df.pivot_table(index = df.groupby('X').cumcount(),
                         columns = 'X',
                         values ='Y',
                         aggfunc =  ''.join)
            .rename_axis(columns = None)
            .reindex(columns=df['X'].unique()))
print(new_df)

输出:

              id:      userName:             eventTime  awsRegion  \
0  35252916702903       IAMAdmin  2020-02-04T05:42:16Z  us-east-1   
1      3525291679       IAMAdmin  2020-02-04T05:41:58Z  us-east-1   
2   3525288310411  EC2FullAccess  2020-02-04T05:18:39Z  us-east-1   

  sourceIPAddress  
0    203.99.xx.xx  
1    203.99.xx.xx  
2    34.229.xx.xx  

推荐阅读