首页 > 解决方案 > 使用 Python 中的现有代码协助转置具有附加列的数据框

问题描述

我最初让@Kevin 帮助我在这里转置下面示例 df 的结果:Web scraping in python using BeautifulSoup - how to transpose results?

我添加了另一列“CEO”,并希望回收@Kevin 之前提供的代码,以便在最后合并。这是没有附加列的代码:

    from collections import defaultdict
    aggregated_data = defaultdict(dict)
    for idx, row in df.iterrows():
        aggregated_data[row.Name][row.Category] = row.Rating
    aggregated_rows = [{"Company": name, **ratings} for name, ratings in aggregated_data.items()]
    result = pd.DataFrame(aggregated_rows)
    result.to_csv('text.csv')

我一直在尝试合并它但没有成功,这是我到目前为止所拥有的并产生错误:

from collections import defaultdict
    aggregated_data = defaultdict(dict)
    for idx, row in df.iterrows():
        aggregated_data[row.Name][row.Category] = [row.Rating][row.CEO]
    aggregated_rows = [{"Company": name, **ratings, ceo} for name, ratings, ceo_rating in aggregated_data.items()]
    result = pd.DataFrame(aggregated_rows)
    print(result)

示例 df:

import pandas as pd
name = ['3M','3M','3M','3M','3M','Google','Google','Google','Google','Google','Apple','Apple','Apple','Apple','Apple']
number = ['3.8','3.9','3.5','3.6','3.8','4.2','4.0','3.6','3.9','4.2','3.8','4.1','3.7','3.7','4.1']
category = ['Work/Life Balance',' Salary/Benefits','Job Security/Advancement','Management','Culture','Work/Life Balance',' Salary/Benefits','Job Security/Advancement','Management','Culture','Work/Life Balance',' Salary/Benefits','Job Security/Advancement','Management','Culture']
ceo_rating = ['85%','85%','85%','85%','85%','86%','86%','86%','86%','86%','84%','84%','84%','84%','84%']
cols = {'Name':name,'Rating':number,'Category':category, 'CEO':ceo_rating}
df = pd.DataFrame(cols)
print(df)

结果:

      Name Rating                  Category  CEO
0       3M    3.8         Work/Life Balance  85%
1       3M    3.9           Salary/Benefits  85%
2       3M    3.5  Job Security/Advancement  85%
3       3M    3.6                Management  85%
4       3M    3.8                   Culture  85%
5   Google    4.2         Work/Life Balance  86%
6   Google    4.0           Salary/Benefits  86%
7   Google    3.6  Job Security/Advancement  86%
8   Google    3.9                Management  86%
9   Google    4.2                   Culture  86%
10   Apple    3.8         Work/Life Balance  84%
11   Apple    4.1           Salary/Benefits  84%
12   Apple    3.7  Job Security/Advancement  84%
13   Apple    3.7                Management  84%
14   Apple    4.1                   Culture  84%

希望使它像这样:

 company name Work/Life Balance  Salary/Benefits Job Security/Advancement Management Culture  CEO
0           3M               3.8              3.9                      3.5        3.6     3.8    85%
1       Google               4.2              4.0                      3.6        3.9     4.2    86%
2        Apple               3.8              4.1                      3.7        3.7     4.1    84%

如果有人可以提供帮助,那就太好了。谢谢!

标签: pythonpandas

解决方案


这是一种方法pivot_table

s = df.pivot_table(index=['Name','CEO'],columns='Category',values='Rating',aggfunc='first').reset_index()

推荐阅读