首页 > 解决方案 > 向 Pandas DF 添加新列,为每一行执行基本数学方程以确定值

问题描述

因此,我使用 Python 3.7 并使用 Jupyter Notebooks 执行数据报告。我有一个数据框 floridaDtFinal,它具有以下列:

State                                                object
County                                               object
Candidate                                            object
Total Votes                                           int64
County Vote Percentage                              float64
White Alone                                         float64
Black or African American Alone                     float64
American Indian and Alaska Native Alone             float64
Asian Alone                                         float64
Native Hawaiian and Other Pacific Islander Alone    float64
Some Other Race Alone                               float64
Two or More Races                                   float64
Hispanic or Latino                                  float64
Hispanic or Latino (Mexican)                        float64
Hispanic or Latino (Puerto Rican)                   float64
Hispanic or Latino (Cuban)                          float64
Hispanic or Latino (Other Hispanic or Latino)       float64
Total Population                                    float64
dtype: object

我希望为列出的每个种族添加额外的列。每个种族列的值是人口普查人口值。我想添加的附加列和值将是该县每个种族的总人口百分比。因此,“仅白人百分比”、“仅黑人或非裔美国人百分比”等。

我的 DF 中的行是各自州的每个县。

Pandas DF 演示布局

我知道我基本上可以为每个列值做这样的事情:

floridaDtFinal['White Alone Percent'] = 100 / floridaDtFinal['Total Population'] * floridaDtFinal['White Alone']

但我想知道是否有一种更有效的方法可以将计算应用于列表中的每个种族,而无需手动输入。

我意识到输入它可能比手动输入花费的时间更长,但我现在很好奇,很想知道我将如何执行它。只需在每个末尾添加“百分比”并为整个列执行计算。

我已经执行了搜索,但大多数结果都是基于基于列表值创建列值,而不是列标题本身。

编辑:一些示例数据框作为文本。

{'State': {0: 'Florida',
  1: 'Florida',
  2: 'Florida',
  3: 'Florida',
  4: 'Florida'},
 'County': {0: 'Holmes County',
  1: 'Lafayette County',
  2: 'Baker County',
  3: 'Dixie County',
  4: 'Union County'},
 'Candidate': {0: 'Donald Trump',
  1: 'Donald Trump',
  2: 'Donald Trump',
  3: 'Donald Trump',
  4: 'Donald Trump'},
 'Total Votes': {0: 8080, 1: 3128, 2: 11911, 3: 6759, 4: 5133},
 'County Vote Percentage': {0: 89.105,
  1: 85.511,
  2: 84.722,
  3: 82.76,
  4: 82.194},
 'White Alone': {0: 17237.0, 1: 6931.0, 2: 23279.0, 3: 14333.0, 4: 11268.0},
 'Black or African American Alone': {0: 1356.0,
  1: 1394.0,
  2: 3824.0,
  3: 1474.0,
  4: 3359.0},
 'American Indian and Alaska Native Alone': {0: 266.0,
  1: 13.0,
  2: 144.0,
  3: 2.0,
  4: 116.0},
 'Asian Alone': {0: 131.0, 1: 0.0, 2: 173.0, 3: 38.0, 4: 93.0},
 'Native Hawaiian and Other Pacific Islander Alone': {0: 60.0,
  1: 28.0,
  2: 10.0,
  3: 20.0,
  4: 10.0},
 'Some Other Race Alone': {0: 67.0, 1: 7.0, 2: 245.0, 3: 254.0, 4: 191.0},
 'Two or More Races': {0: 315.0, 1: 264.0, 2: 536.0, 3: 468.0, 4: 266.0},
 'Hispanic or Latino': {0: 546.0, 1: 1360.0, 2: 721.0, 3: 674.0, 4: 862.0},
 'Hispanic or Latino (Mexican)': {0: 248.0,
  1: 488.0,
  2: 133.0,
  3: 275.0,
  4: 217.0},
 'Hispanic or Latino (Puerto Rican)': {0: 120.0,
  1: 148.0,
  2: 149.0,
  3: 115.0,
  4: 249.0},
 'Hispanic or Latino (Cuban)': {0: 29.0,
  1: 568.0,
  2: 140.0,
  3: 147.0,
  4: 161.0},
 'Hispanic or Latino (Other Hispanic or Latino)': {0: 149.0,
  1: 156.0,
  2: 299.0,
  3: 137.0,
  4: 235.0},
 'Total Population': {0: 20524.0,
  1: 11357.0,
  2: 29653.0,
  3: 17937.0,
  4: 17027.0}

标签: pythonpandasnumpy

解决方案


您可以使用以下内容来创建新列:

cols = floridaDtFinal.columns[5:17]
for col in cols:
    floridaDtFinal[f'{col} Percent'] = 100 / floridaDtFinal['Total Population'] * floridaDtFinal[col]

如果您想让新列与原始 Ethnicity 列一起出现,您可以进一步对相关列(Ethnicity 和 Percent Columns)的列标签进行排序,如下所示:

header_col = ['State', 'County', 'Candidate', 'Total Votes', 'County Vote Percentage', 'Total Population']
floridaDtFinal = floridaDtFinal[header_col].join(floridaDtFinal.drop(columns=header_col).sort_index(axis=1))

结果:

     State            County     Candidate  Total Votes  County Vote Percentage  Total Population  American Indian and Alaska Native Alone  American Indian and Alaska Native Alone Percent  Asian Alone  Asian Alone Percent  Black or African American Alone  Black or African American Alone Percent  Hispanic or Latino  Hispanic or Latino (Cuban)  Hispanic or Latino (Cuban) Percent  Hispanic or Latino (Mexican)  Hispanic or Latino (Mexican) Percent  Hispanic or Latino (Other Hispanic or Latino)  Hispanic or Latino (Other Hispanic or Latino) Percent  Hispanic or Latino (Puerto Rican)  Hispanic or Latino (Puerto Rican) Percent  Hispanic or Latino Percent  Native Hawaiian and Other Pacific Islander Alone  Native Hawaiian and Other Pacific Islander Alone Percent  Some Other Race Alone  Some Other Race Alone Percent  Two or More Races  Two or More Races Percent  White Alone  White Alone Percent
1  Florida  Lafayette County  Donald Trump         3128                  85.511           11357.0                                     13.0                                         0.114467          0.0             0.000000                           1394.0                                12.274368              1360.0                       568.0                            5.001321                         488.0                              4.296909                                          156.0                                               1.373602                              148.0                                   1.303161                   11.974993                                              28.0                                                  0.246544                    7.0                       0.061636              264.0                   2.324558       6931.0            61.028441
2  Florida      Baker County  Donald Trump        11911                  84.722           29653.0                                    144.0                                         0.485617        173.0             0.583415                           3824.0                                12.895828               721.0                       140.0                            0.472128                         133.0                              0.448521                                          299.0                                               1.008330                              149.0                                   0.502479                    2.431457                                              10.0                                                  0.033723                  245.0                       0.826223              536.0                   1.807574      23279.0            78.504704
3  Florida      Dixie County  Donald Trump         6759                  82.760           17937.0                                      2.0                                         0.011150         38.0             0.211853                           1474.0                                 8.217651               674.0                       147.0                            0.819535                         275.0                              1.533144                                          137.0                                               0.763784                              115.0                                   0.641133                    3.757596                                              20.0                                                  0.111501                  254.0                       1.416067              468.0                   2.609132      14333.0            79.907454
4  Florida      Union County  Donald Trump         5133                  82.194           17027.0                                    116.0                                         0.681271         93.0             0.546191                           3359.0                                19.727492               862.0                       161.0                            0.945557                         217.0                              1.274446                                          235.0                                               1.380161                              249.0                                   1.462383                    5.062548                                              10.0                                                  0.058730                  191.0                       1.121748              266.0                   1.562225      11268.0            66.177248
0      NaN     Holmes County  Donald Trump         8080                  89.105           20524.0                                    266.0                                         1.296044        131.0             0.638277                           1356.0                                 6.606899               546.0                        29.0                            0.141298                         248.0                              1.208341                                          149.0                                               0.725979                              120.0                                   0.584681                    2.660300                                              60.0                                                  0.292341                   67.0                       0.326447              315.0                   1.534789      17237.0            83.984603

推荐阅读