python - 向 Pandas DF 添加新列,为每一行执行基本数学方程以确定值
问题描述
因此,我使用 Python 3.7 并使用 Jupyter Notebooks 执行数据报告。我有一个数据框 floridaDtFinal,它具有以下列:
State object
County object
Candidate object
Total Votes int64
County Vote Percentage float64
White Alone float64
Black or African American Alone float64
American Indian and Alaska Native Alone float64
Asian Alone float64
Native Hawaiian and Other Pacific Islander Alone float64
Some Other Race Alone float64
Two or More Races float64
Hispanic or Latino float64
Hispanic or Latino (Mexican) float64
Hispanic or Latino (Puerto Rican) float64
Hispanic or Latino (Cuban) float64
Hispanic or Latino (Other Hispanic or Latino) float64
Total Population float64
dtype: object
我希望为列出的每个种族添加额外的列。每个种族列的值是人口普查人口值。我想添加的附加列和值将是该县每个种族的总人口百分比。因此,“仅白人百分比”、“仅黑人或非裔美国人百分比”等。
我的 DF 中的行是各自州的每个县。
我知道我基本上可以为每个列值做这样的事情:
floridaDtFinal['White Alone Percent'] = 100 / floridaDtFinal['Total Population'] * floridaDtFinal['White Alone']
但我想知道是否有一种更有效的方法可以将计算应用于列表中的每个种族,而无需手动输入。
我意识到输入它可能比手动输入花费的时间更长,但我现在很好奇,很想知道我将如何执行它。只需在每个末尾添加“百分比”并为整个列执行计算。
我已经执行了搜索,但大多数结果都是基于基于列表值创建列值,而不是列标题本身。
编辑:一些示例数据框作为文本。
{'State': {0: 'Florida',
1: 'Florida',
2: 'Florida',
3: 'Florida',
4: 'Florida'},
'County': {0: 'Holmes County',
1: 'Lafayette County',
2: 'Baker County',
3: 'Dixie County',
4: 'Union County'},
'Candidate': {0: 'Donald Trump',
1: 'Donald Trump',
2: 'Donald Trump',
3: 'Donald Trump',
4: 'Donald Trump'},
'Total Votes': {0: 8080, 1: 3128, 2: 11911, 3: 6759, 4: 5133},
'County Vote Percentage': {0: 89.105,
1: 85.511,
2: 84.722,
3: 82.76,
4: 82.194},
'White Alone': {0: 17237.0, 1: 6931.0, 2: 23279.0, 3: 14333.0, 4: 11268.0},
'Black or African American Alone': {0: 1356.0,
1: 1394.0,
2: 3824.0,
3: 1474.0,
4: 3359.0},
'American Indian and Alaska Native Alone': {0: 266.0,
1: 13.0,
2: 144.0,
3: 2.0,
4: 116.0},
'Asian Alone': {0: 131.0, 1: 0.0, 2: 173.0, 3: 38.0, 4: 93.0},
'Native Hawaiian and Other Pacific Islander Alone': {0: 60.0,
1: 28.0,
2: 10.0,
3: 20.0,
4: 10.0},
'Some Other Race Alone': {0: 67.0, 1: 7.0, 2: 245.0, 3: 254.0, 4: 191.0},
'Two or More Races': {0: 315.0, 1: 264.0, 2: 536.0, 3: 468.0, 4: 266.0},
'Hispanic or Latino': {0: 546.0, 1: 1360.0, 2: 721.0, 3: 674.0, 4: 862.0},
'Hispanic or Latino (Mexican)': {0: 248.0,
1: 488.0,
2: 133.0,
3: 275.0,
4: 217.0},
'Hispanic or Latino (Puerto Rican)': {0: 120.0,
1: 148.0,
2: 149.0,
3: 115.0,
4: 249.0},
'Hispanic or Latino (Cuban)': {0: 29.0,
1: 568.0,
2: 140.0,
3: 147.0,
4: 161.0},
'Hispanic or Latino (Other Hispanic or Latino)': {0: 149.0,
1: 156.0,
2: 299.0,
3: 137.0,
4: 235.0},
'Total Population': {0: 20524.0,
1: 11357.0,
2: 29653.0,
3: 17937.0,
4: 17027.0}
解决方案
您可以使用以下内容来创建新列:
cols = floridaDtFinal.columns[5:17]
for col in cols:
floridaDtFinal[f'{col} Percent'] = 100 / floridaDtFinal['Total Population'] * floridaDtFinal[col]
如果您想让新列与原始 Ethnicity 列一起出现,您可以进一步对相关列(Ethnicity 和 Percent Columns)的列标签进行排序,如下所示:
header_col = ['State', 'County', 'Candidate', 'Total Votes', 'County Vote Percentage', 'Total Population']
floridaDtFinal = floridaDtFinal[header_col].join(floridaDtFinal.drop(columns=header_col).sort_index(axis=1))
结果:
State County Candidate Total Votes County Vote Percentage Total Population American Indian and Alaska Native Alone American Indian and Alaska Native Alone Percent Asian Alone Asian Alone Percent Black or African American Alone Black or African American Alone Percent Hispanic or Latino Hispanic or Latino (Cuban) Hispanic or Latino (Cuban) Percent Hispanic or Latino (Mexican) Hispanic or Latino (Mexican) Percent Hispanic or Latino (Other Hispanic or Latino) Hispanic or Latino (Other Hispanic or Latino) Percent Hispanic or Latino (Puerto Rican) Hispanic or Latino (Puerto Rican) Percent Hispanic or Latino Percent Native Hawaiian and Other Pacific Islander Alone Native Hawaiian and Other Pacific Islander Alone Percent Some Other Race Alone Some Other Race Alone Percent Two or More Races Two or More Races Percent White Alone White Alone Percent
1 Florida Lafayette County Donald Trump 3128 85.511 11357.0 13.0 0.114467 0.0 0.000000 1394.0 12.274368 1360.0 568.0 5.001321 488.0 4.296909 156.0 1.373602 148.0 1.303161 11.974993 28.0 0.246544 7.0 0.061636 264.0 2.324558 6931.0 61.028441
2 Florida Baker County Donald Trump 11911 84.722 29653.0 144.0 0.485617 173.0 0.583415 3824.0 12.895828 721.0 140.0 0.472128 133.0 0.448521 299.0 1.008330 149.0 0.502479 2.431457 10.0 0.033723 245.0 0.826223 536.0 1.807574 23279.0 78.504704
3 Florida Dixie County Donald Trump 6759 82.760 17937.0 2.0 0.011150 38.0 0.211853 1474.0 8.217651 674.0 147.0 0.819535 275.0 1.533144 137.0 0.763784 115.0 0.641133 3.757596 20.0 0.111501 254.0 1.416067 468.0 2.609132 14333.0 79.907454
4 Florida Union County Donald Trump 5133 82.194 17027.0 116.0 0.681271 93.0 0.546191 3359.0 19.727492 862.0 161.0 0.945557 217.0 1.274446 235.0 1.380161 249.0 1.462383 5.062548 10.0 0.058730 191.0 1.121748 266.0 1.562225 11268.0 66.177248
0 NaN Holmes County Donald Trump 8080 89.105 20524.0 266.0 1.296044 131.0 0.638277 1356.0 6.606899 546.0 29.0 0.141298 248.0 1.208341 149.0 0.725979 120.0 0.584681 2.660300 60.0 0.292341 67.0 0.326447 315.0 1.534789 17237.0 83.984603
推荐阅读
- swift - 执行被中断,原因:信号 SIGABRT
- node.js - docker build 卡在 pm2 start
- ckeditor - 插入后如何将光标移到跨度元素之外?
- flutter - 如何在颤动中更改文本按钮的背景颜色?
- react-native - 为什么我的 React Native 输入字段会不断重置为其初始值?
- c# - StackExchange.Redis 命令超时问题
- php - 碳格式只有闰年的年和日
- android - ConstraintLayout Slide Transition 动画在动画结束前显示视图的全高
- php - 如何使用 PHP 在 twilio 来电显示手机号码列表中添加手机号码?
- python - 如何从文件夹中动态导入脚本并在 Python 中动态执行函数?