r - 计算组的百分比
问题描述
我有一组人口数据。我已将数据划分为年龄组和区域。
如何使用以下示例数据计算所有列中每个组和区域内的比例?
area sex agegrouping 2011 2012 2013
area1 F 0-4 637.4815661 626.6145185 596.7128164
area1 F 10-14 417.8041418 402.5041888 411.2180838
area1 F 15-19 360.6491372 361.5883403 364.5626384
area1 F 20-24 562.4887445 598.7190796 617.9790937
area1 M 0-4 581.08247 581.11732 556.4439468
area1 M 10-14 408.1015966 379.945334 377.7312704
area1 M 15-19 380.7336397 392.2732017 384.8757803
area1 M 20-24 1089.024655 983.1813181 874.3646633
area2 F 0-4 460.2959017 479.7512631 489.1076221
area2 F 10-14 357.2974721 378.9785589 410.7145251
area2 F 15-19 353.4763328 324.3975914 312.5421936
area2 F 20-24 674.8157905 627.0151556 568.8309423
area2 M 0-4 570.1424505 579.4558621 572.8858648
area2 M 10-14 366.9484728 365.0947588 370.726409
area2 M 15-19 382.3444468 365.0342791 343.5104
area2 M 20-24 645.3627281 624.4575313 577.5540519
我知道我可以逐列手动完成,但有没有办法一次完成(因为完整的数据集到 2050 年)。
数据应如下所示(但包括所有其他年份列和区域):
area sex agegrouping 2011.percent
area1 F 0-4 14.36621575
area1 F 10-14 9.415589032
area1 F 15-19 8.127550019
area1 F 20-24 12.67618562
area1 M 0-4 13.09521181
area1 M 10-14 9.196933521
area1 M 15-19 8.5801722
area1 M 20-24 24.54214205
解决方案
这是一个dplyr
版本:
library(dplyr)
dt = read.table(text = "
area sex agegrouping 2011 2012 2013
area1 F 0-4 637.4815661 626.6145185 596.7128164
area1 F 10-14 417.8041418 402.5041888 411.2180838
area1 F 15-19 360.6491372 361.5883403 364.5626384
area1 F 20-24 562.4887445 598.7190796 617.9790937
area1 M 0-4 581.08247 581.11732 556.4439468
area1 M 10-14 408.1015966 379.945334 377.7312704
area1 M 15-19 380.7336397 392.2732017 384.8757803
area1 M 20-24 1089.024655 983.1813181 874.3646633
area2 F 0-4 460.2959017 479.7512631 489.1076221
area2 F 10-14 357.2974721 378.9785589 410.7145251
area2 F 15-19 353.4763328 324.3975914 312.5421936
area2 F 20-24 674.8157905 627.0151556 568.8309423
area2 M 0-4 570.1424505 579.4558621 572.8858648
area2 M 10-14 366.9484728 365.0947588 370.726409
area2 M 15-19 382.3444468 365.0342791 343.5104
area2 M 20-24 645.3627281 624.4575313 577.5540519
", header=T)
dt %>%
group_by(area) %>% # for each area
mutate_if(is.numeric, ~./sum(.)) %>% # calculate percentages for each numeric column
rename_if(is.numeric, ~gsub("X", "prc_", .)) %>% # update the names of those columns
ungroup() # forget the grouping
# # A tibble: 16 x 6
# area sex agegrouping prc_2011 prc_2012 prc_2013
# <fct> <fct> <fct> <dbl> <dbl> <dbl>
# 1 area1 F 0-4 0.144 0.145 0.143
# 2 area1 F 10-14 0.0942 0.0930 0.0983
# 3 area1 F 15-19 0.0813 0.0836 0.0871
# 4 area1 F 20-24 0.127 0.138 0.148
# 5 area1 M 0-4 0.131 0.134 0.133
# 6 area1 M 10-14 0.0920 0.0878 0.0903
# 7 area1 M 15-19 0.0858 0.0907 0.0920
# 8 area1 M 20-24 0.245 0.227 0.209
# 9 area2 F 0-4 0.121 0.128 0.134
# 10 area2 F 10-14 0.0938 0.101 0.113
# 11 area2 F 15-19 0.0928 0.0866 0.0857
# 12 area2 F 20-24 0.177 0.167 0.156
# 13 area2 M 0-4 0.150 0.155 0.157
# 14 area2 M 10-14 0.0963 0.0975 0.102
# 15 area2 M 15-19 0.100 0.0975 0.0942
# 16 area2 M 20-24 0.169 0.167 0.158
推荐阅读
- docker - microk8s拉镜像,卡在ContainerCreating状态
- python - Leetcode中ListNode的Python逻辑
- regex - 不要在正则表达式中捕获可选的 html 标记
- mysql - Mysql SELECT ... WHERE IN(从多个表中选择多个子)
- java - 无法在 Jira 客户端中捕获 SSLHandshakeException
- c# - MVC:显示分组列表的视图
- angular - 如何在spring boot中处理angular对JSON对象的put请求?
- javascript - 如果 PHP 中的编码字符串包含 Unicode 字符,则无法在 JavaScript 中解码
- python - 使用python删除单个对象上未使用的材料
- wordpress - 如何阻止从我的 wordpress 网站获取请求