r - 如何按组计算变量内的比率
问题描述
我有一个数据框,包括:
HouseholdID <- c("1", "2","2", "3", "3", "4")
ItemNo <- c("23", "25", "23", "26", "23", "24")
ItemPrice <- c(200, 300, 200, 500, 200, 900)
TotalHouseholdExpenses <- c(200, 500, 500, 700, 700, 900)
df <- data.frame(HouseholdID, ItemNo, ItemPrice, TotalHouseholdExpenses)
也就是说,住户 1 买了 23 件,花了 200 美元,住户 2 买了 25 和 23 件,花了 500 美元,等等。如果把 23 和 24 件视为“诱惑品”,我想知道每个家庭总数的百分比家庭开支由诱惑商品组成,我将如何计算并创建这个比率的新变量?谢谢你。
解决方案
ItemPrice/TotalHouseholdExpenses
如果比率dcast
适合你
library(data.table)
dt <- data.table(HouseholdID,ItemNo,ItemPrice,TotalHouseholdExpenses)
dcast(dt, HouseholdID + TotalHouseholdExpenses ~ ItemNo, value.var = 'ItemPrice') %>%
mutate(across(`23`:`26`, ~ .x /TotalHouseholdExpenses))
或者不必手动指定列:
dcast(dt, HouseholdID + TotalHouseholdExpenses ~ ItemNo, value.var = 'ItemPrice') %>%
mutate(across(3:(2+length(unique(ItemNo))), ~ .x /TotalHouseholdExpenses))
output:
HouseholdID TotalHouseholdExpenses 23 24 25 26
1: 1 200 1.0000000 NA NA NA
2: 2 500 0.4000000 NA 0.6 NA
3: 3 700 0.2857143 NA NA 0.7142857
4: 4 900 NA 1 NA NA
最后用 % 格式
library(data.table)
library(scales)
dt <- data.table(HouseholdID,ItemNo,ItemPrice,TotalHouseholdExpenses)
dcast(dt, HouseholdID + TotalHouseholdExpenses ~ ItemNo, value.var = 'ItemPrice') %>%
mutate(across(3:(2+length(unique(ItemNo))), ~ label_percent()(.x /TotalHouseholdExpenses)))
output:
HouseholdID TotalHouseholdExpenses 23 24 25 26
1: 1 200 100% <NA> <NA> <NA>
2: 2 500 40% <NA> 60% <NA>
3: 3 700 29% <NA> <NA> 71%
4: 4 900 <NA> 100% <NA> <NA>
推荐阅读
- html - 是否有另一种超链接方式
- iphone - iPhone设备上的Bootstrap轮播双幻灯片故障
- javascript - TypeError:无法读取未定义的属性“类别”
- python - Driver.get 一组链接?
- python - 为什么批次标准化会使我的批次如此异常?
- java - Delayed ExecutorService with manual start
- excel - VBA在excel中循环A列中的每个单元格
- javascript - 将数组的值推送到对象的值
- firebase - 重定向“源”中的查询参数是否仍然不允许?
- udp - 通过 UDP 从 ftp 下载文件但未呈现路径