r - 使用 data.table 聚合列组合
问题描述
假设我有这些数据
> dput(data)
structure(list(Country = c("USA", "USA", "USA", "USA", "USA",
"USA", "USA", "USA", "USA"), Location = c("West", "East", "East",
"North", "North", "East", "West", "North", "East"), Gender = c("M",
"M", "F", "F", "F", "F", "F", "F", "M"), Age = c("20 - 30", "30 - 40",
"20 - 30", "30 - 40", "20 - 30", "20 - 30", "30 - 40", "20 - 30",
"30 - 40"), Civil_Status = c("Single", "Single", "Married", "Married",
"Married", "Single", "Single", "Married", "Married"), Expenditure = c(320,
400, 800, 900, 750, 350, 620, 1200, 800)), row.names = c(NA,
-9L), class = c("tbl_df", "tbl", "data.frame"))
Country Location Gender Age Civil_Status Expenditure
<chr> <chr> <chr> <chr> <chr> <dbl>
1 USA West M 20 - 30 Single 320
2 USA East M 30 - 40 Single 400
3 USA East F 20 - 30 Married 800
4 USA North F 30 - 40 Married 900
5 USA North F 20 - 30 Married 750
6 USA East F 20 - 30 Single 350
7 USA West F 30 - 40 Single 620
8 USA North F 20 - 30 Married 1200
9 USA East M 30 - 40 Married 800
我要做的是对变量性别、年龄、公民身份的所有组合的支出求和,首先是国家,然后是所有可能的位置,然后将所有这些结果组合合并到一个数据集中。
这是一个例子
Usa
USA, Gender
USA, Age
USA, Civil_Status
USA, Gender, Age
USA, Gender, Civil_Status
.....................
West, Gender
West, Age
.....................
在这种情况下,我将有 2^3=8 个 Country 组合和每个位置 8 个组合。
解决方案
一种选择rollup
来自data.table
library(data.table)
setDT(data)
rollup(data, j = sum(Expenditure), by = c("Country","Gender","Age", "Civil_Status"))
推荐阅读
- python - 根据前后行中的值填充缺失值
- python-3.x - 关于 pygame 模块的简单 SublimeText 问题
- javascript - React Hooks 有条件地渲染登录/注销 w/本地存储变量
- c# - 我应该在循环阅读器之前关闭 SQL 连接吗?
- javascript - 谷歌标签管理器:解析错误。',' 预期的
- solidity - Solidity 中的修饰符和访问值
- javascript - 如何过滤一个 JS 对象并按日期排序?
- python - 替换在 python 代码中的 R 中长度为零
- php - 如何使用 PHP-FFMpeg 打开远程视频?
- javascript - 使用 mysql2 数据库注册和登录 javascript api 时遇到错误