r - 如何在 R 中使用 chr 列聚合数据框
问题描述
我正在尝试在 R 中创建一个聚合三列的数据框。
第一个是日期,第二个是 chr 代码,第三个是 num。对于永远 promo_code,我需要它来创建一个新列,指示已使用促销代码(1 = 是,0 = 否)
promo_codes 数以千计,所以我需要一些可以自行扩展的东西。我无法手动添加列
DATA:
+---------------+------------+--------------+
| Date | Promo_Code | Sales_Amount |
+---------------+------------+--------------+
| 10-29-20 | today20 | 50 |
+---------------+------------+--------------+
| 10-29-20 | vip20 | 50 |
+---------------+------------+--------------+
| 10-29-20 | today20 | 50 |
+---------------+------------+--------------+
| 10-28-20 | vip20 | 50 |
+---------------+------------+--------------+
| 10-28-20 | vip20 | 50 |
+---------------+------------+--------------+
| 10-27-20 | pc20 | 25 |
+---------------+------------+--------------+
| 10-28-20 | | 50 |
+---------------+------------+--------------+
| 10-28-20 | vip20 | 50 |
+---------------+------------+--------------+
| 10-27-20 | | 25 |
+---------------+------------+--------------+
| .... | .... | .... |
+---------------+------------+--------------+
| .... | .... | .... |
+---------------+------------+--------------+
NEW DATAFRAME
+---------------+------------+--------------+--------------+--------------+
| Date | today20 | vip20 | pc20 | Sales Total |
+---------------+------------+--------------+--------------+--------------+
| 10-29-20 | 1 | 1 | 0 | 150.00 |
+---------------+------------+--------------+--------------+--------------+
| 10-28-20 | 0 | 1 | 0 | 100.00 |
+---------------+------------+--------------+--------------+--------------+
| 10-27-20 | 0 | 0 | 1 | 25.00 |
+---------------+------------+--------------+--------------+--------------+
| .... | .... | .... | .... | .... |
+---------------+------------+--------------+--------------+--------------+
| .... | .... | .... | .... | .... |
+---------------+------------+--------------+--------------+--------------+
我试过的:
library(data.table)
out = setDT(data)[,lapply(.SD, sum), by=data$Date]
这给出了这个错误:不支持类型“字符”
library(dplyr)
out = data %>% group_by(date) %>% summerise_each(funs(sum))
这给了我与“无效的'类型'(字符)参数相同的错误
解决方案
这是一种方法,它在给定日期为每个添加计数Promo_Code
,然后旋转宽以将促销代码表示为它们自己的列。按日期聚合产生总Sales_Amount
加Promo_Code
计数。
library(tidyverse)
df %>%
group_by(Date, Promo_Code) %>%
summarise(n = n(),
Sales_Amount = sum(Sales_Amount)) %>%
pivot_wider(id_cols = c(Date, Sales_Amount), names_from = Promo_Code,
values_from = n, values_fill = 0) %>%
summarise(across(.fns = sum))
# A tibble: 5 x 5
Date Sales_Amount pc20 vip20 today20
<date> <dbl> <int> <int> <int>
1 2020-10-27 115 1 1 0
2 2020-10-28 75 1 1 0
3 2020-10-29 60 1 1 0
4 2020-10-30 115 1 1 0
5 2020-10-31 100 1 0 1
示例数据:
set.seed(123)
df <- tibble(Date = rep(seq(ymd('2020-10-27'),ymd('2020-10-31'), by = '1 day'), 2),
Promo_Code = sample(c("today20", "vip20", "pc20"), replace=T, 10),
Sales_Amount = sample(c(25, 35, 50, 65, 75), replace=T, 10))
df
# A tibble: 10 x 3
Date Promo_Code Sales_Amount
<date> <chr> <dbl>
1 2020-10-27 pc20 65
2 2020-10-28 pc20 25
3 2020-10-29 pc20 35
4 2020-10-30 vip20 50
5 2020-10-31 pc20 75
6 2020-10-27 vip20 50
7 2020-10-28 vip20 50
8 2020-10-29 vip20 25
9 2020-10-30 pc20 65
10 2020-10-31 today20 25
推荐阅读
- laravel - Laravel 7.x Observer 未保存,created_by updated_by 用户
- c# - 初始化对象在 OnAwake 和 OnEnable() 之间变为空
- pandas - 将熊猫隐藏到列表中
- android - 按下按钮后改变其外观?
- ruby - 有没有办法用水晶语言做单线?
- javascript - 错误:参数“数据”的值不是有效的 Firestore 文档。不能使用“未定义”;作为 Firestore 值
- reactjs - Netlify-forms 联系表单提交 404 错误
- python - Python - 类方法返回对象。如何读取对象数据?
- reactjs - 如何将 Draggable 捕捉到网格?(材质界面)
- amazon-web-services - Postman 上的 AWS 响应超时