r - 总结R中数据框中的重复项
问题描述
我有一个包含以下数据的日期框:
#sample data
Date <- c( "2020-01-01", "2020-01-01", "2020-01-01", "2020-01-01", "2020-01-01", "2020-01-02", "2020-01-02", "2020-01-02", "2020-01-02")
Salesperson <-c ( "Sales1", "Sales1", "Sales1", "Sales2", "Sales2", "Sales1", "Sales1", "Sales2", "Sales2" )
Clothing <-c ( "5", "2", "8", "3", "3", "4", "7", "3", "4" )
Electronics <-c ( "6", "9", "1", "2", "1", "2", "2", "1", "2" )
data<-data.frame(Date,Salesperson,Clothing,Electronics, stringsAsFactors = FALSE)
data$Date<-as.Date(data$Date,"%Y-%m-%d")
在 df 中有几行,销售人员在同一日期多次记录了他们的销售额,而不是将它们相加。
我想要的结果由下面的数据框显示:
Date <- c ( "2020-01-01", "2020-01-01", "2020-01-02", "2020-01-02" )
Salesperson <- c ( "Sales1", "Sales2", "Sales1", "Sales2")
Clothing <- c ( "15", "6", "11", "7" )
Electronics <- c ( "16", "3", "4", "3" )
data1<-data.frame(Date,Salesperson,Clothing,Electronics, stringsAsFactors = FALSE)
有谁知道如何达到这个结果?
解决方案
为了总结您的数据,您需要将数字作为数字而不是字符串传递。请参阅我在您的和变量as.numeric()
前面添加的内容:Clothing
Electronics
Clothing <-as.numeric(c ( "5", "2", "8", "3", "3", "4", "7", "3", "4" ))
Electronics <-as.numeric(c ( "6", "9", "1", "2", "1", "2", "2", "1", "2" ))
现在,使用总和进行总结,请尝试:
library(dplyr)
data %>%
group_by(Date, Salesperson) %>%
summarise(sum_cloth=(sum(Clothing)), sum_elec=sum(Electronics))
# Groups: Date [2]
Date Salesperson sum_cloth sum_elec
<chr> <chr> <dbl> <dbl>
1 2020-01-01 Sales1 15 16
2 2020-01-01 Sales2 6 3
3 2020-01-02 Sales1 11 4
4 2020-01-02 Sales2 7 3
推荐阅读
- python-3.x - 如何在 sqlalchemy 中组合同一类的数据库存储实例和应用程序硬编码实例?
- javascript - ajax自动刷新div在x秒后隐藏div
- .net - 带有 .NET Core 3.0 的 VSTO
- ios - What is the alternative to WebIntent on iOS in Ionic 4?
- sql - 我想加入两个表并获取 10 秒间隔内的记录
- python - 使用服务帐户进行 Google API 身份验证
- node.js - How to warmup a serverless lambda with child routes in nodejs
- c# - C# Math.Net - 如何正确设置 Fit.Curve 的参数
- mysql - SQL - 过滤的命名条件
- powershell - Powershell 2.0 对文件中的列进行计数