r - 根据条件计算平均值
问题描述
我有一张桌子
Country ClaimId ClaimItem ClaimAmt
IN C1 1 100
IN C1 2 200
US C2 1 100
US C2 2 100
US C2 3 100
US C3 1 100
US C3 2 100
UK C4 1 100
UK C4 2 200
UK C1 1 100
UK C1 2 200
在这里,我想计算每个 claimID 的每个国家/地区的平均值,这样我的预期表格看起来像
Country ClaimId ClaimItem ClaimAmt Avg
IN C1 1 100 300
IN C1 2 200 300
US C2 1 100 250
US C2 2 100 250
US C2 3 100 250
US C3 1 100 250
US C3 2 100 250
UK C4 1 100 300
UK C4 2 200 300
UK C1 1 100 300
UK C1 2 200 300
关于如何实现预期表的任何想法。谢谢
这是示例
> dput(claims)
structure(list(Country = structure(c(1L, 1L, 3L, 3L, 3L, 3L,
3L, 2L, 2L, 2L, 2L), .Label = c("IN", "UK", "US"), class = "factor"),
ClaimId = structure(c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L,
1L, 1L), .Label = c("C1", "C2", "C3", "C4"), class = "factor"),
ClaimItem = c(1L, 2L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 1L, 2L),
ClaimAmt = c(100L, 200L, 100L, 100L, 100L, 100L, 100L, 100L,
200L, 100L, 200L)), .Names = c("Country", "ClaimId", "ClaimItem",
"ClaimAmt"), class = "data.frame", row.names = c(NA, -11L))
解决方案
这是一个解决方案data.table
:
claims <-
structure(list(Country = structure(c(1L, 1L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L),
.Label = c("IN", "UK", "US"), class = "factor"),
ClaimId = structure(c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 1L, 1L),
.Label = c("C1", "C2", "C3", "C4"), class = "factor"),
ClaimItem = c(1L, 2L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 1L, 2L),
ClaimAmt = c(100L, 200L, 100L, 100L, 100L, 100L, 100L, 100L, 200L, 100L, 200L)),
.Names = c("Country", "ClaimId", "ClaimItem", "ClaimAmt"),
class = "data.frame", row.names = c(NA, -11L))
library("data.table")
setDT(claims)
claims[, Avg:=sum(ClaimAmt)/uniqueN(ClaimId), Country][]
# > claims[, Avg:=sum(ClaimAmt)/uniqueN(ClaimId), Country][]
# Country ClaimId ClaimItem ClaimAmt Avg
# 1: IN C1 1 100 300
# 2: IN C1 2 200 300
# 3: US C2 1 100 250
# 4: US C2 2 100 250
# 5: US C2 3 100 250
# 6: US C3 1 100 250
# 7: US C3 2 100 250
# 8: UK C4 1 100 300
# 9: UK C4 2 200 300
# 10: UK C1 1 100 300
# 11: UK C1 2 200 300
推荐阅读
- c - SDL2:架构 x86_64 / MAC 的未定义符号
- java - 使用BufferedWriter时的程序设计,我会重复打开和关闭文件吗?
- neo4j - 加权图中的 Neo4j 中介中心性
- javascript - 如何使用传单将世界地图居中
- javascript - 根据下拉菜单中的选择更改 div 的内容,第一个选项在 JavaScript 中可见
- python - 你能在html中执行python吗
- amp-html - AMP 域“ampproject.org”不再可用
- git - 为什么 GitKraken 快速转发功能会合并到开发分支中?
- sql - 从 xml 路径中删除尾随逗号
- typescript - TypeScript 和非泛型类型参数