首页 > 解决方案 > R data.table 通过返回新表的组的动态列名

问题描述

默认情况下,对 data.table 的 group by 操作会返回一个带有自动命名列的新 data.table V1

dt <- data.table(a = sample(1:100, 100), b = sample(1:100, 100), id = rep(1:10,10))
dt[, mean(a), by = id]

#     id V1
# 1:  1 48.2
# 2:  2 47.9
# 3:  3 46.8
# 4:  4 54.7
# 5:  5 63.7
# 6:  6 50.6
# 7:  7 43.3
# 8:  8 52.7
# 9:  9 45.4
# 10: 10 51.7

这篇文章之后,我可以设置列的名称,结果如下

dt[, list(mean = mean(a)), by = id]

列名可以有一个变量吗?例如,mean我不想明确设置,而是想做类似的事情

column_name <- "mean"
dt[, list(column_name = mean(a)), by = id]  # resulting column name is column_name (and not mean)

标签: rdata.table

解决方案


我们可以用setNames

library(data.table)
dt[, setNames(list(mean(a)), column_name), by = id]

#    id mean
# 1:  1 56.8
# 2:  2 50.5
# 3:  3 50.5
# 4:  4 42.4
# 5:  5 49.9
# 6:  6 47.8
# 7:  7 60.6
# 8:  8 57.4
# 9:  9 54.6
#10: 10 34.5

数据

set.seed(123)
dt <- data.table(a = sample(1:100, 100), b = sample(1:100, 100), id = rep(1:10,10))
column_name <- "mean"

推荐阅读