r - tapply 的问题
问题描述
我正在使用 tapply 按样本 ID(SID)组合表格。对于列表中的第一个样本,有 3 个测量值,但它仅显示为一个。
我有 4 件事需要传递给新表。首先是 SID。其次是具有该 SID 的所有测量值的面积平均值。第三是所有的距离。最后是测量次数。
cases_iTLS <- data.frame(unique(iTLS$SID))
colnames(cases_iTLS)[colnames(cases_iTLS)=="unique.iTLS.SID."] <- "SID"
cases_iTLS$SID <- factor(cases_iTLS$SID)
# Average of TLS on one slide for area
cases_iTLS$Area_iTLS <- tapply(iTLS$Area, iTLS$SID,FUN=mean)
# Average of TLS on one slide for distance
cases_iTLS$Distance_iTLS <- tapply(iTLS$Distance, iTLS$SID,FUN=mean)
# Number of measurements per SID
cases_iTLS$Count_iTLS <- tapply(iTLS$Region_Index, iTLS$SID,FUN=length)
SID Region_index Area Distance Type Location
112906 1 53531.53 71.982 iTLS intratumoral
112906 3 76809.61 97.384 iTLS intratumoral
112906 5 40937.30 9.643 iTLS intratumoral
112947 1 35071.66 2.067 iTLS intratumoral
112947 3 17979.88 36.319 iTLS
解决方案
因为您需要跨多个列(Area、Distance和SIDmean
)运行单独的聚合函数 ( and ) ,所以请考虑使用for grouping aggregation 来返回数据框。length
aggregate
通常,tapply
在单个数字指标上运行,而不是跨列或函数返回单个命名的原子向量。下面调用一个do.call
+data.frame
来绑定多个聚合的嵌套结果
aggregate
# AGGREGATE ACROSS COLS AND FUNCS
cases_iTLS <- aggregate(cbind(Area, Distance, Region_Index) ~ SID, iTLS,
function(x) c(mean=mean(x), count = length(x))
# BIND NESTED, UNDERLYING RESULTS
cases_iTLS <- do.call(data.frame, cases_iTLS)
# KEEP NEEDED COLUMNS
cases_iTL <- cases_iTL[c("SID", "Area.mean", "Distance.mean", "Region_Index.count")
tapply
如果您想走这条路,请考虑使用和 transposetapply
构建单独的聚合矩阵:rbind
t
cases_iTL_mat <- with(iTLS,
t(rbind(Area_mean = tapply(Area, SID, FUN=mean) ,
Distance_mean = tapply(Distance, SID, FUN=mean),
Region_count = tapply(Region_Index, SID, FUN=length)
))
)
by
而且我会疏忽不指出by
(面向对象的包装器tapply
):
cases_iTL_mat <- do.call(rbind,
by(iTLS, iTLS$SID, function(sub) {
c(Area_mean = mean(sub$Area),
Distance_mean = mean(sub$Distance),
Region_count = length(sub$Region_Index))
})
)
推荐阅读
- python - Python:Pandas 合并导致 NaN
- python-3.x - Tensorflow - ImportError:找不到'msvcp140.dll'
- html - 固定输入组引导程序 4
- android - Android Firebase 检索图片网址
- javascript - node js heroku 与 postgresql 交互
- android - 调用 clearAllTables() 后清除 Auto-Increment 值
- apache-spark - Spark 在值系列中查找 NULL 值块
- node.js - Nest 无法解析 VendorsService (?) 的依赖关系。请验证 [0] 参数在当前上下文中是否可用
- spring-boot - 如何使用百里香叶根据模型上的值在下拉列表中选择一个选项
- xcode - 如何将 Xcode 10 Beta 更改为新的“Darkmode”?