r - group_by,根据名称对不同的列使用不同的函数
问题描述
我有一个很大的df,有很多列(+100),其中一些的名称以“_e”结尾,而另一些则以“_se”结尾。我想使用列的标签来总结这些变量sign
。对于那些进入“_e”的列,我想得到这些值的总和。对于以“_se”结尾的那些,要使用的公式是值的平方幂之和的平方根 - sqrt(sum((x)^2)
。
sqrt(sum((x)^2)
根据列的命名方式,我是否可以使用这些不同的函数(sum 和 )来聚合我的数据?一个附加条件是,分组仅发生在名称以 _e 或 _se 结尾的那些列,避免所有未这样命名的列。
然后,所需的结果将是具有 2 行的 df - 一行用于“冷”,另一行用于“热”,以“_e”或“_se”结尾的列的新聚合值。请参阅下面的虚拟数据:
structure(list(POC_e = c(58521L, 16161L, 1194L, 2699L, 1271L,
1967L, 28248L, 12160L, 14397L, 7079L, 13667L, 4951L, 6604L, 13696L,
12551L, 2772L, 1848L, 34065L, 33594L, 67847L, 25307L, 70405L,
93996L, 65963L, 39789L, 20335L, 15903L, 108016L, 31371L, 24278L,
50822L, 20552L, 37172L, 41811L, 28593L, 16834L, 26969L), POC_se = c(1291.92647266486,
939.72404052784, 190.346374094663, 388.212893466502, 208.822332096987,
252.213709480391, 1160.08565499059, 838.438264515793, 878.262309647022,
535.666492080041, 705.069168328457, 416.437334300563, 598.234978711058,
937.860748119978, 626.051576985209, 329.912572655009, 297.104027430243,
1103.07866902155, 1068.06832578293, 1581.78604703798, 1060.61990776234,
1419.56938660059, 1844.54038903284, 1766.10256323752, 1983.39556445415,
882.694867765916, 805.685353170852, 2313.56161072355, 920.266765311779,
967.61140849172, 1634.66447221862, 504.867156218601, 824.885933362813,
975.720818549529, 1096.29928336111, 777.860187976763, 952.665438551264
), BelPov_e = c(20565L, 3812L, 66L, 840L, 164L, 145L, 13560L,
3952L, 5483L, 2591L, 3093L, 1928L, 1149L, 4197L, 4125L, 336L,
190L, 8558L, 7332L, 14872L, 5141L, 14086L, 28506L, 10753L, 4411L,
5961L, 5799L, 22387L, 5189L, 3335L, 7407L, 867L, 3077L, 3329L,
2817L, 1597L, 2406L), BelPov_se = c(1068.87660902013, 411.317970001003,
23.6457754878744, 172.906389645903, 54.6876004289108, 55.5557916537185,
892.989254922357, 401.344733821059, 604.616650266614, 336.438879762679,
404.301023537906, 282.386361349728, 147.670323118653, 496.534865097384,
483.679751234656, 103.488187597609, 80.8533491461191, 648.06537390533,
661.960303317718, 953.091653167516, 484.435550214379, 1032.29133967345,
1525.69708374773, 869.176346634638, 637.864956679027, 655.848453420799,
525.336305385565, 1478.17209676848, 420.427043758192, 416.538929000497,
658.333683680529, 121.413258770268, 278.074805599612, 361.857390496834,
298.858858908146, 234.619743804892, 359.830153667996), sign = c("cold",
"cold", "cold", "cold", "cold", "cold", "cold", "cold", "cold",
"cold", "cold", "cold", "cold", "cold", "cold", "cold", "cold",
"hot", "hot", "hot", "hot", "hot", "hot", "hot", "hot", "cold",
"cold", "hot", "hot", "hot", "hot", "hot", "hot", "hot", "hot",
"hot", "hot")), row.names = c(NA, 37L), class = "data.frame")
解决方案
在 Base-R 中
e <- df1[,grep("_e", colnames(df1))]
se <- df1[,grep("_se", colnames(df1))]
e <- lapply(split(e, df1$sign), colSums)
se <- lapply(split(se, df1$sign), function(x) apply(x,2,function(y) sqrt(sum(y^2))))
cbind(do.call(rbind,e),do.call(rbind,se))
输出:
POC_e BelPov_e POC_se BelPov_se
cold 236024 77956 3143.043 2062.604
hot 817384 146070 5732.776 3165.828
推荐阅读
- java - 多线程 UDP 服务器 Java
- c# - 如何在c#中拆分具有自己特定规则的字符串?
- flutter - Flutter ListView:小部件不更新
- android - 在 android 应用程序中运行 gradle 插件时无法加载类“org.aspectj.bridge.MessageHandler”
- terraform - 从本地列表中获取值
- javascript - Vanilla JS - 无限自动播放滑块
- python - 将脚本导入另一个脚本以编辑变量时出现 ModuleNotFoundError
- flutter - 使用 Riverpod 打开页面时如何进行多个异步调用?
- node.js - Teams Bot 如何收听来电?
- python - 基于日期时间对象添加季节列时遇到问题