r - 需要帮助创建基于 R 中其他三列的计算的列
问题描述
我有一个这样的数据文件:
structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("2020-07-26 00:00:00|Monitor1.txt|01",
"2020-07-26 00:00:00|Monitor1.txt|02", "2020-07-26 00:00:00|Monitor1.txt|03",
"2020-07-26 00:00:00|Monitor1.txt|04", "2020-07-26 00:00:00|Monitor1.txt|05",
"2020-07-26 00:00:00|Monitor1.txt|06", "2020-07-26 00:00:00|Monitor1.txt|07",
"2020-07-26 00:00:00|Monitor1.txt|08", "2020-07-26 00:00:00|Monitor1.txt|09",
"2020-07-26 00:00:00|Monitor1.txt|10", "2020-07-26 00:00:00|Monitor1.txt|11",
"2020-07-26 00:00:00|Monitor1.txt|12", "2020-07-26 00:00:00|Monitor1.txt|13",
"2020-07-26 00:00:00|Monitor1.txt|14", "2020-07-26 00:00:00|Monitor1.txt|15",
"2020-07-26 00:00:00|Monitor1.txt|16", "2020-07-26 00:00:00|Monitor1.txt|17",
"2020-07-26 00:00:00|Monitor1.txt|18", "2020-07-26 00:00:00|Monitor1.txt|19",
"2020-07-26 00:00:00|Monitor1.txt|20", "2020-07-26 00:00:00|Monitor1.txt|21",
"2020-07-26 00:00:00|Monitor1.txt|22", "2020-07-26 00:00:00|Monitor1.txt|23",
"2020-07-26 00:00:00|Monitor1.txt|24", "2020-07-26 00:00:00|Monitor1.txt|25",
"2020-07-26 00:00:00|Monitor1.txt|26", "2020-07-26 00:00:00|Monitor1.txt|27",
"2020-07-26 00:00:00|Monitor1.txt|28", "2020-07-26 00:00:00|Monitor1.txt|29",
"2020-07-26 00:00:00|Monitor1.txt|30", "2020-07-26 00:00:00|Monitor1.txt|31",
"2020-07-26 00:00:00|Monitor1.txt|32"), class = "factor"), t = c(60,
120, 180, 240, 300, 360), activity = c(0L, 0L, 0L, 0L, 0L, 0L
), moving = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), asleep = c(TRUE,
TRUE, TRUE, TRUE, TRUE, TRUE), Day = c(1, 1, 1, 1, 1, 1)), row.names = c(NA,
-6L), class = c("behavr", "data.table", "data.frame"), sorted = "id", .internal.selfref = <pointer: 0x0000019c94541ef0>, metadata = structure(list(
id = structure(1L, .Label = c("2020-07-26 00:00:00|Monitor1.txt|01",
"2020-07-26 00:00:00|Monitor1.txt|02", "2020-07-26 00:00:00|Monitor1.txt|03",
"2020-07-26 00:00:00|Monitor1.txt|04", "2020-07-26 00:00:00|Monitor1.txt|05",
"2020-07-26 00:00:00|Monitor1.txt|06", "2020-07-26 00:00:00|Monitor1.txt|07",
"2020-07-26 00:00:00|Monitor1.txt|08", "2020-07-26 00:00:00|Monitor1.txt|09",
"2020-07-26 00:00:00|Monitor1.txt|10", "2020-07-26 00:00:00|Monitor1.txt|11",
"2020-07-26 00:00:00|Monitor1.txt|12", "2020-07-26 00:00:00|Monitor1.txt|13",
"2020-07-26 00:00:00|Monitor1.txt|14", "2020-07-26 00:00:00|Monitor1.txt|15",
"2020-07-26 00:00:00|Monitor1.txt|16", "2020-07-26 00:00:00|Monitor1.txt|17",
"2020-07-26 00:00:00|Monitor1.txt|18", "2020-07-26 00:00:00|Monitor1.txt|19",
"2020-07-26 00:00:00|Monitor1.txt|20", "2020-07-26 00:00:00|Monitor1.txt|21",
"2020-07-26 00:00:00|Monitor1.txt|22", "2020-07-26 00:00:00|Monitor1.txt|23",
"2020-07-26 00:00:00|Monitor1.txt|24", "2020-07-26 00:00:00|Monitor1.txt|25",
"2020-07-26 00:00:00|Monitor1.txt|26", "2020-07-26 00:00:00|Monitor1.txt|27",
"2020-07-26 00:00:00|Monitor1.txt|28", "2020-07-26 00:00:00|Monitor1.txt|29",
"2020-07-26 00:00:00|Monitor1.txt|30", "2020-07-26 00:00:00|Monitor1.txt|31",
"2020-07-26 00:00:00|Monitor1.txt|32"), class = "factor"),
file_info = list(list(path = "C:/Users/ariji/Desktop/ShinyWrapperForCircadianAnalysis/Monitor1.txt",
file = "Monitor1.txt")), region_id = 1L, experiment_id = "2020-07-26 00:00:00|Monitor1.txt",
start_datetime = structure(1595721600, class = c("POSIXct",
"POSIXt"), tzone = "UTC"), stop_datetime = structure(1596326400, class = c("POSIXct",
"POSIXt"), tzone = "UTC"), genotype = "Early", replicate = 1L,
uid = 1L), sorted = "id", class = c("data.table", "data.frame"
), row.names = c(NA, -1L), .internal.selfref = <pointer: 0x0000019c94541ef0>))
完整文件在这里 - https://anonymousfiles.io/1vypEs9u/(阅读fread
)。我需要做的是 - 在这个 data.table 中创建另一列,称为noramct
. 中的值noramct
应该是(activity
/sum of allactivity
那天(1,2,3...7)。这必须由每个人按列完成id
。所以基本上,对于每个id
,我想要一个标准化的活动(即特定id
的activity
除以特定日期的活动id
)。记住一天活动的总和有两个级别,都activity
来自一个特定的 id 和一个特定day
的,这可能会令人困惑,因为一天的活动activity
将有多个计数id
s。任何帮助将不胜感激!感谢期待。Reddit上也有人问过这个问题。
解决方案
不确定这是否是您的意思,但我认为首先为每个观察创建分母就足够了(我理解它是其对应的 id 和日期的总活动),然后简单地将每个值除以其对应的分母。幸运的是,这在 data.table 中非常简单:
data[, day_activity_for_id := sum(activity), by = .(id, day)
][, noramct := activity/day_activity_for_id]
另外,对下次的友好建议:如果您向我们展示桌子头部的打印件,则更容易理解您的问题,而不是繁琐的结构!data.table 在您的控制台中非常干净地打印它
head(data)
推荐阅读
- javascript - 未捕获的 ReferenceError:getDropdownData 未在 HTMLAnchorElement.onclick 中定义
- reactjs - 使用自定义 Hook 防止组件重新渲染
- snowflake-cloud-data-platform - 雪花和正则表达式 - 在 SF 中实现已知良好表达式时的问题
- python-3.x - 通过邮递员传递python关键字
- c# - 纬度/经度列表的中心位置
- javascript - 将事件添加到使用 angular 中的 createElement 创建的单选按钮
- python - 如何在元组中用逗号替换空格?
- sql - 为什么尝试使用日期查询删除旧行时所有行都被删除
- flutter - 使用 dio 上传多图像选择器(包已过期)问题
- android - 设备旋转时如何防止edittext焦点回到第一个edittext