r - R:根据前一行合并连续行
问题描述
我正在使用 R 处理说话时间数据集(时间序列)。该数据显示了不同说话者的进出时间。我想(1)按时间顺序组合属于同一扬声器的连续行,(2)将时间保留在In
第一行的列中,并将时间保留Out
在新合并行的最后一列中。有人可以指导我如何解决这个问题吗?非常感谢!
原始数据:
structure(list(In = c("15:22.5", "15:31.2", "15:38.1", "15:45.4",
"15:52.2", "16:11.0", "16:23.4", "16:35.3", "16:36.9", "16:47.4",
"17:06.0", "17:11.2", "17:18.7", "17:26.7", "17:34.6", "18:05.0",
"18:17.5", "18:28.9", "18:32.4", "19:00.4", "19:41.3", "20:01.6"
), Out = c("15:27.7", "15:36.9", "15:45.2", "15:52.0", "16:10.8",
"16:22.0", "16:35.0", "16:36.8", "16:37.8", "17:04.8", "17:08.3",
"17:17.0", "17:23.8", "17:27.2", "18:04.3", "18:06.0", "18:24.3",
"18:31.8", "18:59.1", "19:40.2", "19:53.1", "20:19.1"), Speaker = c("Y",
"Y T", "Y", "T", "ATA", "Y", "T", "T", "Y", "Y T", "Y", "T",
"Y", "Y", "Y", "Y", "A T", "T", "T", "T", "T", "A TY")), class = "data.frame", row.names = c(NA,
-22L))
预期输出:
structure(list(In = c("15:22.5", "15:31.2", "15:38.1", "15:45.4",
"15:52.2", "16:11.0", "16:23.4", "16:36.9", "16:47.4", "17:06.0",
"17:11.2", "17:18.7", "18:17.5", "18:28.9", "20:01.6"), Out = c("15:27.7",
"15:36.9", "15:45.2", "15:52.0", "16:10.8", "16:22.0", "16:36.8",
"16:37.8", "17:04.8", "17:08.3", "17:17.0", "18:06.0", "18:24.3",
"19:53.1", "20:19.1"), Speaker = c("Y", "Y T", "Y", "T", "ATA",
"Y", "T", "Y", "Y T", "Y", "T", "Y", "A T", "T", "A TY")), class = "data.frame", row.names = c(NA,
-15L))
解决方案
我们可以使用rleid
from data.table
ie 基于“Speaker”的相似相邻值创建一个分组变量,然后summarise
取first
“In”和last
“Out”列的值
library(dplyr)
library(data.table)
df1 %>%
group_by(grp = rleid(Speaker), Speaker) %>%
summarise(In = first(In), Out = last(Out), .groups = 'drop') %>%
select(names(df1))
-输出
# A tibble: 15 x 3
# In Out Speaker
# <chr> <chr> <chr>
# 1 15:22.5 15:27.7 Y
# 2 15:31.2 15:36.9 Y T
# 3 15:38.1 15:45.2 Y
# 4 15:45.4 15:52.0 T
# 5 15:52.2 16:10.8 ATA
# 6 16:11.0 16:22.0 Y
# 7 16:23.4 16:36.8 T
# 8 16:36.9 16:37.8 Y
# 9 16:47.4 17:04.8 Y T
#10 17:06.0 17:08.3 Y
#11 17:11.2 17:17.0 T
#12 17:18.7 18:06.0 Y
#13 18:17.5 18:24.3 A T
#14 18:28.9 19:53.1 T
#15 20:01.6 20:19.1 A TY
推荐阅读
- bash - 我正在使用 `xprintidle` 在 X 秒不活动后触发 bash 脚本。我如何让它循环?
- objectbox - ObjectBox DB 是否跨语言和平台兼容?
- ruby-on-rails - 如何在 Rails6 中的视图和控制器之外使用持久值
- java - 如何将任何对象的列表转换为具有相同类型的数组?
- node.js - node.js 中的 DBSCAN 实现
- ssl - 如何允许/限制证书颁发另一个证书?
- c++ - 如何禁用getter临时返回的写入?
- visual-studio - Visual Studio CMake 构建日志添加项目路径并中断错误解析
- java - 从 /dev/input 读取几个字节时出现 IOException
- javascript - FormattedMessage 使用打字稿对块进行赋值