r - 识别向量中的给定模式并添加缺少的元素以获得给定模式的重复
问题描述
这个问题与这个Wide a dataframe 和 insert missing columns有关
假设我们有一个按以下顺序包含 5 个元素的给定模式:"A", "B", "C", "D", "E"
这种模式重复说 10 次。但有时缺少一些元素(参见图片我的矢量(橙色)。
是否可以R
识别重复的模式并填充缺少的元素(参见图片我想要的输出)。
我的载体:
my.vector <- c("A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "B", "C",
"D", "E", "B", "C", "D", "E", "B", "C", "D", "E", "B", "C", "D",
"E", "B", "C", "D", "E", "B", "C", "D", "E", "A", "B", "C", "D",
"E", "B")
my.vector
[1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "A" "B" "C" "D" "E" "B"
图形解释:
解决方案
使用, (或使用任何分组函数等)创建基于diff
of match
ing 索引的分组列,并使用 'LETTERS[1:5] unlist list unname`创建一个LETTERS[1:5]
split
tapply
union
,
the
and
unname( unlist(lapply(split(my.vector, cumsum(c(TRUE,
diff(match(my.vector, LETTERS[1:5])) != 1))),
function(x) union(LETTERS[1:5], x))))
-输出
[1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A"
[37] "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E"
或者另一种选择是complete
library(dplyr)
library(tidyr)
library(data.table)
tibble(col1 = my.vector) %>%
group_by(rn = rowid(col1)) %>%
complete(col1 = LETTERS[1:5]) %>%
ungroup %>%
pull(col1)
-输出
1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A"
[37] "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E"
推荐阅读
- docker - yml中的Docker-compose命令不起作用
- tensorflow - 如何定义(稀疏)可变对角张量
- bash - 使用 awk 或 sed 在文本文件中的一行中忽略反斜杠后的逗号
- java - 是否调用方法“SecurityUtils.getSubject();” 会一直打redis数据库吗?
- .net - StructureMap 拦截器和 DynamicProxy
- azure - 本地 postgreSql 到 blob 的性能调整
- c# - 我如何使用 HTML 视频标签在 asp.net 应用程序中播放 VLC 实时流媒体视频
- php - CodeIgniter 使用 If 语句显示不同的菜单
- css - li:nth-child() 颜色格式的菜单
- excel - 使用do until循环遍历日期列表时如何跳过丢失的日期