首页 > 解决方案 > 根据事件保留值

问题描述

我有以下数据框df1.如何在连续出现 6 次之后保存行值w.例如,id 1在最后一次出现的情况wt8,,我想将最后一次出现的值保存在新t7的位置w数据框。如果不满足条件,我想删除该行,例如id 3,4,5,6.

输入:

  id t1 t2 t3 t4 t5 t6 t7 t6 t8 t9
  1  w  w  w  w  w  w  t  t  w  s
  2  w  w  w  w  w  w  t  t  o  s
  3  w  s  s  o  w  w  t  t  o  s
  4  w  s  s  o  o  w  t  t  o  s
  5  w  s  s  s  s  s  w  w  s  s
  6  s  s  s  w  t  t  w  w  w  s

输出:

  id t1 t2 t3 t4 t5 t6 t7 t6 t8 t9
  1                 w  t  t  w  s
  2                 w  t  t  o  s
  

样本数据

df1<-structure(list(id=c(1,2,3,4,5,6), t1=c("w","w","w","w","w","s"), t2=c("w","w","s","s","s","s"),t3 = c("w","w","s","s","s","s"),
                    t4 = c("w","w","o","o","s","w"), t5 = c("w","w","w","o","s","t"), t6 = c("w","w","w","w","s","t"),
                    t7 = c("t","t","t","t","w","w"),t6 = c("t","t","t","t","w","w"), t8 = c("w","o","o","o","s","w"), t9=c("s","s","s","s","s","s")), row.names = c(NA, 6L), class = "data.frame")

此外,如果我们的出现次数高于 6,如何保存 w 之前的值?

输入数据:

 id t1 t2 t3 t4 t5 t6 t7 t6 t8 t9
 1  w  w  w  w  w  w  t  t  w  s
 2  w  s  s  o  o  w  t  t  o  s
 3  w  s  s  o  w  w  t  t  o  s
 4  w  s  s  o  o  w  t  t  o  s
 5  w  s  s  s  s  s  w  w  s  s
 6  s  w  w  w  w  w  w  w  w  s

Output data:

  id t1 t2 t3 t4 t5 t6 t7 t6 t8 t9
  6  s  w  


Sample data:

df1<-structure(list(id=c(1,2,3,4,5,6), t1=c("w","w","w","w","w","s"), t2=c("w","s","s","s","s","w"),t3 = c("w","s","s","s","s","w"),
                    t4 = c("w","o","o","o","s","w"), t5 = c("w","o","w","o","s","w"), t6 = c("w","w","w","w","s","w"),
                    t7 = c("t","t","t","t","w","w"),t6 = c("t","t","t","t","w","w"), t8 = c("w","o","o","o","s","w"), t9=c("s","s","s","s","s","s")), row.names = c(NA, 6L), class = "data.frame")

标签: rdataframe

解决方案


使用%in%rowSums

df1[rowSums(t(apply(df1[2:7], 1, `%in%`, "w"))) == 6, -(2:6)]
#   id t6 t7 t8 t9 t10
# 1  1  w  t  t  w   s
# 2  2  w  t  t  o   s

编辑

或使用rleto count "w"s 并if/else处理类似这样的案例:

res <- apply(df2, 1, function(x) {
  r <- rle(x)
  w <- which(r$lengths >= 6 & r$values == "w")
  if (length(w) == 0) NA
  else if (r$lengths[w] == 6)
    x[c(1, (w + 5):length(x))]
  else 
    x[1:w]
})
res[!is.na(res)]
# $`1`
# id  t6  t7  t8  t9 t10 
# "1" "w" "t" "t" "w" "s" 
# 
# $`6`
# id  t1  t2 
# "6" "s" "w" 

数据:

df1 <- structure(list(id = c(1, 2, 3, 4, 5, 6), t1 = c("w", "w", "w", 
"w", "w", "s"), t2 = c("w", "w", "s", "s", "s", "s"), t3 = c("w", 
"w", "s", "s", "s", "s"), t4 = c("w", "w", "o", "o", "s", "w"
), t5 = c("w", "w", "w", "o", "s", "t"), t6 = c("w", "w", "w", 
"w", "s", "t"), t7 = c("t", "t", "t", "t", "w", "w"), t8 = c("t", 
"t", "t", "t", "w", "w"), t9 = c("w", "o", "o", "o", "s", "w"
), t10 = c("s", "s", "s", "s", "s", "s")), row.names = c(NA, 
6L), class = "data.frame")

df2 <- structure(list(id = c(1, 2, 3, 4, 5, 6), t1 = c("w", "w", "w", 
"w", "w", "s"), t2 = c("w", "s", "s", "s", "s", "w"), t3 = c("w", 
"s", "s", "s", "s", "w"), t4 = c("w", "o", "o", "o", "s", "w"
), t5 = c("w", "o", "w", "o", "s", "w"), t6 = c("w", "w", "w", 
"w", "s", "w"), t7 = c("t", "t", "t", "t", "w", "w"), t8 = c("t", 
"t", "t", "t", "w", "w"), t9 = c("w", "o", "o", "o", "s", "w"
), t10 = c("s", "s", "s", "s", "s", "s")), row.names = c(NA, 
6L), class = "data.frame")

推荐阅读