首页 > 解决方案 > 根据R中的条件将值替换为前一行值

问题描述

我正在尝试清理我拥有的一些数据。

当前格式有 4 个变量(id、speaker、text 和 dup):

id speaker text dup 

1 GHS how are you 0 

2 yea yea 1 

3 CHA where is it 0 

4 CHA I cant find it 0 

5 CHA did you 0 

6 what what 1 

7 CHA did you find it 0

dup 是我创建的一个变量,用于标记扬声器 = 文本的所有实例。如果这是真的,我想用它上面的行的值替换扬声器(见第 2 行和第 6 行)

所需格式:

id speaker text dup 

1 GHS how are you 0 

2 GHS yea 1 

3 CHA where is it 0 

4 CHA I cant find it 0
 
5 CHA did you 0 

6 CHA what 1 

7 CHA did you find it 0

提前致谢!

标签: r

解决方案


我们可以replace根据 'dup' 将 'speaker' 中的值设置为 NA,然后fill使用之前的非 NA 值

library(dplyr)
library(tidyr)
df %>%
  mutate(speaker = replace(speaker, as.logical(dup), NA)) %>%  
  fill(speaker)
#  id speaker            text dup
#1  1     GHS     how are you   0
#2  2     GHS             yea   1
#3  3     CHA     where is it   0
#4  4     CHA  I cant find it   0
#5  5     CHA         did you   0
#6  6     CHA            what   1
#7  7     CHA did you find it   0

或者在一个步骤中使用na.locf0fromzoo

library(zoo)
df$speaker <- with(df, na.locf0(replace(speaker, as.logical(dup), NA)))

或者如果只有一个案例,

with(df, ifelse(dup ==1, lag(speaker), speaker))

数据

df <- structure(list(id = 1:7, speaker = c("GHS", "yea", "CHA", "CHA", 
"CHA", "what", "CHA"), text = c("how are you", "yea", "where is it", 
"I cant find it", "did you", "what", "did you find it"), dup = c(0L, 
1L, 0L, 0L, 0L, 1L, 0L)), class = "data.frame", row.names = c(NA, 
-7L))

推荐阅读