首页 > 解决方案 > 根据 dplyr 中的另一列替换列中的 vaue

问题描述

您好,我有一个数据框,例如

COL1 COL2 COL3 
A    nan  NaN 
B    ET1  Carnivora
C    ET1  NaN 
D    ET2  Fish
E    OK   Aves 
F    ET3  NaN 

我有一个清单

List_ET<-c("ET1","ET2","ET3","nan")

df$COL2如果对应的值存在于该列表中,我想替换所有 df$COL3 值,Unknown但如果df$COL3不是NaN,我什么也不做。

比我应该得到的:

COL1 COL2 COL3 
A    nan  Unknown 
B    ET1  Carnivora
C    ET1  Unknown 
D    ET2  Fish
E    OK   Aves 
F    ET3  Unknown 

这是数据框

structure(list(COL1 = structure(1:6, .Label = c("A", "B", "C", 
"D", "E", "F"), class = "factor"), COL2 = structure(c(4L, 1L, 
1L, 2L, 5L, 3L), .Label = c("ET1", "ET2", "ET3", "nan", "OK"), class = "factor"), 
    COL3 = structure(c(4L, 2L, 4L, 3L, 1L, 4L), .Label = c("Aves", 
    "Carnivora", "Fish", "NaN"), class = "factor")), class = "data.frame", row.names = c(NA, 
-6L))

到目前为止我试过

df$COL3[df$COL2 %in% List_ET]<- "Unknown" 

但是当 df$COL3 不是时,它不包括什么都不做NaN

标签: rdataframedplyrrefactoringfactors

解决方案


编辑 也许你可能正在寻找这个循环

levels(df$COL3) <- c("Aves", "Carnivora", "Fish", "NaN", "Unknown")

for(i in seq_along(df$COL3)){
  if(df$COL2[i] %in% List_ET & df$COL3[i] == "NaN"){
    df$COL3[i] <- "Unknown"
  }
}

> df
  COL1 COL2      COL3
1    A  nan   Unknown
2    B  ET1 Carnivora
3    C  ET1   Unknown
4    D  ET2      Fish
5    E   OK      Aves
6    F  ET3   Unknown

但是,如果只是想NaN在 col3中替换为"Unknown"简单地使用这个

levels(df$COL3) <- c("Aves", "Carnivora", "Fish", "Unknown")

> df$COL3
[1] Unknown   Carnivora Unknown   Fish      Aves      Unknown  
Levels: Aves Carnivora Fish Unknown

#OR
> df
  COL1 COL2      COL3
1    A  nan   Unknown
2    B  ET1 Carnivora
3    C  ET1   Unknown
4    D  ET2      Fish
5    E   OK      Aves
6    F  ET3   Unknown

推荐阅读