首页 > 解决方案 > 如何根据 r dplyr 中的两个条件用 NA 替换值?

问题描述

我需要根据另外两列的条件用 NA 替换数值。

这是我的可重现示例:

library(dplyr)

data1 <- read.csv(text = "
  site,day,biomass,aereal,root,ei.obs
  siteA,50,464.65,2020.3,307.3,0.84
  siteA,NA,NA,NA,NA,NA
  siteA,NA,NA,NA,NA,NA
  siteA,59,1222.565,2159.5,148.3,0.93
  siteA,NA,NA,NA,NA,NA
  siteA,NA,NA,NA,NA,NA
  siteA,66,1250.86,2046.8,159.1,0.92
  siteB,50,464.65,2020.3,307.3,0.84
  siteB,NA,NA,NA,NA,NA
  siteB,NA,NA,NA,NA,NA
  siteB,59,1222.565,2159.5,148.3,0.93
  siteB,NA,NA,NA,NA,NA
  siteB,NA,NA,NA,NA,NA
  siteB,66,1250.86,2046.8,159.1,0.92")


data1.1 <- data1 %>% 
  mutate(ei.obs =  if_else(site == "siteA" & day == 66, NA , ei.obs)) 

这是我得到的错误:

Error: Problem with `mutate()` input `ei.obs`.
x `false` must be a logical vector, not a double vector.
i Input `ei.obs` is `if_else(site == "siteA" & day == 66, NA, ei.obs)`.

或者,我试过这个:

data1.1 <- data1 %>% 
  mutate(ei.obs =  na_if(ei.obs, site == "siteA" & day == 66)) 

但是数据框中没有任何变化。

预期的结果是这样的:

在此处输入图像描述

标签: rdataframedplyr

解决方案


简单的解决方案是使用ifelse而不是if_else

library(dplyr)
data1.1 <- data1 %>% mutate(ei.obs =  ifelse(site == "siteA" & day == 66, NA , ei.obs))

if_else需要相同类型的输出。NA属于逻辑类,所以你得到错误,你可以NA_real改用。

library(dplyr)
data1.1 <- data1 %>% mutate(ei.obs =  if_else(site == "siteA" & day == 66, NA_real_, ei.obs))

顺便说一句,复制数据会在site列中创建空白,您可以使用trimws.

data1$site <- trimws(data1$site)

推荐阅读