r - str_detect also finding NA in filter
问题描述
I want to filter out rows where a column contains a string. I am using a tidyverse
solution. The problem I'm having is str_detect
also seems to be finding NA
results and so these are also removed by my filter:
df1 = data.frame(x1 = c("PI", NA, "Yes", "text"),
x2 = as.character(c(NA, 1, NA,"text")),
x3 = c("foo", "bar","foo", "bar"))
> df1
x1 x2 x3
1 PI <NA> foo
2 <NA> 1 bar
3 Yes <NA> foo
4 text text bar
#remove rows which have "PI" in column `x1`:
df2 = df1%>%
filter(!str_detect(x1, "(?i)pi"))
> df2
x1 x2 x3
1 Yes <NA> foo
2 text text bar
How do I prevent str_detect
finding NA
?
解决方案
Add a condition with is.na
and |
. The NA
issue is just because for NA
elements, the str_detect
returns NA
, which gets automatically removed by filter
library(dplyr)
library(stringr)
df1 %>%
filter(is.na(x1) |
str_detect(x1, regex("pi", ignore_case = TRUE), negate = TRUE))
-output
x1 x2 x3
1 <NA> 1 bar
2 Yes <NA> foo
3 text text bar
i.e. check the output of str_detect
with(df1, str_detect(x1, regex("pi", ignore_case = TRUE), negate = TRUE))
[1] FALSE NA TRUE TRUE
The NA
remains as such unless we make it TRUE
with(df1, str_detect(x1, regex("pi", ignore_case = TRUE), negate = TRUE)|is.na(x1))
[1] FALSE TRUE TRUE TRUE
Or another option is to coalesce
with TRUE
so that all the NA
elements in str_detect
will change to TRUE
value
df1 %>%
filter(coalesce(str_detect(x1, regex("pi", ignore_case = TRUE),
negate = TRUE), TRUE))
x1 x2 x3
1 <NA> 1 bar
2 Yes <NA> foo
3 text text bar
推荐阅读
- database - 在 Azure 上找不到数据网关实例
- android - 解密我的文件 Android KEYstore AES Cipher 时出现 AEADBadTagException
- actions-on-google - 已部署 Google Actions
- typescript - 如何让 typescript http.get observable 等待参数返回内联?
- python - 如何使用 Wikipedia API 获取图像标题
- cpython - C Python实现上的函数`_PyIO_str_readline`的实现在哪里?
- java - 在数据狗指标中记录销售额
- vue.js - 这个 curl 请求的等效 vue-apollo 选项是什么?
- c++ - C++:铸造成员函数指针
- mysql - MySQL - 基于多态表值的条件查询结果?