r - 如何删除此 csv 文件中包含某些字符串的行
问题描述
我正在阅读的文件非常大,并且某个字符串总是在整个文件中出现多次。我只需要让它遍历文件并删除包含这些特定字符串/NA的每一行。
我已经使用 grep 函数来尝试摆脱字符串,但它只摆脱了第一个出现的字符串,而没有其他相同的字符串。
RAO <- readr::read_csv(file = "RateAddOnsExcel.csv")
RAO$...4 <- NULL
RAO$...5 <- NULL
RAO$Quarter. <- NULL
names(RAO)[1:13] = c("ProviderName","AIMNumber", "ChainName",
"RateEffectiveDate", "ComponentTotal",
"VentAddOn", "QualityAddOn",
"SpecialCareUnitAddOn", "AssessmentAddOn",
"SelectedExpenditureAddOn", "RateReduction",
"CaseMixRate", "CaseMixAssessment")
RAO$AIMNumber <- NULL
RAO$ChainName <- NULL
names(RAO)[1:13] = c("ProviderName","AIMNumber", "ChainName",
"RateEffectiveDate", "ComponentTotal",
"VentAddOn", "QualityAddOn",
"SpecialCareUnitAddOn", "AssessmentAddOn",
"SelectedExpenditureAddOn", "RateReduction",
"CaseMixRate", "CaseMixAssessment")
RAO <- RAO[-which(apply(RAO, 1, function(x)all(is.na(x)))),]
View(RAO)
remove.list <- paste(c("Myers", "Provider", "NA", "JJ"), collapse =
'|')
RAO %>% filter(!grepl(remove.list, RAO$ProviderName))
RAO %>% filter(!str_detect(RAO$ProviderName, remove.list))
我想摆脱那些我放在那里的特定字符串。
解决方案
library(dplyr)
# simulate some data
set.seed(12345)
RAO <- data.frame(A = sample(c("Myers", "Provider", "NA", "JJ", "Stack","Overflow"), 50, replace=T),
B = rnorm(50) )
head(RAO)
# A B
# 1 Stack 1.8050975
# 2 Overflow -0.4816474
# 3 Stack 0.6203798
# 4 Overflow 0.6121235
# 5 NA -0.1623110
# 6 Myers 0.8118732
# Remove rows where column A is one of Myers,Provider or NA
RAO %>% filter( !grepl ("Myers|Provider|NA", A))
# A B
# 1 Stack 1.80509752
# 2 Overflow -0.48164736
# 3 Stack 0.62037980
# 4 Overflow 0.61212349
# 5 JJ 2.04919034
# 6 Stack 1.63244564
或者,如果 A 列中的值包含多个单词,并且您想要删除那些值以这 3 个单词之一开头的行,您可以在grepl
函数中的正则表达式中添加“^”符号:grepl ("^Myers|^Provider|^NA", A)
推荐阅读
- wix - msi 日志文件状态不反映安装文件状态
- python - 为什么我会收到“失败”响应?
- reactjs - Firebase 托管的网络应用程序不会将数据写入 Firebase Cloud Firestore
- python - 为什么不打印出变量的值?
- python - Python导入的sqlite3中的子查询
- r - R 中的 BradleyTerry2 包 - 使用零假设作为参考播放器
- flutter - Try catch 块不会捕获重新抛出的错误
- vue.js - 使用 bootstrap vue 在表格中的每一行旁边添加一个图标?
- python - 在 Python 中使用正则表达式去除星号“*”中的数字
- azure - 从 Azure 表存储的千万条记录中查询一条记录