r - 在 Rstudio 中用空格替换某些类类型的 NA
问题描述
我正在尝试从具有 100 多列的大型数据集中转换特定列中的 NA。单独更换每列的 NA 是不可行的。但是,我只想在列不是日期时这样做,因为当它们是日期时它会出错。
我尝试了以下但没有成功。
library(Hmisc)
library(data.table)
applenames <- which(sapply(appleorigin,class) %nin% "Date")
names(appleorigin[,c(applenames)])
appleorigin[which(is.na(appleorigin)),applenames] <- ""
该代码适用于以下表示。
GroupID Number_of_Apples Date Total_Apples Grocer Farm
1 NA 2000-03-01 NA Henry Applefarms
NA 5 2000-03-01 5 Henry NA
NA NA 2000-03-01 5 Henry Applefarms
1 8 2000-03-01 13 Jane Applefarms
2 2 2000-03-01 2 Henry Hillbasin
3 4 NA 4 Jane Overgrown
3 NA 2000-03-01 5 Julie RedLads
3 1 2000-03-01 6 John Yesteryear
4 2 2000-02-01 NA NA FujiFresh
4 NA 2000-02-01 2 Mai Appleseed
5 NA 2000-01-01 0 Joy Yesteryear
5 0 2000-01-01 0 Mai Applefarms
这是目标:
GroupID Number_of_Apples Date Total_Apples Grocer Farm
1 2000-03-01 Henry Applefarms
5 2000-03-01 5 Henry
2000-03-01 5 Henry Applefarms
1 8 2000-03-01 13 Jane Applefarms
2 2 2000-03-01 2 Henry Hillbasin
3 4 NA 4 Jane Overgrown
3 2000-03-01 5 Julie RedLads
3 1 2000-03-01 6 John Yesteryear
4 2 2000-02-01 FujiFresh
4 2000-02-01 2 Mai Appleseed
5 2000-01-01 0 Joy Yesteryear
5 0 2000-01-01 0 Mai Applefarms
我也试图让这个循环,但没有成功。任何帮助,将不胜感激。
解决方案
您可以使用 :
cols <- !sapply(appleorigin,function(x) "Date" %in% class(x))
appleorigin[cols][is.na(appleorigin[cols])] <- ''
# GroupID Number_of_Apples Date Total_Apples Grocer Farm
#1 1 2000-03-01 Henry Applefarms
#2 5 2000-03-01 5 Henry
#3 2000-03-01 5 Henry Applefarms
#4 1 8 2000-03-01 13 Jane Applefarms
#5 2 2 2000-03-01 2 Henry Hillbasin
#6 3 4 <NA> 4 Jane Overgrown
#7 3 2000-03-01 5 Julie RedLads
#8 3 1 2000-03-01 6 John Yesteryear
#9 4 2 2000-02-01 FujiFresh
#10 4 2000-02-01 2 Mai Appleseed
#11 5 2000-01-01 0 Joy Yesteryear
#12 5 0 2000-01-01 0 Mai Applefarms
数据
appleorigin <- structure(list(GroupID = c(1L, NA, NA, 1L, 2L, 3L, 3L, 3L, 4L,
4L, 5L, 5L), Number_of_Apples = c(NA, 5L, NA, 8L, 2L, 4L, NA,
1L, 2L, NA, NA, 0L), Date = structure(c(11017, 11017, 11017,
11017, 11017, NA, 11017, 11017, 10988, 10988, 10957, 10957), class = "Date"),
Total_Apples = c(NA, 5L, 5L, 13L, 2L, 4L, 5L, 6L, NA, 2L,
0L, 0L), Grocer = c("Henry", "Henry", "Henry", "Jane", "Henry",
"Jane", "Julie", "John", NA, "Mai", "Joy", "Mai"), Farm = c("Applefarms",
NA, "Applefarms", "Applefarms", "Hillbasin", "Overgrown",
"RedLads", "Yesteryear", "FujiFresh", "Appleseed", "Yesteryear",
"Applefarms")), row.names = c(NA, -12L), class = "data.frame")
推荐阅读
- python - 在python中搜索包含多个关键字的文件名
- android - 在一项活动中需要显示 3 个底部工作表
- python - 如何提取python字符串中的单词
- linux - 我们如何开始开发 Linux 映像工具(捕获和应用)来恢复/备份 Linux 操作系统映像
- python - django 使用 ajax 从 api 获取数据
- download - BrowserComponent "onDownloadStart" event
- c++ - 当 boost 自动测试用例名称中包含逗号时,boost 会抱怨:“测试设置错误:没有匹配过滤器的测试用例或所有测试用例都被禁用”
- c# - 如何在单元测试期间跳过 Fluent 验证静态功能?
- jsf - Primefaces selectManyCheckbox在列中从上到下布局
- eslint - cmd eslint 找不到 IDE 发现的 ts 错误