r - How can we show 0 for the levels with 0 observation in a factor variable while using dcase
问题描述
I have a df like this:
df<-structure(list(AEOUT = c("RECOVERED/RESOLVED", "RECOVERED/RESOLVED",
"RECOVERED/RESOLVED", "NOT RECOVERED/NOT RESOLVED", "NOT RECOVERED/NOT RESOLVED",
"RECOVERED/RESOLVED", "NOT RECOVERED/NOT RESOLVED", "FATAL",
"RECOVERED/RESOLVED", "NOT RECOVERED/NOT RESOLVED", "NOT RECOVERED/NOT RESOLVED",
"RECOVERED/RESOLVED", "RECOVERED/RESOLVED", "RECOVERED/RESOLVED",
"NOT RECOVERED/NOT RESOLVED", "NOT RECOVERED/NOT RESOLVED", "NOT RECOVERED/NOT RESOLVED",
"NOT RECOVERED/NOT RESOLVED"), AEREL1S = c("UNRELATED", "UNRELATED",
"UNRELATED", "UNRELATED", "UNRELATED", "UNRELATED", "UNRELATED",
"UNRELATED", "UNRELATED", "UNRELATED", "UNRELATED", "RELATED",
"RELATED", "RELATED", "RELATED", "UNRELATED", "UNRELATED", "UNRELATED"
)), row.names = c(NA, -18L), class = c("tbl_df", "tbl", "data.frame"
))
> test<-df %>%dcast(.,AEOUT~AEREL1S)
Using 'AEREL1S' as value column. Use 'value.var' to override
Aggregation function missing: defaulting to length
Warning message:
In dcast(., AEOUT ~ AEREL1S) :
The dcast generic in data.table has been passed a tbl_df and will attempt to redirect to the reshape2::dcast; please note that reshape2 is deprecated, and this redirection is now deprecated as well. Please do this redirection yourself like reshape2::dcast(.). In the next version, this warning will become an error.
> dput(head(AE_OC, n=18))
structure(list(AEOUT = c("RECOVERED/RESOLVED", "RECOVERED/RESOLVED",
"RECOVERED/RESOLVED", "NOT RECOVERED/NOT RESOLVED", "NOT RECOVERED/NOT RESOLVED",
"RECOVERED/RESOLVED", "NOT RECOVERED/NOT RESOLVED", "FATAL",
"RECOVERED/RESOLVED", "NOT RECOVERED/NOT RESOLVED", "NOT RECOVERED/NOT RESOLVED",
"RECOVERED/RESOLVED", "RECOVERED/RESOLVED", "RECOVERED/RESOLVED",
"NOT RECOVERED/NOT RESOLVED", "NOT RECOVERED/NOT RESOLVED", "NOT RECOVERED/NOT RESOLVED",
"NOT RECOVERED/NOT RESOLVED"), AEREL1S = c("UNRELATED", "UNRELATED",
"UNRELATED", "UNRELATED", "UNRELATED", "UNRELATED", "UNRELATED",
"UNRELATED", "UNRELATED", "UNRELATED", "UNRELATED", "RELATED",
"RELATED", "RELATED", "RELATED", "UNRELATED", "UNRELATED", "UNRELATED"
)), row.names = c(NA, -18L), class = c("tbl_df", "tbl", "data.frame"
))
For AEOUT
, it has 6 levels. I wonder how should I show 0
for the levels that are not in the table?
AEOUT = factor(AEOUT, levels = c("RECOVERED/RESOLVED","RECOVERED/RESOLVED WITH SEQUELAE", "RECOVERING/RESOLVING", "NOT RECOVERED/NOT RESOLVED", "FATAL", "UNKNOWN"))
I tried to summaries the data.Is it possible for me to keep the level even there is 0 obs?
My current codes are:
test<-df %>%dcast(.,AEOUT~AEREL1S)
and output looks like this:
解决方案
You can use the dplyr::count
function with .drop = FALSE
to do just what you want:
library(dplyr)
df %>%
mutate(AEOUT = factor(AEOUT, levels = c("RECOVERED/RESOLVED","RECOVERED/RESOLVED WITH SEQUELAE", "RECOVERING/RESOLVING", "NOT RECOVERED/NOT RESOLVED", "FATAL", "UNKNOWN"))) %>%
count(AEOUT, .drop = FALSE)
## A tibble: 6 x 2
# AEOUT n
# <fct> <int>
#1 RECOVERED/RESOLVED 4
#2 RECOVERED/RESOLVED WITH SEQUELAE 0
#3 RECOVERING/RESOLVING 0
#4 NOT RECOVERED/NOT RESOLVED 3
#5 FATAL 1
#6 UNKNOWN 0
推荐阅读
- java - 如何通过编写删除数据的好方法来纠正?
- javascript - FullCalendar JS - TimeGrid 上的样式问题
- c# - P/调用 NtQueryVolumeInformationFile 函数返回 0xC0000003 错误
- kubernetes - 使用 helm 时创建的额外秘密
- python - 你如何在 Python 中从 API 中提取 JSON 数组?
- ansible - 使用 Ansible Tower Survey 在剧本中传递两个变量
- aws-lambda - 每当新文件到达两个不同的 s3 前缀时触发 AWS Lambda 函数
- javascript - 我想在我的页面上以全尺寸显示我的视频
- mysql - Mysql:如何在不从 mysql 时区更改的情况下保存时间戳 UTC?
- counter - 计数器返回意外值