r - 创建一个包含缺少数据的计数和百分比的表
问题描述
我正在尝试创建一个具有预设尺寸的表格,并让 R 填写计数和百分比。这是针对 R-markdown 报告的。
这是我的示例数据的代码。
#This is the most realistic data I could produce.
Maj <- rep("Major A", times=50)
set.seed(24601)
Race <- sample(c("Asian","Black", "Am Indian","Hawiian" ,"Hispanic","White","Two or More Races","Not Reported"),
prob=c(.01,.1,.01,.01,.02,.80,.05,.01),size=50, replace = T)
Sex <- sample(c("Female","Male"), prob=c(.98,.02),size=50,replace=T)
Enroll_MajorA <- cbind(Maj,Sex,Race)
我需要该表来计算数据集中是否存在给定的种族和性别组合的计数和百分比。这是我需要的样子。
我尝试单独计算表的每个值,R-markdown 给了我一个“内存错误”。我已经尝试创建一个计数和百分比表并将它们组合在一起,但它并没有提供报告所需的所有种族/性别组合。我不知道从这里去哪里。请帮忙!
解决方案
您可以使用aggregate
. 您可以保持矩阵不变,因为您可以使用as.data.frame
,它会自动强制转换为可数因子。(NROW
大写)不区分矩阵和向量。
m.agg <- do.call(data.frame,
aggregate(. ~ Sex + Race, as.data.frame(Enroll_MajorA), function(x)
c(count=as.integer(NROW(x)), share=NROW(x) / NROW(Enroll_MajorA))))
为了得到完整的集合,我们可以合并一个expand.grid
,我们可能想要清理一下。
res <- merge(as.data.frame(m.agg), expand.grid(Sex=c("Female", "Male"),
Race=relevant.races), all=TRUE) # `relevant.races` below
res[, 3:4][is.na(res[, 3:4])] <- 0 # transform `NA` into 0 to be nice
res[order(res[, "Race"]), ] # order output
# Sex Race Maj.count Maj.share
# 1 Female Black 2 0.04
# 10 Male Black 0 0.00
# 2 Female Hawiian 1 0.02
# 3 Female Hispanic 1 0.02
# 11 Male Hispanic 0 0.00
# 4 Female Two or More Races 2 0.04
# 12 Male Two or More Races 0 0.00
# 5 Female White 44 0.88
# 13 Male White 0 0.00
# 6 Female Asian 0 0.00
# 14 Male Asian 0 0.00
# 7 Female Am. Indian 0 0.00
# 15 Male Am. Indian 0 0.00
# 8 Female Hawaiian 0 0.00
# 16 Male Hawaiian 0 0.00
# 9 Female Not Reported 0 0.00
# 17 Male Not Reported 0 0.00
数据
relevant.races <- c("Asian","Black", "Am. Indian", "Hawaiian" , "Hispanic", "White",
"Two or More Races", "Not Reported")
Enroll_MajorA <- structure(c("Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Major A", "Major A", "Major A",
"Major A", "Major A", "Major A", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "White", "White",
"White", "Hawiian", "White", "White", "White", "White", "White",
"White", "White", "White", "White", "Two or More Races", "White",
"White", "White", "White", "White", "White", "White", "Hispanic",
"White", "White", "White", "White", "White", "White", "Two or More Races",
"White", "White", "White", "White", "White", "White", "White",
"White", "Black", "White", "White", "Black", "White", "White",
"White", "White", "White", "White", "White", "White", "White"
), .Dim = c(50L, 3L), .Dimnames = list(NULL, c("Maj", "Sex",
"Race")))
推荐阅读
- java - 仿真器:仿真器:错误:仿真当前需要硬件加速
- vue.js - npm run build 生成错误的路径
- angular - Angular 中的锚标签——它们是如何工作的?
- java - 如何在 SPRING REST API 中使用 SQL 查询处理“ALL”?
- python - matplotlib 以两种不同的方式打印图像
- cordova - 如何编辑 config.xml 以在 phonegap 构建中启用 android 调试?
- javascript - 为什么数字从 Object.prototype 继承时不是 Object 的实例?
- mysql - 错误消息:无法连接到数据库服务器
- javascript - 在 AngularJS 应用程序上使用 google API then() 函数
- java - 为什么每次调用 SpringMVC 服务中的简单方法都比静态方法慢?