首页 > 解决方案 > 不同组的加权表

问题描述

我想计算几个组的加权交叉表。在summarytools ( https://cran.csiro.au/web/packages/summarytools/news/news.html ) 的新闻中指出,与 stby() 一起使用的 ctable() 也支持权重。但是,我没有成功。我尝试将权重放在 list-command 以及 stby 部分。

这是我的数据:

structure(list(ext = structure(c(1L, 1L, 2L, 2L, 3L, 3L
), label = c(ext = "Jahrgaenge"), class = c("labelled", 
"integer")), col = structure(c(0L, 0L, 1L, 0L, 1L, 0L), label = c(col = "Testvariable"), class = c("labelled", 
"integer")), sex = structure(c(2L, 2L, 1L, 1L, 2L, 1L), label = c(sex = "Geschlecht"), class = c("labelled", 
"integer")), weight = structure(c(1.654133, 0.3196581, 
0.2779197, 1.875442, 1.875442, 0.3609791
), label = c(weight = "Gewichtungsvariable"), class = c("labelled", 
"numeric"))), row.names = c(NA, 6L), class = "data.frame")

没有对交叉表进行分组,以下代码可以正常工作

ctable(d.bmi$sex, d.bmi$col, weights=d.bmi$weight, prop="r")

但是,我想按组进行交叉表。所以我尝试了以下方法:

with(d.bmi, stby(list(x=sex, y=col), INDICES=ext, FUN=ctable, WEIGHTS = weight))
with(d.bmi, stby(list(x=sex, y=col, weights=weight), INDICES=ext, FUN=ctable))

我还能怎么写代码?我很感谢任何提示。

标签: r

解决方案


处理整数数据时,包存在问题。它将在 0.9.9 版本中修复,但与此同时,您可以从 github 安装 dev-current 分支:

remotes::install_github("dcomtois/summarytools", ref = "dev-current")

至于语法,你非常接近......

定义数据

dd <- structure(
  list(
    ext = structure(
      c(1L, 1L, 2L, 2L, 3L, 3L),
      label = c(ext = "Jahrgaenge"), 
      class = c("labelled", "numeric")
    ),
    col = structure(
      c(0L, 0L, 1L, 0L, 1L, 0L),
      label = c(col = "Testvariable"), 
      class = c("labelled", "numeric")
    ),
    sex = structure(
      c(2L, 2L, 1L, 1L, 2L, 1L),
      label = c(sex = "Geschlecht"),
      class = c("labelled", "numeric")
    ),
    weight = structure(
      c(1.654133, 0.3196581, 0.2779197, 1.875442, 1.875442, 0.3609791),
      label = c(weight = "Gewichtungsvariable"), 
      class = c("labelled", "numeric"))
  ),
  row.names = c(NA, 6L),
  class = "data.frame")

生成 ctable

注意添加round.digits = 2以避免精度损失。

library(summarytools)
with(dd, stby(list(sex, col), ext, ctable, weights = weight, round.digits = 2))

## Cross-Tabulation, Row Proportions  
## sex * col  
## Data Frame: dd  
## Group: ext = 1  
## 
## ------- ----- ---------------- ----------------
##           col                0            Total
##     sex                                        
##       2         1.97 (100.00%)   1.97 (100.00%)
##   Total         1.97 (100.00%)   1.97 (100.00%)
## ------- ----- ---------------- ----------------
## 
## Group: ext = 2  
## 
## ------- ----- --------------- --------------- ----------------
##           col               0               1            Total
##     sex                                                       
##       1         1.88 (87.09%)   0.28 (12.91%)   2.15 (100.00%)
##   Total         1.88 (87.09%)   0.28 (12.91%)   2.15 (100.00%)
## ------- ----- --------------- --------------- ----------------
## 
## Group: ext = 3  
## 
## ------- ----- ---------------- ---------------- ----------------
##           col                0                1            Total
##     sex                                                         
##       1         0.36 (100.00%)   0.00 (  0.00%)   0.36 (100.00%)
##       2         0.00 (  0.00%)   1.88 (100.00%)   1.88 (100.00%)
##   Total         0.36 ( 16.14%)   1.88 ( 83.86%)   2.24 (100.00%)
## ------- ----- ---------------- ---------------- ----------------

要显示所有组合,请使用因子...

dd$sex <- as.factor(dd$sex)
dd$col <- as.factor(dd$col)

# (Optionnal) Using magrittr's %$% instead of with()
library(magrittr)
dd %$% stby(list(sex, col), ext, ctable, weights = weight, round.digits = 2)

## Cross-Tabulation, Row Proportions  
## sex * col  
## Data Frame: dd  
## Group: ext = 1  
## 
## ------- ----- ---------------- -------------- ----------------
##           col                0              1            Total
##     sex                                                       
##       1         0.00 (  0.00%)   0.00 (0.00%)   0.00 (  0.00%)
##       2         1.97 (100.00%)   0.00 (0.00%)   1.97 (100.00%)
##   Total         1.97 (100.00%)   0.00 (0.00%)   1.97 (100.00%)
## ------- ----- ---------------- -------------- ----------------
## 
## Group: ext = 2  
## 
## ------- ----- --------------- --------------- ----------------
##           col               0               1            Total
##     sex                                                       
##       1         1.88 (87.09%)   0.28 (12.91%)   2.15 (100.00%)
##       2         0.00 ( 0.00%)   0.00 ( 0.00%)   0.00 (  0.00%)
##   Total         1.88 (87.09%)   0.28 (12.91%)   2.15 (100.00%)
## ------- ----- --------------- --------------- ----------------
## 
## Group: ext = 3  
## 
## ------- ----- ---------------- ---------------- ----------------
##           col                0                1            Total
##     sex                                                         
##       1         0.36 (100.00%)   0.00 (  0.00%)   0.36 (100.00%)
##       2         0.00 (  0.00%)   1.88 (100.00%)   1.88 (100.00%)
##   Total         0.36 ( 16.14%)   1.88 ( 83.86%)   2.24 (100.00%)
## ------- ----- ---------------- ---------------- ----------------

推荐阅读