首页 > 解决方案 > 表 1 包 R 中的 ANOVA P 值列

问题描述

我正在尝试对数据集执行 ANOVA 测试,以使用 table1 包比较表中不同组的平均值。在此示例中,在页面底部,作者执行 t 检验以将 2 个均值(男性与女性)与我粘贴在代码中的函数进行比较。

我想做同样的事情,但有多种方法,如下面的示例数据集所示。我想要所有年龄组的列和一个方差分析 p 值列。

我没有找到解决方案,所以如果有人可以提供帮助,我将非常感激!


library(tidyverse)
library(table1)

# Function to compute t-test
pvalue <- function(x, ...) {
  # Construct vectors of data y, and groups (strata) g
  y <- unlist(x)
  g <- factor(rep(1:length(x), times=sapply(x, length)))
  if (is.numeric(y)) {
    # For numeric variables, perform a standard 2-sample t-test
    p <- t.test(y ~ g)$p.value
  } else {
    # For categorical variables, perform a chi-squared test of independence
    p <- chisq.test(table(y, g))$p.value
  }
  # Format the p-value, using an HTML entity for the less-than sign.
  # The initial empty string places the output on the line below the variable label.
  c("", sub("<", "&lt;", format.pval(p, digits=3, eps=0.001)))
}

# Fake dataset
age_group = factor(c("10-20", "20-30", "30-40", "40-50", "10-20", "40-50", "40-50", "30-40", "30-40", "30-40"), 
                   levels = c("10-20", "20-30", "30-40", "40-50"))
protein = c(25.3, 87.5, 35.1, 50.8, 50.4, 61.5, 76.7, 56.1, 59.2, 40.2)
fat = c(76, 45, 74, 34, 55, 100, 94, 81, 23, 45)
gender = c("female", "male", "male", "female", "female", "female", "male", "male", "female", "female")
mydata <- tibble(gender, age_group, protein, fat)

标签: rstatisticstidyversep-valuestatistical-test

解决方案


编辑:我解决了这个问题,它实际上很容易。如果有人正在寻找相同的功能,这里是新版本的功能:

pvalueANOVA <- function(x, ...) {
  # Construct vectors of data y, and groups (strata) g
  y <- unlist(x)
  g <- factor(rep(1:length(x), times=sapply(x, length)))
  
  if (is.numeric(y)) {
    # For numeric variables, perform a standard 2-sample t-test
    ano <- aov(y ~ g)
    p <- summary(ano)[[1]][[5]][1]
    
  } else {
    # For categorical variables, perform a chi-squared test of independence
    p <- chisq.test(table(y, g))$p.value
  }
  # Format the p-value, using an HTML entity for the less-than sign.
  # The initial empty string places the output on the line below the variable label.
  c("", sub("<", "&lt;", format.pval(p, digits=3, eps=0.001)))
}

推荐阅读