首页 > 解决方案 > 是否有 R 函数来提取单个组均值?

问题描述

我知道 tapply 可用于分隔组并单独计算均值。我想知道是否有一个功能可以隔离其中一种方法进行单独分析。

我正在使用一个样本 t 检验将我收集的数据与班级均值进行比较。这是我正在使用的数据示例

#Sample of my data
structure(list(
pKa = c(6.946, 7.1, 6.625, 7.528, 7.102, 6.743,6.936, 6.579, 6.672, 7.27), 

pH = c("pH_6.1", "pH_6.7", "pH_7.3", "pH_8.1", "pH_6.1", "pH_6.7", "pH_7.3", "pH_8.1", "pH_6.1", "pH_6.7"), 

id = c("XAU", "XAU", "XAU", "XAU", "MyData", "MyData", "MyData","MyData", "PQ", "PQ")),
 row.names = c(NA, 10L), class = "data.frame")

我正在尝试提取按“pH”(解释变量)分组的每个“pKa”(响应变量)的平均值,并在 t 检验中使用每个平均值来比较“MyData”与收集的总类数据。下面显示了我要比较的数据(Mydata Vs 类在不同 pH 组中的平均值:6.1、6.7、7.3、8.1)

# The data I collected
Exp2MyData

#pKa     pH     id
#5 7.102 pH_6.1 MyData
#6 6.743 pH_6.7 MyData
#7 6.936 pH_7.3 MyData
#8 6.579 pH_8.1 MyData 

#Means of the class data
E2 <- tapply(Exp2$pKa, Exp2$pH, mean)
E2 
#pH_6.1 pH_6.7 pH_7.3 pH_8.1 
# 7.102  6.743  6.936  6.579

我尝试使用的 T 测试代码是:

t.test('X'pH 的全类 pKa 平均值,我收集的 pKa 值)

这是我目前正在尝试使用的代码

#Making pH 8.1 into seperatate group
Buff_8.1 <- subset(Exp2,pH=="pH_8.1")

#Obtaining the means for pKa @ pH = 8.1
m8.1 <- mean(Buff_8.1$pKa)
m8.1

#t.test
t.test(m8.1, mu= 6.579)

但我收到错误“t.test.default(m8.1, mu = 6.579) 中的错误:'x' 观察值不足”

任何帮助都会很棒。

标签: rt-test

解决方案


我假设您想测试平均 pKa 是否与 pH 组的平均 pKa 显着不同。在示例df1$class_mean[1]中,我们针对第一个 pH 组计算 pka 平均值[1]。对于第二个 pH 组更改为[2]依此类推.. 或将其包装成一个函数。

  • 我们使用t_testfrom rstatixpackage -> pipe friendly
library(dplyr)
library(rstatix)

# get the group mean by pH group
df1 <- df %>% 
  group_by(pH) %>% 
  summarize(class_mean = mean(pKa))

# one sample t-test against group-pH mean
# df1$class_mean[1] is the first group-pH mean [2], ..etc. until the last [4]

stat.test <- df %>% t_test(pKa ~ 1, mu = df1$class_mean[1], detailed = TRUE)
# stat.test <- df %>% t_test(pKa ~ 1, mu = df1$class_mean[2], detailed = TRUE)
# stat.test <- df %>% t_test(pKa ~ 1, mu = df1$class_mean[3], detailed = TRUE)
# stat.test <- df %>% t_test(pKa ~ 1, mu = df1$class_mean[4], detailed = TRUE)
stat.test

# Output:
# A tibble: 1 x 12
  estimate .y.   group1 group2         n statistic     p    df conf.low conf.high method alternative
*    <dbl> <chr> <chr>  <chr>      <int>     <dbl> <dbl> <dbl>    <dbl>     <dbl> <chr>  <chr>      
1     6.95 pKa   1      null model    10     0.448 0.665     9     6.73      7.17 T-test two.sided  

数据:

df <- structure(list(
  pKa = c(6.946, 7.1, 6.625, 7.528, 7.102, 6.743,6.936, 6.579, 6.672, 7.27), 
  
  pH = c("pH_6.1", "pH_6.7", "pH_7.3", "pH_8.1", "pH_6.1", "pH_6.7", "pH_7.3", "pH_8.1", "pH_6.1", "pH_6.7"), 
  
  id = c("XAU", "XAU", "XAU", "XAU", "MyData", "MyData", "MyData","MyData", "PQ", "PQ")),
  row.names = c(NA, 10L), class = "data.frame")

推荐阅读