首页 > 解决方案 > 如何在数据帧上应用函数以获取描述性统计信息

问题描述

我有一个包含两个数据框的列表,其中一个变量(Year)应该是因子,另一个是数字,我想要它的描述性。这是我的清单的一个例子:

> D1
   Year    value
1  1386 7.544808
2  1387 7.552638
3  1387 7.572596
4  1387 7.790549
5  1388 7.607089
6  1388 7.635559
7  1389 7.469881
8  1389 7.622461
9  1389 7.622461
10 1390 7.596479
11 1390 7.645063
12 1391 7.654853
13 1391 7.605891
14 1392 7.612247
15 1381 7.747241
16 1383 7.808759
17 1383 7.834336
18 1384 7.482341
19 1384 7.433035

> D2
   Year    value
1  1386 7.544808
2  1387 7.552638
3  1387 7.572596
4  1387 7.790549
5  1388 7.607089
6  1388 7.635559
7  1389 7.469881
8  1389 7.622461
9  1389 7.622461
10 1390 7.596479
11 1390 7.645063
12 1391 7.654853
13 1391 7.605891
14 1392 7.612247
15 1381 7.747241
16 1383 7.808759
17 1383 7.834336
18 1384 7.482341
19 1384 7.433035

My_list<-list(Labe1=D1,Label2=D2)

现在我想在上面的列表中应用我的以下函数来生成不同年份类别的变量值的描述性统计数据。

# take mean with confience interval from columns
MeanFunc<-function(x) round(mean(x,na.rm = TRUE),digits=6 )
SEFunc<-function(x) round(qt(0.975,df=sum(!is.na(x))-1)*sd(x,na.rm = TRUE)/sqrt(sum(!is.na(x)) ),digits=5 )
SDFunc<-function(x) round(sd(x,na.rm = TRUE),digits=5 )
LeftFunc<-function(x)  round(mean(x,na.rm = TRUE)-SEFunc(x),digits=5) 
RightFunc<-function(x) round(mean(x,na.rm = TRUE)+SEFunc(x),digits=5)  
MaxFunc<-function(x) round(max(x,na.rm = TRUE) ,digits=5)  
MinFunc<-function(x) round(min(x,na.rm = TRUE) ,digits=5) 

multi.fun <- function(x) {
  c(Mean = MeanFunc(x), SE = SEFunc(x), SD = SDFunc(x), Left=LeftFunc(x),Right=RightFunc(x),Max=MaxFunc(x),Min=MinFunc(x))
} 

现在我怎样才能产生一个类似于这个列表的输出?:

$Lable1
Mean      SE      SD    Left   Right     Max     Min
value 7.407750 0.02683 0.35525 7.38092 7.43458 8.54102 5.90301
1381  0.203978 0.09325 1.23486 0.11073 0.29723 8.08833 0.00000
1382  0.078627 0.05813 0.76970 0.02050 0.13676 7.99239 0.00000
1383  0.635951 0.16005 2.11930 0.47590 0.79600 8.54102 0.00000
1384  0.422948 0.13113 1.73636 0.29182 0.55408 8.20205 0.00000
1385  0.267271 0.10543 1.39602 0.16184 0.37270 8.30430 0.00000
1386  0.354070 0.12012 1.59055 0.23395 0.47419 7.85514 0.00000
1387  1.279604 0.21165 2.80268 1.06795 1.49125 8.23982 0.00000
$Lable2
Mean      SE      SD    Left   Right     Max     Min
value 7.407750 0.02683 0.35525 7.38092 7.43458 8.54102 5.90301
1381  0.203978 0.09325 1.23486 0.11073 0.29723 8.08833 0.00000
1382  0.078627 0.05813 0.76970 0.02050 0.13676 7.99239 0.00000
1383  0.635951 0.16005 2.11930 0.47590 0.79600 8.54102 0.00000
1384  0.422948 0.13113 1.73636 0.29182 0.55408 8.20205 0.00000
1385  0.267271 0.10543 1.39602 0.16184 0.37270 8.30430 0.00000
1386  0.354070 0.12012 1.59055 0.23395 0.47419 7.85514 0.00000
1387  1.279604 0.21165 2.80268 1.06795 1.49125 8.23982 0.00000

非常感谢...

标签: rstatistics

解决方案


检查此解决方案:

library(tidyverse)
library(plotrix)

My_list %>%
  map(
    ~group_by(.x, Year) %>% 
    summarise(
      Mean = mean(value, na.rm = TRUE) %>% round(6),
      SE = std.error(value, na.rm = TRUE),
      SD = sd(value,na.rm = TRUE),
      Left = mean(value,na.rm = TRUE) - std.error(value, na.rm = TRUE),
      Right = mean(value,na.rm = TRUE) + std.error(value, na.rm = TRUE),
      Max = max(value,na.rm = TRUE),
      Min = min(value,na.rm = TRUE)
    ) %>%
      mutate_at(3:8, ~round(.x, 5))
  )

推荐阅读