首页 > 解决方案 > 从多个矩阵中高效操作和提取数据 - 均值和日期

问题描述

我有一系列大型矩阵,我只是习惯于以这种格式导航它们并使用函数。

我有许多参数的分钟数据,我已经能够将其减少到每日平均值 - 我想将每个平均输出与日期序列对齐,并从那里提取每年的每日平均值。

以单数形式,我这样做了

A <- matrix(c(1:3285),nrow=3)
AA <- sapply(1:1095, function(x) mean(A [,x], na.rm = TRUE))
D <- seq(from = as.Date("2013-01-01"), to = as.Date("2015-12-31"), by= 1)
df <- cbind.data.frame(D,AA)

这让我每列的平均值与 2013-2015 的日期对齐

library(lubridate)
years <- year(as.Date(df$D, "%d-%b-%y"))
day <- yday(as.Date(df$D, "%d-%b-%y"))

 #to get the average of DOY over three years
  avg <- as.data.frame(tapply(df$AA,day, mean, na.rm=T)) #gives average value on day of year 
  #Average for specific DOY for each year
  av <- as.data.frame(tapply(df$AA,list(day,years), mean, na.rm=T)) #gets the DOY average per year

#bind to get yearly averages and overall average in a data frame format
DF <- cbind(av,avg)
head(DF)
colnames(DF)[4] <- "avg" #rename ts average column

现在说我有多个矩阵(所有相同的维度只是不同的参数),我想这样做......有没有一种有效的方法来循环这个,所以我得到每个 AC 的数据帧(DF)输出?

 #extra matrices to play with:
 B <- matrix(c(3285:6570),nrow=3)
 C <- matrix(c(6570:9855),nrow=3)

到目前为止,我在 stackoverflow 上获得了一些初步帮助:

#column means for each matrices
vapply(list(A, B, C), colMeans, numeric(1095))

标签: rfunctionmatrixapply

解决方案


这是一个tidyverse解决方案。让

dates <- seq(from = as.Date("2013-01-01"), to = as.Date("2015-12-31"), by = 1)
A <- data.frame(matrix(c(1:3285), ncol = 3, byrow = TRUE))

因为我知道所有矩阵的日期都相同。此外,我制作了A长而不是宽,这在使用tidyverse. 那么也许您更喜欢以下形式的输出

A %>% group_by(year = year(dates), day = yday(dates)) %>% 
  summarise(dayYearAvg = mean(c(X1, X2, X3))) %>%
  group_by(day) %>% mutate(dayAvg = mean(dayYearAvg))
# A tibble: 1,095 x 4
# Groups:   day [365]
#     year   day dayYearAvg dayAvg
#    <dbl> <dbl>      <dbl>  <dbl>
#  1  2013     1          2   1097
#  2  2013     2          5   1100
#  3  2013     3          8   1103
#  ...

如果没有,我们会得到与您的示例相同的结果

A %>% group_by(year = year(dates), day = yday(dates)) %>% 
  summarise(dayYearAvg = mean(c(X1, X2, X3))) %>%
  group_by(day) %>% mutate(dayAvg = mean(dayYearAvg)) %>%
  spread(year, dayYearAvg) %>% ungroup %>% select(-day)
# A tibble: 365 x 4
#    dayAvg `2013` `2014` `2015`
#     <dbl>  <dbl>  <dbl>  <dbl>
#  1   1097      2   1097   2192
#  2   1100      5   1100   2195
#  3   1103      8   1103   2198
#  4   1106     11   1106   2201
#  ...

现在让也

B <- data.frame(matrix(c(3285:6569), ncol = 3, byrow = TRUE))
C <- data.frame(matrix(c(6570:9854), ncol = 3, byrow = TRUE))
l <- list(A, B, C)

这给

map(l, . %>% group_by(year = year(dates), day = yday(dates)) %>% 
      summarise(dayYearAvg = mean(c(X1, X2, X3))) %>%
      group_by(day) %>% mutate(dayAvg = mean(dayYearAvg)) %>%
      spread(year, dayYearAvg) %>% ungroup %>% select(-day))
# [[1]]
# A tibble: 365 x 4
#    dayAvg `2013` `2014` `2015`
#     <dbl>  <dbl>  <dbl>  <dbl>
#  1   1097      2   1097   2192
#  2   1100      5   1100   2195
#  ...
# [[2]]
# A tibble: 365 x 4
#    dayAvg `2013` `2014` `2015`
#     <dbl>  <dbl>  <dbl>  <dbl>
#  1   4381   3286   4381   5476
#  2   4384   3289   4384   5479
#  ...
# [[3]]
# A tibble: 365 x 4
#    dayAvg `2013` `2014` `2015`
#     <dbl>  <dbl>  <dbl>  <dbl>
#  1   7666   6571   7666   8761
#  2   7669   6574   7669   8764
#  ...

推荐阅读