首页 > 解决方案 > 在 R 中将特定列表合并在一起

问题描述

我有一个列表列表的列表(我知道,很多列表,总共大约 6000 个 DataFrame)。第一个列表指定开始月份(1 月到 12 月),第二个指定年份(2002 - 2018),第三个包含不同部门(例如,全权委托或必需消费品,共 10 个),最后一个指定分位数( 1 到 5)。
更清楚地说:sector_prtf[["StartMonth"]][["Year"]][["Sector"]][["Quantile"]]
以下是 DataFrame 外观的示例:

sector_prtf[[1]][[1]][[1]][[1]]
              Growth quantile                 Sector
2002-01-31 0.2278331        1 Consumer Discretionary  

sector_prtf[[1]][[1]][[2]][[1]]
               Growth quantile           Sector
2002-01-31 0.09700046        1 Consumer Staples  

sector_prtf[[1]][[2]][[1]][[1]]
               Growth quantile                 Sector
2003-01-31 -0.1081433        1 Consumer Discretionary  

sector_prtf[[2]][[1]][[1]][[1]]
              Growth quantile                 Sector
2002-02-28 0.3596547        1 Consumer Discretionary

目标是以扇区与分位数和相应的开始日期在一起的方式将列表合并在一起。

              Growth quantile                 Sector
2002-01-31 0.2278331        1 Consumer Discretionary  
               Growth quantile                 Sector
2003-01-31 -0.1081433        1 Consumer Discretionary  
              Growth quantile                 Sector
2004-01-30 0.6446954        1 Consumer Discretionary  
.
.
.
              Growth quantile                 Sector
2017-01-31 0.1824898        1 Consumer Discretionary  

就像我提到的,这应该针对每个部门和每个开始日期进行。

我尝试通过执行简单的 rbinds 将列表合并在一起:

merged_sector <- lapply(sector_prtf, function(a) lapply(a, function(b) lapply(b, function(c) do.call("rbind", c))))
merged_sector <- lapply(merged_sector, function(a) lapply(a, function(b) do.call("rbind", b)))
merged_sector <- lapply(merged_sector, function(a) do.call("rbind", a))
merged_sector <- do.call("rbind", merged_sector)  

之后合并的列表如下所示:

.
.
.
2012-01-319   -1.030502e-02        1              Materials
2012-01-3117   3.039239e-02        2              Materials
2012-01-3127   6.278972e-02        3              Materials
2012-01-3110   1.150880e-01        1            Real Estate
2012-01-3118   9.337119e-02        2            Real Estate
2012-01-3128   3.242025e-02        3            Real Estate
2012-01-3119   6.044756e-02        1              Utilities
2012-01-31110  1.154916e-01        2              Utilities
2012-01-3129   1.156366e-01        3              Utilities
2013-01-31     2.797345e-01        1 Consumer Discretionary
2013-01-311    1.875079e-01        2 Consumer Discretionary
2013-01-312    3.652037e-01        3 Consumer Discretionary
.
.
.  

我现在的想法是按扇区和分位数过滤合并的 DF,但是日期是一个大问题(唯一的行名)。
有没有更简单的方法来解决这个问题?提前致谢

*更新:这是请求dput文件的链接。它仅包括第一个开始月份(一月):https ://ufile.io/y80fb

**编辑 2:对于我没有提供可重现的示例给我带来的不便,我深表歉意。

list(list(list(structure(list(Growth = 0.227833070205427, quantile = structure(1L, .Label = "1", class = "factor"), 
    Sector = structure(1L, .Label = "Consumer Discretionary", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
    structure(list(Growth = 0.00580189434527657, quantile = structure(1L, .Label = "2", class = "factor"), 
        Sector = structure(1L, .Label = "Consumer Discretionary", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
    structure(list(Growth = 0.280654630370414, quantile = structure(1L, .Label = "3", class = "factor"), 
        Sector = structure(1L, .Label = "Consumer Discretionary", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
    list(structure(list(Growth = 0.0970004606893047, quantile = structure(1L, .Label = "1", class = "factor"), 
        Sector = structure(1L, .Label = "Consumer Staples", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = 0.054821203483339, quantile = structure(1L, .Label = "2", class = "factor"), 
            Sector = structure(1L, .Label = "Consumer Staples", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = 0.00837169953085215, quantile = structure(1L, .Label = "3", class = "factor"), 
            Sector = structure(1L, .Label = "Consumer Staples", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
    list(structure(list(Growth = -0.078767963284149, quantile = structure(1L, .Label = "1", class = "factor"), 
        Sector = structure(1L, .Label = "Energy", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = -0.069104950106169, quantile = structure(1L, .Label = "2", class = "factor"), 
            Sector = structure(1L, .Label = "Energy", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = -0.27207135756175, quantile = structure(1L, .Label = "3", class = "factor"), 
            Sector = structure(1L, .Label = "Energy", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
    list(structure(list(Growth = 0.009642535558954, quantile = structure(1L, .Label = "1", class = "factor"), 
        Sector = structure(1L, .Label = "Financials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = 0.0117244867054771, quantile = structure(1L, .Label = "2", class = "factor"), 
            Sector = structure(1L, .Label = "Financials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = -0.185284889832411, quantile = structure(1L, .Label = "3", class = "factor"), 
            Sector = structure(1L, .Label = "Financials", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
    list(structure(list(Growth = 0.239390715659085, quantile = structure(1L, .Label = "1", class = "factor"), 
        Sector = structure(1L, .Label = "Health Care", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = -0.0162271493055311, quantile = structure(1L, .Label = "2", class = "factor"), 
            Sector = structure(1L, .Label = "Health Care", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = -0.067303679327545, quantile = structure(1L, .Label = "3", class = "factor"), 
            Sector = structure(1L, .Label = "Health Care", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
    list(structure(list(Growth = 0.0620349870410483, quantile = structure(1L, .Label = "1", class = "factor"), 
        Sector = structure(1L, .Label = "Industrials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = 0.0821803720980501, quantile = structure(1L, .Label = "2", class = "factor"), 
            Sector = structure(1L, .Label = "Industrials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = 0.137729664907273, quantile = structure(1L, .Label = "3", class = "factor"), 
            Sector = structure(1L, .Label = "Industrials", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
    list(structure(list(Growth = -0.0843930112785794, quantile = structure(1L, .Label = "1", class = "factor"), 
        Sector = structure(1L, .Label = "Information Technology", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = -0.172018997118367, quantile = structure(1L, .Label = "2", class = "factor"), 
            Sector = structure(1L, .Label = "Information Technology", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = -0.298718947065689, quantile = structure(1L, .Label = "3", class = "factor"), 
            Sector = structure(1L, .Label = "Information Technology", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
    list(structure(list(Growth = 0.0170747596874905, quantile = structure(1L, .Label = "1", class = "factor"), 
        Sector = structure(1L, .Label = "Materials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = 0.190415482682349, quantile = structure(1L, .Label = "2", class = "factor"), 
            Sector = structure(1L, .Label = "Materials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = 0.221341415148432, quantile = structure(1L, .Label = "3", class = "factor"), 
            Sector = structure(1L, .Label = "Materials", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
    list(structure(list(Growth = 0.168638361539387, quantile = structure(1L, .Label = "1", class = "factor"), 
        Sector = structure(1L, .Label = "Real Estate", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = 0.0810611988754563, quantile = structure(1L, .Label = "2", class = "factor"), 
            Sector = structure(1L, .Label = "Real Estate", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = 0.0365040437639329, quantile = structure(1L, .Label = "3", class = "factor"), 
            Sector = structure(1L, .Label = "Real Estate", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
    list(structure(list(Growth = 0.111350872628164, quantile = structure(1L, .Label = "1", class = "factor"), 
        Sector = structure(1L, .Label = "Utilities", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = -0.0978660942657028, quantile = structure(1L, .Label = "2", class = "factor"), 
            Sector = structure(1L, .Label = "Utilities", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
        structure(list(Growth = 0.112770511307641, quantile = structure(1L, .Label = "3", class = "factor"), 
            Sector = structure(1L, .Label = "Utilities", class = "factor")), class = "data.frame", row.names = "2002-01-31"))))

标签: r

解决方案


我写了一个递归Reduce它应该做你想要的。所以请仔细阅读?减少。

假设您的列表列表名为ll

rReduce <- function(x) {
  y <- Reduce("rbind", x)
  if (is.list(y)) {
    return(rReduce(y))
  } else {
    return(x)
  }
}

res <- rReduce(ll)
print(res)
#                   Growth quantile                 Sector
#2002-01-31    0.227833070        1 Consumer Discretionary
#2002-01-311   0.097000461        1       Consumer Staples
#2002-01-312  -0.078767963        1                 Energy
#2002-01-313   0.009642536        1             Financials
#2002-01-314   0.239390716        1            Health Care
#2002-01-315   0.062034987        1            Industrials
#2002-01-316  -0.084393011        1 Information Technology
#...

这相当于

Reduce("rbind", Reduce("rbind", Reduce("rbind", ll)))

如果我没有记错的话。现在,您仍然在行名中遇到日期问题,但这很容易通过以下方式解决:

res$Date <- as.Date(substr(rownames(res), 1, 10))
rownames(res) <- NULL
print(res)
#         Growth quantile                 Sector       Date
#1   0.227833070        1 Consumer Discretionary 2002-01-31
#2   0.097000461        1       Consumer Staples 2002-01-31
#3  -0.078767963        1                 Energy 2002-01-31
#4   0.009642536        1             Financials 2002-01-31
#5   0.239390716        1            Health Care 2002-01-31
#6   0.062034987        1            Industrials 2002-01-31
#7  -0.084393011        1 Information Technology 2002-01-31
#8   0.017074760        1              Materials 2002-01-31

上面用到的数据是:

ll <- list(list(list(structure(list(Growth = 0.227833070205427, quantile = structure(1L, .Label = "1", class = "factor"), 
                                    Sector = structure(1L, .Label = "Consumer Discretionary", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.00580189434527657, quantile = structure(1L, .Label = "2", class = "factor"), 
                                    Sector = structure(1L, .Label = "Consumer Discretionary", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.280654630370414, quantile = structure(1L, .Label = "3", class = "factor"), 
                                    Sector = structure(1L, .Label = "Consumer Discretionary", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
                list(structure(list(Growth = 0.0970004606893047, quantile = structure(1L, .Label = "1", class = "factor"), 
                                    Sector = structure(1L, .Label = "Consumer Staples", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.054821203483339, quantile = structure(1L, .Label = "2", class = "factor"), 
                                    Sector = structure(1L, .Label = "Consumer Staples", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.00837169953085215, quantile = structure(1L, .Label = "3", class = "factor"), 
                                    Sector = structure(1L, .Label = "Consumer Staples", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
                list(structure(list(Growth = -0.078767963284149, quantile = structure(1L, .Label = "1", class = "factor"), 
                                    Sector = structure(1L, .Label = "Energy", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = -0.069104950106169, quantile = structure(1L, .Label = "2", class = "factor"), 
                                    Sector = structure(1L, .Label = "Energy", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = -0.27207135756175, quantile = structure(1L, .Label = "3", class = "factor"), 
                                    Sector = structure(1L, .Label = "Energy", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
                list(structure(list(Growth = 0.009642535558954, quantile = structure(1L, .Label = "1", class = "factor"), 
                                    Sector = structure(1L, .Label = "Financials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.0117244867054771, quantile = structure(1L, .Label = "2", class = "factor"), 
                                    Sector = structure(1L, .Label = "Financials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = -0.185284889832411, quantile = structure(1L, .Label = "3", class = "factor"), 
                                    Sector = structure(1L, .Label = "Financials", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
                list(structure(list(Growth = 0.239390715659085, quantile = structure(1L, .Label = "1", class = "factor"), 
                                    Sector = structure(1L, .Label = "Health Care", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = -0.0162271493055311, quantile = structure(1L, .Label = "2", class = "factor"), 
                                    Sector = structure(1L, .Label = "Health Care", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = -0.067303679327545, quantile = structure(1L, .Label = "3", class = "factor"), 
                                    Sector = structure(1L, .Label = "Health Care", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
                list(structure(list(Growth = 0.0620349870410483, quantile = structure(1L, .Label = "1", class = "factor"), 
                                    Sector = structure(1L, .Label = "Industrials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.0821803720980501, quantile = structure(1L, .Label = "2", class = "factor"), 
                                    Sector = structure(1L, .Label = "Industrials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.137729664907273, quantile = structure(1L, .Label = "3", class = "factor"), 
                                    Sector = structure(1L, .Label = "Industrials", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
                list(structure(list(Growth = -0.0843930112785794, quantile = structure(1L, .Label = "1", class = "factor"), 
                                    Sector = structure(1L, .Label = "Information Technology", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = -0.172018997118367, quantile = structure(1L, .Label = "2", class = "factor"), 
                                    Sector = structure(1L, .Label = "Information Technology", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = -0.298718947065689, quantile = structure(1L, .Label = "3", class = "factor"), 
                                    Sector = structure(1L, .Label = "Information Technology", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
                list(structure(list(Growth = 0.0170747596874905, quantile = structure(1L, .Label = "1", class = "factor"), 
                                    Sector = structure(1L, .Label = "Materials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.190415482682349, quantile = structure(1L, .Label = "2", class = "factor"), 
                                    Sector = structure(1L, .Label = "Materials", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.221341415148432, quantile = structure(1L, .Label = "3", class = "factor"), 
                                    Sector = structure(1L, .Label = "Materials", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
                list(structure(list(Growth = 0.168638361539387, quantile = structure(1L, .Label = "1", class = "factor"), 
                                    Sector = structure(1L, .Label = "Real Estate", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.0810611988754563, quantile = structure(1L, .Label = "2", class = "factor"), 
                                    Sector = structure(1L, .Label = "Real Estate", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.0365040437639329, quantile = structure(1L, .Label = "3", class = "factor"), 
                                    Sector = structure(1L, .Label = "Real Estate", class = "factor")), class = "data.frame", row.names = "2002-01-31")), 
                list(structure(list(Growth = 0.111350872628164, quantile = structure(1L, .Label = "1", class = "factor"), 
                                    Sector = structure(1L, .Label = "Utilities", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = -0.0978660942657028, quantile = structure(1L, .Label = "2", class = "factor"), 
                                    Sector = structure(1L, .Label = "Utilities", class = "factor")), class = "data.frame", row.names = "2002-01-31"), 
                     structure(list(Growth = 0.112770511307641, quantile = structure(1L, .Label = "3", class = "factor"), 
                                    Sector = structure(1L, .Label = "Utilities", class = "factor")), class = "data.frame", row.names = "2002-01-31"))))

推荐阅读