首页 > 解决方案 > 将列表合并在一起,并根据其中一个列表中包含的信息创建一个新列

问题描述

我正在尝试在 2 个单独的列表之一中创建一个新列。

df1 的一个元素如下所示:

$`c(5, 19)`
$`c(5, 19)`[[1]]
               Feature           Gain         Cover    Frequency
 1:     plaza_eliptica 0.948578681145 0.53759794901 0.2794117647
 2:               wind 0.014083116347 0.10011187610 0.1343137255
 3:               temp 0.011637657812 0.08581378948 0.1460784314
 4:               year 0.006344014204 0.12430881478 0.1137254902
 5:           humidity 0.004941509318 0.03941622272 0.0862745098
 6:          barometer 0.003729098869 0.03750491037 0.0715686275
 7:             season 0.003482740507 0.02015837244 0.0254901961
 8:              month 0.003016223359 0.03462645560 0.0539215686
 9:                day 0.002926824939 0.00525381171 0.0578431373
10:            weekday 0.000644114655 0.01335391670 0.0176470588
11:      week_of_month 0.000587970927 0.00074890364 0.0117647059
12: workday_on_holiday 0.000025880557 0.00107281595 0.0009803922
13:            holiday 0.000002167362 0.00003216151 0.0009803922

df2 的第一个元素看起来像

[[1]]
[[1]][[1]]
     date         c_farolillo
[1,] "2016-01-01" "17"       

[[1]][[2]]
     date         c_farolillo
[1,] "2016-01-02" "9"        

[[1]][[3]]
     date         c_farolillo
[1,] "2016-01-03" "8"        

[[1]][[4]]
     date         c_farolillo
[1,] "2016-01-04" "3"        

[[1]][[5]]
     date         c_farolillo
[1,] "2016-01-05" "4"        

[[1]][[6]]
     date         c_farolillo
[1,] "2016-01-06" "4" 

我正在尝试获取 df2 的第一个元素并在 df1 中创建一个新列。也就是说,我想取这个元素:

[[1]]
[[1]][[1]]
     date         c_farolillo
[1,] "2016-01-01" "17"       

并向 df1 的元素 1 添加一列,例如:

$`c(5, 19)`
$`c(5, 19)`[[1]]
               Feature           Gain         Cover    Frequency   date       c_farolillo
 1:     plaza_eliptica 0.948578681145 0.53759794901 0.2794117647  2016-01-01      17
 2:               wind 0.014083116347 0.10011187610 0.1343137255  2016-01-01      17
 3:               temp 0.011637657812 0.08581378948 0.1460784314  2016-01-01      17
 4:               year 0.006344014204 0.12430881478 0.1137254902  2016-01-01      17
 5:           humidity 0.004941509318 0.03941622272 0.0862745098  2016-01-01      17
 6:          barometer 0.003729098869 0.03750491037 0.0715686275  2016-01-01      17
 7:             season 0.003482740507 0.02015837244 0.0254901961  2016-01-01      17
 8:              month 0.003016223359 0.03462645560 0.0539215686  2016-01-01      17
 9:                day 0.002926824939 0.00525381171 0.0578431373  2016-01-01      17
10:            weekday 0.000644114655 0.01335391670 0.0176470588  2016-01-01      17
11:      week_of_month 0.000587970927 0.00074890364 0.0117647059  2016-01-01      17
12: workday_on_holiday 0.000025880557 0.00107281595 0.0009803922  2016-01-01      17
13:            holiday 0.000002167362 0.00003216151 0.0009803922  2016-01-01      17

然后取df2的元素2:

[[1]][[2]]
     date         c_farolillo
[1,] "2016-01-02" "9"        

并做同样的事情,但对于 df1 的元素 2。

$`c(5, 19)`[[2]]
               Feature          Gain        Cover   Frequency   date       c_farolillo
 1:     plaza_eliptica 0.95025739085 0.5490795291 0.283433134  2016-01-02       9
 2:               temp 0.01236820897 0.0832973356 0.150698603  2016-01-02       9
 3:               wind 0.01196041617 0.0895609496 0.125748503  2016-01-02       9
 4:               year 0.00604315510 0.1158975396 0.112774451  2016-01-02       9
 5:             season 0.00511480982 0.0173938219 0.027944112  2016-01-02       9
 6:           humidity 0.00500999458 0.0578155014 0.086826347  2016-01-02       9
 7:          barometer 0.00325812831 0.0340156062 0.071856287  2016-01-02       9
 8:              month 0.00323898173 0.0354103288 0.062874251  2016-01-02       9
 9:                day 0.00220665600 0.0067323511 0.058882236  2016-01-02       9
10:            weekday 0.00050300478 0.0103857430 0.014970060  2016-01-02       9
11: workday_on_holiday 0.00001964502 0.0002228799 0.000998004  2016-01-02       9
12:      week_of_month 0.00001960867 0.0001884139 0.002994012  2016-01-02       9

这样 2 个主列表中的每一个中的所有 3 个列表都包含合并的数据。也就是说,list3,df1 的元素 3 将合并到一起:

[[2]][[3]]
     date         pza_del_carmen
[1,] "2016-01-03" "10" 

$`c(7, 1, 2, 18)`[[3]]
               Feature          Gain        Cover    Frequency   date        pza_del_carmen
 1:      pza_de_espana 0.75620312440 0.2776437590 0.1729106628  2016-01-03      10
 2:             retiro 0.21115176179 0.2195341962 0.1498559078  2016-01-03      10
 3:   escuelas_aguirre 0.01304161322 0.0993235815 0.0970220941  2016-01-03      10
 4:               wind 0.00497255534 0.0963420148 0.1123919308  2016-01-03      10
 5:               temp 0.00490558802 0.1068475040 0.1585014409  2016-01-03      10
 6:          barometer 0.00356537931 0.0580338186 0.0787704131  2016-01-03      10
 7:           humidity 0.00201778550 0.0233865914 0.0672430355  2016-01-03      10
 8:               year 0.00177749645 0.0517034409 0.0547550432  2016-01-03      10
 9:                day 0.00086333491 0.0048338563 0.0509125841  2016-01-03      10
10:              month 0.00047874430 0.0234141348 0.0211335255  2016-01-03      10
11:            weekday 0.00040798584 0.0168542292 0.0144092219  2016-01-03      10
12:      week_of_month 0.00032152928 0.0032202756 0.0105667627  2016-01-03      10
13:             season 0.00026657228 0.0186674991 0.0105667627  2016-01-03      10
14: weekend_on_holiday 0.00002652936 0.0001950987 0.0009606148  2016-01-03      10

数据1:

编辑:

数据与此处列表 中的list2数据相同。Info_assessment在此处输入图像描述

在此处输入图像描述

编辑:

新数据:

    list1 <- list(`c(5, 19)` = list(structure(list(Feature = c("plaza_eliptica", 
"wind", "temp", "year", "humidity", "barometer", "season", "month", 
"day", "weekday", "week_of_month", "workday_on_holiday", "holiday"
), Gain = c(0.948578681144529, 0.0140831163472628, 0.0116376578118342, 
0.00634401420383024, 0.0049415093180091, 0.00372909886882749, 
0.00348274050673969, 0.00301622335931412, 0.00292682493887959, 
0.000644114654618996, 0.000587970926885777, 0.0000258805573006903, 
0.00000216736196828243), Cover = c(0.537597949014824, 0.100111876095501, 
0.0858137894753769, 0.12430881477959, 0.0394162227230228, 0.0375049103727748, 
0.0201583724440218, 0.034626455595298, 0.005253811712761, 0.0133539166971052, 
0.000748903637236591, 0.00107281594659352, 0.000032161505893596
), Frequency = c(0.279411764705882, 0.134313725490196, 0.146078431372549, 
0.113725490196078, 0.0862745098039216, 0.0715686274509804, 0.0254901960784314, 
0.053921568627451, 0.057843137254902, 0.0176470588235294, 0.0117647058823529, 
0.000980392156862745, 0.000980392156862745)), row.names = c(NA, 
-13L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x560f12912cd0>), 
    structure(list(Feature = c("plaza_eliptica", "temp", "wind", 
    "year", "season", "humidity", "barometer", "month", "day", 
    "weekday", "workday_on_holiday", "week_of_month"), Gain = c(0.950257390847805, 
    0.0123682089682263, 0.0119604161685161, 0.00604315510198456, 
    0.00511480981946408, 0.00500999457778123, 0.00325812831159771, 
    0.00323898173138714, 0.00220665599964529, 0.000503004779421417, 
    0.0000196450237473069, 0.0000196086704235932), Cover = c(0.549079529057103, 
    0.0832973355514094, 0.0895609496061689, 0.115897539589901, 
    0.0173938218615296, 0.0578155014108067, 0.0340156061873294, 
    0.0354103287593173, 0.00673235113002399, 0.0103857430401735, 
    0.000222879883826733, 0.000188413922410228), Frequency = c(0.283433133732535, 
    0.150698602794411, 0.125748502994012, 0.112774451097804, 
    0.0279441117764471, 0.0868263473053892, 0.0718562874251497, 
    0.062874251497006, 0.0588822355289421, 0.0149700598802395, 
    0.000998003992015968, 0.0029940119760479)), row.names = c(NA, 
    -12L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x560f12912cd0>)), 
    `c(7, 1, 2, 18)` = list(structure(list(Feature = c("pza_de_espana", 
    "retiro", "escuelas_aguirre", "wind", "temp", "barometer", 
    "humidity", "year", "day", "month", "weekday", "week_of_month", 
    "season", "weekend_on_holiday"), Gain = c(0.762835844259031, 
    0.205459059740918, 0.0130315791677542, 0.0045078890564497, 
    0.00444974962904841, 0.00339293826829134, 0.00189508238873358, 
    0.00187978643588582, 0.00100750177875752, 0.000538521180289064, 
    0.000402209068457385, 0.000300268511436018, 0.0002522936065142, 
    0.0000472769084332351), Cover = c(0.293886140204331, 0.227015081557907, 
    0.0916711798129263, 0.0951455374927713, 0.102043766809557, 
    0.0520602895145079, 0.0284397058958519, 0.0521635564204478, 
    0.00571869176893915, 0.0177917404833809, 0.0133466738877007, 
    0.00350419034156103, 0.0168233263876777, 0.000390119422439669
    ), Frequency = c(0.175908221797323, 0.154875717017208, 0.101338432122371, 
    0.111854684512428, 0.136711281070746, 0.0736137667304015, 
    0.0678776290630975, 0.0583173996175908, 0.0506692160611855, 
    0.0248565965583174, 0.0181644359464627, 0.011472275334608, 
    0.0124282982791587, 0.00191204588910134)), row.names = c(NA, 
    -14L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x560f12912cd0>), 
        structure(list(Feature = c("pza_de_espana", "retiro", 
        "escuelas_aguirre", "wind", "temp", "barometer", "humidity", 
        "year", "day", "month", "weekday", "week_of_month", "season", 
        "weekend_on_holiday"), Gain = c(0.762803211914528, 0.205468329058334, 
        0.0130409115957786, 0.00452752500477332, 0.0044356364989903, 
        0.00339296767537418, 0.00189335858734865, 0.00188130563208465, 
        0.00100789616848372, 0.000544801039619805, 0.000402657864212878, 
        0.000301123014739554, 0.000252994653373448, 0.0000472812923590283
        ), Cover = c(0.293865486823143, 0.227012786737776, 0.0915793870076463, 
        0.0953268282831992, 0.101871655299658, 0.0520671739749038, 
        0.0284167576945319, 0.0521819149815037, 0.00571869176893915, 
        0.0179064814899808, 0.0133374946071727, 0.00349960070129703, 
        0.0168256212078097, 0.000390119422439669), Frequency = c(0.175908221797323, 
        0.154875717017208, 0.10038240917782, 0.112810707456979, 
        0.135755258126195, 0.0736137667304015, 0.0678776290630975, 
        0.0583173996175908, 0.0506692160611855, 0.0258126195028681, 
        0.0181644359464627, 0.011472275334608, 0.0124282982791587, 
        0.00191204588910134)), row.names = c(NA, -14L), class = c("data.table", 
        "data.frame"), .internal.selfref = <pointer: 0x560f12912cd0>)))

新数据2:

list2 <- list(`c(5, 19)` = list(Info_assessment = list(structure(c("2016-01-01", 
"17"), .Dim = 1:2, .Dimnames = list(NULL, c("date", "c_farolillo"
))), structure(c("2016-01-02", "9"), .Dim = 1:2, .Dimnames = list(
    NULL, c("date", "c_farolillo")))), X_test = list(structure(c(1, 
1, 2016, 1, 1, 1, 0, 1, 1, 1, 0, 1, 52.2692307692308, 5.46153846153846, 
84.9615384615385, 30.1315384615385, 25), .Dim = c(1L, 17L), .Dimnames = list(
    NULL, c("day", "month", "year", "quarter", "semester", "weekday", 
    "weekend", "season", "holiday", "workday_on_holiday", "weekend_on_holiday", 
    "week_of_month", "temp", "wind", "humidity", "barometer", 
    "plaza_eliptica"))), structure(c(2, 1, 2016, 1, 1, 0, 1, 
1, 0, 0, 0, 1, 47.7307692307692, 10.4230769230769, 77.0769230769231, 
30.1834615384615, 29), .Dim = c(1L, 17L), .Dimnames = list(NULL, 
    c("day", "month", "year", "quarter", "semester", "weekday", 
    "weekend", "season", "holiday", "workday_on_holiday", "weekend_on_holiday", 
    "week_of_month", "temp", "wind", "humidity", "barometer", 
    "plaza_eliptica")))), Y_test = list(structure(17, .Dim = c(1L, 
1L), .Dimnames = list(NULL, "c_farolillo")), structure(9, .Dim = c(1L, 
1L), .Dimnames = list(NULL, "c_farolillo")))), `c(7, 1, 2, 18)` = list(
    Info_assessment = list(structure(c("2016-01-01", "12"), .Dim = 1:2, .Dimnames = list(
        NULL, c("date", "pza_del_carmen"))), structure(c("2016-01-02", 
    "10"), .Dim = 1:2, .Dimnames = list(NULL, c("date", "pza_del_carmen"
    )))), X_test = list(structure(c(1, 1, 2016, 1, 1, 1, 0, 1, 
    1, 1, 0, 1, 52.2692307692308, 5.46153846153846, 84.9615384615385, 
    30.1315384615385, 28, 17, 6), .Dim = c(1L, 19L), .Dimnames = list(
        NULL, c("day", "month", "year", "quarter", "semester", 
        "weekday", "weekend", "season", "holiday", "workday_on_holiday", 
        "weekend_on_holiday", "week_of_month", "temp", "wind", 
        "humidity", "barometer", "pza_de_espana", "escuelas_aguirre", 
        "retiro"))), structure(c(2, 1, 2016, 1, 1, 0, 1, 1, 0, 
    0, 0, 1, 47.7307692307692, 10.4230769230769, 77.0769230769231, 
    30.1834615384615, 21, 24, 5), .Dim = c(1L, 19L), .Dimnames = list(
        NULL, c("day", "month", "year", "quarter", "semester", 
        "weekday", "weekend", "season", "holiday", "workday_on_holiday", 
        "weekend_on_holiday", "week_of_month", "temp", "wind", 
        "humidity", "barometer", "pza_de_espana", "escuelas_aguirre", 
        "retiro")))), Y_test = list(structure(12, .Dim = c(1L, 
    1L), .Dimnames = list(NULL, "pza_del_carmen")), structure(10, .Dim = c(1L, 
    1L), .Dimnames = list(NULL, "pza_del_carmen")))))

标签: r

解决方案


它是一个listof list,所以我们可以Map在对应的 ,内部list做第二个Map

Map(function(lst1, lst2) Map(function(dat1, dat2) dat1[, 
     colnames(dat2) := .(dat2[1], dat2[2])][], lst1, lst2), list1,
       lapply(list2, `[[`, "Info_assessment") )

推荐阅读