首页 > 解决方案 > 根据之间的值提取数据

问题描述

我有一些看起来像这样的数据:

          INFLATION.EXPECTATIONS..YEAR.ON.YEAR.CHANGE.IN.HICP          X   X.1        X.2        X.3         X.4         X.5         X.6         X.7         X.8
1                                               TARGET_PERIOD FCT_SOURCE POINT       T0_0   F0_0T0_4    F0_5T0_9    F1_0T1_4    F1_5T1_9    F2_0T2_4    F2_5T2_9
2                                                        1999          1     1                                20          70          10                        
3                                                        1999          2     1                    10          30          60                                    
4                                                        1999          3    .8                    20          50          30                                    
5                                                        1999          4   1.2                                40          60                                    
395                                                   2003Dec         93                                                                                        
396                                                   2003Dec         94   1.9                                            20          50          30            
397                                                   2003Dec         95   1.5 4.95049505 8.91089109 15.84158416 20.79207921 20.79207921 15.84158416  8.91089109
398                                                                                                                                                             
399  CORE INFLATION EXPECTATIONS; YEAR-ON-YEAR CHANGE IN CORE                                                                                                   
400                                                                                                                                                             
401                                                                                                                                                             
402      GROWTH EXPECTATIONS; YEAR-ON-YEAR CHANGE IN REAL GDP                                                                                                   
403                                             TARGET_PERIOD FCT_SOURCE POINT       T0_0   F0_0T0_4    F0_5T0_9    F1_0T1_4    F1_5T1_9    F2_0T2_4    F2_5T2_9
404                                                      1999          1     2                                                        43          47          10
405                                                      1999          2   2.2                                                        50          50            
406                                                      1999          3   1.9                                                        70          30            
407                                                      1999          4   2.4                                                        20          60          20
797                                                    2003Q4         93                                                                                        
798                                                    2003Q4         94   2.5                                                        10          30          50
799                                                    2003Q4         95   2.5 2.97029703 4.95049505  4.95049505 11.38613861 11.38613861 14.85148515 14.85148515
800                                                                                                                                                             
801    EXPECTED UNEMPLOYMENT RATE; PERCENTAGE OF LABOUR FORCE                                                                                                   
802                                             TARGET_PERIOD FCT_SOURCE POINT       T9_0   F9_0T9_4    F9_5T9_9  F10_0T10_4  F10_5T10_9  F11_0T11_4  F11_5T11_9
803                                                      1999          1  10.2                                 0 33.33333333 55.55555556 11.11111111            
804                                                      1999          2    10                                40          60                                    
805                                                      1999          3  10.7                                            10          80          10            
1198                                                  2003Dec         95   9.5         26       24.5        24.5        11.5        11.5           1           1

.csv数据在一个文件中包含几个数据集。

  1. 如何过滤数据,以便我有一个数据框,该数据框选择包含来自INFLATION.EXPECTATIONSto的单词的数据,CORE INFLATION然后还选择来自GROWTH EXPECTATIONSto的数据EXPECTED UNEMPLOYMENT RATE
  2. 然后使用类似的东西janitor::row_to_names(myData, row_number = 2)来替换列名。

我有这些数据类型的列表,我想将该函数应用于所有列表,从每个列表中提取相关信息。

数据:

    myData <- structure(list(V1 = structure(c(17L, 18L, 2L, 2L, 2L, 11L, 11L, 
11L, 11L, 1L, 14L, 1L, 1L, 16L, 18L, 2L, 2L, 2L, 12L, 12L, 12L, 
12L, 1L, 15L, 18L, 2L, 2L, 11L, 11L, 1L, 13L), .Label = c("", 
"1999", "1999Dec", "1999Nov", "1999Q3", "2000", "2000Dec", "2000Nov", 
"2000Q3", "2003", "2003Dec", "2003Q4", "ASSUMPTIONS", "CORE INFLATION EXPECTATIONS; YEAR-ON-YEAR CHANGE IN CORE", 
"EXPECTED UNEMPLOYMENT RATE; PERCENTAGE OF LABOUR FORCE", "GROWTH EXPECTATIONS; YEAR-ON-YEAR CHANGE IN REAL GDP", 
"INFLATION EXPECTATIONS; YEAR-ON-YEAR CHANGE IN HICP", "TARGET_PERIOD"
), class = "factor"), V2 = structure(c(1L, 68L, 2L, 10L, 17L, 
64L, 65L, 66L, 67L, 1L, 1L, 1L, 1L, 1L, 68L, 2L, 10L, 17L, 64L, 
65L, 66L, 67L, 1L, 1L, 68L, 2L, 10L, 66L, 67L, 1L, 1L), .Label = c("", 
"1", "10", "11", "14", "16", "17", "18", "19", "2", "20", "23", 
"24", "26", "28", "29", "3", "31", "32", "33", "34", "35", "36", 
"37", "38", "39", "4", "40", "43", "45", "46", "47", "5", "50", 
"52", "53", "54", "55", "56", "59", "6", "60", "61", "62", "63", 
"64", "65", "67", "68", "7", "70", "71", "72", "73", "76", "85", 
"86", "87", "88", "89", "9", "90", "91", "92", "93", "94", "95", 
"FCT_SOURCE"), class = "factor"), V3 = structure(c(1L, 96L, 16L, 
16L, 13L, 62L, 1L, 28L, 23L, 1L, 1L, 1L, 1L, 1L, 96L, 55L, 59L, 
28L, 55L, 1L, 67L, 67L, 1L, 1L, 96L, 31L, 29L, 34L, 91L, 1L, 
1L), .Label = c("", ".25", ".28", ".3", ".4", ".5", ".53", ".6", 
".64", ".68", ".7", ".71", ".8", ".9", ".97", "1", "1.1", "1.2", 
"1.3", "1.33", "1.38", "1.4", "1.5", "1.53", "1.6", "1.7", "1.8", 
"1.9", "10", "10.1", "10.2", "10.3", "10.4", "10.5", "10.6", 
"10.7", "10.73", "10.78", "10.8", "10.9", "11", "11.02", "11.1", 
"11.16", "11.2", "11.22", "11.24", "11.3", "11.4", "11.5", "11.6", 
"11.7", "11.9", "12", "2", "2.02", "2.04", "2.1", "2.2", "2.25", 
"2.26", "2.3", "2.31", "2.36", "2.37", "2.4", "2.5", "2.6", "2.7", 
"2.75", "2.8", "2.9", "3", "3.1", "3.2", "7.7", "7.8", "8", "8.1", 
"8.3", "8.5", "8.6", "8.75", "8.8", "9", "9.2", "9.3", "9.39", 
"9.4", "9.43", "9.5", "9.6", "9.7", "9.8", "9.9", "POINT"), class = "factor"), 
    V4 = structure(c(1L, 57L, 1L, 1L, 1L, 1L, 1L, 1L, 43L, 1L, 
    1L, 1L, 1L, 1L, 57L, 1L, 1L, 1L, 1L, 1L, 1L, 29L, 1L, 1L, 
    58L, 1L, 1L, 1L, 32L, 1L, 1L), .Label = c("", ".1001001", 
    ".2004008", ".29910269", ".69375619", ".7", ".999001", "0", 
    "1", "1.01010101", "1.1", "1.18694362", "1.2", "1.3986014", 
    "1.49700599", "1.6", "1.98019802", "1.998002", "10", "11", 
    "11.11111111", "15", "16", "17", "19", "2", "2.02020202", 
    "2.3", "2.97029703", "20", "21", "26", "3", "3.5", "3.5892323", 
    "3.96039604", "30", "37", "37.9", "4", "4.54545455", "4.60921844", 
    "4.95049505", "40", "5", "5.05050505", "5.08982036", "5.09490509", 
    "5.5", "50", "6", "60", "70", "8", "9", "9.8", "T0_0", "T9_0"
    ), class = "factor"), V5 = structure(c(1L, 86L, 1L, 11L, 
    28L, 1L, 1L, 1L, 84L, 1L, 1L, 1L, 1L, 1L, 86L, 1L, 1L, 1L, 
    1L, 1L, 1L, 54L, 1L, 1L, 87L, 1L, 1L, 11L, 33L, 1L, 1L), .Label = c("", 
    ".5", ".6006006", ".82378943", ".99009901", "0", "1", "1.01010101", 
    "1.2012012", "1.29611167", "10", "10.37067461", "11", "12", 
    "13", "14", "15", "17", "18.28171828", "18.8", "19", "2", 
    "2.00400802", "2.27948464", "2.52525253", "2.8", "2.97029703", 
    "20", "21.21212121", "22", "22.22222222", "23", "24.5", "25", 
    "26", "26.9", "28", "29", "29.9", "3", "3.03030303", "3.26409496", 
    "3.35353113", "3.7", "30", "34.34343434", "35", "4", "4.5", 
    "4.60921844", "4.7", "4.76602183", "4.8951049", "4.95049505", 
    "40", "40.59405941", "43.56435644", "45", "5", "5.26315789", 
    "5.5", "5.55555556", "5.69430569", "5.94059406", "50", "6", 
    "6.08175474", "6.1", "6.2", "6.41751201", "6.5277921", "6.78642715", 
    "60", "68", "7", "7.14360752", "7.19280719", "7.92079208", 
    "75", "8", "8.08080808", "8.4", "8.68263473", "8.91089109", 
    "9.09090909", "F0_0T0_4", "F9_0T9_4"), class = "factor"), 
    V6 = structure(c(1L, 103L, 41L, 61L, 81L, 1L, 1L, 1L, 26L, 
    1L, 1L, 1L, 1L, 1L, 103L, 1L, 1L, 1L, 1L, 1L, 1L, 73L, 1L, 
    1L, 104L, 5L, 74L, 8L, 49L, 1L, 1L), .Label = c("", ".5", 
    ".5988024", ".98", "0", "1.01010101", "1.93814019", "10", 
    "10.22044088", "11", "11.46560319", "11.47704591", "12", 
    "12.05599666", "12.68731269", "13", "13.31830865", "13.40105912", 
    "14", "14.1", "14.2", "14.61461461", "14.6609717", "14.8", 
    "15", "15.84158416", "16", "16.16161616", "17", "17.08291708", 
    "17.22870049", "17.3", "18", "18.3046303", "18.58141858", 
    "19", "2", "2.4024024", "2.52525253", "2.97029703", "20", 
    "21.05263158", "21.3", "22", "22.22222222", "22.9", "23", 
    "24", "24.5", "25", "25.64870259", "26", "27.1", "28", "29", 
    "29.29292929", "3", "3.0223655", "3.48953141", "3.76610505", 
    "30", "32.1", "32.57007906", "33.86613387", "35", "36.36363636", 
    "38.38383838", "39", "4", "4.45103858", "4.5", "4.70753205", 
    "4.95049505", "40", "45", "48.51485149", "49.49494949", "5", 
    "5.55555556", "5.94059406", "50", "51.48514851", "55", "6", 
    "6.06060606", "60", "7", "7.29355033", "7.3", "7.51503006", 
    "7.92079208", "70", "73", "75", "8", "8.80903491", "80", 
    "9", "9.09090909", "9.4", "90", "95", "F0_5T0_9", "F9_5T9_9"
    ), class = "factor"), V7 = structure(c(1L, 112L, 102L, 95L, 
    66L, 1L, 1L, 35L, 36L, 1L, 1L, 1L, 1L, 1L, 112L, 1L, 1L, 
    1L, 1L, 1L, 1L, 7L, 1L, 1L, 113L, 72L, 95L, 66L, 8L, 1L, 
    1L), .Label = c("", "10", "10.2", "10.92184369", "11", "11.11111111", 
    "11.38613861", "11.5", "12", "12.5", "12.76180698", "12.87128713", 
    "13", "13.13131313", "13.86138614", "14", "14.64646465", 
    "14.83150891", "15", "15.78947368", "15.84158416", "15.86826347", 
    "16.5", "16.94915254", "17.82178218", "18", "18.18181818", 
    "18.4", "18.5", "19", "2", "2.02020202", "2.5", "2.97029703", 
    "20", "20.79207921", "21", "21.05263158", "21.42018153", 
    "22.22222222", "22.4", "22.74937858", "23.17682318", "23.23232323", 
    "23.4", "23.5", "23.87612388", "23.9134977", "24", "24.24242424", 
    "24.27", "24.76100431", "24.8", "25", "25.8", "26", "26.26262626", 
    "26.50190878", "26.55310621", "26.6", "27", "27.87460039", 
    "28.2", "28.97102897", "3", "30", "30.5", "31", "31.36863137", 
    "32", "33", "33.33333333", "34.83033932", "35", "36.36363636", 
    "37.04692474", "4", "4.19161677", "40", "40.1", "42.5", "44.04404404", 
    "44.44444444", "45", "45.29246795", "45.45454545", "47.36842105", 
    "5", "5.55555556", "5.7", "50", "55", "6", "6.49728861", 
    "60", "65", "7.50750751", "7.63131814", "7.91295747", "7.91859762", 
    "7.92079208", "70", "75", "8.47457627", "8.6", "80", "9", 
    "9.09090909", "9.9009901", "90", "95", "F1_0T1_4", "F10_0T10_4"
    ), class = "factor"), V8 = structure(c(1L, 109L, 4L, 1L, 
    1L, 1L, 1L, 93L, 49L, 1L, 1L, 1L, 1L, 1L, 109L, 89L, 93L, 
    102L, 1L, 1L, 4L, 8L, 1L, 1L, 110L, 96L, 1L, 77L, 10L, 1L, 
    1L), .Label = c("", ".99009901", "1.2", "10", "100", "11", 
    "11.11111111", "11.38613861", "11.48851149", "11.5", "11.96834817", 
    "12", "12.12121212", "12.38850347", "12.5", "12.87128713", 
    "12.9", "13", "13.7", "14.12825651", "14.64646465", "14.93271741", 
    "15", "15.34354221", "15.46906188", "15.55333998", "15.84158416", 
    "15.9154325", "16", "16.01601602", "16.30390144", "16.4", 
    "16.5", "16.78817127", "16.83168317", "17", "17.82178218", 
    "17.96407186", "18", "18.18181818", "18.2", "18.5", "18.81188119", 
    "19.64107677", "2", "2.5", "20", "20.1", "20.79207921", "21", 
    "21.05263158", "21.77822178", "21.78217822", "21.9", "21.95608782", 
    "23", "23.16070475", "23.23232323", "23.3", "23.5", "24", 
    "24.1", "24.24242424", "24.39331863", "25", "25.5", "25.6", 
    "26", "26.02866866", "26.46709175", "26.7", "27.27272727", 
    "27.47252747", "29.07438362", "3", "3.03030303", "30", "30.5", 
    "31", "31.3", "33.33333333", "33.43343343", "34.76953908", 
    "35", "36.84210526", "38.29616235", "40", "42.5", "43", "45", 
    "45.29246795", "5", "50", "52.63157895", "55", "55.55555556", 
    "58.89", "6", "60", "65", "7.07070707", "70", "75", "8", 
    "80", "9.09090909", "90", "95", "F1_5T1_9", "F10_5T10_9"), class = "factor"), 
    V9 = structure(c(1L, 115L, 1L, 1L, 1L, 1L, 1L, 78L, 25L, 
    1L, 1L, 1L, 1L, 1L, 115L, 91L, 96L, 78L, 1L, 1L, 78L, 17L, 
    1L, 1L, 114L, 11L, 1L, 7L, 5L, 1L, 1L), .Label = c("", ".2", 
    ".3", "0", "1", "1.01010101", "10", "10.8", "100", "11", 
    "11.11111111", "12", "12.48751249", "13", "14", "14.8", "14.85148515", 
    "14.9118284", "15", "15.18481518", "15.23244313", "15.3", 
    "15.34653465", "15.48", "15.84158416", "16", "16.16161616", 
    "16.23246493", "16.45193261", "16.46181322", "16.46276025", 
    "16.5", "16.56686627", "17", "17.5", "17.6", "17.8", "17.84646062", 
    "18", "18.18181818", "18.2", "18.35728953", "18.53710625", 
    "18.68686869", "19", "19.19191919", "19.8", "2", "2.0979021", 
    "2.2", "20", "20.07992008", "20.64128257", "20.79207921", 
    "21", "21.03688933", "21.0958608", "21.68825742", "22.01289543", 
    "22.05143661", "22.22222222", "22.72727273", "23.42342342", 
    "23.49869452", "23.76237624", "24", "24.8", "25", "26.31578947", 
    "27", "27.27272727", "27.72277228", "28", "29", "29.74051896", 
    "3.00852244", "3.03030303", "30", "33.33333333", "33.66336634", 
    "35", "36.36363636", "38.1", "38.88888889", "4", "4.5", "4.54545455", 
    "4.70753205", "40", "45", "47", "47.5", "5", "5.26315789", 
    "5.5", "50", "53.53535354", "55", "6", "6.38722555", "6.40640641", 
    "6.5", "60", "65", "7.5", "70", "75", "8", "8.5", "80", "9", 
    "90", "95", "F11_0T11_4", "F2_0T2_4"), class = "factor"), 
    V10 = structure(c(1L, 95L, 1L, 1L, 1L, 1L, 1L, 1L, 89L, 1L, 
    1L, 1L, 1L, 1L, 95L, 11L, 1L, 1L, 1L, 1L, 72L, 20L, 1L, 1L, 
    94L, 1L, 1L, 11L, 9L, 1L, 1L), .Label = c("", ".1", ".1998002", 
    ".21562789", ".3003003", ".38", ".89820359", "0", "1", "1.4", 
    "10", "10.98772023", "11", "11.52058792", "12", "12.4750499", 
    "12.66201396", "12.8", "13", "14.85148515", "15", "15.34653465", 
    "15.7", "15.84158416", "16", "16.32047478", "16.4", "16.41983258", 
    "16.5", "16.73346693", "17.5", "17.82178218", "17.839445", 
    "18.21355236", "18.68686869", "19.19191919", "2", "2.997003", 
    "20", "20.2020202", "21", "21.03688933", "21.05263158", "21.78217822", 
    "23.38303445", "23.42342342", "23.76237624", "25", "25.34645511", 
    "26.4", "27", "27.27272727", "27.77777778", "29.74051896", 
    "3", "3.9", "30", "33", "33.33333333", "33.66336634", "35", 
    "36.36363636", "4.2", "4.5", "40", "45", "47.5", "5", "5.0331525", 
    "5.11022044", "5.22842116", "50", "54.54545455", "6", "6.4", 
    "6.5", "60", "7.4", "7.5", "7.60584095", "70", "75", "8", 
    "8.08080808", "8.09190809", "8.19180819", "8.4", "8.5", "8.91089109", 
    "80", "9", "9.91433347", "90", "F11_5T11_9", "F2_5T2_9"), class = "factor"), 
    V11 = structure(c(1L, 70L, 1L, 1L, 1L, 1L, 1L, 1L, 44L, 1L, 
    1L, 1L, 1L, 1L, 70L, 1L, 1L, 1L, 1L, 1L, 13L, 18L, 1L, 1L, 
    69L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", ".0998004", 
    ".3", ".3996004", ".5", ".501002", ".99009901", "1", "1.01010101", 
    "1.12405899", "1.4", "1.998002", "10", "10.89108911", "11", 
    "11.11111111", "11.2", "11.38613861", "11.88118812", "12", 
    "12.12121212", "12.9258517", "13", "13.1", "13.13131313", 
    "14", "14.73788328", "15", "15.46906188", "15.55333998", 
    "15.75817641", "15.77591758", "15.84158416", "15.93429158", 
    "16.01601602", "17", "18", "18.37598227", "18.72866037", 
    "2", "2.02020202", "2.40571489", "2.5", "2.97029703", "20", 
    "22.72727273", "25", "26", "3", "3.1", "3.6", "30", "4", 
    "4.7952048", "4.81580352", "5", "5.94059406", "6", "6.56565657", 
    "7.07070707", "7.07876371", "7.3", "7.68463074", "70", "8", 
    "9", "9.09090909", "9.9009901", "F12_0", "F3_0T3_4"), class = "factor"), 
    V12 = structure(c(1L, 47L, 1L, 1L, 1L, 1L, 1L, 1L, 8L, 1L, 
    1L, 1L, 1L, 1L, 48L, 1L, 1L, 1L, 1L, 1L, 1L, 14L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", ".2", ".2997003", 
    ".3", ".6", ".8", ".9", ".99009901", "1", "1.01010101", "10", 
    "11", "11.17705242", "11.38613861", "11.39742319", "12", 
    "13.13131313", "15", "2", "2.5", "2.97029703", "20", "3", 
    "30", "4", "4.19161677", "4.68594217", "5", "5.25", "5.94059406", 
    "6", "6.08782435", "6.16232465", "6.56565657", "7.07070707", 
    "7.27623954", "7.50750751", "7.9", "7.92079208", "8", "8.47457627", 
    "9.05804378", "9.09090909", "9.6201232", "9.9009901", "9.94358251", 
    "F3_5", "F3_5T3_9"), class = "factor"), V13 = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 26L, 
    1L, 1L, 1L, 1L, 1L, 1L, 12L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L), .Label = c("", ".3", ".5988024", ".99009901", "0", 
    "1", "1.01010101", "1.98019802", "10", "11", "11.7938553", 
    "11.88118812", "13.7487636", "2", "3.003003", "4.78564307", 
    "5", "5.25", "6.16232465", "7", "7.9", "7.92079208", "8.08080808", 
    "9", "9.09090909", "F4_0"), class = "factor")), row.names = c(1L, 
2L, 3L, 4L, 5L, 395L, 396L, 397L, 398L, 399L, 400L, 401L, 402L, 
403L, 404L, 405L, 406L, 407L, 797L, 798L, 799L, 800L, 801L, 802L, 
803L, 804L, 805L, 1198L, 1199L, 1200L, 1201L), class = "data.frame")

标签: r

解决方案


我们可以使用grep返回两种情况的索引,然后使用:orseq来获取索引的顺序

library(dplyr)
myData %>% 
   slice(c(grep('^INFLATION EXPECTATIONS', V1):grep('CORE INFLATION', V1),
    grep('GROWTH EXPECTATIONS', V1):grep('EXPECTED UNEMPLOYMENT RATE', V1) ))

如果有多个集合,请使用map2

library(purrr)
map2(c('^INFLATION EXPECTATIONS', 'GROWTH EXPECTATIONS'),
     c('CORE INFLATION', 'EXPECTED UNEMPLOYMENT RATE'), 
  ~ grep(.x, myData$V1):grep(.y, myData$V1)) %>%
       flatten_int %>%
     slice(myData, .)

推荐阅读