r - 在给定条件下跨多列选择值
问题描述
我正在尝试选择 2011 年后对应月份的所有年度值,如果行的大小不匹配,则用 NA 填充它们。
我努力了:
rank_data %>% filter(across(starts_with('year'))>2011)
仅返回单列的输出:
jan year feb year2 mar year3 apr year4 may year5 jun year6 jul year7 aug year8 sep year9 oct year10 nov year11 dec
1: 205.3 2014 188.2 2014 167.8 1920 122.1 1882 134.2 1906 152.0 1948 173.6 1882 191.3 1985 196.5 1957 234.6 1998 224.2 1931 232.6
2: 203.2 2016 179.0 1950 162.9 1868 119.5 1993 133.4 1925 147.3 1980 156.4 1932 181.1 1900 193.6 1927 211.6 1862 222.5 1928 230.1
year12 win year13 spr year14 sum year15 aut year16 ann year17
1: 1979 529.4 1960 348.9 1983 438.8 1928 527.8 1944 1547.8 2008
2: 1994 522.5 1994 340.4 1903 426.6 2007 519.6 1923 1530.2 2015
可重现的代码:
structure(list(jan = c(268.1, 263.1, 235.2, 223.3, 219.2, 218.3
), year = c(1928, 1948, 2008, 1877, 1995, 1990), feb = c(287.6,
241.9, 213.7, 205.1, 191.9, 191.2), year2 = c(2020, 2002, 1997,
1990, 1958, 1923), mar = c(225.3, 190.7, 187.8, 187.2, 175.9,
173.9), year3 = c(1981, 1903, 2019, 1947, 1994, 1912), apr = c(147.4,
147.1, 143.1, 138.1, 132.4, 128.4), year4 = c(1920, 1867, 1970,
1913, 1947, 2012), may = c(166.5, 161.1, 153.4, 142.1, 141.8,
139.1), year5 = c(1967, 1924, 1886, 1920, 2015, 1993), jun = c(205.6,
173.7, 172.5, 170.6, 161.1, 161.1), year6 = c(2012, 1928, 2007,
1907, 1872, 1998), jul = c(196.1, 191.4, 183.6, 180.7, 180.3,
178.9), year7 = c(1939, 1888, 1920, 1988, 2009, 1880), aug = c(260.9,
231, 209.7, 205.4, 203.9, 197.7), year8 = c(1956, 1917, 1891,
1927, 1879, 2004), sep = c(287.1, 238.1, 223.3, 208.8, 207.6,
206.2), year9 = c(1918, 1950, 1869, 1935, 1866, 1872), oct = c(286.1,
278.8, 263.8, 259.6, 249.9, 248.4), year10 = c(1967, 1903, 1954,
2000, 1938, 1870), nov = c(306.7, 260.9, 253.6, 252.6, 242.3,
235.8), year11 = c(2009, 2015, 1929, 2000, 1951, 1954), dec = c(343.1,
266, 246.8, 246.5, 238.9, 238), year12 = c(2015, 1993, 1868,
1986, 1929, 2006), win = c(688.7, 625.9, 582.1, 560, 558.4, 546.6
), year13 = c(2016, 1995, 2014, 1990, 2020, 1877), spr = c(457.3,
408.9, 375.8, 372.8, 371.8, 371.8), year14 = c(1920, 1947, 1979,
2006, 1981, 1913), sum = c(499.7, 489.5, 483.8, 468.6, 452.1,
446.9), year15 = c(1879, 2012, 1956, 1912, 1927, 2020), aut = c(673,
668.7, 580.8, 567.9, 560.8, 554.5), year16 = c(2000, 1954, 1872,
1935, 1903, 1981), ann = c(1758.2, 1691.1, 1690, 1660.7, 1648.5,
1624.6), year17 = c(1872, 1954, 2000, 1877, 1903, 2012)), row.names = c(NA,
6L), class = "data.frame")
预期输出:
jan year feb year2 mar year3 ...
205.3 2014 287.6 2020 187.8 2019 ...
203.2 2016 188.2 2014 NA NA ...
解决方案
也许,我们可以使用索引来替换相应“年份”值小于或等于 2011 年的值
i1 <- grep("^year", names(rank_data))
i2 <- i1 -1
tmp <- NA^(rank_data[i1] <= 2011)
rank_data[i2] <- rank_data[i2] * tmp
rank_data[i1] <- rank_data[i1] * tmp
如果我们要删除所有NA
与“年份”相关的列
i3 <- rep(sapply(rank_data[i2], function(x) any(!is.na(x))), each = 2)
lst1 <- lapply(rank_data[i3], function(x) x[complete.cases(x)])
mx <- max(lengths(lst1))
do.call(cbind, lapply(lst1, `length<-`, mx))
-输出
# feb year2 mar year3 apr year4 may year5 jun year6 nov year11 dec year12 win year13 sum year15 ann year17
#[1,] 287.6 2020 187.8 2019 128.4 2012 141.8 2015 205.6 2012 260.9 2015 343.1 2015 688.7 2016 489.5 2012 1624.6 2012
#[2,] NA NA NA NA NA NA NA NA NA NA NA NA NA NA 582.1 2014 446.9 2020 NA NA
#[3,] NA NA NA NA NA NA NA NA NA NA NA NA NA NA 558.4 2020 NA NA NA NA
推荐阅读
- java - 如何在java中循环sql语句
- sql - sql 中的 max 没有从数据库中得到正确的结果(使用 laravel 框架)
- amazon-web-services - PySpark:如何添加数据来自查询的列(类似于每行的子查询)
- c# - Prism 7 - 将 ConfigurationModuleCatalog 与 DirectoryModuleCatalog 合并
- node.js - 如何在在线代码编辑器中添加欢迎信息?
- javascript - Vue Cli 3 生产构建失败,而开发构建工作正常
- excel - Selenium 用于测试从网页下载到 Excel 工作簿的数据
- amazon-web-services - AWS - Lambda 和 SQS 行为
- amazon-cognito - 创建新的 Cognito 身份时触发 Lambda 函数
- laravel - 验证和电子邮件在 OctoberCMS 上不起作用