首页 > 解决方案 > 按日期重新排列列

问题描述

我有一个包含 10 个日期列的数据集,但有些日期不按顺序排列。也就是说,变量date1应该有最早的日期,date2应该有第二个最早的日期,...,date10应该有最晚的日期。我之前编写了两个嵌套的 for 循环并利用包中的nth函数Rfast来完成此操作,但我收到与Rcpp包相关的错误并且无法修复它。有没有更有效的方法来做这样的事情?

这是我的数据集的示例。如您所见,第 5 次观察的日期不按顺序排列。TloBankruptcy4FileDate有最早的日期,所以它的值应该给TloBankruptcy1FileDate. 下一个最早日期目前在 中TloBankruptcy3FileDate,但应该分配给TloBankruptcy2FileDate

我想要一个仍然有 10 行和 10 列的数据集,但是应该相应地分配每个变量的值。

我希望我是清楚的。谢谢!

structure(list(TloBankruptcy1FileDate = structure(c(NA, NA, NA, 
NA, 14992, 16764, NA, NA, NA, NA), format.sas = "MMDDYY", class = "Date"), 
    TloBankruptcy2FileDate = structure(c(NA, NA, NA, NA, 14713, 
    10101, NA, NA, NA, NA), format.sas = "MMDDYY", class = "Date"), 
    TloBankruptcy3FileDate = structure(c(NA, NA, NA, NA, 12892, 
    NA, NA, NA, NA, NA), format.sas = "MMDDYY", class = "Date"), 
    TloBankruptcy4FileDate = structure(c(NA, NA, NA, NA, 9282, 
    NA, NA, NA, NA, NA), format.sas = "MMDDYY", class = "Date"), 
    TloBankruptcy5FileDate = structure(c(NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), format.sas = "MMDDYY", class = "Date"), 
    TloBankruptcy6FileDate = structure(c(NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), format.sas = "MMDDYY", class = "Date"), 
    TloBankruptcy7FileDate = structure(c(NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), format.sas = "MMDDYY", class = "Date"), 
    TloBankruptcy8FileDate = structure(c(NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), format.sas = "MMDDYY", class = "Date"), 
    TloBankruptcy9FileDate = structure(c(NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), format.sas = "MMDDYY", class = "Date"), 
    TloBankruptcy10FileDate = structure(c(NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_), format.sas = "MMDDYY", class = "Date")), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

标签: rtidyversedata-wrangling

解决方案


df %>%
  rowid_to_column() %>%
  pivot_longer(-rowid) %>%
  group_by(rowid) %>%
  arrange(value) %>%
  mutate(name = str_c("f", row_number())) %>%
  pivot_wider() %>%
  ungroup() %>%
  arrange(rowid)
#> # A tibble: 10 x 11
#>    rowid f1         f2         f3         f4         f5     f6    
#>    <int> <date>     <date>     <date>     <date>     <date> <date>
#>  1     1 NA         NA         NA         NA         NA     NA    
#>  2     2 NA         NA         NA         NA         NA     NA    
#>  3     3 NA         NA         NA         NA         NA     NA    
#>  4     4 NA         NA         NA         NA         NA     NA    
#>  5     5 1995-06-01 2005-04-19 2010-04-14 2011-01-18 NA     NA    
#>  6     6 1997-08-28 2015-11-25 NA         NA         NA     NA    
#>  7     7 NA         NA         NA         NA         NA     NA    
#>  8     8 NA         NA         NA         NA         NA     NA    
#>  9     9 NA         NA         NA         NA         NA     NA    
#> 10    10 NA         NA         NA         NA         NA     NA    
#> # ... with 4 more variables: f7 <date>, f8 <date>, f9 <date>, f10 <date>

推荐阅读