首页 > 解决方案 > R,枢轴更长,每行多个观察值

问题描述

我想我有一个与这个问题几乎相同的问题:R Pivot multiple columns from wide to long但是当我试图跟进时,我在正则表达式上无可救药地迷失了。

我也在尝试将数据转为更长,并且我还有多个要保存的列。我目前的数据:

FollowUpScans<-structure(list(study_id = c(40, 44, 49, 61, 66, 67, 68, 84, 86, 
94, 95, 101, 123, 126, 131, 153, 154, 155, 156, 161, 166, 169, 
175, 185, 199, 203, 207, 211, 217, 221, 227, 256, 257, 259, 266, 
275, 284, 301, 306, 307, 309, 313, 320, 353, 382, 392, 398, 401, 
402, 412, 415, 428, 431, 433, 434, 436), Score1 = c(3, 0, 4, 
4, NA, 0, 0, 5, 0, 0, 7, 0, 4, 0, 4, 2, 3, 1, 0, 2, 2, 0, 3, 
0, 0, 0, 9, 0, 0, 0, 6, 0, 0, 7, 5, 7, 0, 0, 8, 0, 0, 0, 5, 0, 
3, 0, 5, 0, 2, 0, 0, 0, 0, 7, 0, 2), TimeBetweenScans = structure(c(316, 
113, 335, 104, 7, 42, 30, 643, 404, 40, 171, 51, 449, 56, 104, 
79, 116, 65, 39, 1193, 142, 106, 221, 36, 125, 137, 927, 63, 
156, 32, 411, 201, 160, 166, 459, 212, 50, 312, 1627, 354, 33, 
62, 842, 174, 216, 17, 214, 24, 149, 72, 9, 13, 42, 771, 113, 
122), class = "difftime", units = "days"), Score2 = c(NA, 0, 
7, NA, NA, NA, 0, 7, NA, 5, 8, 0, NA, NA, NA, 8, NA, NA, 9, NA, 
NA, 0, 4, NA, NA, 0, 9, 2, 0, NA, NA, NA, NA, NA, NA, NA, 4, 
1, 8, NA, NA, 3, NA, 0, 8, NA, 5, NA, 7, NA, 0, 3, NA, 7, NA, 
4), TimeBetweenScans2 = structure(c(NA, 139, 660, NA, NA, NA, 
84, 1794, NA, 221, 320, 227, NA, NA, NA, 989, NA, NA, 411, NA, 
NA, 216, 474, NA, NA, 372, 1006, 429, 447, NA, NA, NA, NA, NA, 
NA, NA, 313, 530, 1706, NA, NA, 130, NA, 300, 264, NA, 268, NA, 
382, NA, 38, 138, NA, 1200, 166, 475), class = "difftime", units = "days"), 
    Score3 = c(NA, NA, NA, NA, NA, NA, 7, NA, NA, 8, NA, NA, 
    NA, NA, NA, 8, NA, NA, NA, NA, NA, 1, 4, NA, NA, 0, NA, 5, 
    0, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, NA, 
    NA, NA, NA, 5, NA, NA, NA, NA, NA, NA, 8, 0, 4), TimeBetweenScans3 = structure(c(NA, 
    NA, NA, NA, NA, NA, 467, NA, NA, 394, NA, NA, NA, NA, NA, 
    1097, NA, NA, NA, NA, NA, 266, 796, NA, NA, 941, NA, 533, 
    470, NA, NA, NA, NA, NA, NA, NA, NA, 783, NA, NA, NA, NA, 
    NA, NA, NA, NA, 388, NA, NA, NA, NA, NA, NA, 1512, 180, 640
    ), class = "difftime", units = "days"), Score4 = c(NA, NA, 
    NA, NA, NA, NA, 8, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, 5, NA, NA, NA, 1, NA, 5, 0, NA, NA, NA, NA, 
    NA, NA, NA, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA), TimeBetweenScans4 = structure(c(NA, 
    NA, NA, NA, NA, NA, 826, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, 497, NA, NA, NA, 1102, NA, 567, 1204, 
    NA, NA, NA, NA, NA, NA, NA, NA, 1574, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), class = "difftime", units = "days"), 
    Score5 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, 1, NA, 
    NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), 
    TimeBetweenScans5 = structure(c(NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 575, 
    NA, NA, NA, 1225, NA, NA, 1266, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA), class = "difftime", units = "days")), row.names = c(NA, 
-56L), class = c("tbl_df", "tbl", "data.frame"))

而不是看起来像这样的列:study_id、Score1、TimeBetweenScans、Score2、TimeBetweenScans2、Score3、TimeBetweenScans3 等。

我希望它最终看起来像:study_id,Score,Time,Occurence

“Occurence”列将只有一个 1、2、3、4 等。以显示它来自哪个列。study_id 列最好保留,因为它表明它来自哪个“人”。

任何帮助,将不胜感激!谢谢!

标签: rtidyverse

解决方案


你可以试试:

FollowUpScans %>%
  rename(TimeBetweenScans1 = TimeBetweenScans) %>%
  pivot_longer(-study_id, 
    names_to = c(".value", "Time"), 
    names_pattern = "([A-Za-z]+)([0-9]+)")

步骤是:

  • 重命名可能导致问题的列
  • pivot_longer指定列以任意数量的字符后跟任意数量的数字模式命名。您可以使用与我在这里分享的不同的正则表达式模式。例如,您可能会使用"(.*)(\\d+)"这个特定的数据集。

如果您先重命名,我会怀疑您最终会得到太多行。你应该以nrow(FollowUpScans) * 5行结束。


推荐阅读