首页 > 解决方案 > 针对多个值传播单个键

问题描述

我的数据框如下:实际数据运行成数百行和列

这里的目标是针对每个列 V1、V2、...VN 传播“属性值”。即出现在 V1 列中的日期,应该分散到列名中并且对应的“属性值”应该出现在下面的每个列中

              df1 <- data.frame(ROW_ID = c("23416","23416","23416"),
              Process_ID = c("SLT","SLT","SLT"),
              Operation_Code = c("SLT","SLT","SLT"),
              Resource_Group_Code = c("BD","BD","BT"),
              Location_Code = c("JS","JS","JS"),
              Resource_Code = c("B-T234","B-T234","B-T234"),
              Resource_Desc = c("699","699","699"),
              iDeleteFlag = c("N","N","N"),
              Attribute_Code = c("RA002","RA002","RA002"),
              Attribute_Value = c("266","269","298"),
              Capacity_Type = c("s","s","s"),
              Planning_Version = c("PDMT","PDMT","PDMT"),
             "V1"= c("2021-10-10", "2021-10-31", "2021-11-07"),
             "V2"= c("2021-10-17", "", "2021-11-14"),
             "V3" = c("2021-10-24", "", "2021-11-21"),
             "V4" = c("", "2021-11-07", ""),
             "V5" = c("", "2021-11-21", ""))

所需的输出如下:

              df2 <- data.frame(ROW_ID = c("23416","23416","23416"),
              Process_ID = c("SLT","SLT","SLT"),
              Operation_Code = c("SLT","SLT","SLT"),
              Resource_Group_Code = c("BD","BD","BT"),
              Location_Code = c("JS","JS","JS"),
              Resource_Code = c("B-T234","B-T234","B-T234"),
              Resource_Desc = c("699","699","699"),
              iDeleteFlag = c("N","N","N"),
              Attribute_Code = c("RA002","RA002","RA002"),
                               Capacity_Type = c("s","s","s"),
              Planning_Version = c("PDMT","PDMT","PDMT"),
              "2021-10-10"= c("266", "", ""),
              "2021-10-17"= c("266", "", ""),
              "2021-10-24" = c("266", "", ""),
              "2021-10-31" = c("", "269", ""),
              "2021-11-07" = c("", "269", "298"),
              "2021-11-14" = c("", "", "298"),
              "2021-11-21" = c("", "269", "298"))

我的代码如下:我的代码没有给出所需的输出

                       RA002variable_2021ANeg <- gather(RA002variable_2021ANeg, key, value, - 
                       ROW_ID, - Process_ID, - Operation_Code, - Resource_Group_Code, -
                       Location_Code, - Resource_Code, - Resource_Desc, -
                       iDeleteFlag, - Attribute_Code1, - Capacity_Type, -
                       Planning_Version, -Attribute_Value) %>%
                       mutate(key =(  Attribute_Value)) %>%
                       select(- Attribute_Value) %>%
                       spread(key, value)

标签: r

解决方案


收集和传播已替换为pivot_longerpivot_wider。虽然收集和传播仍在工作,但最好我们都习惯了新功能。

由于您ROW_ID的每一行都不是唯一的,因此我创建了额外的索引列(只是行号),收集列 V1:V5 并将日期分布到列名中:

df1 %>%
  mutate(index = row_number()) %>%
  pivot_longer(V1:V5, names_to = "name", values_to = "value") %>%
  select(-name) %>%
  filter(value != "") %>%
  pivot_wider(names_from = "value", values_from = "Attribute_Value")

推荐阅读