首页 > 解决方案 > 使用条件将行合并为一个,并将一行中的值替换为另一行中的值

问题描述

我在 R 中有一个如下所示的数据集:

A <- c("X", "Y", "Z", "W", "U")
B <- c("apple", "pear", "apple", "pear", "pear")
C <- c("december", "december" ,"June", "june", "march")
D <- c("Winter", "Summer" ,"Winter", "Summer", "Summer")
df <- data.frame(A,B,C,D);df

  A     B        C      D
1 X apple december Winter
2 Y  pear december Summer
3 Z apple     June Winter
4 W  pear     june Summer
5 U  pear    march Summer

我想逐列合并 C 行(将第 1 行与第 2 行混合,将第 3 行与第 4 行混合)但我也想替换 B 行中的值,同时考虑 D 列。基本上,当 2 个值在C(例如“十二月”),当D是“夏天”(“梨”)时,B中的值总是被D是“冬天”(苹果)时B中的值替换我想在最后像这样的数据框:

  A     B        C             D
1 X apple december Winter,Summer
2 Z apple     june Winter,Summer
3 U  pear    march        Summer

当合并 2 行时,我真的想保留 D 列中的 2 个值。

有人有想法吗?

标签: rconditional-statementsrow

解决方案


一个data.table选项

setDT(df)[
  ,
  c(
    lapply(
      setNames(.(A, B), c("A", "B")),
      function(x) if ("Winter" %in% D) replace(x, D == "Summer", x[D == "Winter"]) else x
    ),
    .(D = D)
  ),
  C
][
  ,
  lapply(.SD, function(x) toString(unique(x))),
  C
][,
  .SD,
  .SDcols = names(df)
]

   A     B        C              D
1: X apple december Winter, Summer
2: Z apple     june Winter, Summer
3: U  pear    march         Summer

数据

> dput(df)
structure(list(A = c("X", "Y", "Z", "W", "U"), B = c("apple",
"pear", "apple", "pear", "pear"), C = c("december", "december",
"june", "june", "march"), D = c("Winter", "Summer", "Winter",
"Summer", "Summer")), class = "data.frame", row.names = c(NA,
-5L))

推荐阅读