首页 > 解决方案 > 从 R 中现有的多列创建一个新列

问题描述

有人可以指导我如何在 R 中创建新的 4 个变量吗?我想从 R 中的以下数据创建新的 4 个变量,例如;

data$VarApple = var1 through var6 = "apple"
data$varBerry = var1 through var6 = "berry"
data$varPear = var1 through var6 = "pear"
data$varBanana = var1 through var6 = "banana"


data = data.frame(var1 = c("apple","pear","berry","apple","pear","banana","berry"),
       var2 = c("banana","apple","berry","apple","banana","banana","berry"),
       var3 = c("berry","pear","pear","apple","berry","banana","apple"),
       var4 = c("apple","banana","apple","pear","berry","pear","berry"),
       var5 = c("banana","pear","pear","apple","apple","banana","berry"),
       var6 = c("pear","berry","apple","apple","banana","banana","apple"))

标签: r

解决方案


我们可以使用在ed 数据集和原始数据table上创建频率计数unlistcbind

cbind(data, +(table(seq_len(nrow(data))[row(data)], unlist(data)) >0))
#      var1   var2   var3   var4   var5   var6 apple banana berry pear
#1  apple banana  berry  apple banana   pear     1      1     1    1
#2   pear  apple   pear banana   pear  berry     1      1     1    1
#3  berry  berry   pear  apple   pear  apple     1      0     1    1
#4  apple  apple  apple   pear  apple  apple     1      0     0    1
#5   pear banana  berry  berry  apple banana     1      1     1    1
#6 banana banana banana   pear banana banana     0      1     0    1
#7  berry  berry  apple  berry  berry  apple     1      0     1    0

mtabulateqdapTools

library(qdapTools)
cbind(data, +(mtabulate(as.data.frame(t(data))) > 0))

或上述的变体

cbind(data, +(mtabulate(asplit(data, 1)) > 0))

或者一个选项cSplit

library(tidyr)
library(splitstackshape)
data %>%
  unite(newcol, everything(), remove = FALSE) %>%
  cSplit_e('newcol', '_', mode = 'binary', type = 'character', drop = TRUE)

推荐阅读