首页 > 解决方案 > 使用一些约束循环所有可能的列组合

问题描述

xx 是样本数据。它包含变量 dep1、dep2、dep3、bet1、bet2、bet3。我想选择所有可能的 2 列组合,但不选择具有相同名称的组合 (except,number) 。在这个例子中,有 9 个这样的组合 {dep1:bet1,dep1:bet2,dep1:bet3,dep2:bet1.......}

下面是我想为所有组合运行的代码(我只为一个组合运行)也在最后一行我添加了一个代码来跟踪计算中包含哪些变量。我相信正则表达式将有助于理解。帮助表示赞赏!

xx<-data.frame(id=1:10,
               category=c(rep("A",5),rep("B",5)),
               dep1=sample(1:5,10,replace = T),
               dep2=sample(1:5,10,replace = T),
               dep3=sample(1:5,10,replace = T),
               bet1=sample(1:5,10,replace = T),
               bet2=sample(1:5,10,replace = T),
               bet3=sample(1:5,10,replace = T))

xx%>%select(2,dep1,bet1)%>%
  mutate(vdep=if_else(dep1>3,1,0),
        vbet=if_else(bet1>3,1,0))%>%
  group_by(category)%>%
  summarise(vdep=mean(vdep),
            vbet=mean(vbet))%>%ungroup()%>%
  gather(variable,value,-category)%>%
  mutate(variable=as.factor(variable))%>%
  unite(variable,category,col = "new")%>%
  spread(new,value)%>%
  mutate(first="dep1",second="bet1")

标签: rloopstidyverse

解决方案


如果我理解正确,应该执行以下操作:

# the data 
xx<-data.frame(id=1:10,
               category=c(rep("A",5),rep("B",5)),
               dep1=sample(1:5,10,replace = T),
               dep2=sample(1:5,10,replace = T),
               dep3=sample(1:5,10,replace = T),
               bet1=sample(1:5,10,replace = T),
               bet2=sample(1:5,10,replace = T),
               bet3=sample(1:5,10,replace = T))

# Getting the column names with "dep" or "bet"
cols = names(xx)[grepl("dep|bet", names(xx))]
deps = cols[grepl("dep", cols)]
bets = cols[grepl("bet", cols)]

# Getting all possible combinations of these columns
comb = expand.grid(deps, bets)
comb

#   Var1 Var2
# 1 dep1 bet1
# 2 dep2 bet1
# 3 dep3 bet1
# 4 dep1 bet2
# 5 dep2 bet2
# 6 dep3 bet2
# 7 dep1 bet3
# 8 dep2 bet3
# 9 dep3 bet3

# Transposing the dataframe containing these combinations, so that
# we can directly use sapply / lapply on the columns latter
comb = data.frame(t(comb), stringsAsFactors = FALSE)

# For each combination, subset the dataframe xx
result = sapply(comb, function(x){
  xx[, x]
}, simplify = FALSE)

result

# $X1
#     dep1 bet1
# 1     1    5
# 2     1    5
# 3     2    2
# 4     2    2
# 5     1    5
# 6     3    3
# 7     1    1
# 8     2    2
# 9     3    2
# 10    1    5
# 
# $X2
#     dep2 bet1
# 1     1    5
# 2     2    5
# 3     4    2
# 4     5    2
# 5     1    5
# 6     5    3
# 7     2    1
# 8     1    2
# 9     4    2
# 10    4    5
# 
# $X3
#     dep3 bet1
# 1     3    5
# 2     2    5
# 3     4    2
# 4     3    2
# 5     3    5
# 6     2    3
# 7     1    1
# 8     4    2
# 9     5    2
# 10    5    5
# 
# $X4
#     dep1 bet2
# 1     1    5
# 2     1    1
# 3     2    1
# 4     2    2
# 5     1    2
# 6     3    2
# 7     1    3
# 8     2    3
# 9     3    5
# 10    1    1
# 
# $X5
#     dep2 bet2
# 1     1    5
# 2     2    1
# 3     4    1
# 4     5    2
# 5     1    2
# 6     5    2
# 7     2    3
# 8     1    3
# 9     4    5
# 10    4    1
# 
# $X6
#      dep3 bet2
# 1     3    5
# 2     2    1
# 3     4    1
# 4     3    2
# 5     3    2
# 6     2    2
# 7     1    3
# 8     4    3
# 9     5    5
# 10    5    1
# 
# $X7
#      dep1 bet3
# 1     1    3
# 2     1    2
# 3     2    5
# 4     2    1
# 5     1    3
# 6     3    2
# 7     1    4
# 8     2    1
# 9     3    1
# 10    1    3
# 
# $X8
#     dep2 bet3
# 1     1    3
# 2     2    2
# 3     4    5
# 4     5    1
# 5     1    3
# 6     5    2
# 7     2    4
# 8     1    1
# 9     4    1
# 10    4    3
# 
# $X9
#     dep3 bet3
# 1     3    3
# 2     2    2
# 3     4    5
# 4     3    1
# 5     3    3
# 6     2    2
# 7     1    4
# 8     4    1
# 9     5    1
# 10    5    3

推荐阅读