r - 用二进制数据获取r中不同组合的频率
问题描述
我有一个包含二进制数据的表,如下所示:
middle-circle triangles-inside straight-rays split-rays triangle-rays grouped-rays sep-lines
1 0 0 0 1 0 1
0 1 0 1 0 0 0
0 0 0 0 0 0 0
0 1 0 1 0 0 0
0 1 0 1 0 0 0
0 0 0 0 0 0 0
0 0 1 0 0 0 0
我想知道不同组合出现的频率。我在stackoverflow上阅读了同样的问题,并将以下代码应用于我的数据:
library(gtools)
# get all vars present in each row
present <- lapply(seq(nrow(det)), function(i) names(which(det[i,] == 1)))
# get all pairs
all.pairs <- gtools::combinations(n = ncol(det), r = 2, colnames(det))
# count times pairs appear
count <- apply(all.pairs, 1, function(x){
there <- lapply(x, function(y) sapply(present, `%in%`, x = y))
sum(Reduce(`&`, there))
})
cbind(all.pairs, count)
我得到以下结果:
count
[1,] "grouped_rays" "middle_circle" "0"
[2,] "grouped_rays" "separation_lines" "0"
[3,] "grouped_rays" "split _rays" "0"
[4,] "grouped_rays" "straight_rays" "0"
[5,] "grouped_rays" "triangle_rays" "0"
[6,] "grouped_rays" "triangles_inside" "0"
[7,] "middle_circle" "separation_lines" "0"
[8,] "middle_circle" "split _rays" "0"
[9,] "middle_circle" "straight_rays" "0"
[10,] "middle_circle" "triangle_rays" "0"
[11,] "middle_circle" "triangles_inside" "0"
[12,] "separation_lines" "split _rays" "0"
[13,] "separation_lines" "straight_rays" "0"
[14,] "separation_lines" "triangle_rays" "0"
[15,] "separation_lines" "triangles_inside" "0"
[16,] "split _rays" "straight_rays" "0"
[17,] "split _rays" "triangle_rays" "0"
[18,] "split _rays" "triangles_inside" "0"
[19,] "straight_rays" "triangle_rays" "0"
[20,] "straight_rays" "triangles_inside" "0"
[21,] "triangle_rays" "triangles_inside" "0"
我的问题:是否有可能不仅得到成对的组合,而且得到所有的组合?为什么总是说“count 0”?我正在尝试获取与上述列表类似的列表,其中包含所有可能的组合以及它们发生的频率。它应该如下所示:
count
[1,] "grouped_rays" "middle_circle" "sep-lines "2"
[2,] "grouped_rays" "separation_lines" "triangles inside" "0"
[3,] "grouped_rays" "split _rays" "1"
当然,还有所有其他可能的组合。这只是一个例子。
解决方案
也许这会产生预期的结果?
tt <- do.call(rbind, apply(x==1, 1, function(y) {
z <- names(y[y])
if(length(z) > 1) t(combn(z, 2))}))
table(apply(tt, 1, function(y) paste(sort(y), collapse = " ")))
# middle.circle sep.lines middle.circle triangle.rays
# 1 1
# sep.lines triangle.rays split.rays triangles.inside
# 1 3
推荐阅读
- javascript - Firestore Timestamp 和 Google Cloud Functions 的问题(date.toDate() 不是函数)
- haskell - Haskell递归陷阱,是什么减慢了它的速度?
- javascript - 从对象数组中选择一个对象
- python - 使用 np.genfromtxt 读取文件时如何进行多次转换?
- sql-server - 连接表 - 如何汇总问题(总和)?
- python - Django 酥脆的表单没有加载 css
- php - PHP MySQL电话号码的正确数据类型?
- go - Go text/template - 动态嵌套模板
- bash - 在设置工作目录之前,如何为 Slurm 作业创建新目录?
- django - django-allauth:让用户完成所有注册步骤但阻止登录