r - r - 查找所有组合,不包括来自同一组的组合
问题描述
我有一组点。我想要一个点与其他所有点的组合,不包括与它在同一组中的点。
考虑示例数据:
Group Longitude Latitude
Group1 1.5 4.0
Group1 3.0 5.0
Group1 1.3 3.0
Group2 2.0 8.0
Group2 0.7 6.5
Group3 2.0 3.0
然后我想要的输出将是这样的:
GroupA LongitudeA LatitudeA GroupB LongitudeB LatitudeB
Group1 1.5 4.0 Group2 2.0 8.0
Group1 1.5 4.0 Group2 0.7 6.5
Group1 1.5 4.0 Group3 2.0 3.0
Group1 3.0 5.0 Group2 2.0 8.0
Group1 3.0 5.0 Group2 0.7 6.5
Group1 3.0 5.0 Group3 2.0 3.0
等等
可重现的数据:
structure(list(Group = c("Group 1", "Group 1", "Group 1", "Group 2",
"Group 2", "Group 3"), Longitude = c(1.5, 3, 1.3, 2, 0.7, 2),
Latitude = c(4, 5, 3, 8, 6.5, 3)), class = "data.frame", row.names = c(NA,
-6L), spec = structure(list(cols = list(Group = structure(list(), class =
c("collector_character",
"collector")), Longitude = structure(list(), class = c("collector_double",
"collector")), Latitude = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1), class = "col_spec"))
解决方案
我们可以通过过滤所有行的笛卡尔 SQL 连接来解决这个问题Group = Group_B
。这是对sqldf
包的简单操作。
data_a <-
structure(
list(
Group = c("Group 1", "Group 1", "Group 1", "Group 2",
"Group 2", "Group 3"),
Longitude = c(1.5, 3, 1.3, 2, 0.7, 2),
Latitude = c(4, 5, 3, 8, 6.5, 3)
),
class = "data.frame",
row.names = c(NA,-6L),
spec = structure(list(
cols = list(
Group = structure(list(), class =
c("collector_character",
"collector")),
Longitude = structure(list(), class = c("collector_double",
"collector")),
Latitude = structure(list(), class = c("collector_double",
"collector"))
),
default = structure(list(), class = c("collector_guess",
"collector")),
skip = 1
), class = "col_spec")
)
读取数据后,我们将其复制到第二个数据框中,并通过附加_B
到每个列名称来重命名第二个数据框中的列。
data_b <- data_a
colnames(data_b) <- paste(colnames(data_b),"_B",sep="")
然后我们加载sqldf
并执行一个 SQL 查询,该查询创建 和 的笛卡尔积,将 中的每一行data_a
与中的每一行data_b
匹配。该子句消除了结果集中等于的每一行。data_a
data_b
where
Group
Group_B
library(sqldf)
sqlStmt <- c('select * from data_a, data_b
where "Group" <> "Group_B"')
joined <- sqldf(sqlStmt)
...和输出:
> joined
Group Longitude Latitude Group_B Longitude_B Latitude_B
1 Group 1 1.5 4.0 Group 2 2.0 8.0
2 Group 1 1.5 4.0 Group 2 0.7 6.5
3 Group 1 1.5 4.0 Group 3 2.0 3.0
4 Group 1 3.0 5.0 Group 2 2.0 8.0
5 Group 1 3.0 5.0 Group 2 0.7 6.5
6 Group 1 3.0 5.0 Group 3 2.0 3.0
7 Group 1 1.3 3.0 Group 2 2.0 8.0
8 Group 1 1.3 3.0 Group 2 0.7 6.5
9 Group 1 1.3 3.0 Group 3 2.0 3.0
10 Group 2 2.0 8.0 Group 1 1.5 4.0
11 Group 2 2.0 8.0 Group 1 3.0 5.0
12 Group 2 2.0 8.0 Group 1 1.3 3.0
13 Group 2 2.0 8.0 Group 3 2.0 3.0
14 Group 2 0.7 6.5 Group 1 1.5 4.0
15 Group 2 0.7 6.5 Group 1 3.0 5.0
16 Group 2 0.7 6.5 Group 1 1.3 3.0
17 Group 2 0.7 6.5 Group 3 2.0 3.0
18 Group 3 2.0 3.0 Group 1 1.5 4.0
19 Group 3 2.0 3.0 Group 1 3.0 5.0
20 Group 3 2.0 3.0 Group 1 1.3 3.0
21 Group 3 2.0 3.0 Group 2 2.0 8.0
22 Group 3 2.0 3.0 Group 2 0.7 6.5
如果输出需要消除组被反转的行(即Group
值 2 与Group_B
等于 1 匹配的点),可以按如下方式执行此操作:
sqlStmt <- c('select * from data_a, data_b
where "Group" < "Group_B"')
joined <- sqldf(sqlStmt)
joined
...和输出:
> joined
Group Longitude Latitude Group_B Longitude_B Latitude_B
1 Group 1 1.5 4.0 Group 2 2.0 8.0
2 Group 1 1.5 4.0 Group 2 0.7 6.5
3 Group 1 1.5 4.0 Group 3 2.0 3.0
4 Group 1 3.0 5.0 Group 2 2.0 8.0
5 Group 1 3.0 5.0 Group 2 0.7 6.5
6 Group 1 3.0 5.0 Group 3 2.0 3.0
7 Group 1 1.3 3.0 Group 2 2.0 8.0
8 Group 1 1.3 3.0 Group 2 0.7 6.5
9 Group 1 1.3 3.0 Group 3 2.0 3.0
10 Group 2 2.0 8.0 Group 3 2.0 3.0
11 Group 2 0.7 6.5 Group 3 2.0 3.0
推荐阅读
- windows - 从远程服务器渲染 jupyter notebook 时出现奇怪的线条而不是单元格
- python - 在 PyCham 中设置工作目录似乎不起作用
- django - Django模型:如果大小写匹配,如何更新值,否则保持当前值
- javascript - 优化及时确定许可证使用的过程
- python - 为什么我的模型的性能停滞不前?
- r - 清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单清单
- sql - 关于 ON DELETE SET DEFAULT 的问题。默认值是否需要存在于引用的表中?
- bash - Shell:将字符串添加到每行的末尾,与模式匹配。文件名在另一个文件中给出
- php - 如何将 updateOrCreate 与 hasMany 关系一起使用?
- flutter - Flutter ClipPath 阴影