首页 > 解决方案 > Neo4j 笛卡尔积限制 - 查询优化

问题描述

给定以下架构

在此处输入图像描述

我需要获取Collection每个列表以及用户指定的一组ProductCard(它的Product变体)匹配标准:

我从这样的查询开始

MATCH (c:Collection {type: 'selected_collection_type'})<-[:FROM_COLLECTION]-(:Product)-[:OF_TYPE]->(pt1:ProductType {title: '1st product type'}), (c)<-[:FROM_COLLECTION]-(:Product)-[:OF_TYPE]->(pt2:ProductType {title: '2nd product type'}),(c)<-[:FROM_COLLECTION]-(:Product)-[:OF_TYPE]->(pt3:ProductType {title: '3rd product type'}), (c)<-[:FROM_COLLECTION]-(:Product)-[:OF_TYPE]->(pt4:ProductType {title: '4th product type'})
CALL apoc.cypher.run('
WITH {c} AS c, {pt1} AS pt1, {pt2} AS pt2, {pt3} AS pt3, {pt4} AS pt4
    MATCH (pt1)<-[:OF_TYPE]-(p1:Product)-[:FROM_COLLECTION]->(c), (pt2)<-[:OF_TYPE]-(p2:Product)-[:FROM_COLLECTION]->(c), (pt3)<-[:OF_TYPE]-(p3:Product)-[:FROM_COLLECTION]->(c), (pt4)<-[:OF_TYPE]-(p4:Product)-[:FROM_COLLECTION]->(c), (pc1:ProductCard)-[:VARIANT_OF]->(p1), (pc2:ProductCard)-[:VARIANT_OF]->(p2), (pc3:ProductCard)-[:VARIANT_OF]->(p3), (pc4:ProductCard)-[:VARIANT_OF]->(p4)
    WHERE (pc1.price + pc2.price + pc3.price + pc4.price < price_margin_for_set)
    RETURN pc1, pc2, pc3, pc4, (p1.weight + p2.weight + p3.weight + p4.weight) AS sweight ORDER BY sweight DESC LIMIT 1
', {c:c, pt1:pt1, pt2:pt2, pt3:pt3, pt4:pt4}) YIELD value
RETURN c, value ORDER BY value.sweight DESC LIMIT 8;

它适用于多达 3 种选定的产品类型,但是当我添加第 4 种产品类型时,速度会急剧下降。这里的问题是我只需要从子查询返回的 1 个集合,但是从所有产品变体(Product可以有 1..~10 ProductCard)计算的笛卡尔积对于 4 种类型来说相当大。

如何优化此查询以获得从子查询返回 1 组匹配价格标准所需的性能/减少变化计数?

这是EXPLAIN 解释

编辑:稍微改变了查询

WITH ['Product Type 1', 'Product Type 2', 'Product Type 3', 'Product Type 4'] as types
MATCH (c:Collection)<-[:FROM_COLLECTION]-(:Product)-[:OF_TYPE]->(pt:ProductType)
WHERE pt.title in types AND c.type = 'collection type'
WITH c, size(types) as inputCnt, count(DISTINCT pt) as cnt
WHERE cnt = inputCnt
CALL apoc.cypher.run('
WITH {c} AS c 
MATCH (c)<-[:FROM_COLLECTION]-(p1:Product)-[:OF_TYPE]->(:ProductType {title: "Product Type 1"})
MATCH (pc1:ProductCard)-[:VARIANT_OF]->(p1)
MATCH (c)<-[:FROM_COLLECTION]-(p2:Product)-[:OF_TYPE]->(:ProductType {title: "Product Type 2"})
MATCH (pc2:ProductCard)-[:VARIANT_OF]->(p2)
MATCH (c)<-[:FROM_COLLECTION]-(p3:Product)-[:OF_TYPE]->(:ProductType {title: "Product Type 3"})
MATCH (pc3:ProductCard)-[:VARIANT_OF]->(p3)
MATCH (c)<-[:FROM_COLLECTION]-(p4:Product)-[:OF_TYPE]->(:ProductType {title: "Product Type 4"})
MATCH (pc4:ProductCard)-[:VARIANT_OF]->(p4)
WHERE (pc1.price + pc2.price + pc3.price + pc4.price < 1000)
RETURN pc1, pc2, pc3, pc4, (p1.weight + p2.weight + p3.weight + p4.weight) AS sweight ORDER BY sweight DESC LIMIT 1
', {c:c}) YIELD value
RETURN DISTINCT c, value LIMIT 8;

解释

整个查询

子查询说明

子查询

标签: graphneo4jcypher

解决方案


打破笛卡尔积的最简单方法是添加逻辑,使一组明显胜过其他组。例子

WHERE p1.value > p2.value > p3.value > p4.value

但是,此查询的性质仍然会导致许多集具有微小的微不足道的变化。我会改为 return WHERE p1.value < 1000 WITH COLLECT(p1) as possible,并处理其余的客户端以最小化数据传输。(如果你真的想要,你可以在收集时按值对 p1 排序,然后对最小的 3 求和,然后过滤掉任何值大于1000-smallest_set但感觉有点过分的东西)


推荐阅读