首页 > 解决方案 > 如何在分区内比较 MySql 中所有可能的值?

问题描述

我正在尝试比较每个零售商交易的每个用户的价值。这是输入示例表:

user_id|retailer_id|amount_spent
1      |2          |30
1      |2          |10
1      |2          |28

现在,如果在所有购买中花费的金额在 30% 以内,我想比较同一零售商的每个不同用户。假设第一笔交易和第二笔交易的花费金额相差 67%(30 美元和 10 美元),高于 30% 的阈值。但是,与第一行的 30 美元相比,支出为 28 美元的第三行在 30% 范围内。因此,这两个事务将满足标准,即第 1 行和第 3 行的比较。

当前查询:

   select distinct a.customer_id, a.purchase_date 

from 
(
select 
  customer_id,
  retailer,
  purchase_date,
  purchase_amount,
  Lag(purchase_amount) over (partition by customer_id,retailer) as previous_amt

  from tbl
)a 

where abs(a.purchase_amount-a.previous_amt)/a.purchase_amount <=0.3

Outout 会给我空行,因为它正在比较顺序交易金额。但是,不考虑第 1 行和第 3 行满足条件,因此,它将返回这 2 行。

如何从这里调整我的查询?

标签: mysqlsql

解决方案


考虑以下...

DROP TABLE IF EXISTS my_table;

CREATE TABLE my_table
(id SERIAL PRIMARY KEY
,user_id INT NOT NULL 
,retailer_id INT NOT NULL
,amount_spent INT NOT NULL
);

INSERT INTO my_table (user_id,retailer_id,amount_spent) VALUES
(1,2,30),
(1,2,10),
(1,2,28),
(1,3,10),
(1,3,40),
(2,1,20);

以下查询将向我们显示在另一个行的 30% 范围内没有其他 (user_id,retailer) 组合的所有行(我的算术或逻辑可能略有偏差,但希望你明白)......

SELECT a.*
  FROM my_table a
  LEFT 
  JOIN 
     ( SELECT y.*
         FROM my_table x
         JOIN my_table y 
           ON y.id <> x.id
          AND y.user_id = x.user_id
          AND y.retailer_id = x.retailer_id
          AND y.amount_spent BETWEEN x.amount_spent * 0.3 AND x.amount_spent * 1.3
     ) b
    ON b.id = a.id
 WHERE b.id IS NULL;

   +----+---------+-------------+--------------+
   | id | user_id | retailer_id | amount_spent |
   +----+---------+-------------+--------------+
   |  4 |       1 |           3 |           10 |
   |  5 |       1 |           3 |           40 |
   |  6 |       2 |           1 |           20 |
   +----+---------+-------------+--------------+

如有必要,我们可以进一步细化如下

 SELECT a.user_id
      , a.retailer_id
   FROM my_table a
   LEFT 
   JOIN 
      ( SELECT y.*
          FROM my_table x
          JOIN my_table y 
            ON y.id <> x.id
           AND y.user_id = x.user_id
           AND y.retailer_id = x.retailer_id
           AND y.amount_spent BETWEEN x.amount_spent * 0.3 AND x.amount_spent * 1.3
      ) b
     ON b.id = a.id
  WHERE b.id IS NULL
  GROUP 
     BY a.user_id
      , a.retailer_id 
 HAVING COUNT(*) > 1;

  +---------+-------------+
  | user_id | retailer_id |
  +---------+-------------+
  |       1 |           3 |
  +---------+-------------+

推荐阅读