首页 > 解决方案 > 如何获取每天结果集的百分比?

问题描述

我正在尝试检索过去 30 天内特定商家的可用产品百分比。

期望的结果示例:

20210504 merchant1 20%
20210504 merchant2 30%
20210505 merchant1 25%
20210505 merchant2 35%

有3张表:

  1. 可用性(包含每个产品和商家和日期的可用性信息)
  2. 产品(我们要过滤的制造商 ID 所在的位置)
  3. 商户(商户信息)

最小示例:https ://www.db-fiddle.com/f/wtnK5R4DWi7Dy6LwLaP4mX/0

这仅返回一个商家和一天的百分比:

-- get percentage of available products per merchant over time
SELECT
   m.name AS metric,
   t.s AS AMOUNT_AVAILABLE,
   count(*) AS AMOUNT_TOTAL,
   t.s / count(*) AS percentage
FROM availability p
CROSS JOIN (
    SELECT count(*) AS s FROM availability p2
    INNER JOIN products mp on p2.SKU = mp.SKU
    WHERE
            availability = 'sofort lieferbar'
        AND date = curdate() - interval 1 day -- testing for one day, but we want a time series
        AND mp.MANUFACTURER_ID = 1
        -- AND p2.merchant_id = p.merchant_id -- does not work
        -- AND merchant_id = 2
        -- GROUP BY merchant_id
    )  t
INNER JOIN products mp on p.SKU = mp.SKU
INNER JOIN merchants m ON m.id = p.MERCHANT_ID

WHERE
        p.date = curdate() - interval 1 day 
    and mp.MANUFACTURER_ID = 1
    -- and merchant_id = 2
GROUP BY
    merchant_id

现在我正在尝试以某种方式将交叉连接与 from 表合并,以便获取每个商家和日期的信息。交叉连接如何与 from 表连接?

数据和Shema:

create table merchants
(
    id           tinyint unsigned                                      not null
        primary key,
    name         varchar(255)                                          null
);
 INSERT INTO merchants (id, name) VALUES (1, 'Amazon');
 INSERT INTO merchants (id, name) VALUES (2, 'eBay');
 
 
 create table availability
(
    DATE         date                                                                                not null,
    SKU          char(10)                                                                            not null,
    merchant_id  tinyint unsigned                                                                    not null,
    availability enum ('sofort lieferbar', 'verzögert lieferbar', 'nicht lieferbar', 'außer Handel') null,
    constraint DATE
        unique (DATE, SKU, merchant_id)
);

INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-11', '1', 1, 'sofort lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-11', '1', 2, 'nicht lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-12', '1', 1, 'sofort lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-12', '1', 2, 'nicht lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-13', '1', 1, 'nicht lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-13', '1', 2, 'sofort lieferbar');


create table products
(
    SKU               char(8)                                      not null
        primary key,
    NAME              varchar(255)                                 null,
    MANUFACTURER_ID   mediumint unsigned                           null,
    updated           datetime default CURRENT_TIMESTAMP           not null on update CURRENT_TIMESTAMP
);

INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('1', 'Sneaker', 1, '2021-05-12 02:27:46');
INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('2', 'Ball', 1, '2021-05-12 02:27:46');
INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('3', 'Pen', 2, '2021-05-12 02:27:46');
INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('4', 'Paper', 2, '2021-05-12 02:27:46');

标签: mysql

解决方案


我写了一个查询,它似乎适用于您提供的数据。让我知道是否有任何问题,我会看看我能做些什么。

   SELECT
       concat('merchant', t.ID) as merchant,
       t.Date,
       g.prod_available / t.all_prod_from_merch AS percentage_available 
    FROM
       (
          SELECT
             # gets total number of products in time range Date,
             ID,
             count(merchant_ID) AS all_prod_from_merch 
          FROM
             merchants m 
             JOIN
                availability a 
                ON m.ID = a.merchant_ID 
          WHERE
             Date < curdate() 
             AND Date >= curdate() - interval 10 day 
          GROUP BY
             merchant_ID,
             Date
       )
       t 
       LEFT JOIN
          (
             SELECT
                merchant_ID,
                Date,
                COUNT(merchant_ID) AS prod_available 
             FROM
                availability 
             WHERE
                AVAILABILITY = 'sofort lieferbar' 
                AND date IN 
                (
                   SELECT
                      Date 
                   FROM
                      availability 
                   WHERE
                      date < curdate() 
                      AND date >= curdate() - interval 10 day 
                   GROUP BY
                      Date
                )
             GROUP BY
                merchant_ID,
                Date 
          )
          g 
          ON g.merchant_ID = t.ID 
          AND g.Date = t.Date 
    ORDER BY
       t.date;

join中的第一个select获取每个商家在时间范围内的产品总数。第二个从每个商家那里获得可用的。所以开头的选择只做分数。


推荐阅读