首页 > 解决方案 > Postgresql:需要帮助用纯 SQL 替换我的编程解决方案(跨多个表的多列聚合)

问题描述

我有以下三个表。

Customer
+----+------------+-----------+
| ID | First_Name | Last_Name |
+----+------------+-----------+

Purchase
+----+-----+-------+
| ID | VIN | PRICE |
+----+-----+-------+

Part
+-----+-----------+--------------+------+
| VIN | PART_NAME | ORDER_NUMBER | COST |
+-----+-----------+--------------+------+

对于每个客户,我需要找到:

我目前在 Python 中工作,并且已经成功地使用了几个查询,以及坦率地说 hacky 的 Python 代码,以达到预期的结果。但是,我想尽可能多地使用 SQL 来做这件事,因为这是一个介绍性数据库课程的作业。

  1. 首先,我找到我们向其购买车辆的所有客户的 ID,以及从他们那里购买的车辆数量,并将其存储在一个列表 (CUSTOMER_LIST) 中。
SELECT T1.ID, COUNT(*) 
FROM CUSTOMER AS T1, PURCHASE AS T2 
WHERE T1.ID = T2.ID
GROUP BY T1.ID
  1. 对于列表中的每个 ID,我查询每个客户的名字和姓氏,并将它们连接起来,并将它们附加到 CUSTOMER_LIST。
SELECT CONCAT(FIRST_NAME, ' ', LAST_NAME)
FROM CUSTOMER
WHERE ID = (CUSTOMER_LIST[0])

该列表现在包含客户的 ID、售出的车辆数量和他们的全名。

  1. 然后我查询从客户那里购买的每辆车。
SELECT VIN
FROM PURCHASE
WHERE ID = (CUSTOMER_LIST[0])
  1. 对于从该特定客户处购买的每辆车,我检索了车辆所需的零件总数以及零件的​​总成本。
SELECT COUNT(*) FROM PART WHERE VIN = (VEHICLE_LIST[0])
SELECT SUM(COST) FROM PART WHERE VIN = (VEHICLE_LIST[0])

这些值除以购买的车辆数量,得出平均零件数量和平均零件成本。平均值附加到 CUSTOMER_LIST。

  1. 最后,我查询了从客户那里购买的车辆的平均价格。
SELECT to_char(AVG(PRICE), '9999999999D99') FROM PURCHASE WHERE ID = (CUSTOMER_LIST[0])

这些值也附加到 CUSTOMER_LIST。

这个样本数据

+---------------+------------+-----------+
|      ID       | First_Name | Last_Name |
+---------------+------------+-----------+
| S530460864050 | JOHN       | SMITH     |
+---------------+------------+-----------+

+---------------+-------------------+-------+
|      ID       |        VIN        | PRICE |
+---------------+-------------------+-------+
| S530460864050 | 1GCHG39R5W1012259 |  2500 |
| S530460864050 | 1FD0X4HT5FEB20353 |  5000 |
+---------------+-------------------+-------+

+-------------------+-----------------------------+----------------------+-------+
|        VIN        |          PART_NAME          |     ORDER_NUMBER     | COST  |
+-------------------+-----------------------------+----------------------+-------+
| 1GCHG39R5W1012259 | Spark Plug Asm              | 1FD0X4HT5FEB20353-01 | 20.84 |
| 1GCHG39R5W1012259 | Filter Asm,Oil              | 1FD0X4HT5FEB20353-01 | 58.83 |
| 1GCHG39R5W1012259 | Switch Asm-Ignition & Start | 1FD0X4HT5FEB20353-01 | 13.72 |
| 1GCHG39R5W1012259 | Bearing Asm-Front Wheel     | 1FD0X4HT5FEB20353-02 | 61.52 |
| 1GCHG39R5W1012259 | Element-Air Cleaner         | 1FD0X4HT5FEB20353-02 | 21.85 |

应导致以下结果

+------------+---------------+-------------------------+-------------------------+-----------------------+
| Full Name  | Vehicles Sold | Average Cost of Vehicle | Average Number of Parts | Average Cost of Parts |
+------------+---------------+-------------------------+-------------------------+-----------------------+
| JOHN SMITH |             2 |                 3750.00 |                     2.5 |                 35.35 |
+------------+---------------+-------------------------+-------------------------+-----------------------+

我用我的解决方案得出了正确的值,但是代码不是以非常直观或高效的方式编写的,如果可能的话,我想完全通过 SQL 来实现它。

标签: sqlpostgresql

解决方案


这应该这样做:

SELECT
  c.first_name || ' ' || c.last_name AS "Full Name", 
  COUNT(DISTINCT p.VIN) AS "Vehicles Sold", 
  AVG(p.PRICE) AS "Average Cost of Vehicle", 
  MAX(num_parts_per_vehicle * 1.00) / COUNT(DISTINCT p.VIN) AS "Average Number of Parts", 
  MAX(cost_parts_per_vehicle) AS "Average Cost of Parts"
FROM customer c
LEFT JOIN purchase p ON c.ID = p.ID -- Get car info
LEFT JOIN (
  SELECT p.id,
    -- avg parts per vehicle
    COUNT(pt.PART_NAME) / COUNT(DISTINCT p.VIN) AS num_parts_per_vehicle,
    -- avg cost per vehicle
    CAST(SUM(pt.COST) / COUNT(pt.ORDER_NUMBER) AS DECIMAL(10,2)) AS cost_parts_per_vehicle
  FROM purchase p
  INNER JOIN part pt ON p.VIN = pt.VIN
  GROUP BY 1 -- Get info per customer
) pt ON c.id = pt.id -- Get summarized parts info
GROUP BY c.first_name, c.last_name;
;

SQL小提琴

我看到的一个潜在问题是计算Average number of partsAverage cost of parts似乎计算不同。对于零件数量,您有两辆车和五个零件,平均为 2.5。对于平均成本,五个零件的总成本为 176.76 美元,平均为 35.35 美元,这意味着您只计算了一辆车。您似乎正在使用不同的逻辑来计算这两个值。


推荐阅读