首页 > 解决方案 > Oracle查询将多级多对多关系分组为独立组

问题描述

基于对学生和学科交叉引用的多级 MANY TO MANY 依赖,如何形成一个 ORACLE sql/plsql 查询来对 Student_Subject 表进行分组,从而使组独立且排他如下?我在实际表中有 519 行。我正在寻找一个高性能的解决方案

示例源表:

在此处输入图像描述

预期目标输出:

在此处输入图像描述

实际源表数据创建查询

CREATE TABLE student_subject ( student_name, subject_name ) AS
 SELECT 'SMITH', 'CHEMISTRY' FROM DUAL UNION ALL
 SELECT 'ROBIN', 'PHYSICS' FROM DUAL UNION ALL
 SELECT 'SAM', 'PHYSICS' FROM DUAL UNION ALL
 SELECT 'SAM', 'MATH' FROM DUAL UNION ALL
 SELECT 'JENNY', 'MATH' FROM DUAL UNION ALL
 SELECT 'JENNY', 'ACCOUNTS' FROM DUAL UNION ALL
 SELECT 'DON', 'ENGLISH' FROM DUAL UNION ALL
 SELECT 'DON', 'SPANISH' FROM DUAL UNION ALL
 SELECT 'RON', 'HISTORY' FROM DUAL UNION ALL
 SELECT 'JOVAN', 'HISTORY' FROM DUAL UNION ALL
 -- etc. for an additional 509 more rows

有关475 行的示例集,请参阅此 dbfiddle 。

标签: sqloracleplsqlmany-to-manyhierarchy

解决方案


我正在寻找一个高性能的解决方案

如果你想要高性能,那就忘记 SQL。使用您的(高度连接的)数据,没有办法使用CONNECT BY不会查看大量额外、不必要的数据的技巧。

这是一个简单的 PL/SQL 块,它可以很快完成。到目前为止,我相信比纯 SQL 更快。(像往常一样,如果这意味着要学习新东西,我会很高兴地吃乌鸦……)

subject_name这是通过从尚未分配给组的行中查找最低的记录来工作的。这些获得分配的下一个组号。然后,我们无限循环地寻找与新组中的行直接相关的行并将它们分配给新组。当我们找不到更多直接相关的行时,该组就完成了,我们开始下一个组。当我们无法创建任何新组时,我们就完成了。

-- Add a column to student_subject to hold the group number results
ALTER TABLE student_subject ADD ( group_number NUMBER );

-- Let's time it...
SET TIMING ON

-- Go through and assign group numbers to each record.
-- Doing it iteratively in PL/SQL means we can easily look at each row only once.
BEGIN

  UPDATE student_subject SET group_number = null;
  
  LOOP
    UPDATE student_subject ss
    SET group_number = ( SELECT nvl(max(group_number),0)+1 FROM student_subject ss2 )  -- Assign next group number
    WHERE subject_name = ( SELECT min(subject_name) FROM student_subject ss2 WHERE group_number IS NULL );
    
    EXIT WHEN SQL%ROWCOUNT = 0;  -- done..  all rows have groups now.
  
    LOOP
      UPDATE student_subject ss
      SET group_number = ( SELECT max(group_number) FROM student_subject ss2 )
      WHERE group_number is null
      AND EXISTS ( SELECT 'direct relation' FROM student_subject ss2 WHERE (ss2.subject_name = ss.subject_name OR ss2.student_name = ss.student_name ) AND ss2.group_number = ( SELECT max(group_number) FROM student_subject ss3 ) );
      
      EXIT WHEN SQL%ROWCOUNT = 0;  -- finished group.  move to next group
    END LOOP;
  END LOOP;

END;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.128

(SELECT...您可以通过用PL/SQL 变量替换更新中的表达式来进一步优化它。例如,跟踪当前组号,而不是MAX()每次都选择。我只是懒惰。


推荐阅读