首页 > 解决方案 > Oracle sql:过滤仅相差很小时间的“假”(重复)行

问题描述

几天前我发布了一个类似的问题,但不幸的是,我误解了客户对我必须解决的问题的陈述(最准确地说,我忘记了其中的一部分)。这就是为什么这篇文章看起来像是重复的,但事实并非如此。

这是我的原始帖子:Oracle sql:过滤仅相差很小的重复行

所以让我们再来一次:

我有一个 Oracle 表,其中包含由 parquimeters 触发的事件警报。警报具有打开/关闭状态,当警报处于打开 (PKN_EVENTSTATUS='Open') 和关闭 (PKN_EVENTSTATUS='Close') 之间的一小段时间 -RECEIVEDDATE- (例如 Open RECEIVEDDATE = x | Close RECEIVEDDATE = x + 30 秒 - 当然 - PKN_EVENTNAME 是相同的)这两个事件行(打开/关闭)都被视为“假”警报(由 parquimeter 因“错误”而触发的“故障警报”)所以两者都应该被删除。

我需要创建一个 Oracle SQL 查询来选择所有“假”警报,以便我可以删除它们。同样,在“打开”和“关闭”状态之间具有微小时间差异 (RECEIVEDDATE) 的警报。

我开始创建一个可能有效的查询,但它非常慢,所以我什至无法测试它,因为它需要太长时间。我很确定它可以优化,但现在找不到如何优化,所以我希望任何人都可以帮助我。

我当前非常慢的查询:

select t1.ID, t1.PKN_EVENTNAME, t1.PKN_EVENTSTATUS, t1.RECEIVEDDATE
from PARQUIMETERS_ALARMS t1
where
  exists
     (select 'x'
      from   PARQUIMETERS_ALARMS t2
      where  t1.PKN_EVENTNAME = t2.PKN_EVENTNAME
             and ((t1.PKN_EVENTSTATUS = 'Open' and t2.PKN_EVENTSTATUS = 'Close'
             and abs(t1.RECEIVEDDATE - t2.RECEIVEDDATE) * 24 * 60 * 60 < 30) -- < 30 sec
             or (t1.PKN_EVENTSTATUS = 'Close' and t2.PKN_EVENTSTATUS = 'Open'
             and abs(t2.RECEIVEDDATE - t1.RECEIVEDDATE) * 24 * 60 * 60 < 30))) -- < 30 sec

标签: sqloracleoracle11gsubqueryquery-performance

解决方案


exists使用两个条件而不是一个条件可能会更快:

select t1.id, t1.pkn_eventname, t1.pkn_eventstatus, t1.receiveddate
from parquimeters_alarms t1
where
    (
        t1.pkn_eventstatus = 'Open'
        and exists (
            select 1
            from parquimeters_alarms t2
            where 
                t2.pkn_eventname = t1.pkn_eventname 
                and t2.pkn_eventstatus = 'Close'
                and t2.receiveddate < t1.receiveddate + 30 / 60 / 60 / 24
        )
    )
    or (
        t1.pkn_eventstatus = 'Close'
        and exists (
            select 1
            from parquimeters_alarms t2
            where 
                t2.pkn_eventname = t1.pkn_eventname 
                and t2.pkn_eventstatus = 'Open'
                and t1.receiveddate < t2.receiveddate + 30 / 60 / 60 / 24
        )
    )
    

此查询可能会利用(pkn_eventname, pkn_eventstatus, receiveddate).

您还可以考虑union all,这避免了对or条件的需要:

select t1.id, t1.pkn_eventname, t1.pkn_eventstatus, t1.receiveddate
from parquimeters_alarms t1
where
    t1.pkn_eventstatus = 'Open'
    and exists (
        select 1
        from parquimeters_alarms t2
        where 
            t2.pkn_eventname = t1.pkn_eventname 
            and t2.pkn_eventstatus = 'Close'
            and t2.receiveddate < t1.receiveddate + 30 / 60 / 60 / 24
    )
union all
select t1.id, t1.pkn_eventname, t1.pkn_eventstatus, t1.receiveddate
from parquimeters_alarms t1
where 
    t1.pkn_eventstatus = 'Close'
    and exists (
        select 1
        from parquimeters_alarms t2
        where 
            t2.pkn_eventname = t1.pkn_eventname 
            and t2.pkn_eventstatus = 'Open'
            and t1.receiveddate < t2.receiveddate + 30 / 60 / 60 / 24
    )

推荐阅读