首页 > 解决方案 > 比较列值并返回 ERROR 或 OK

问题描述

我需要用目标系统验证源系统并确保它们之间的值匹配。问题是源系统一团糟,而且很难验证。

我有以下示例数据,它们应该都可以,但它们显示为错误。有谁知道进行比较的方法,这将导致以下所有人都可以?

CREATE TABLE #testdata (
    ID INT
    ,ValueSource VARCHAR(800)
    ,ValueDestination VARCHAR(800)
    ,Value_Varchar_Check AS (
        CASE 
            WHEN coalesce(ValueSource, '0') = coalesce(ValueDestination, '0')
                THEN 'OK'
            ELSE 'ERROR'
            END
        )
    )

INSERT INTO #testdata (
    ID
    ,ValueSource
    ,ValueDestination
    )
SELECT 1
    ,'hepatitis c,other (specify)' 'hepatitis c, other (specify)'    
UNION ALL    
SELECT 2
    ,'lung problems / asthma,lung problems / asthma'
    ,'lung problems / asthma'    
UNION ALL    
SELECT 3
    ,'lung problems / asthma,diabetes'
    ,'diabetes, lung problems / asthma'    
UNION ALL    
SELECT 4
    ,'seizures/epilepsy,hepatitis c,seizures/epilepsy'
    ,'hepatitis c, seizures/epilepsy'

标签: sqlsql-servertsql

解决方案


我认为您不能将其写为生成的列,因为计算起来非常棘手。如果您使用的是 SQL Server 2016 或更高版本,则可以使用STRING_SPLITValueSourceandValueDestination值转换为表,然后使用如下查询按字母顺序对它们进行排序:

SELECT DISTINCT ID, TRIM(value) AS value,
       DENSE_RANK() OVER (PARTITION BY ID ORDER BY TRIM(value)) AS rn
FROM testdata
CROSS APPLY STRING_SPLIT(ValueSource, ',')

对于ValueSource,这会产生:

ID  value                   rn
1   hepatitis c             1
1   other (specify)         2
2   lung problems / asthma  1
3   diabetes                1
3   lung problems / asthma  2
4   hepatitis c             1
4   seizures/epilepsy       2

然后,您可以FULL OUTER JOIN在 , 和 , 上的这两个表IDvaluern在任一侧存在空值时检测错误(因为这意味着给定值IDrn不匹配):

WITH t1 AS (
  SELECT DISTINCT ID, TRIM(value) AS value,
         DENSE_RANK() OVER (PARTITION BY ID ORDER BY TRIM(value)) AS rn
  FROM testdata
  CROSS APPLY STRING_SPLIT(ValueSource, ',')
),
t2 AS (
  SELECT DISTINCT ID, TRIM(value) AS value,
         DENSE_RANK() OVER (PARTITION BY ID ORDER BY TRIM(value)) AS rn
  FROM testdata
  CROSS APPLY STRING_SPLIT(ValueDestination, ',')
)
SELECT COALESCE(t1.ID, t2.ID) AS ID,
       CASE WHEN COUNT(CASE WHEN t1.value IS NULL OR t2.value IS NULL THEN 1 END) > 0 THEN 'Error'
            ELSE 'OK'
       END AS Status
FROM t1
FULL OUTER JOIN t2 ON t2.ID = t1.ID AND t2.rn = t1.rn AND t2.value = t1.value
GROUP BY COALESCE(t1.ID, t2.ID)

输出(用于您的样本数据):

ID  Status
1   OK
2   OK
3   OK
4   OK

SQLFiddle 上的演示

然后,您可以使用上面的整个查询作为CTE(调用它t3)来更新您的原始表:

UPDATE t
SET t.Value_Varchar_Check = t3.Status
FROM testdata t
JOIN t3 ON t.ID = t3.ID

输出:

ID  ValueSource                                         ValueDestination                    Value_Varchar_Check
1   hepatitis c,other (specify)                         hepatitis c, other (specify)        OK
2   lung problems / asthma,lung problems / asthma       lung problems / asthma              OK
3   lung problems / asthma,diabetes                     diabetes, lung problems / asthma    OK
4   seizures/epilepsy,hepatitis c,seizures/epilepsy     hepatitis c, seizures/epilepsy      OK

SQLFiddle 上的演示


推荐阅读