首页 > 解决方案 > 如何在使用 SSIS 加载数据时生成 ID 以保持多级 XML 的关系完整性?

问题描述

我正在尝试使用 SSIS 将 xml 文件加载到 sql server 数据库中。我的xml数据结构如下:

<PRODUCT xmlns="http://www.areabc.com/ABC" version="5.252">
  <APPLICANTS seller id="0">
    <APPLICANT first_name="ABC" last_name="DEF" middle_name="" xmlns:d3p1="http://www.areabc.com/InternalUse">
      <INFO state="" country="" />
      <ADDRESS duration="10" description="">
      </ADDRESS>
    </APPLICANT>
  </APPLICANTS>
  .
  .
  .
  
</PRODUCT>

每个 XML 文件大致给了我 20 个表(申请表、信息表、地址表等)((一些 XML 文件不包含所有节点))。我有 30 个要加载的 xml 文件。当我将数据转储到 SQL 服务器表中时,SSIS 自动生成表 ID,如申请人 ID 或 INFO_ID 等。现在这个申请人 ID 或 INFO_ID 是相同的,对于来自 30 个 XML 的所有 30 行的申请人 ID = 5 或 INFO_ID = 8文件。所有 20 张桌子也是如此(图片)

在此处输入图像描述.

我想知道如何为 SSIS 中的每个表生成唯一的 ID,以便它们可以连接并保持关系完整性。示例:可以将申请人表与地址表连接起来。

感谢任何帮助

标签: sql-serverxmlssisrelational-databaseprimary-key

解决方案


这是一个概念性的例子。

您可以根据自己的需要进行调整。

SQL

-- DDL and sample data population, start
USE tempdb;
GO

DROP TABLE IF EXISTS #city;
DROP TABLE IF EXISTS #state;

-- parent table
CREATE TABLE #state  (
   stateID INT IDENTITY PRIMARY KEY, 
   stateName VARCHAR(30), 
   abbr CHAR(2), 
   capital VARCHAR(30)
);
-- child table (1-to-many)
CREATE TABLE #city (
   cityID INT IDENTITY, 
   stateID INT NOT NULL FOREIGN KEY REFERENCES #state(stateID), 
   city VARCHAR(30), 
   [population] INT,
   PRIMARY KEY (cityID, stateID, city)
);
-- mapping table to preserve IDENTITY ids
DECLARE @idmapping TABLE (GeneratedID INT PRIMARY KEY,
    NaturalID VARCHAR(20) NOT NULL UNIQUE);

DECLARE @xml XML =
N'<root>
   <state>
      <StateName>Florida</StateName>
      <Abbr>FL</Abbr>
      <Capital>Tallahassee</Capital>
      <cities>
         <city>
            <city>Miami</city>
            <population>470194</population>
         </city>
         <city>
            <city>Orlando</city>
            <population>285713</population>
         </city>
      </cities>
   </state>
   <state>
      <StateName>Texas</StateName>
      <Abbr>TX</Abbr>
      <Capital>Austin</Capital>
      <cities>
         <city>
            <city>Houston</city>
            <population>2100263</population>
         </city>
         <city>
            <city>Dallas</city>
            <population>5560892</population>
         </city>
      </cities>
   </state>
</root>';
-- DDL and sample data population, end

;WITH rs AS 
(
    SELECT stateName   = p.value('(StateName/text())[1]', 'VARCHAR(30)'),
           abbr         = p.value('(Abbr/text())[1]', 'CHAR(2)'),
           capital      = p.value('(Capital/text())[1]', 'VARCHAR(30)')
    FROM   @xml.nodes('/root/state') AS t(p)
 )
 MERGE #state AS o
 USING rs ON 1 = 0
 WHEN NOT MATCHED THEN
    INSERT(stateName, abbr, capital)  
       VALUES(rs.stateName, rs.Abbr, rs.Capital)
 OUTPUT inserted.stateID, rs.stateName 
   INTO @idmapping (GeneratedID, NaturalID);

;WITH Details AS 
(
    SELECT NaturalID   = p.value('(StateName/text())[1]', 'VARCHAR(30)'),
           city         = c.value('(city/text())[1]', 'VARCHAR(30)'),
           [population]   = c.value('(population/text())[1]', 'INT')
    FROM   @xml.nodes('/root/state') AS A(p)   -- parent
      CROSS APPLY A.p.nodes('cities/city') AS B(c) -- child
) 
INSERT #city (stateID, city, [Population])
SELECT m.GeneratedID, d.city, d.[Population]
FROM   Details AS d
   INNER JOIN @idmapping AS m ON d.NaturalID = m.NaturalID;

-- test
SELECT * FROM #state;
SELECT * FROM @idmapping;
SELECT * FROM #city;

推荐阅读