首页 > 解决方案 > 合并表中的记录并更新相关表

问题描述

我有一个主表,其中包含链接到其他各种表的用户。有时由于导入的数据错误,此主表中有重复项,我想合并它们。请参见下表。

表:用户

UserID    Username    FirstName    LastName
1         Main        John         Doe
2         Duplicate   John         Doo

表:记录1

RecordID  RecordName      CreatedUserID   UpdatedUserID
1         Test record 1   1               2
2         Test record 2   2               null
3         Test record 3   2               null

CreatedUserID 和 UpdatedUserID 是 Users.UserID 的外部列。
所以目前如果我想合并用户 1 和 2,我会用这些 SQL 语句来做:

UPDATE Records1 SET UpdatedUserID = 1 WHERE UpdatedUserID = 2
UPDATE Records1 SET CreatedUserID = 1 WHERE CreatedUserID = 2
DELETE FROM Users WHERE UserID = 2

这只是一个示例子集,但实际上,有很多相关的记录表,我必须为其添加额外的 SQL-Update 语句。

我知道我可能在这里碰运气,但是也许有办法完成上述操作(批量更新所有相关表并删除“重复”记录),而不是手动更新每个外部字段和每个相关表。users 表基本上是链接到所有其他表的基表,因此为每个表创建单独的语句相当麻烦,所以如果有快捷方式可用,那就太好了。

标签: sql-servertsql

解决方案


由于识别重复帐户的过程将是手动的,因此(通常)将处理成对的帐户。(我假设 Inspector 不能在您的 UI 中勾选 15 个用户帐户作为重复项并提交整个批次进行处理。)

像下面这样的存储过程可能是一个好的开始:

create procedure MergeUsers
  @RetainedUserId Int, -- UserId   that is being kept.
  @VictimUserId Int -- UserId   that is to be removed.

as
  begin

  -- Validate the input.
  --   Optional, but you may want some reality checks.
  --   (Usernames are probably unique already, eh?)
  declare @UsernameMatch as Int, @FirstNameMatch as Int, @LastNameMatch as Int, @EmailMatch as Int;
  select
    @UsernameMatch = case when R.Username = V.Username then 1 else 0 end,
    @FirstNameMatch = case when R.FirstName = V.FirstName then 1 else 0 end,
    @LastNameMatch = case when R.LastName = V.LastName then 1 else 0 end,
    @EmailMatch = case when R.Email= V.Emailthen 1 else 0 end
    from Users as R inner join
      Users as V on V.UserId = @VictimUserId and R.UserId = @RetainedUserId;
  if @UsernameMatch + @FirstNameMatch + @LastNameMatch + @EmailMatch < 2
    begin
    -- The following message should be enhanced to provide a better clue as to which user
    --   accounts are being processed and what did or didn't match.
    RaIsError( 'MergeUsers: The two user accounts should have something in common.', 25, 42 );
    return;
    end;

  -- Update all of the related tables.
  --   Using a single pass through each table and updating all of the appropriate columns may improve performance.
  --   The   case   expression will only alter the values which reference the victim user account.
  update Records1
    set
      CreatedUserId = case when CreatedUserId  = @VictimId then @RetainedUserId else CreatedUserId end,
      UpdatedUserId = case when UpdatedUserId = @VictimId then @RetainedUserId else UpdatedUserId end
    where CreatedUserId = @VictimUserId or UpdatedUserId = @VictimUserId;

  update Records2
    set ...
    where ...;

  -- Houseclean   Users .
  delete from Users
    where UserId = @VictimUserId;

  end;

注意:剩下的练习是在 SP 中添加try/catch和事务,以确保合并是全有或全无操作。


推荐阅读