首页 > 解决方案 > Python Difflib 库将差异写入新文件

问题描述

我正在尝试运行 python 脚本来比较 2 个文件的内容并将它们合并到一个新文件中。

我有 2 个要比较的数据库模式。出于演示目的,我将它们命名为file1file2

file1看起来像这样。

USE [sql-database-one]
GO
/****** Object:  Table [dbo].[PeopleInfo]    Script Date: 23/06/2021 17:36:21 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[PeopleInfo](
    [PersonId] [int] NOT NULL,
    [FirstName] [text] NOT NULL,
    [LastName] [text] NOT NULL,
    [Age] [int] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
/****** Object:  Index [PK__PeopleIn__AA2FFBE5FDC5E42D]    Script Date: 23/06/2021 17:36:21 ******/
ALTER TABLE [dbo].[PeopleInfo] ADD PRIMARY KEY CLUSTERED 
(
    [PersonId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

file2看起来像这样。

USE [sql-database-two]
GO
/****** Object:  Table [dbo].[PeopleInfo]    Script Date: 23/06/2021 17:54:57 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[PeopleInfo](
    [PersonId] [int] NOT NULL,
    [FirstName] [text] NOT NULL,
    [LastName] [text] NOT NULL,
    [Age] [int] NULL,
    [SecondName] [text] NOT NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
/****** Object:  Index [PK__PeopleIn__AA2FFBE59EF41C87]    Script Date: 23/06/2021 17:54:58 ******/
ALTER TABLE [dbo].[PeopleInfo] ADD PRIMARY KEY CLUSTERED 
(
    [PersonId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

这只是一个用于测试目的的模型,两个文件之间的唯一区别,除了数据库名称,就是表。file2 有一个额外的列名SecondName

使用 difflibs 我可以遍历这些行并打印出差异,我对结果很满意。

现在我想要实现的是将这些文件合并到一个文件中,所有差异都在正确的位置。

到目前为止,我的代码看起来像这样。

file1 = open("backup.sql", 'r')
file2 = open("backup2.sql", 'r')

diff = difflib.context_diff(file1.readlines(), file2.readlines())
delta = ''.join(x[1:] for x in diff if x.startswith('! '))
# delta = ''.join(diff)
print(delta)


print('Diffs successfully written in result.sql')
with open('backup.sql', 'r+') as files1, open('backup2.sql') as files2:
    result = open('result.sql', 'w')
    f1 = [i.strip() for i in files1.readlines()]
    f2 = [j.strip() for j in files2.readlines()]
    f1 += [item for item in f2 if item not in f1]
    file1.seek(0)
    for line in f1:
        result.write(line + '\n')

我期望Age在新列之后看到新文件SecondName,但我得到的是:

USE [sql-database-one]
GO
/****** Object:  Table [dbo].[PeopleInfo]    Script Date: 23/06/2021 17:36:21 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[PeopleInfo](
[PersonId] [int] NOT NULL,
[FirstName] [text] NOT NULL,
[LastName] [text] NOT NULL,
[Age] [int] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
/****** Object:  Index [PK__PeopleIn__AA2FFBE5FDC5E42D]    Script Date: 23/06/2021 17:36:21 ******/
ALTER TABLE [dbo].[PeopleInfo] ADD PRIMARY KEY CLUSTERED
(
[PersonId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
USE [sql-database-two]
/****** Object:  Table [dbo].[PeopleInfo]    Script Date: 23/06/2021 17:54:57 ******/
[Age] [int] NULL,
[SecondName] [text] NOT NULL
/****** Object:  Index [PK__PeopleIn__AA2FFBE59EF41C87]    Script Date: 23/06/2021 17:54:58 ******/

所有差异都附加在文件的底部。任何人都可以帮助我或指导我如何在正确的行编写差异的正确路径?

太感谢了

标签: python-3.xdifflib

解决方案


推荐阅读