python - 使用 pypdf2 合并 PDF 页面失败
问题描述
有了这些演示文件,
test.pdf:“你好”
tomerge1.pdf:“1”
tomerge2.pdf:“2”
在 中output.pdf
,我想拥有:
- 第 1 页:第 1 页
test.pdf
与第 1 页合并tomerge1.pdf
,即“Hello 1” - 第2页:第1页
test.pdf
与第1页合并tomerge2.pdf
,即“Hello 2”
这是我使用的:
from PyPDF2 import PdfFileWriter, PdfFileReader
outputpdf = PdfFileWriter()
inputpdf = PdfFileReader(open("test.pdf", "rb"))
tomerge1 = PdfFileReader(open("tomerge1.pdf", "rb"))
tomerge2 = PdfFileReader(open("tomerge2.pdf", "rb"))
page = inputpdf.getPage(0)
page.mergePage(tomerge1.getPage(0))
outputpdf.addPage(page)
# exit()
# if we stop here, the output is "Hello 1", which is good
# Why isn't "Hello 1" remembered here?
# del page # doesn't change anything
page = inputpdf.getPage(0)
page.mergePage(tomerge2.getPage(0))
outputpdf.addPage(page)
with open("output.pdf", "wb") as f:
outputpdf.write(f)
遗憾的是,它不起作用:输出不是“Hello 1”/“Hello 2”,而是:“Hello 2”/“Hello 2”。
问题:如何有预期的行为?(当有 10 或 20 页时,大小不会增长得很快)
解决方案
我发现当我在做一个类似的练习时,你需要阅读一次并合并一次。解决方法是为两个阅读器合并的输入文件(“test.pdf”)设置两个阅读器。下面的示例代码:
addressfile = open("Documents/addresses.pdf","rb")
xwfile = "Downloads/input.pdf"
crosswordfile = open(xwfile,"rb")
xword = PdfFileReader(crosswordfile)
xw2 = PdfFileReader(crosswordfile)
addr = PdfFileReader(addressfile)
xwpage = xword.getPage(0)
addpage1 = addr.getPage(1)
addpage2 = addr.getPage(2)
pdfWriter = PdfFileWriter()
xp2 = xw2.getPage(0)
xwpage.mergePage(addpage1)
xp2.mergePage(addpage2)
res = open("/home/paula/xw.pdf",'wb')
pdfWriter.addPage(xwpage)
pdfWriter.addPage(xp2)
pdfWriter.write(res)
res.close()
crosswordfile.close()
所以在你的代码中是:
testfile = open("test.pdf", "rb")
outputpdf = PdfFileWriter()
inputpdf1 = PdfFileReader(testfile)
inputpdf2 = PdfFileReader(testfile)
tomerge1 = PdfFileReader(open("tomerge1.pdf", "rb"))
tomerge2 = PdfFileReader(open("tomerge2.pdf", "rb"))
page1 = inputpdf1.getPage(0)
page1.mergePage(tomerge1.getPage(0))
outputpdf.addPage(page1)
# exit()
# No need stop here, the output will have both "Hello 1" and "Hello 2"
# Using two readers for the same file fools PyPdf2 into thinking they
# are two different files, i.e. that we are merging from two sperate sources
page2 = inputpdf2.getPage(0)
page2.mergePage(tomerge2.getPage(0))
outputpdf.addPage(page2)
with open("output.pdf", "wb") as f:
outputpdf.write(f)