首页 > 解决方案 > 文件提前结束 - XWPFDocument 到 PDFConverter

问题描述

我没有将文档的内容附加到 CTBody 类,而是使用 XWPFDocument 类将所有数据从 word 文档传输到空文档,但出现如下错误。当我将 XWPFDocument 文档转换为 pdf 时,该错误被指出

fr.opensagres.poi.xwpf.converter.core.XWPFConverterException:org.apache.xmlbeans.XmlException:错误:文件过早结束。

        FileInputStream fis   = new FileInputStream("1.docx");
        FileInputStream fis1  = new FileInputStream("2.docx");

        XWPFDocument xdoc = new XWPFDocument(OPCPackage.open(fis));
        XWPFDocument xdoc1 = new XWPFDocument(OPCPackage.open(fis1));

        CTBody ct = xdoc.getDocument().getBody();
        CTBody ct1 = xdoc1.getDocument().getBody();

        XWPFDocument doc = new XWPFDocument();
        doc.createStyles();

        doc.getDocument().addNewBody().set(ct);
        doc.getDocument().addNewBody().set(ct1);

        FileOutputStream out = new FileOutputStream( new File("test.pdf"));
        PdfOptions opt = PdfOptions.create();
        PdfConverter.getInstance().convert(doc, out, opt);

        doc.write(out);
        doc.close();
        out.close();

这是堆栈跟踪。

  fr.opensagres.poi.xwpf.converter.core.XWPFConverterException: org.apache.xmlbeans.XmlException: error: Premature end of file.
at fr.opensagres.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:71)
at fr.opensagres.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:39)
at fr.opensagres.poi.xwpf.converter.core.AbstractXWPFConverter.convert(AbstractXWPFConverter.java:46)
at trafficMan.MainApp.mergeDocument(MainApp.java:513)
at trafficMan.MainApp$2.actionPerformed(MainApp.java:609)
at javax.swing.AbstractButton.fireActionPerformed(Unknown Source)
at javax.swing.AbstractButton$Handler.actionPerformed(Unknown Source)
at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown Source)
at javax.swing.DefaultButtonModel.setPressed(Unknown Source)
at javax.swing.plaf.basic.BasicButtonListener.mouseReleased(Unknown Source)
at java.awt.Component.processMouseEvent(Unknown Source)
at javax.swing.JComponent.processMouseEvent(Unknown Source)
at java.awt.Component.processEvent(Unknown Source)
at java.awt.Container.processEvent(Unknown Source)
at java.awt.Component.dispatchEventImpl(Unknown Source)
at java.awt.Container.dispatchEventImpl(Unknown Source)
at java.awt.Component.dispatchEvent(Unknown Source)
at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source)
at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source)
at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source)
at java.awt.Container.dispatchEventImpl(Unknown Source)
at java.awt.Window.dispatchEventImpl(Unknown Source)
at java.awt.Component.dispatchEvent(Unknown Source)
at java.awt.EventQueue.dispatchEventImpl(Unknown Source)
at java.awt.EventQueue.access$500(Unknown Source)
at java.awt.EventQueue$3.run(Unknown Source)
at java.awt.EventQueue$3.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source)
at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source)
at java.awt.EventQueue$4.run(Unknown Source)
at java.awt.EventQueue$4.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source)
at java.awt.EventQueue.dispatchEvent(Unknown Source)
at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.run(Unknown Source)
Caused by: org.apache.xmlbeans.XmlException: error: Premature end of file.
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3448)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1272)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1259)
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
at org.openxmlformats.schemas.wordprocessingml.x2006.main.StylesDocument$Factory.parse(Unknown Source)
at org.apache.poi.xwpf.usermodel.XWPFDocument.getStyle(XWPFDocument.java:557)
at fr.opensagres.poi.xwpf.converter.core.styles.XWPFStylesDocument.<init>(XWPFStylesDocument.java:196)
at fr.opensagres.poi.xwpf.converter.core.styles.XWPFStylesDocument.<init>(XWPFStylesDocument.java:190)
at fr.opensagres.poi.xwpf.converter.core.XWPFDocumentVisitor.createStylesDocument(XWPFDocumentVisitor.java:182)
at fr.opensagres.poi.xwpf.converter.core.XWPFDocumentVisitor.<init>(XWPFDocumentVisitor.java:175)
at fr.opensagres.poi.xwpf.converter.pdf.internal.PdfMapper.<init>(PdfMapper.java:155)
at fr.opensagres.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:56)
... 40 more
 Caused by: org.xml.sax.SAXParseException; systemId: file://; lineNumber: 1; columnNumber: 1; Premature end of file.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3422)
... 51 more

标签: javapdfms-wordapache-poipdf-conversion

解决方案


甚至这段代码也只是简单地附加多个文档主体。它首先创建一个新的XWPFDocument,其中CTBody已经包含一个。然后它使用 XWPFDocument.getDocument()which 获取该org.openxmlformats.schemas.wordprocessingml.x2006.main.CTDocument1新元素XWPFDocument,然后向其中添加两个新CTBody元素。之后,CTDocument1将具有三个 CTBody要素。

但是根据Office Open XMLa类型CT_Document只能有一个元素CT_Body

以下 XML Schema 片段定义了CT_Document元素的内容:

<complexType name="CT_Document">
 <complexContent>
  <extension base="CT_DocumentBase">
   <sequence>
    <element name="body" type="CT_Body" minOccurs="0" maxOccurs="1"/>
   </sequence>
  </extension>
 </complexContent>
</complexType>

如您所见:CT_Body最多发生 1 次。

合并两个Word文档不仅仅是连接文档正文。正文中的所有元素都需要合并为一个CTBody元素。单个文件系统的其他部分Word(主题、样式、字体表、注释、编号、媒体等)也需要合并。我不知道任何免费Java图书馆,除了 OpenOfficeLibreOffice可以正确地做到这一点。


推荐阅读