首页 > 解决方案 > DOCX4J 异常:线程“主”java.lang.ArrayIndexOutOfBoundsException 中的异常:1

问题描述

我们正在低于错误

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
    at org.docx4j.convert.in.xhtml.TableHelper.setupTblGrid(TableHelper.java:227)
    at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.traverse(XHTMLImporterImpl.java:1021)
    at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.traverse(XHTMLImporterImpl.java:1304)
    at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.traverse(XHTMLImporterImpl.java:1284)
    at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.traverse(XHTMLImporterImpl.java:1284)
    at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.traverse(XHTMLImporterImpl.java:1284)
    at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.traverse(XHTMLImporterImpl.java:1284)
    at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.traverse(XHTMLImporterImpl.java:1284)
    at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.traverse(XHTMLImporterImpl.java:1284)
    at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.traverse(XHTMLImporterImpl.java:825)
    at org.docx4j.convert.in.xhtml.XHTMLImporterImpl.convert(XHTMLImporterImpl.java:698)
    at com.deloitte.abbvie.practice.Docx4jGenerateDocument.main(Docx4jGenerateDocument.java:51)

当我们尝试使用下面的 HTML 生成 word 文档时。根据我们的理解,如果我们在其中添加子表<td></td>开始抛出异常。

test2.html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content="HTML Tidy for Java (vers. 2009-12-01), see jtidy.sourceforge.net" />
<title></title>
</head>
<body>
<table style="width: 100%; border-collapse: collapse; border: 0.5px solid black;">
<tbody>
<tr>
<td bgcolor="#E5DFEC" valign="top" width="20%" style="border-collapse: collapse; border: 0.5px solid black; border-bottom: 0.0px solid; border-right: 0.0px solid; border-left: 0.0px solid;">
<div style="color: #000066; font-size: 15px;">Test</div>
</td>
<td width="80%" style="border-collapse: collapse; border: 0.5px solid black; border-bottom: 0.1px solid; border-right: 0.0px solid;">
<h3 style="font-weight: bold; text-decoration: underline;"><u style="font-family: Calibri;">TEST 1:</u></h3>
<p><img src="Docs_UAT_ISSUE2//a2x03000000GyjIAAS_image_12117863437.png" /></p>
<h3 style="font-weight: bold; text-decoration: underline;"><u style="font-family: Calibri;">TEST 2:</u></h3>
<table align="left" class="ql-table-blob" border="1" style="width: 6.5in; margin-left: 6.75pt;" width="624">
<tbody>
<tr>
<td colspan="3" rowspan="1" valign="bottom" style="width: 6.5in;" width="624">
<p class="ListEnd" style="margin-left: 0in;"><b>ABC <a target="_blank">Tests</a></b></p>
</td>
</tr>
<tr>
<td colspan="1" rowspan="1" valign="bottom" style="width: 139.25pt;" width="186">
<p class="ListEnd" style="margin-left: 0in;"><b>PQR</b></p>
</td>
<td colspan="1" rowspan="1" valign="bottom" style="width: 175.5pt;" width="234">
<p class="ListEnd" style="margin-left: 0in;"><b>MNO</b></p>
</td>
<td colspan="1" rowspan="1" valign="bottom" style="width: 153.25pt;" width="204">
<p class="ListEnd" style="margin-left: 0in;"><b>XYZ</b></p>
</td>
</tr>
<tr style=";">
<td colspan="1" rowspan="1" valign="top" style="width: 139.25pt;" width="186">
<p class="ListEnd" style="margin-left: 0in;">Hematocrit<br />
 Hemoglobin<br />
 Red blood cell (RBC) count<br />
 White blood cell (WBC) count<br />
 Neutrophils<br />
 Bands<br />
 Lymphocytes<br />
 Monocytes<br />
 Basophils<br />
 Eosinophils<br />
 Platelet count (estimate not &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; acceptable)</p>
</td>
<td colspan="1" rowspan="3" valign="top" style="width: 175.5pt;" width="234">
<p class="ListEnd" style="margin-left: 0in;">Blood urea nitrogen (BUN) or Urea<br />
 dfgsdf<br />
dfgsdfggCreatinine dfgdfesg (Cockcroft-Gault calculation)<br />
 Tfgsdfgotal fgdfgsd<br />
 Dirfgsdgsdfgdect and indiredfgsdfgct bilirubin<br />
 Lactate njkkh (LDH)<br />
 jkljljl<br />
 Alanine hjkhjkhjkhjk (ljj;klj;kl/ALT)<br />
 Aspartate jkjkl (SGOT/AST)<br />
 Alkaline sdfsdgdf<br />
 Sofdgbsdfgdium<br />
 Pofgsfgtassium<br />
 Cagdrfgdflcium<br />
 Inorfgrdgdfganic phosphorus<br />
 Urifgdrhdfgc acid<br />
 fgdfgf protein<br />
 Glfghdhucose<br />
 Bicarfghgfdhbonate/CO2<br />
 fghdfhh</p>
</td>
<td colspan="1" rowspan="3" valign="top" style="width: 153.25pt;" width="204">
<p class="ListEnd" style="margin-left: 0in;">A</p>
<p class="ListEnd" style="margin-left: 0in;">B</p>
</td>
</tr>
<tr>
<td colspan="1" rowspan="1" valign="bottom" style="width: 139.25pt;" width="186">
<p class="ListEnd" style="margin-left: 0in;"><b>C</b></p>
</td>
</tr>
<tr>
<td colspan="1" rowspan="1" valign="top" style="width: 139.25pt;" width="186">
<p class="ListEnd" style="margin-left: 0in;">C<br />
 A<br />
 psdcsdfH<br />
 Protesdfsdain<br />
 Blodfsdod<br />
 Glucosdfsdgsdfse<br />
 Leukosdfafcyte edfsdfasdfsterase<br />
 Nitrdfdfsitdfgsddes<br />
 Bilirudfgsdfdgsdgfffgbin<br />
 Urobilidfsddnogen<br />
 Micrsdvsdfdrfgoscopic analdfgdfgfdgysis (as dfgdfgsdfneeded)</p>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</body>
</html>

这是我们使用 docx4j生成docx文件的 Java 代码。

package com.deloitte.abbvie.practice;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.math.BigInteger;

import org.docx4j.convert.in.xhtml.XHTMLImporterImpl;
import org.docx4j.jaxb.Context;
import org.docx4j.model.structure.PageDimensions;
import org.docx4j.model.structure.PageSizePaper;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.wml.Body;
import org.docx4j.wml.SectPr;
import org.docx4j.wml.SectPr.PgMar;

import com.deloitte.abbvie.common.DocumentUtil;

public class Docx4jGenerateDocument {
    static DocumentUtil documentUtil = new DocumentUtil();

    public static void main(String[] args) throws Exception {
//      String finalHTMLString = "<ul> <li>Coffee</li> <li>Tea</li> <li>Milk</li> <li>CHILD <ul> <li>Child 1</li> <li>Child 2</li> <li>Child 3</li> </ul> </li> </ul> <ol> <li>Coffee</li> <li>Tea</li> <li>Milk</li> <li>CHILD 1 <ol> <li>Child 1</li> <li>Child 2</li> <li>Child 3</li> <li>CHILD 2 <ol> <li>Child 1</li> <li>Child 2</li> <li>Child 3</li> </ol> </li> </ol> </li> </ol>";

//      OR
        BufferedReader br = new BufferedReader(new FileReader(
                "C:\\Users\\naveekhan\\Documents\\Naveed_Project\\Projects_Workspace\\DocumentGenerationProtocolOpsManual-SP\\src\\main\\java\\com\\deloitte\\abbvie\\practice\\test2.html"));
        StringBuilder sb = new StringBuilder();
        String line = br.readLine();

        while (line != null) {
            sb.append(line);
            sb.append("\n");
            line = br.readLine();
        }
        String finalHTMLString = sb.toString();
//==============================================================        

//      finalHTMLString = documentUtil.replaceCheckBoxs(finalHTMLString);
//      finalHTMLString = documentUtil.fixWhitespaceIssue(finalHTMLString);
//      finalHTMLString = documentUtil.cleanHTML(finalHTMLString);
        
//      System.out.println(finalHTMLString);

        String filePath = "C:\\Users\\naveekhan\\Documents\\Naveed_Project\\Projects_Workspace\\DocumentGenerationProtocolOpsManual-SP\\src\\main\\java\\com\\deloitte\\abbvie\\practice\\GenerateDocument-06-11-2020.docx";

        if (!finalHTMLString.isEmpty()) {
            // actual code to generate WORD document
            WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage(PageSizePaper.LETTER, false);
            XHTMLImporterImpl xHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
            wordMLPackage.getMainDocumentPart().getContent().addAll(xHTMLImporter.convert(finalHTMLString, null));
            File exportFile = new File(filePath);
            // setting page margin
            Body body = wordMLPackage.getMainDocumentPart().getJaxbElement().getBody();
            PageDimensions page = new PageDimensions();
            PgMar pgMar = page.getPgMar();
            pgMar.setLeft(BigInteger.valueOf(750));
            pgMar.setRight(BigInteger.valueOf(750));
            SectPr sectPr = Context.getWmlObjectFactory().createSectPr();
            body.setSectPr(sectPr);
            sectPr.setPgMar(pgMar);
            wordMLPackage.save(exportFile);
        }
    }

}

任何人或图书馆所有者都可以为我们提供解决问题的解决方案吗?

标签: docx4j

解决方案


见 https://github.com/plutext/docx4j-ImportXHTML/issues/29

根据该问题报告,解决方法是删除 align="left"。 

如果你不能做到这一点,解决这个问题就更复杂了。 

解决该问题的第一步是将 docx4j-ImportXHTML 迁移到 https://github.com/flyingsaucerproject/flyingsaucer 的当前版本,以查看问题是否已经消失。我怀疑它没有,在这种情况下,下一步是尝试在飞碟代码库中修复它。 


推荐阅读