itext - itext html 到 pdf 内容超出文档
问题描述
我正在尝试在没有任何 css 的情况下转换这段 html:
<!-- saved from url=(1129)https://00f74ba44bf27c26fa604fec19ae391f1d94b6b867-apidata.googleusercontent.com/download/storage/v1/b/backoffice-pao-export/o/document.html?jk=AFshE3XhuRHA7mtfWHAXotti5kjbdIdwxYMBJwIALdaUHwAd5SAytVpKLo_GL_3G_C4shq09Xmhlh2M5uo4BlheALWF58v-9mdqU7EYAR03iEraa1dZZNG0eu3waNSsxkMoxAHr-_GqZXDUHVNvMrLZnTiO7uYcZzQ2OuWvLl3xnX2ppzF0fZ3Bi1b7Rka7nhlNGmrjYDbWWBbrWRiiMnBNd_QZAK_T0t5XobSXCwlJ90IczJLMgjlDYXdq6UJzlsJQLEBI4MA5Ca1s0x-yhygik9sYOv1yawtyPAmvUfwVThET3b6HEA_tnVShpSes8rLZzAJemRtJ7HAJ0NhasQxwsIwOtmriFl8jhQCbFT7nxlwmnfhnSwTSqCxL9JiBdCTHOEqmHVCfsGAC3j3eiJdFFTncsgwhu2MN9_4DSibiuyc_UjHPPcOHOmbSLQxZFtnY4lL-OMIM4G-iDm5gb2k7_K0icO_-eTpSySqhKsFJroGg9KtzU-Rp8mUjeCeY_oGNWE8u1ndsZnP635pJ3hSzsFhEKK85X-L0BpCKTOH3WEATg7c4cEl-VaIyrEbz5ap4GoKCMo9oV2egcfoM2c2N91ZN5IpuXpAlwBoRf0O0zECZfBHQaVOX5RbNYu1cdB69jWVl52ZHl1q2dkx8pILl7dThSan5GHK3cfnP_0fucOiPLLKTH0KXZdY7y1eH666WyUdIsv4SrXvLHzhASeQp7XV_WjtEbVriylge0iOVdbngznKzVxGOJ5xQCnyr3oFZl_GfDnVxMokx-dBNefPAYCWNu3NrNkvJ1emR1KBlTJjX7OIrmQPjSDX5lx8fejzIB3cstLXeTHFVU-ITkQ4ZadevjoV_mMz3SKUU_chyzQVybYdHt498-1gVLmtlb2Qww3bKMPsOK9i3_h2MxvHiV9Sow6mYzZHV9Q-riCbBEDoRbNo0iyHgjbOjs-UHwQPN0U1bvOvU2RxcS7A&isca=1 -->
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
</head>
<body>
<div id="184981f8-654a-4e90-a0f5-e75d1edaf2ca" class="act">
<div id="b995877a-0d3c-439f-984e-f9f809d124a5" class="footnotes">
<table>
<tbody>
<tr>
<td id="f29aca16-143d-1fc6-8f6a-d2aa116cde25">1</td>
<td>Ezechiel HAVRENNE is a lecturer at the University of Luxembourg on Investment Funds. Views expressed
in this article reflect some of the author’s experience to date on the subject matter. As the
Luxembourg investment fund market continues to develop these views may – and will most likely
continue to – evolve in one way or another. This article should in no way be construed as legal,
business or structuring advice rendered by the author or any other entity, nor should it be
construed as reflecting the views of such entity(ies)
</td>
</tr>
<tr>
<td id="434b1865-a5ea-1f96-b0fa-09ea9e4fb76a">2</td>
<td>The Preqin Quarterly Update: Private Debt, Q3 2020, 7 October 2020, page 12; <a
id="0e11d32d-c25b-65c1-8266-39da10bb62f3"
href="https://www.preqin.com/insights/research/quarterly-updates/preqin-quarterly-update-private-debt-q3-2020"
target="_blank" class="tech_external" rel="noopener">https://www.preqin.com/insights/research/quarterly-updates/preqin-quarterly-update-private-debt-q3-2020</a>
(accessed 15 March 2021). These figures drastically contrast with those reported by Lipper as of
October 2016, whereby “<em>the gross AuM of all funds that invest primarily in loan participations
was approximately USD 218 billon</em>� as mentioned in IOSCO’s final report; IOSCO
FR03/2017, ib., page 4
</td>
</tr>
<tr>
<td id="6bf035e5-d434-1eec-a550-58147bed84a0">3</td>
<td>According to EU recommendation 2003/361, 2 factors determine whether a business is an SME: (i) the
number of employees and (ii) either turnover or balance sheet total. A medium-sized company has up
to 250 employees, a turnover of up to €50 million or a balance sheet total of up to €43 million.
A small-sized company has up to 50 employees & a turnover or balance sheet total of up to €10
million. A micro-company has up to 10 employees & a turnover or balance sheet total of up to
€2 million
</td>
</tr>
<tr>
<td id="5028557e-4efe-1066-9fd4-28809a6d0653">4</td>
<td>For instance, one of the driving forces that has led European jurisdictions to consider permitting
funds to originate loans was the adoption of the EU regulation on European long-term investment
funds allowing funds the origination of loans under certain conditions. As a result, many
jurisdictions in Europe now allow loan originations by funds
</td>
</tr>
<tr>
<td id="cd0ac4df-9139-1c0a-9dd0-c15cca78845a">5</td>
<td>See IOSCO’s final report FR03/2017, <em>Findings of the Survey on Loan Funds</em>, February 2017,
page 4 <a id="76d9ff09-04f9-61a4-a311-2cfee0e19245"
href="https://www.iosco.org/library/pubdocs/pdf/IOSCOPD555.pdf" target="_blank"
class="tech_external" rel="noopener">https://www.iosco.org/library/pubdocs/pdf/IOSCOPD555.pdf</a>
(accessed 13 April 2021)
</td>
</tr>
<tr>
<td id="a0dd548b-cfa4-182c-9472-624a6be46538">6</td>
<td>See the Glossary of Summaries published on EUR-Lex, <a id="3052c250-b9c1-60f7-b36c-45ab06665101"
href="https://eur-lex.europa.eu/summary/glossary/sme.html"
target="_blank" class="tech_external"
rel="noopener">https://eur-lex.europa.eu/summary/glossary/sme.html</a>
(accessed 13 April 2021) as well as the European Commission’s page titled “<em>Access to finance
for SMEs</em>�,<a id="b8b721ff-fd48-67aa-aaac-e5b1d0d02b60"
href="https://ec.europa.eu/growth/access-to-finance_en" target="_blank"
class="tech_external" rel="noopener">
https://ec.europa.eu/growth/access-to-finance_en</a> (accessed 13 April 2021)
</td>
</tr>
<tr>
<td id="d98d8f00-f797-1b37-9540-36713cfdc8a7">7</td>
<td><em>Ib.</em></td>
</tr>
<tr>
<td id="3868e384-a464-1b26-933a-8ec3a95f86d5">8</td>
<td>For more information see <a id="dc357707-f043-68ce-a7bc-c9a5d9d86c7d"
href="https://ec.europa.eu/growth/smes/cosme_en" target="_blank"
class="tech_external" rel="noopener">https://ec.europa.eu/growth/smes/cosme_en</a>
(accessed 13 April 2021)
</td>
</tr>
<tr>
<td id="6766e322-fdf8-16b8-99e4-006e43fdecbd">9</td>
<td>See the European Commission’s page titled “COSME Financial Instruments�, <a
id="62cbd917-994d-6388-b0db-786a5c792685"
href="https://ec.europa.eu/growth/access-to-finance/cosme-financial-instruments_en"
target="_blank" class="tech_external" rel="noopener">https://ec.europa.eu/growth/access-to-finance/cosme-financial-instruments_en</a>
(accessed 13 April 2021)
</td>
</tr>
<tr>
<td id="11773190-b10f-1399-b71f-3a5fcfa5a5fc">10</td>
<td>Even if the eligibility for participation in the COSME LGF programme was extended to Loan
Origination funds it does not appear from the EIF’s register published as at 31 January 2021 that
any would have made the list. See<a id="cf5536ce-bff2-6220-9ed7-e4011b938b0e"
href="https://www.eif.org/what_we_do/guarantees/single_eu_debt_instrument/cosme-loan-facility-growth/cosme_lgf_signatures.pdf"
target="_blank" class="tech_external" rel="noopener">
https://www.eif.org/what_we_do/guarantees/single_eu_debt_instrument/cosme-loan-facility-growth/cosme_lgf_signatures.pdf</a>
(accessed 13 April 2021)
</td>
</tr>
<tr>
<td id="12b455e1-ceff-10b6-ba3d-df5b441fe989">11</td>
<td>Those associated countries include Iceland, Montenegro, Turkey, the Republic of North Macedonia,
Albania, Serbia, Bosnia and Herzegovina, and Kosovo
</td>
</tr>
<tr>
<td id="d8103a16-44fa-1096-8295-d478456b0117">12</td>
<td>Connor Hussey, Luxembourg private debt industry grows 36% from 2019, Private Funds CFO, 3 December
2020, <a id="0facc75b-6776-606c-b47d-e2025d559bf2"
href="https://www.privatefundscfo.com/luxembourg-private-debt-industry-grows-36-2-from-2019"
target="_blank" class="tech_external" rel="noopener">https://www.privatefundscfo.com/luxembourg-private-debt-industry-grows-36-2-from-2019</a>/
(accessed 13 April 2021). These figures should be in line with the then reality based on the 2017
final report of IOSCO whereby it stated that “<em>in Luxembourg, the net AuM of all domestic Loan
Funds (i.e., Funds with their primary activity engaged in lending and across various loan
activities, encompassing also activities such as microfinance, real estate debt or
infrastructure financing) is EUR 37.3 bn, constituting 1% of all domestic Funds</em>�, IOSCO
FR03/2017, ib., page 9
</td>
</tr>
<tr>
<td id="228c3276-de18-1393-9860-66ff5272b741">13</td>
<td>KPMG – ALFI Private Debt Fund Survey 2020, pages 4 and 5, <br><a
id="6d4a0dff-557a-603a-8b28-c47bd843b6b4"
href="https://assets.kpmg/content/dam/kpmg/lu/pdf/private-debt-fund-survey-2020.pdf"
target="_blank" class="tech_external" rel="noopener">https://assets.kpmg/content/dam/kpmg/lu/pdf/private-debt-fund-survey-2020.pdf?</a>utm_source=Sailthru&utm_medium=email&utm_campaign=Loan%20Note%203%20December%202020&utm_term=PDI_LONENOTE_SUBSCRIBER<br>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</body>
</html>
但是每次当我运行HtmlConverter.convertToPdf()时,内容都会被裁剪,并将 html 内容作为字符串得到以下结果:
但是,当我remove last tr element
,我得到预期的结果:
您认为这是什么原因造成的?是因为表格元素有太多的孩子吗?
--- 问题更新 ----
因此,在阅读了@CptCave 的评论后,我尝试使用在这种情况下应该可以工作的 word-break css 属性将 html 更改为这种格式:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<style>
.word-break{
word-break: break-all;
}
</style>
</head>
<body>
<div id="b995877a-0d3c-439f-984e-f9f809d124a5" class="footnotes">
<table class="word-break">
<tbody>
<tr>
<td id="7673aebd-bc37-198d-932f-987fb16fb503">94</td>
<td>See ESMA Consultation Paper Guidelines on transaction reporting, reference data, order record
keeping & clock synchronisation, 23 December 2015, ESMA/2015/1909, p. 49; <a
id="5326eab7-02a4-69ec-9069-2d0c8eb5f180"
href="https://www.esma.europa.eu/sites/default/files/library/2015-1909_guidelines_on_transaction_reporting_reference_data_order_record_keeping_and_clock_synchronisation.pdf"
target="_blank" class="tech_external" rel="noopener">https://www.esma.europa.eu/sites/default/files/library/2015-1909_guidelines_on_transaction_reporting_reference_data_order_record_keeping_and_clock_synchronisation.pdf</a>
(accessed on 13 April 2021)
</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>
但是我得到了这个结果:
解决方案是添加内联 css
*<table style="word-wrap: break-word"/>*
所以为了完成我在转换之前用jsoup改变了文档结构:
Document document = Jsoup.parse(html);
document.getElementsByTag("table").forEach(table -> {
table.attr("style", "word-wrap: break-word");
});
解决方案
据我所知,您的问题是由于缺少自动换行引起的。您的最后一个表格行有一个很长的不间断字符串:带有 UTM 标记的链接。如果您从中删除 utm-tags,则裁剪将不会持续。
<tr>
<td id="228c3276-de18-1393-9860-66ff5272b741">13</td>
<td>KPMG – ALFI Private Debt Fund Survey 2020, pages 4 and 5, <br><a
id="6d4a0dff-557a-603a-8b28-c47bd843b6b4"
href="https://assets.kpmg/content/dam/kpmg/lu/pdf/private-debt-fund-survey-2020.pdf"
target="_blank" class="tech_external" rel="noopener">https://assets.kpmg/content/dam/kpmg/lu/pdf/private-debt-fund-survey-2020.pdf</a><br>
</td>
</tr>
更持久的解决方案是使用 CSS 实现自动换行,并将参数 overflow-wrap 设置为 break-word。
在 iText KB 中有一个完整的例子:https ://kb.itextpdf.com/home/it7kb/examples/pdfhtml-support-for-overflow-wrap-word-break-css-properties
推荐阅读
- python - 使用python 2.7向Gmail发送电子邮件并避免Gmail红锁安全警告
- php - 如何在wordpress的选项菜单中显示单选按钮的值
- quill - 如何使用 Quill 编辑和保存我的 html 文件?
- makefile - Make say 'doc' 是最新的,doc 文件夹为空(Makefile C++)
- javascript - 如何在javascript中为类别添加值
- python - 错误:文件“xml_parser.py”,第 5 行,在
out_file = sys.argv[2] IndexError: 列表索引超出范围 - javascript - 交换同步循环到异步方法?
- c# - 作为 NULL 存储在数据库中的日期返回为 01/01/1900 而不是 DateTime.MinValue
- ios - iOS Firebase Crashlytics :: 出现来自模拟器的测试崩溃,但不会出现来自设备的真正崩溃
- sql - SQL 查询,返回 field_w,其中 field_x 为最大值——按 field_y、field_z 分组