marklogic - 通过 MLCP 摄取存储在 Archive 中的 XML

问题描述

使用以下命令通过 MLCP 导入存储在存档中的 XML 文档时：

mlcp import -mode local -host localhost -input_file_path "D:\xmlworkflow\test" -input_file_type archive -username admin -password admin -port 8000 -database Documents -input_file_pattern ".*/*.zip" -output_uri_prefix "/modules/"

我收到以下错误：

18/08/10 11:09:41 INFO contentpump.LocalJobRunner: Content type: XML 
18/08/10 11:09:41 INFO contentpump.FileAndDirectoryInputFormat: Total input paths to process : 2 
18/08/10 11:09:41 ERROR contentpump.LocalJobRunner: Error getting input splits: 
18/08/10 11:09:41 ERROR contentpump.LocalJobRunner: Not type information in Archive name

我正在使用 MarkLogic 8.0-7.1。

有谁知道这个错误？

标签： marklogicmarklogic-8mlcp

输入文件类型archive是指使用 MLCP 存档导出 ( -output_type archive) 创建的 MLCP 存档 zip 文件。

我想你打算-input_compressed改用。就像是：

mlcp.bat import -mode local -host localhost -input_file_path "D:\xmlworkflow\test" -input_compressed -username xxx -password yyy -port 8000 -database Documents -input_file_pattern ".*/*.zip" -output_uri_prefix "/modules/"

为了有选择地导入特定文件，我建议使用转换，$content如果应该摄取文件，或者()如果跳过（空序列），则通过转换。

可以在此处找到有关 MLCP 转换的文档：

http://docs.marklogic.com/guide/mlcp/import#id_82518

！

marklogic - 通过 MLCP 摄取存储在 Archive 中的 XML

问题描述

解决方案

推荐阅读