首页 > 解决方案 > marklogic 删除重复的节点/元素

问题描述

我有数千个具有重复元素节点的文档。如何查找和删除titleXML 文件中的重复元素?

我使用fn:distict-values()导致性能问题。

例如:01.xml

<doc>
     <pdf>1</pdf>
     <title>Head First JavaScript</title>
     <title>Head First JavaScript</title>
</doc>

02.xml

<doc>
    <pdf>0</pdf>
    <title>Python: Programming Basics for Absolute Beginners </title>
    <title>Python: Programming Basics for Absolute Beginners </title>
</doc>

结果:01.xml

<doc>
     <pdf>1</pdf>
     <title>Head First JavaScript</title>

</doc>

02.xml

<doc>
    <pdf>0</pdf>
    <title>Python: Programming Basics for Absolute Beginners </title>

</doc>

标签: marklogic

解决方案


嗨,请测试附加代码

    let $doc :=
<doc>
    <title>Head First JavaScript</title>
     <title>Head First JavaScript</title>
     <title>hellao</title>
     <title>hello</title>
     <title>hello</title>
     <title>Python: Programming Basics for Absolute Beginners </title>
     <title>ahello</title>
     <title>Python: Programming Basics for Absolute Beginners </title>
</doc>

for $data in $doc//title[not(. = preceding-sibling::node())]
return $data

推荐阅读