首页 > 解决方案 > Find un-referenced variables in xml through XPath

问题描述

The following XML structure is an example of metadata of a given screen:

<page>
    <context>
        <variable name="used" type="String" />
        <variable name="unused" type="String" />
        <variable name="temp" type="Number" />
    </context>
    <actions>
        <assign>
             <from>"Test"</from>
             <to>used</to>
        </assign>
        <assign>
             <from>1</from>
             <to>temp</to>
        </assign>
    </actions>
 </page>

I'm looking for an XPath expression that can return me a list of variables that are un-referenced in the page. In this example, it is the unused variable.

Given that:

not(/page/actions//*/text() = 'unused') => Returns true (unreferenced)
not(/page/actions//*/text() = 'used') => Returns false

and

/page/context/variable[not(/page/actions//*/text() = @name)] => Return unused variable node
/page/context/variable[/page/actions//*/text() = @name] => Returns the used and temp variable node

this all works as long as the text exactly matches the name of the variable. However, as the text is an expression, it can contain more than just the variable name and in any place in the string.

So I thought of using the contains(haystack, needle) to do the same as above.

Given that:

/page/actions//*[contains(text() , 'temp')] => Returns the temp variable node
not(/page/actions//*[contains(text() , 'unused')]) => Returns true

I thought that one of these would work:

/page/context/variable[not(/page/actions//*[contains(text(), @name)])]

I assume @name doesn't work as it's not in the scope of the variable node, but the one from //*

nor does

/page/context/variable[not(contains(/page/actions//*/text(), @name))]

which returns all 3 variables.

Can anyone guide me as to:

  1. Why does it work with equality and not with the contains?
  2. What expression would return me the correct result?

Ideally, this is achieved by using version 1.0 of the XPath specification.

标签: xpath

解决方案


它与相等性一起工作的原因是您的谓词正在过滤/page/context/variable元素,@name引用该变量元素的属性也是如此,并且=是一个集合比较,可以测试任何(多个)/page/actions//text()节点值是否等于(单个)@name

您尝试使用的 XPath 存在两个问题contains()

  1. 在动作元素的谓词过滤器中,@name将解析为谓词正在过滤的上下文元素的名称属性。它不知道您指的是/page/context/variable元素的名称属性。
  2. 可以有多个text(),并且contains()第一个参数需要一个项目。

对于XPath 2.0或更高版本,您可以使用以下 XPath,它将 绑定@name到变量,然后测试是否有任何text()节点后代/page/actions包含变量$name值。

/page/context/variable[
  let $name := @name 
  return not(/page/actions//*/text()[contains(., $name)]) 
]

推荐阅读