python - 字符串Python中子字符串的精确匹配
问题描述
我知道这个问题很常见,但我下面的示例比问题标题所暗示的要复杂一些。
假设我有以下“test.xml”文件:
<?xml version="1.0" encoding="UTF-8"?>
<test:xml xmlns:test="http://com/whatever/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<parent xsi:type="parentType">
<child xsi:type="childtype">
<grandchild>
<greatgrandchildone>greatgrandchildone</greatgrandchildone>
<greatgrandchildtwo>greatgrandchildtwo</greatgrandchildtwo>
</grandchild><!--random comment -->
</child>
<child xsi:type="childtype">
<greatgrandchildthree>greatgrandchildthree</greatgrandchildthree>
<greatgrandchildfour>greatgrandchildfour</greatgrandchildfour><!--another random comment -->
</child>
<child xsi:type="childtype">
<greatgrandchildthree>greatgrandchildthree</greatgrandchildthree>
<greatgrandchildfour>greatgrandchildfour</greatgrandchildfour><!--third random comment -->
</child>
</parent>
</test:xml>
在下面的程序中,我主要做两件事:
- 找出 xml 中包含“类型”属性的所有节点
- 循环遍历 xml 的每个节点并找出它是否是包含“类型”属性的元素的子元素
这是我的代码:
from lxml import etree
import re
xmlDoc = etree.parse("test.xml")
root = xmlDoc.getroot()
nsmap = {
'xsi': 'http://www.w3.org/2001/XMLSchema-instance'
}
nodesWithType = []
def check_type_in_path(nodesWithType, path, root):
typesInPath = []
elementType = ""
for node in nodesWithType:
print("checking node: ", node, " and path: ", path)
if re.search(r"\b{}\b".format(
node), path, re.IGNORECASE) is not None:
element = root.find('.//{0}'.format(node))
elementType = element.attrib.get(f"{{{nsmap['xsi']}}}type")
if elementType is not None:
print("found an element for this path. adding to list")
typesInPath.append(elementType)
else:
print("element: ", node, " not found in path: ", path)
print("path ", path ," has types: ", elementType)
print("-------------------")
return typesInPath
def get_all_node_types(xmlDoc):
nodesWithType = []
root = xmlDoc.getroot()
for node in xmlDoc.iter():
path = "/".join(xmlDoc.getpath(node).strip("/").split('/')[1:])
if "COMMENT" not in path.upper():
element = root.find('.//{0}'.format(path))
elementType = element.attrib.get(f"{{{nsmap['xsi']}}}type")
if elementType is not None:
nodesWithType.append(path)
return nodesWithType
nodesWithType = get_all_node_types(xmlDoc)
print("nodesWithType: ", nodesWithType)
for node in xmlDoc.xpath('//*'):
path = "/".join(xmlDoc.getpath(node).strip("/").split('/')[1:])
typesInPath = check_type_in_path(nodesWithType, path, root)
代码应返回特定路径中包含的所有类型。例如,考虑路径parent/child[3]/greatgrandchildfour
。此路径是包含属性“type”的两个节点的子节点(直接或远距离):parent
和parent/child[3]
。因此,我希望nodesWithType
该特定节点的数组同时包含“parentType”和“childtype”。
但是,根据下面的打印,nodesWithType
此节点的数组仅包含“parentType”类型,不包含“childtype”。此逻辑的主要重点是检查具有该类型的节点的路径是否包含在相关节点的路径中(因此检查字符串的精确匹配)。但这显然行不通。我不确定是不是因为条件中有数组注释没有对其进行验证,或者可能是其他原因。
对于上面的例子,返回的打印是:
checking node: parent and path: parent/child[3]/greatgrandchildfour
found an element for this path. adding to list
checking node: parent/child[1] and path: parent/child[3]/greatgrandchildfour
element: parent/child[1] not found in path: parent/child[3]/greatgrandchildfour
checking node: parent/child[2] and path: parent/child[3]/greatgrandchildfour
element: parent/child[2] not found in path: parent/child[3]/greatgrandchildfour
checking node: parent/child[3] and path: parent/child[3]/greatgrandchildfour
element: parent/child[3] not found in path: parent/child[3]/greatgrandchildfour
path parent/child[3]/greatgrandchildfour has types: parentType
解决方案
推荐阅读
- redhat - 尝试启动指向自定义 xstartup 的 vncserver
- python - CSV 编写器未从 API 调用写入新的 JSON 值
- python - 我安装了 py2cytoscape 并且它在 pycharm linux 中使用 GUI 可以正常工作,但是当我从终端运行我的代码时,就会出现这个问题:
- xamarin.forms - Xamarin 表单全球化
- razor - 将错误的模型传递给部分
- javascript - Express.js:如何在不使用 Multer 创建新路由的情况下将 req 对象传递给中间件
- c# - 接收列表
C#中的url参数中的参数 - machine-learning - 当数据的形状为 (x,y,z) 时如何进行聚类?
- javascript - 如何在 MarkLogic 中将行插入 JSON 文档 [更新]
- c# - RecyclableMemoryStreamManager 作为 Singleton