python - 如何使用python根据特定条件添加xml标签
问题描述
示例 XML 文件
<ArticleSet>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>harvard university of science. abc@gmail.com</Affiliation>
<Keywords>-</Keywords>
</Article>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>-</Affiliation>
<Keywords>-</Keywords>
</Article>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>harvard university of science. ghi@yahoo.co.in</Affiliation>
<Keywords>-</Keywords>
</Article>
</ArticleSet>
示例代码
from xml.etree import ElementTree as etree
import re
root = etree.parse("sampleinput.xml").getroot()
for article in root.iter("Affiliation"):
if(article.text != "-"):
email = re.search(r'[\w\.-]+@[\w\.-]+', article.text)
c = etree.Element("<Email>")
c.text = email.group(0)
etree.write(article,c)
输出所需更新的 XML 文件
<?xml version="1.0"?>
<ArticleSet>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>harvard university of science. abc@gmail.com</Affiliation>
<Keywords>-</Keywords>
<Email>abc@gmail.com</Email>
</Article>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>-</Affiliation>
<Keywords>-</Keywords>
<Email>-</Email>
</Article>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>harvard university of science. ghi@yahoo.co.in</Affiliation>
<Keywords>-</Keywords>
<Email>ghi@yahoo.co.in</Email>
</Article>
</ArticleSet>
我想从<Affiliation>
标签中提取电子邮件地址并创建一个名为的新标签<Email>
并将提取的电子邮件存储到该标签中。如果<Affiliation>
等于-
则存储<Email>-</Email>
到该文章中。
错误
回溯(最后一次调用):文件“C:/Users/Ghost Rider/Documents/Python/addingTagsToXML.py”,第 11 行,在 etree.write(article,c) AttributeError: module 'xml.etree.ElementTree' has没有属性“写”
解决方案
你可以试试这个:
import re
import xml
tree = xml.etree.ElementTree.parse('filename.xml')
e = tree.getroot()
for article in e.findall('Article'):
child = xml.etree.ElementTree.Element("Email")
if article[2].text != '-':
email = re.search(r'[\w\.-]+@[\w\.-]+', article[2].text).group()
child.text = email
else:
child.text = ' - '
article.insert(4,child)
tree.write("filename.xml")
推荐阅读
- javascript - 不渲染 instagram、twitter 等嵌入 angular
- azure-webjobs - 无法使连续 Azure Webjob 成为单例
- html - 在 NetSuite 中向库存项目添加图像
- sql-server - SSIS 2016 投标连接部署问题
- postgresql - 我怎样才能从某个日期/时间*下载我的 heroku DB 的副本?
- html - 如何宣布网站不支持屏幕阅读器?
- c# - OpenXML 文件在 Excel 中打开时需要修复
- python - 线程如何共享对实例的引用?
- docusignapi - DocuSign API - 帐户用户之间的信封共享
- lean - 难以定义欧几里得空间的子集