r - 我无法从 xml 中提取节点。xml_find_all 未按预期工作
问题描述
我的问题可能相当简单,但我在使用 xml 时遇到了问题。我有一个代谢物列表和一个数据库,我可以在其中找到有关它们的 xml 格式的信息。我正在尝试创建一个同义词表,以便将我必须的代谢物名称翻译成更适合下游分析的名称。这是一个简单的代码,我试图访问同义词节点,但由于某种原因无法正常工作。我成功地尝试了另一个 xml 文件。此外,任何有关如何构建此表的提示将不胜感激。
library(xml2)
metabolites <- read_xml('<?xml version="1.0" encoding="UTF-8"?>
<hmdb xmlns="http://www.hmdb.ca">
<metabolite>
<version>4.0</version>
<creation_date>2005-11-16 15:48:42 UTC</creation_date>
<update_date>2019-01-11 19:13:56 UTC</update_date>
<accession>HMDB0000001</accession>
<status>quantified</status>
<secondary_accessions>
<accession>HMDB00001</accession>
<accession>HMDB0004935</accession>
</secondary_accessions>
<name>1-Methylhistidine</name>
<cs_description>1-Methylhistidine, also known as 1-mhis...</cs_description>
<description>One-methylhistidine (1-MHis) is derived ...</description>
<synonyms>
<synonym>(2S)-2-amino-3-(1-Methyl-1H-imidazol-4-yl)propanoic acid</synonym>
<synonym>1-Methylhistidine</synonym>
<synonym>Pi-methylhistidine</synonym>
<synonym>(2S)-2-amino-3-(1-Methyl-1H-imidazol-4-yl)propanoate</synonym>
<synonym>1 Methylhistidine</synonym>
</synonyms>
<chemical_formula>C7H11N3O2</chemical_formula>
<average_molecular_weight>169.1811</average_molecular_weight>
</metabolite>
</hmdb>')
syn <- xml_find_all(metabolites, "//synonyms")
谢谢!
解决方案
它与命名空间声明有关。请参阅此处的讨论:https ://github.com/r-lib/xml2/issues/222
library(xml2)
metabolites <- read_xml('<hmdb xmlns="http://www.hmdb.ca">
<metabolite>
<version>4.0</version>
<creation_date>2005-11-16 15:48:42 UTC</creation_date>
<update_date>2019-01-11 19:13:56 UTC</update_date>
<accession>HMDB0000001</accession>
<status>quantified</status>
<secondary_accessions>
<accession>HMDB00001</accession>
<accession>HMDB0004935</accession>
</secondary_accessions>
<name>1-Methylhistidine</name>
<cs_description>1-Methylhistidine, also known as 1-mhis...</cs_description>
<description>One-methylhistidine (1-MHis) is derived ...</description>
<synonyms>
<synonym>(2S)-2-amino-3-(1-Methyl-1H-imidazol-4-yl)propanoic acid</synonym>
<synonym>1-Methylhistidine</synonym>
<synonym>Pi-methylhistidine</synonym>
<synonym>(2S)-2-amino-3-(1-Methyl-1H-imidazol-4-yl)propanoate</synonym>
<synonym>1 Methylhistidine</synonym>
</synonyms>
<chemical_formula>C7H11N3O2</chemical_formula>
<average_molecular_weight>169.1811</average_molecular_weight>
</metabolite>
</hmdb>')
# namespace d1
xml_ns(metabolites)
#> d1 <-> http://www.hmdb.ca
#doesn't work
xml_find_all(metabolites, "//synonyms")
#> {xml_nodeset (0)}
#works
xml_find_all(metabolites, "//d1:synonyms")
#> {xml_nodeset (1)}
#> [1] <synonyms>\n <synonym>(2S)-2-amino-3-(1-Methyl-1H-imidazol-4-yl)pro ...
由reprex 包(v0.3.0)于 2019-11-09 创建
推荐阅读
- android - 如何过滤firebase中的值?
- docker - 如何设置 docker nextcloud 以使用 SSL
- tensorflow - add_loss() 如何处理复合模型?
- c# - 如何“禁用”计算列以便应用迁移?
- mysql - 如何在mysql中找到给定数字的范围
- java - RestAssured Groovy gpath findAll 返回一个值,以防它只找到一个匹配项
- r - R包:编写自己的汇总函数(方法)
- git - PR中两个分支之间的GitHub文件更改
- python - 诅咒 Python 中的命令行应用程序
- sql - 从 Oracle 中删除行 ID 最少的重复项