r - R XML提取日期时间列返回错误值
问题描述
我正在尝试使用下面的代码在 R 中提取 XML 数据,并且是该过程的新手。除了 NEW_DATE 列之外,所有数据点似乎都是正确的。id=1 行的新日期:NEW_DATE = 852163200000,而不是下面原始 XML 格式中列出的 1997-01-02T00:00:00。似乎当我解析会话响应时,NEW_DATE 返回一个字符类型,其值我无法解释。我为这篇文章更改的唯一代码是将代理 URL 替换为 # 占位符。
任何帮助是极大的赞赏!
library(XML)
library(RCurl)
library(xml2)
library(httr)
library(rvest)
library(dplyr)
library(tidyverse)
#setup proxy
my_proxy = use_proxy(url="##.#.##.##:####")
#setup session and response
my_session = html_session("https://data.treasury.gov/feed.svc/DailyTreasuryYieldCurveRateData",my_proxy)
my_response = my_session$response
#check status
status_code(my_session)
status_code(my_response)
#retrieve content XML
content_parsed = content(my_session$response, as = "parsed")
#convert list to data frame
ust.df = data.frame(t(sapply(content_parsed$d,c)))
#<xs:datetime> data type is used to represent date and time in YYYY-MM-DDThh:mm:ss
#list column names
colnames(ust.df)
#remove X__metadata column
ust.df = ust.df %>%
select(-1)
#replace Date with "" in NEW_DATE column
ust.df$NEW_DATE = gsub("Date", "", paste(ust.df$NEW_DATE))
#replace (,),/ with "" in NEW_DATE column
ust.df$NEW_DATE =gsub("[[:punct:]]", "", ust.df$NEW_DATE)
#fix $NEW_DATE format -- 12 digits
#Id =1 NEW_DATE = 852163200000 instead of 1997-01-02T00:00:00 listed below
Id = 1 参考的 XML 代码示例
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xml:base="http://data.treasury.gov/Feed.svc/">
<title type="text">DailyTreasuryYieldCurveRateData</title>
<id>http://data.treasury.gov/feed.svc/DailyTreasuryYieldCurveRateData</id>
<updated>2021-03-11T17:17:56Z</updated>
<link rel="self" title="DailyTreasuryYieldCurveRateData" href="DailyTreasuryYieldCurveRateData" />
<entry>
<id>http://data.treasury.gov/Feed.svc/DailyTreasuryYieldCurveRateData(1)</id>
<title type="text" />
<updated>2021-03-11T17:17:56Z</updated>
<author>
<name />
</author>
<link rel="edit" title="DailyTreasuryYieldCurveRateDatum" href="DailyTreasuryYieldCurveRateData(1)" />
<category term="TreasuryDataWarehouseModel.DailyTreasuryYieldCurveRateDatum" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
<content type="application/xml">
<m:properties>
<d:Id m:type="Edm.Int32">1</d:Id>
<d:NEW_DATE m:type="Edm.DateTime">1997-01-02T00:00:00</d:NEW_DATE>
<d:BC_1MONTH m:type="Edm.Double" m:null="true" />
<d:BC_2MONTH m:type="Edm.Double" m:null="true" />
<d:BC_3MONTH m:type="Edm.Double">5.190000057220459</d:BC_3MONTH>
<d:BC_6MONTH m:type="Edm.Double">5.3499999046325684</d:BC_6MONTH>
<d:BC_1YEAR m:type="Edm.Double">5.630000114440918</d:BC_1YEAR>
<d:BC_2YEAR m:type="Edm.Double">5.96999979019165</d:BC_2YEAR>
<d:BC_3YEAR m:type="Edm.Double">6.130000114440918</d:BC_3YEAR>
<d:BC_5YEAR m:type="Edm.Double">6.3000001907348633</d:BC_5YEAR>
<d:BC_7YEAR m:type="Edm.Double">6.4499998092651367</d:BC_7YEAR>
<d:BC_10YEAR m:type="Edm.Double">6.5399999618530273</d:BC_10YEAR>
<d:BC_20YEAR m:type="Edm.Double">6.8499999046325684</d:BC_20YEAR>
<d:BC_30YEAR m:type="Edm.Double">6.75</d:BC_30YEAR>
<d:BC_30YEARDISPLAY m:type="Edm.Double">0</d:BC_30YEARDISPLAY>
</m:properties>
</content>
</entry>
</feed>
解决方案
推荐阅读
- c# - 奇怪的 Unity 预制行为?
- reactjs - 使用 React 组件发布和检索最新的 json 对象
- c - C4996 'scanf':此函数或变量可能不安全。考虑改用 scanf_s。要禁用弃用,请使用 _CRT_SECURE_NO_WARNINGS
- javascript - 通过 Firebase 成功登录后如何重定向到本地页面?
- scala - split() 函数在 spark 中的作用
- javascript - 我如何检查两个div是否在javascript中相互接触
- python - 如果缺少属性,如何在类定义阶段引发错误
- google-apps-script - Docs 附加项目 OAuth 范围 - 无法找到要求验证的范围
- mysql - MySQL按所选表的数组值排序
- python-3.x - SolveSympy 无法求解的函数