html - Powershell - 解析 ATOM RSS 文件
问题描述
[rss 文件示例][1]
我正在尝试从网站解析 RSS 文件。RSS 文件有 rss.channel.item 等。我能够解析除“描述”之外的所有内容。它不断返回 HTML 标签,我希望能够获取其中包含它来自谁、受影响区域和事件描述的文本。我想格式化所有内容以显示正确的标题、信息等。
关于如何完成这项任务的任何想法?
代码:
cls
Invoke-WebRequest -Uri "" -outfile c:\""
[xml]$content = get-content c:\""
$feed = $content.rss.channel
foreach ($msg in $feed.item) {
[PSCustomObject]@{
'Date-Time' = [datetime]$msg.pubDate
'Title' = $msg.link.InnerText
'description' = $msg.description.InnerText
}}
示例 RSS 文件:
<item>
<title>Account Management Planned Outage</title>
<link><![CDATA[https://*.service-now.*/sp?id=service_status&service=5569a0344ffe72487e415cd01310c72e]]></link>
<pubDate>06 Apr 2020 11:51:42 -0400</pubDate>
<guid isPermaLink="false">014451c11b0c54547746766dcc4bcb96</guid>
<description><![CDATA[<p><strong>People and Locations Impacted: </strong><br />All students, faculty, staff at all State locations<br /><br /><strong>IT Service(s) Impacted:</strong>
<br />Enterprise Directory services: ldap.*.edu, dirapps.*.*.edu, and ldap-prime.*.*.edu. No outages will occur, but services will be restarted.<br /><br /><strong>Date and Time:</strong><br />Services may be affected from 05:30ET until 06:59ET on Tuesday, 04/07/2020.<br />
<br /><strong>Technical Information:</strong><br />This alert will be updated as new information becomes available. State IT users can view additional details in ServiceNow.</p>]]></description>
解决方案
推荐阅读
- shell - 删除斜线的脚本
- java - 在字符之间但不是在数字上遇到“-”时拆分字符串
- c# - 如何删除过载弹出窗口
- json - Json 验证自定义错误
- graph-tool - 在图形工具中查找源和目标之间的所有路径,返回边而不是顶点
- r - 如何在并行化代码 R 时释放 RAM
- python-3.x - Do I have to install packages needed each time when I start Google Colab?
- mysql - 在 SQL 中,按 INT 范围分组
- r - 使用 R 中的 shapefile 过滤经纬度点
- docker - docker build 在 Kubernetes 节点中失败