c# - 从列表中提取所有数据
问题描述
我正在使用 HTML Agility Pack 来提取数据。我想从源中提取所有列表项:
<div id="feature-bullets" class="a-section a-spacing-medium a-spacing-top-small">
<ul class="a-unordered-list a-vertical a-spacing-mini">
<li><span class="a-list-item">
some data 1
</span></li>
<li><span class="a-list-item">
some data 2
</span></li>
<li><span class="a-list-item">
some data 3
</span></li>
<li><span class="a-list-item">
some data 4
</span></li>
</ul>
到目前为止我的代码:
string source = someSource
var htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(source);
如何提取所有列表项以获得与此类似的结果:
List value 1 is: some data 1
List value 2 is: some data 2
List value 3 is: some data 3
List value 4 is: some data 4
解决方案
这是我正在使用的来源:
amazon.co.uk/dp/B07VD9F419
. 我正在尝试提取项目符号中的数据。
安装额外的 NuGet 包Fizzler.Systems.HtmlAgilityPack
以启用该QuerySelector
功能。查询语法与JavaScript中的相同。
考虑以下示例。
using HtmlAgilityPack;
using Fizzler.Systems.HtmlAgilityPack;
class Program
{
private static readonly HttpClient client = new HttpClient();
static async Task Main(string[] args)
{
string source = await client.GetStringAsync("https://www.amazon.co.uk/dp/B07VD9F419");
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(source);
IEnumerable<HtmlNode> nodes = htmlDoc.DocumentNode.QuerySelectorAll("div#feature-bullets ul li span.a-list-item");
foreach (HtmlNode node in nodes)
{
Console.WriteLine(new string('-', 20) + Environment.NewLine + node.InnerText.Trim());
}
Console.ReadKey();
}
}
控制台输出
--------------------
In addition to body weight, it also gives you a realistic picture of your health and fitness with 13 data points, such as body composition, muscle volume etc.
--------------------
High precision With a series of algori thms complexes and advanced bioelectric Impedance Analysis (BIA), provides accurate state dose.
--------------------
Weighs from 100g to 150kg so it can also weigh fruits and vegetables in addition to adults and children.
--------------------
Stores up to 16 profiles
推荐阅读
- java - 另一个方法调用完成后如何从一个方法返回?
- jquery - 如何解决 Uncaught SyntaxError: Unexpected token '==' in JQuery-UI datepicker
- asp.net-mvc - w3wp.exe 进程和 KERNELBASE.DLL 模块中的 APPCRASH
- c++ - node js读取二进制数据
- requirements - 从 DOORS 中具有相同属性的两个模块更新对象
- paypal-sandbox - Paypal 退款 API 给出未经授权的错误
- microsoft-graph-api - 如何使用 microsoft graph API 复制 excel 文件?
- amazon-web-services - .ebextensions 未包含在 AWS 源包中
- dynamics-crm - Dynamics crm fetchxml group by
- javascript - html2canvas 下载图像仅返回带有白色和右行的 png 文件?