首页 > 解决方案 > 使用 HtmlAgilityPack 获取特定节点信息

问题描述

我的目标是使用 C# 从 HTML 文件中接收特定数据。该文件列出了商店,包括名称、地址和联系信息。各个商店的信息嵌套在不同的标签中。我想单独阅读每个字段。截至目前,该代码为每个商店提供了整个区块。

   var path = (@"C:\Users\filePath");
        string[] files = Directory.GetFiles(path, "HtmlFile.html");
        HtmlDocument doc = new HtmlDocument();
        var titles = new List<Customer>();


        for (int counter = 0; counter < files.Length; counter++ )
        {
           doc.Load(files.ElementAt(counter));

            foreach (HtmlAgilityPack.HtmlNode nodeName in doc.DocumentNode.SelectNodes("//div[@class = 'teaser__content']")) 
            {
                Customer customer = new Customer();
                
                customer.name = nodeName.InnerText;
            }

}

尝试获取我尝试添加的地址 customer.adress = node.SelectSingleNode(("//span[@class ='dh-user-info__adress']")).InnerText

这导致每家商店都获得相同的地址。

这是我正在使用的 html 的片段:

                <div class="teaser__content">
                                        <a href="https://www.dashandwerk.de/aachen/baecker-und-konditoren/printenbaeckerei-klein-e-k/"><div class="teaser__logo-responsive"><img src="./Bäcker- und Konditoren-Innung Regio Aachen _ DasHandwerk.de_files/1843_logo-800x557.jpg" alt="logo"></div></a>
                                    <div class="teaser__title">
                    <a href="https://www.dashandwerk.de/aachen/baecker-und-konditoren/printenbaeckerei-klein-e-k/">Printenbäckerei Klein e. K.</a>
                </div>
                <div class="teaser__description">
                    <div class="dh-user-info">

                                                        <div class="dh-user-info__item">
                                <i class="fa fa-map-marker" aria-hidden="true"></i>
                                <span class="dh-user-info__adress">Franzstr. 91</span>
                                <br>
                                <span class="dh-user-info__place">52064 Aachen</span>
                            </div>
                        
                                                        <div class="dh-user-info__item">
                                <i class="fa fa-phone" aria-hidden="true"></i>
                                <span class="dh-user-info__tel">Tel.: 0241 474350</span>
                                                                        <br>
                                    <span class="dh-user-info__fax">Fax.: 0241 4743522</span>
                                                                </div>
                        
                    </div>
                </div>

                <!-- START tags -->
                    <div class="teaser__tags">
                        <a href="https://www.dashandwerk.de/aachen/baecker-und-konditoren/" class="tag">Bäcker- und Konditoren-Innung Regio Aachen</a>                      </div>
                <!-- END tags -->
                                        <a href="https://www.dashandwerk.de/aachen/baecker-und-konditoren/printenbaeckerei-klein-e-k/"><div class="teaser__logo" style="background-image:url(https://www.dashandwerk.de/wp-content/uploads/business_uploads/1843_logo-200x139.jpg);"></div></a>
                                    <div class="teaser__ratings">
                                        </div>
            </div>

所以,我的问题是,如何获取每个商店的个人地址、城市、电话等信息?

标签: c#nodeshtml-agility-packselectnodes

解决方案


推荐阅读