首页 > 解决方案 > 我如何将我的数据分割成单独的列表?

问题描述

我需要从 url 中获取一些数据,但 div 内部有 3 个元素。我需要将其切片并将其放入单独的列表中。在 Url html 数据中是这样的:

            <div class="nesne row nobetciDiv ">
              <div class="col-md-4 tablo yuksek">
                  <div class="hucre hucre-ortala">
                        <a href="tel:0242 237 67 22">ARAT ECZANESİ</a>  // I need take it to first list.
                          <br>
                         <a href="tel:0242 237 67 22">0(242) 237-67-22</a> // I need take it to second list.
                     </div>
                </div>
            
            
          <div class="col-md-8 tablo yuksek">
             <div class="hucre hucre-ortala">
                 <a href="https://maps.google.com/maps?q=36.8905816274,30.6800764847" class="nadres" target="_blank">
    <img src="/Resim/Upload/mapi.png" class="mapi">
 K.KARABEKIR CD.EGITIM  ARASTIRMA HASTANESI ACIL KARSISI </a> // I need take it to third list.
                 </div>
            </div> 
          </div>

我试过这样做:

List<string> pharmacyName = new List<string>();
            List<string> pharmacyAdress = new List<string>();
            List<string> pharmacyNumber = new List<string>();
            Uri url = new Uri("https://www.antalyaeo.org.tr/tr/nobetci-eczaneler");

            WebClient client = new WebClient();
            client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
            string html = client.DownloadString(url);

            HtmlAgilityPack.HtmlDocument document = new HtmlAgilityPack.HtmlDocument();
            document.LoadHtml(html);
            var xpath = "//text()[not(normalize-space())]";
            var emptyNodes = document.DocumentNode.SelectNodes(xpath);

             
            foreach (HtmlNode emptyNode in emptyNodes)
            {
                emptyNode.ParentNode
                         .ReplaceChild(HtmlTextNode.CreateNode(""), emptyNode);
            }
             

            HtmlNodeCollection title = document.DocumentNode.SelectNodes("//div[contains(@class,'nobetciDiv')]");

            foreach (var item in title)
            {
                HtmlNode x = item.SelectSingleNode("//div[contains(@class,'col-md-4')]");
                HtmlNode a = x.SelectSingleNode("//a");
                pharmacyName.Add(a.InnerText); 
// Its giving me like : ARAT ECZANESİ0(242) 237-67-22  Cant seperate it or cant take third one.
            }

抱歉我的英语不好,我主要用代码来描述。所以感谢所有的帮助!

标签: c#htmlasp.netasp.net-corehtml-agility-pack

解决方案


首先,<div class="col-md-4 tablo yuksek"></div>有两个<a></a>,所以你不能使用SelectSingleNode。然后第三个<a></a><div class="col-md-8 tablo yuksek"></div>而不是。<div class="col-md-4 tablo yuksek"></div>尝试改变

foreach (var item in title)
            {
                HtmlNode x = item.SelectSingleNode("//div[contains(@class,'col-md-4')]");
                HtmlNode a = x.SelectSingleNode("//a");
                pharmacyName.Add(a.InnerText); 
// Its giving me like : ARAT ECZANESİ0(242) 237-67-22  Cant seperate it or cant take third one.
            }

foreach (var item in title)
        {
            HtmlNode x = item.SelectSingleNode("//div[contains(@class,'col-md-4')]").FirstChild;
            pharmacyName.Add(x.SelectNodes("//a[@href]")[0].InnerText);
            pharmacyAdress.Add(x.SelectNodes("a[@href]")[1].InnerText);
            string s3= item.SelectSingleNode("//div[contains(@class,'col-md-8')]").SelectSingleNode("//a[contains(@class,'nadres')]").InnerText.Replace("\r\n","").Trim();
            pharmacyNumber.Add(s3);
           
        }

结果: 在此处输入图像描述


推荐阅读