首页 > 解决方案 > 无法使用 PHP dom 解析器获取子元素

问题描述

下面是我的HTML

<div class="product4Col">
   <div class="fluidprodCol">
      <div class="fluid">
         <a href="url1">Title 1</a>  
      </div>
      <div class="fluid">
         <div>
            <a id="">Add To Bag</a>
         </div>
      </div>
      <div class="fluid productName title">
         <a href="url2">Subtitle 1</a>
      </div>
      <div class="fluid productName price"><label>₹&lt;/label>2,999 
      </div>
      <div class="fluid productName">
         <div class="colorSwatch" ></div>
         <div class="colorSwatch" ></div>
         <div class="colorSwatch" ></div>
      </div>
   </div>
</div>
<div class="product4Col">
   <div class="fluidprodCol">
      <div class="fluid">
         <a href="url11">Title 2</a>     
      </div>
      <div class="fluid">
         <div>
            <a id="">Add To Bag</a>
         </div>
      </div>
      <div class="fluid productName title">
         <a href="url22">Subtitle 2</a>
      </div>
      <div class="fluid productName price"><label>₹&lt;/label>2,999 
      </div>
      <div class="fluid productName">
         <div class="colorSwatch" ></div>
         <div class="colorSwatch" ></div>
      </div>
   </div>
</div>

我想得到如下输出

1: url1 , Title 1 , url2 , Subtitle 1, 3 colorSwatch

2: url11 ,Title 2 , url22 , Subtitle 2, 2 colorSwatch

我尝试了下面的代码,但似乎没有按预期工作,我无法获取 2 级数据,我想获取 url、标题和样本计数。需要帮助来解决问题

$dataop = file_get_contents('http://localhost/dataimport.html');


$doc = new DOMDocument();
$doc->loadHTML($dataop);
$xpath = new DomXPath($doc);
$nodeList = $xpath->query("//div[@class='product4Col']");
foreach($nodeList as $prg){
    echo "<br>------------------<br>";
    $nodeListnx = $prg->query("//div[@class='fluidprodCol']");
    foreach($nodeListnx as $prgnx){
        echo "<p>new</p>";
    }
    echo "<br>------------------<br>";
}

标签: phpdom

解决方案


此代码更正了第二次使用 of 指出的问题,query()并将$prg用作下一次搜索的上下文。但我还在.查询的开头添加了 ,以确保它只读取该节点的内容。

由于这会提取<a>此元素内的标签,因此它只会从第 1 和第 3 个链接中挑选出数据。然后它看起来像colorSwatch分类元素,不确定你想用它们做什么,所以它只是输出内容......

$doc = new DOMDocument();
$doc->loadHTML($dataop);
$xpath = new DomXPath($doc);
$nodeList = $xpath->query("//div[@class='product4Col']");
foreach($nodeList as $prg){
    echo "<br>------------------<br>";
    $nodeListnx = $xpath->query(".//div[@class='fluidprodCol']//a", $prg);
    echo $nodeListnx[0]->attributes['href']->textContent . " " . $nodeListnx[0]->textContent . "<br /";
    echo $nodeListnx[2]->attributes['href']->textContent . " " . $nodeListnx[2]->textContent. "<br /";
    
    $colorSwatchs = $xpath->query(".//div[@class='colorSwatch']", $prg);
    foreach ( $colorSwatchs as $colorSwatch )   {
        echo $colorSwatch->textContent . "<br />";
    }
    echo "<br>------------------<br>";
}

推荐阅读