首页 > 解决方案 > 使用 DomDocument 检索文本,但删除内部 h1 标记

问题描述

我有一些 html 试图检索文本但不包含<h1>标签内容。

$html = '<div class="mytext">   
           <h1>Title of document</h1>   
           This is the text that I want, without the title.
         </div>';

$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$xp = new DOMXpath($dom);
foreach($xp->query('//div[@class="mytext"]') as $node) {
  $description = $node->nodeValue;
  echo $description; 
}

最终结果应该是:This is the text that I want, without the title.

目前是:Title of document This is the text that I want, without the title

我怎样才能得到没有 h1 标签的文本?

标签: phpxpathdomdocument

解决方案


试试这个:

foreach($xp->query('//div[@class="mytext"]/text()[normalize-space()]') as $node) {
   $description = $node->nodeValue;
   echo $description; 
}

推荐阅读