首页 > 解决方案 > 从页面打开几个 javascript 链接

问题描述

我正在使用"fabpot/goutte": "^3.2"和使用PHP 7.3.5.

我正在尝试访问以下页面并单击链接 - https://www.forexfactory.com/calendar.php?month=nov.2019- 打开以下框:

在此处输入图像描述

我试图过滤所有这些链接,但是no links are found

$subCrawler->filter('td.calendar__cell.calendar__detail.detail > a')->each(function ($node) {
    $link = $node->link();
    print $link ."\n";
    print $node->text() ."\n";
});

有什么建议如何单击带有 goutte 的链接并获取源文本和通常的效果文本?

标签: phpgoutte

解决方案


使用痛风:

<?php
require 'vendor/autoload.php';
use Goutte\Client;
use Symfony\Component\DomCrawler\Crawler;


$x = 1;
$LIMIT = 20; 

$client = new Client();
$crawler = $client->request('GET', 'https://www.forexfactory.com/calendar.php?month=nov.2019');
$resArray = array();
$TEMP = array();


$crawler->filter('.calendar_row')->each(function ($node) {
    global $x;
    global $LIMIT;
    global $resArray;
    global $TEMP;
    $x++;


    $EVENTID   = $node->attr('data-eventid');


    $API_RESPONSE = file_get_contents('https://www.forexfactory.com/flex.php?do=ajax&contentType=Content&flex=calendar_mainCal&details='.$EVENTID);

    $API_RESPONSE = str_replace("<![CDATA[","",$API_RESPONSE);
    $API_RESPONSE = str_replace("]]>","",$API_RESPONSE);

$html = <<<HTML
<!DOCTYPE html>
<html>
    <body>
       $API_RESPONSE
    </body>
</html>
HTML;

 $subcrawler = new Crawler($html);

 $subcrawler->filter('.calendarspecs__spec')->each(function ($LEFT_TD) {
     global $resArray;
     global $TEMP;
     $LEFT_TD_INNER_TEXT = trim($LEFT_TD->text());

     if($LEFT_TD_INNER_TEXT == "Source"){

            $TEMP = array(); 
            $LEFT_TD->nextAll()->filter('a')->each(function ($LINK) {
                global $TEMP;   
                array_push($TEMP,$LINK->text(),$LINK->attr('href'));
            });

            $EVENT['sourceTEXT'] = $TEMP[0];
            $EVENT['sourceURL']  = $TEMP[1];
            $EVENT['latestURL']  = $TEMP[3];

            array_push($resArray,$EVENT);
    }

});   

  if($x>$LIMIT){
        echo "<pre>"; var_dump($resArray); echo "</pre>";
        exit;
   }

});

使用简单的 HTML DOM。你可以从这里得到它。

<?php
include('simple_html_dom.php');
$html = file_get_html('https://www.forexfactory.com/calendar.php?month=nov.2019');

$x = 1;
$LIMIT = 10;

foreach($html->find('.calendar_row') as $e){
    $x++;
    $EVENTID = $e->attr['data-eventid'];
    $EVENTNAME = $e->find('.event')[0]->find('div')[0]->innertext;
    echo "<h4>".$EVENTNAME."</h4><br>";

    $API_RESPONSE = file_get_html('https://www.forexfactory.com/flex.php?do=ajax&contentType=Content&flex=calendar_mainCal&details='.$EVENTID);

    $API_RESPONSE = str_replace("<![CDATA[","",$API_RESPONSE);
    $API_RESPONSE = str_replace("]]>","",$API_RESPONSE);
    $API_RESPONSE = str_get_html($API_RESPONSE);
    foreach($API_RESPONSE->find('.calendarspecs__spec') as $LEFT_TD){

        $LEFT_TD_INNER_TEXT = trim($LEFT_TD->innertext);
        if($LEFT_TD_INNER_TEXT == "Source" || $LEFT_TD_INNER_TEXT == "Usual Effect"){
            echo $LEFT_TD_INNER_TEXT.": ".$LEFT_TD->next_sibling()->innertext."<br>";
        }
    }

    if($x>$LIMIT)
        break;
    echo "<hr>";

}

截图(Goutte):在此处输入图像描述

截图(简单的 HTML DOM):这是你想要的吗?


推荐阅读