php - 从页面打开几个 javascript 链接
问题描述
我正在使用"fabpot/goutte": "^3.2"
和使用PHP 7.3.5
.
我正在尝试访问以下页面并单击链接 - https://www.forexfactory.com/calendar.php?month=nov.2019
- 打开以下框:
我试图过滤所有这些链接,但是no links are found
:
$subCrawler->filter('td.calendar__cell.calendar__detail.detail > a')->each(function ($node) {
$link = $node->link();
print $link ."\n";
print $node->text() ."\n";
});
有什么建议如何单击带有 goutte 的链接并获取源文本和通常的效果文本?
解决方案
使用痛风:
<?php
require 'vendor/autoload.php';
use Goutte\Client;
use Symfony\Component\DomCrawler\Crawler;
$x = 1;
$LIMIT = 20;
$client = new Client();
$crawler = $client->request('GET', 'https://www.forexfactory.com/calendar.php?month=nov.2019');
$resArray = array();
$TEMP = array();
$crawler->filter('.calendar_row')->each(function ($node) {
global $x;
global $LIMIT;
global $resArray;
global $TEMP;
$x++;
$EVENTID = $node->attr('data-eventid');
$API_RESPONSE = file_get_contents('https://www.forexfactory.com/flex.php?do=ajax&contentType=Content&flex=calendar_mainCal&details='.$EVENTID);
$API_RESPONSE = str_replace("<![CDATA[","",$API_RESPONSE);
$API_RESPONSE = str_replace("]]>","",$API_RESPONSE);
$html = <<<HTML
<!DOCTYPE html>
<html>
<body>
$API_RESPONSE
</body>
</html>
HTML;
$subcrawler = new Crawler($html);
$subcrawler->filter('.calendarspecs__spec')->each(function ($LEFT_TD) {
global $resArray;
global $TEMP;
$LEFT_TD_INNER_TEXT = trim($LEFT_TD->text());
if($LEFT_TD_INNER_TEXT == "Source"){
$TEMP = array();
$LEFT_TD->nextAll()->filter('a')->each(function ($LINK) {
global $TEMP;
array_push($TEMP,$LINK->text(),$LINK->attr('href'));
});
$EVENT['sourceTEXT'] = $TEMP[0];
$EVENT['sourceURL'] = $TEMP[1];
$EVENT['latestURL'] = $TEMP[3];
array_push($resArray,$EVENT);
}
});
if($x>$LIMIT){
echo "<pre>"; var_dump($resArray); echo "</pre>";
exit;
}
});
使用简单的 HTML DOM。你可以从这里得到它。
<?php
include('simple_html_dom.php');
$html = file_get_html('https://www.forexfactory.com/calendar.php?month=nov.2019');
$x = 1;
$LIMIT = 10;
foreach($html->find('.calendar_row') as $e){
$x++;
$EVENTID = $e->attr['data-eventid'];
$EVENTNAME = $e->find('.event')[0]->find('div')[0]->innertext;
echo "<h4>".$EVENTNAME."</h4><br>";
$API_RESPONSE = file_get_html('https://www.forexfactory.com/flex.php?do=ajax&contentType=Content&flex=calendar_mainCal&details='.$EVENTID);
$API_RESPONSE = str_replace("<![CDATA[","",$API_RESPONSE);
$API_RESPONSE = str_replace("]]>","",$API_RESPONSE);
$API_RESPONSE = str_get_html($API_RESPONSE);
foreach($API_RESPONSE->find('.calendarspecs__spec') as $LEFT_TD){
$LEFT_TD_INNER_TEXT = trim($LEFT_TD->innertext);
if($LEFT_TD_INNER_TEXT == "Source" || $LEFT_TD_INNER_TEXT == "Usual Effect"){
echo $LEFT_TD_INNER_TEXT.": ".$LEFT_TD->next_sibling()->innertext."<br>";
}
}
if($x>$LIMIT)
break;
echo "<hr>";
}
推荐阅读
- apache-spark - 在另一个 Ansible 的一个文件中声明的变量
- java - 如果 Java 版本是 11、11.1x 等,为什么 log4j 停止运行 MDC 逻辑
- php - 我需要测验显示一个比下一个
- sql - SQL - 在日期范围之间的字段中插入日期
- arrays - 使用反应钩子导入对象的 json 数组
- ios - Ionic Native BackgroundFetch iOs 实现
- reactjs - 如何在没有后端的情况下使用谷歌文本到语音进行反应
- python - 获取 NOT NULL 约束失败:DRF 中的locations_location.city_id
- c++ - 如何在 CMD 中显示具有动态行大小的表格?
- bash - 使用 bash -c 逐行解析