php - 如何使用 CURL PHP 从 Google Business 获取评论
问题描述
我正在尝试在 Google Business 中获得评论。目标是通过获取访问权限curl
,然后从pane.rating.moreReviews
label获取价值jsaction
。
我如何修复下面的代码以获得curl
?
function curl($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.94 Safari/537.36');
$html = curl_exec($ch);
curl_close($ch);
return $html;
}
$html = curl("https://www.google.com/maps?cid=12909283986953620003");
$DOM = new DOMDocument();
$DOM->loadHTML($html);
$finder = new DomXPath($DOM);
$classname = 'pane.rating.moreReviews';
$nodes = $finder->query("//*[contains(@jsaction, '$classname')]");
foreach ($nodes as $node) {
$check_reviews = $node->nodeValue;
$ses_key = preg_replace('/[^0-9]+/', '', $check_reviews);
}
// result should be: 166
echo $ses_key;
如果我尝试做var_dump($html);
,我会得到:
string(348437) " "
这个数字在每次页面刷新时都会发生变化。
解决方案
使用 PHP cURL 和没有 API 密钥获取 Google 评论
如何查找 CID - 如果您在 Google 地图中开设了业务:
- 在 Google 地图中搜索商家名称
- 确保它是唯一显示的结果。
- 在 URL 中将 http:// 替换为 view-source:
- 点击 CTRL+F 并在源代码中搜索“ludocid”</li>
- CID 将是“ludocid\u003d”之后的数字,直到最后一个数字
或使用此工具:https ://ryanbradley.com/tools/google-cid-finder/
例子
ludocid\\u003d16726544242868601925\
提示:使用 CSS 中的“.quote”类来设置输出样式
PHP的
<?php
/*
Get Google-Reviews with PHP cURL & without API Key
=====================================================
How to find the CID - If you have the business open in Google Maps:
- Do a search in Google Maps for the business name
- Make sure it’s the only result that shows up.
- Replace http:// with view-source: in the URL
- Click CTRL+F and search the source code for “ludocid”
- CID will be the numbers after “ludocid\\u003d” and till the last number
or use this tool: https://pleper.com/index.php?do=tools&sdo=cid_converter
Example
-------
```TXT
ludocid\\u003d16726544242868601925\
```
> HINT: Use the class ".quote" in you CSS to style the output
###### Copyright 2019 Igor Gaffling
*/
$cid = '16726544242868601925'; // The CID you want to see the reviews for
$show_only_if_with_text = false; // true OR false
$show_only_if_greater_x = 0; // 0-4
$show_rule_after_review = false; // true OR false
/* ------------------------------------------------------------------------- */
$ch = curl_init('https://www.google.com/maps?cid='.$cid);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla / 5.0 (Windows; U; Windows NT 5.1; en - US; rv:1.8.1.6) Gecko / 20070725 Firefox / 2.0.0.6");
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookies.txt');
$result = curl_exec($ch);
curl_close($ch);
$pattern = '/window\.APP_INITIALIZATION_STATE(.*);window\.APP_FLAGS=/ms';
if ( preg_match($pattern, $result, $match) ) {
$match[1] = trim($match[1], ' =;'); // fix json
$reviews = json_decode($match[1]);
$reviews = ltrim($reviews[3][6], ")]}'"); // fix json
$reviews = json_decode($reviews);
//$customer = $reviews[0][1][0][14][18];
//$reviews = $reviews[0][1][0][14][52][0];
$customer = $reviews[6][18]; // NEW IN 2020
$reviews = $reviews[6][52][0]; // NEW IN 2020
}
if (isset($reviews)) {
echo '<div class="quote"><strong>'.$customer.'</strong><br>';
foreach ($reviews as $review) {
if ($show_only_if_with_text == true and empty($review[3])) continue;
if ($review[4] <= $show_only_if_greater_x) continue;
for ($i=1; $i <= $review[4]; ++$i) echo '⭐'; // RATING
if ($show_blank_star_till_5 == true)
for ($i=1; $i <= 5-$review[4]; ++$i) echo '☆'; // RATING
echo '<p>'.$review[3].'<br>'; // TEXT
echo '<small>'.$review[0][1].'</small></p>'; // AUTHOR
if ($show_rule_after_review == true) echo '<hr size="1">';
}
echo '</div>';
}
推荐阅读
- javascript - 当我们需要返回一个值时,为什么我们需要在递归中“返回”?
- react-native - 有什么方法可以检测 android 电视 HDMI 电缆是否连接或未在世博会反应原生?
- gcc - `jz` 到 RIP 相对内存位置会产生操作数不匹配 (GCC)
- powershell - 如何仅从 PowerCLI 获取 currentVersion?
- javascript - Chrome DevTools 中的“暂停脚本执行”未按预期工作
- azure - Azure Blockchain Workbench REST API 返回 204 无内容
- ruby-on-rails - 未定义的方法“simple_form_for”
- php - ERR_TOO_MANY_REDIRECTS laravel 管理/登录页面
- amazon-web-services - EC2 实例状态
- windows-10 - 如何在 Windows 10 计算机用户上以编程方式安装 Windows 10 Text To Speech Voices?