?,python,html,css,web-scraping,beautifulsoup"/>

首页 > 解决方案 > 如何提取这样的锚标签

问题描述

所以,

假设我正在谷歌搜索“白俄罗斯”,一旦我们这样做,我们就会收到一些模型卡,如下图所示,

search_white_russian_on_google_sample_pic

现在,如果您查看 HTML 的 Inspector,我们将看到这些卡片的 HREF 在锚标记内,就像下图一样,(...表示额外的东西)

锚标记图像

<a class="a-no-hover-decoration" href="https://www.liquor.com/recipes/white-russian/" .....>

因此,我感兴趣的是从这些锚标签中提取该 href(如果它们存在用于搜索)。

我的尝试,

import requests
from bs4 import BeautifulSoup

req = requests.get("https://www.google.com/search?q=White+Russian")
soup = BeautifulSoup(req.text, 'html.parser')
soup.find_all("a", {"class": "a-no-hover-detection"}) # this returns Nothing

我对网络抓取有点陌生,因此将感谢您的帮助。

我的第二个问题是,如何检测我们有这样的模型卡与当我们没有这样的卡进行任何给定的随机搜索时?

谢谢。

标签: pythonhtmlcssweb-scrapingbeautifulsoup

解决方案


您还可以使用SelectorGadgets Chrome 扩展程序直观地获取 CSS 选择器。

from bs4 import BeautifulSoup
import requests, lxml

headers = {
    "User-Agent":
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}

response = requests.get('https://www.google.com/search?q=white russian', headers=headers).text
soup = BeautifulSoup(response, 'lxml')
# select() method: https://www.crummy.com/software/BeautifulSoup/bs4/doc/#searching-by-css-class
for result in soup.select('.cv2VAd .a-no-hover-decoration'):
  link = result['href']
  print(link)

输出:

https://www.liquor.com/recipes/white-russian/
https://www.delish.com/cooking/recipe-ideas/a29091466/white-russian-cocktail-recipe/
https://www.kahlua.com/en-us/drinks/white-russian/

或者,您可以使用Google Search Engine Results API来完成。这是一个付费 API,可免费试用 5,000 次搜索。

要集成的代码:

from serpapi import GoogleSearch

params = {
  "api_key": "YOUR_API_KEY",
  "engine": "google",
  "q": "White Russian",
  "google_domain": "google.com",
}

search = GoogleSearch(params)
results = search.get_dict()

for result in results['recipes_results']:
  link = result['link']
  print(link)

输出:

https://www.liquor.com/recipes/white-russian/
https://www.delish.com/cooking/recipe-ideas/a29091466/white-russian-cocktail-recipe/
https://www.kahlua.com/en-us/drinks/white-russian/

免责声明,我为 SerpApi 工作。


推荐阅读