首页 > 解决方案 > 如何通过网络抓取从 udemy 网站查找价格?

问题描述

我正在使用漂亮的 python 汤包来查找课程的价格。用漂亮的汤,我得到的是美元的价格,当我把它换成卢比时,情况就不同了。

price in udemy website : 700
price by beautiful soup : 13.99$

我试图通过计算不同的课程比例来寻找逻辑,但它没有奏效。这是我的代码:

from bs4 import BeautifulSoup
import requests
page = requests.get('https://www.udemy.com/course/python-data-science-machine-learning-bootcamp/')
soup = BeautifulSoup(page.content, 'html.parser')
for sp in soup.find_all('span',class_='price-text__current'):
   print(sp)

我得到这个输出:

<span class="price-text__current" data-purpose="discount-price-text">
<span class="sr-only">Current price:</span> $13.99
</span>
</span>

标签: pythonweb-scrapingbeautifulsoup

解决方案


您必须发送headers您的请求:

from bs4 import BeautifulSoup
import requests

# For French people:
hd = {'Accept-Language': 'fr,fr-FR'}
page = requests.get('https://www.udemy.com/course/python-data-science-machine-learning-bootcamp/',
                    headers = hd)
soup = BeautifulSoup(page.content, 'html.parser')
for sp in soup.find_all('span',class_='price-text__current'):
    print(sp)

# For US people:
hd = {'Accept-Language': 'en,en-US'}
page = requests.get('https://www.udemy.com/course/python-data-science-machine-learning-bootcamp/',
                    headers = hd)
soup = BeautifulSoup(page.content, 'html.parser')
for sp in soup.find_all('span',class_='price-text__current'):
    print(sp)

输出:

<span class="price-text__current" data-purpose="discount-price-text">
<span class="sr-only">Prix actuel :</span> 15,99 €
</span>
<span class="price-text__current" data-purpose="discount-price-text">
<span class="sr-only">Current price:</span> €12.99
</span>

但是请注意,请求后请求的响应似乎有所不同。该网站可能正在服务器端跟踪您。您还必须使用 header 来获得您想要的值。


推荐阅读