首页 > 解决方案 > 如何让 python webscraping 更好地更新

问题描述

我正在尝试制作一个 python 股票价格检查器。它运行良好,但更新非常缓慢。它不断运行并从https://money.cnn.com获取信息。

import requests, time, os
from bs4 import BeautifulSoup as bs
import simpleaudio as sa

original = 0


while True:
    pfe = requests.get('https://money.cnn.com/quote/quote.html?symb=pfe')
    soup = bs(pfe.content, 'lxml').body
    price_pfe = soup.find('td', {'class':'wsod_last'}).span.contents[0]
    if (price_pfe != original):
        print("Pfizer price: " + price_pfe)
        original = price_pfe

有什么技巧可以让它更新得更快吗?

标签: pythonweb-scraping

解决方案


仅使用 lxml 和 xpath 而不是 bs 获胜两次

import requests, time
from lxml import html

original = 0

while True:
    start_time = time.time()
    pfe = requests.get('https://money.cnn.com/quote/quote.html?symb=pfe')
    tree = html.fromstring(pfe.content)
    price_pfe = float(tree.xpath("//td[@class='wsod_last']/span")[0].text_content().strip())
    print(price_pfe)
    if (price_pfe != original):
        print("Pfizer price: " + str(price_pfe))
        original = price_pfe
    print("--- %s seconds ---" % (time.time() - start_time))

推荐阅读