首页 > 解决方案 > 我怎么刮?

问题描述

import requests
from bs4 import BeautifulSoup

url = "https://www.jab.de/tr/en/productadvancedsearch?searchTerm=&page=1"
website = requests.get(url)
html = website.content
soup = BeautifulSoup(html,"html.parser")

urunListesi = soup.find("section",{"class":"results"}).find("div",{"class":"col-item details"})
# print(urunListesi)

for urun in urunListesi:
    link = urun.div.a.get("href")
    print(link)
    print("----------------------------\n")

当我对该代码进行操作时,它返回无,你能帮帮我吗?

标签: pythonseleniumweb-scrapingbeautifulsoup

解决方案


这是我刮取比特币价格的一个例子!在此之后,我将解释一切如何运作。试试看!

import requests
from bs4 import BeautifulSoup

cmc = requests.get(f"https://www.google.com/search?q=what+is+the+price+of+bitcoin")
soup = BeautifulSoup(cmc.content, "html.parser")
# with open("soup.txt", "w") as f:
#     f.write(soup.prettify())
class_of_text = "BNeawe iBp4i AP7Wnd"
price = soup.find("div", attrs={'class':class_of_text}).find("div", attrs={'class':class_of_text}).text

print("Here is the price of bitcoin:")
print(price)

解释部分。

import requests
from bs4 import BeautifulSoup

cmc = requests.get("https://www.google.com/search?q=what+you+want+to+scrape+with+pluses+instead+of+spaces")
soup = BeautifulSoup(cmc.content, "html.parser")

with open("soup.txt", "w") as f:
    f.write(soup.prettify())

这将写入一个名为 soup.txt 的文件。里面会有文件的html。查看该文件并找到您要抓取的文本。命令/控制 f 将帮助您找到它。然后,复制课程。它应该看起来像这样:BNeawe s3v9rd AP7Wnd. 让它成为一个变量。然后使一个新变量等于soup.find("div", attrs={'class':Your_First_Variable}).find("div", attrs={'class':Your_First_Variable}).text。第二个变量将具有抓取的文本。

在soup.txt中找到类后,可以删除,

with open("soup.txt", "w") as f:
    f.write(soup.prettify())

如果您需要进一步的说明或帮助,请发表评论。


推荐阅读