首页 > 解决方案 > Python BeautifulSOUP 在 html 中查找文本

问题描述

我正在尝试按关键字进行谷歌搜索并浏览所有结果并查找它们是否有类似"</div>" "<!DOCTYPE>"或其他内容的文本,它正在获取 html 但我的 if 总是说 div 在任何站点中都不存在。

代码 :

from google import google
import urllib.request
from bs4 import BeautifulSoup


def google_scrape(url):
    thepage = urllib.request.urlopen(url)
    soup = BeautifulSoup(thepage, "html.parser")
    return soup.html

i = 1
query = 'תקליטן'
for url in google.search(query, 10):
    print("Trying : %s" % (url.link))
    try :
        html = google_scrape(url.link)
        if "</div>" in html:
            print("He have it")
        else:
            print("He doesnt have it")
    except Exception as e: print(e)
    #print(url.link)

回复:

Trying : https://www.youtube.com/?hl=iw&gl=IL    
He doesnt have it    
Trying : None

“NoneType”对象没有“超时”属性

Trying : https://he.wikipedia.org/wiki/%D7%99%D7%95%D7%98%D7%99%D7%95%D7%91    
He doesnt have it    
Trying : https://en.wikipedia.org/wiki/YouTube    
He doesnt have it    
Trying : https://www.facebook.com/youtube/

标签: pythonpython-3.xbeautifulsoupgoogle-search

解决方案


推荐阅读