首页 > 解决方案 > 如何在python中提取标签href

问题描述

我想从一个类中提取一个href标签,我该怎么做?这是我的代码:

from bs4 import BeautifulSoup
import requests
from pprint import pprint
def search_manga(titolo):
    i = 0
    e = 0
    base_url = "https://beta.mangaeden.com/it/it-directory/?title="
    titolo = titolo.replace(" ", "+")
    url = base_url + titolo
    r = requests.get(url)
    soup = BeautifulSoup(r.content, "html.parser")
    manga_list = soup.find(id = 'mangaList')
    a_tag = manga_list.find_all(class_='openManga')
    print(a_tag)
    a_tag_array=[]
    for link in a_tag:
        link = a_tag.get('href')
        print(link)
manga_name = input("inserisci il nome del manga: ")
search_manga(manga_name)

这是输出:

AttributeError: ResultSet object has no attribute 'get'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?

我该如何解决?

标签: pythonhtmlweb

解决方案


查看来自您的查询的 HTML 响应,似乎只有一个具有“openManga”类的元素。因此,这可以简化如下:-

from bs4 import BeautifulSoup as BS
import requests


def getHref(title):
    with requests.Session() as sess:
        r = sess.get('https://beta.mangaeden.com/it/it-directory/',
                     params={'title': title})
        r.raise_for_status()
        soup = BS(r.text, 'html.parser')
        a = soup.find_all('a', class_='openManga')
        if a:
            return a[0].get('href', None)


print(getHref('Berserk'))

推荐阅读