首页 > 解决方案 > 如何将存储在变量中的文本转换为 Pyhton 中的 BeautifulSoup 对象

问题描述

   for i in range(self.length):
        print(colored('Title', 'green', attrs=['bold']))
        print(self.url.entries[i].title)
        print(colored('Link', 'green', attrs=['bold']))
        print(self.url.entries[i].link)
        print(colored('Description', 'green', attrs=['bold']))
        soup = BeautifulSoup(self.url.entries[i].summary, 'html.parser')
        for s in soup.find_all('p'):
            print(s)

在此代码中,我需要将存储在 self.url.entries[i].summary 中的所有文本转换为 RSS Feed 中每个描述部分的 BeautifulSoup 对象,以仅打印

标签。但是,我找不到这样做的方法。

存储在 self.url.entries[i].summary 之一中的文本是:

    <img alt="APTOPIX Haiti Earthquake" src="https://i.cbc.ca/1.6143421.1629203330!/fileImage/httpImage/image.jpg_gen/derivatives/16x9_460/aptopix-haiti-earthquake.jpg" title="People affected by the Saturday" width="460" />                <p>Heavy rain from Tropical Storm Grace forced a temporary halt to the government's response to the deadly earthquake that battered the impoverished Caribbean nation on Saturday. </p>

标签: pythonhtmlbeautifulsouprss

解决方案


你可以这样做。

from bs4 import BeautifulSoup

my_text = ''' <img alt="APTOPIX Haiti Earthquake" src="https://i.cbc.ca/1.6143421.1629203330!/fileImage/httpImage/image.jpg_gen/derivatives/16x9_460/aptopix-haiti-earthquake.jpg" title="People affected by the Saturday" width="460" />                <p>Heavy rain from Tropical Storm Grace forced a temporary halt to the government's response to the deadly earthquake that battered the impoverished Caribbean nation on Saturday. </p>'''

# Creates a beautifulsoup object
soup = BeautifulSoup(my_text, 'lxml')

推荐阅读