python - BeautifulSoup RSS 提要提取一个选项卡重排“1”
问题描述
使用python3,BeautifulSoup,试图获取RSS提要,在 <description>
标签里面有<a>
和<img>
标签。
我只想得到
<a>
标签href<img>
标签源
import requests
from bs4 import BeautifulSoup
from bs4 import CData
tp_api = "https://timesofindia.indiatimes.com/rssfeeds/-2128936835.cms"
response = requests.get(tp_api)
soup = BeautifulSoup(response.text, 'xml')
results = soup.find_all('item',)
records = []
for result in results:
main = result.find('description').string
images = main
print(main)
我们得到的回应
<a href="https://timesofindia.indiatimes.com/india/maharashtra-congress-demands-complete-loan-waiver-for-flood-hit-farmers/articleshow/70675961.cms"><img border="0" hspace="10" align="left" style="margin-top:3px;margin-right:5px;" src="https://timesofindia.indiatimes.com/photo/70675961.cms" /></a>The Congress on Wednesday sought a complete loan waiver for farmers affected by floods in Maharashtra and demanded that the state government provide them an assistance of Rs 60,000 per hectare of crop damage.
解决方案
import requests
from bs4 import BeautifulSoup
from bs4 import CData
tp_api = "https://timesofindia.indiatimes.com/rssfeeds/-2128936835.cms"
response = requests.get(tp_api)
soup = BeautifulSoup(response.text, 'html.parser')
results = soup.find_all('item',)
records = []
for result in results:
main = BeautifulSoup(result.find('description').string, 'html.parser')
a_tag = main.find('a')
images = a_tag
print(a_tag)
输出:
<a href="https://timesofindia.indiatimes.com/india/delhi-hc-stays-jnu-inquiry-against-teachers-for-participating-in-protest/articleshow/70676842.cms"><img align="left" border="0" hspace="10" src="https://timesofindia.indiatimes.com/photo/70676842.cms" style="margin-top:3px;margin-right:5px;"/></a>
推荐阅读
- reactjs - 使用反应引导对表列进行排序不起作用
- c - 与 C 指针混淆
- spring - 使 Spring 受益的原则是什么?
- computer-vision - YOLO 和滑动窗口算法的输出向量
- ruby-on-rails - 在 Elastic Beanstalk 上部署时出错 - Rails
- java - 生成带有一些缺失像素的 QR 码
- c# - Microsoft 代理“无法启动”
- r - R - 将 utils::View 重新定义为通用而不与 RStudio 冲突
- html - 如何添加一个覆盖我的主导航标题的粘性标题?
- c++ - 有没有办法在构造函数代码中初始化类的成员对象而不是初始化列表?