首页 > 解决方案 > 如何从 BeautifulSoup 过滤器结果中删除 div

问题描述

现在我正在尝试从 BeautifulSoup 结果中删除 div 类,如下所示:

response = requests.get(url)
            // success
            cnbeta_article_content = BeautifulSoup(response.content, "html.parser").find("div", {"class": "cnbeta-article-body"})
            // failed
            removed_share_content = BeautifulSoup(cnbeta_article_content, "html.parse").find("div", {"class": "article-share-code"}).decompose()
            result_text = removed_share_content.prettify()
            return result_text

首先从类cnbeta-article-body中获取 div,从过滤结果中删除 div article-share-code,但它似乎不起作用。我应该怎么做才能解决它?这个网址是:https://www.cnbeta.com/articles/tech/1097507.htm

标签: python-3.x

解决方案


div的htmlcnbeta-article-body如下


<div class="cnbeta-article-body">
<div class="article-summary">
<div class="topic"><a href="https://www.cnbeta.com/topics/741.htm" target="_blank"><img src="https://static.cnbetacdn.com/topics/9a78aa447fb90ef.png" title="手机 - OnePlus 一加&quot;/></a></div>
<p>一加å³å°†å‘布年度旗舰一加9系列,éšç€æ–°æ——舰的到æ¥ï¼Œä¸€åŠ 8系列机型价格开始下调。今天,一加宣布,<strong>一加8 Pro最高优惠1000å
ƒï¼Œèµ·å”®ä»·åªè¦4599å
ƒï¼Œæ”¯æŒ24期å
æ¯åˆ†æœŸ&lt;/strong>,æä¾›é’空ã€é»‘é•œã€è“调三ç§é
色。&lt;/p> </div>
<div class="article-content" id="artibody">
<div class="article-global"><p><strong>访问:&lt;/strong></p><p><a href="https://click.aliyun.com/m/1000245338/" target="_blank"><strong><span style="color: rgb(192, 0, 0);">2021阿里云上云采è´å£ï¼šé‡‡è´è¡¥è´´ã€å

值返券ã€çˆ†æ¬¾æŠ¢å
ˆè´â€¦â€¦</span></strong></a></p></div> <div class="article-topic"><p>
<strong>访问è´ä¹°é¡µé¢:</strong>
</p>
<p>
<a href="https://c.duomai.com/track.php?site_id=242986&amp;aid=942&amp;euid=&amp;t=http%3A%2F%2Fwww.oneplus.com" target="_blank">一加自è¥æ——舰店&lt;/a>
</p></div><p style="text-align:center"><img src="https://static.cnbetacdn.com/article/2021/0304/8158fddd4c92c53.jpg"/></p><p style="text-align: left;">一加 8 Pro最大的看点之一是å±å¹•ï¼Œ&lt;strong>å
¶å±å¹•å°ºå¯¸ä¸º6.78英寸,分辨率为2K+,刷新率为120Hz,触控采样率为240Hz,被称之为“å±å¹•æœºçš‡â€ã€‚&lt;/strong></p><p style="text-align: left;">DisplayMate评价一加8 Pro:<strong>教科书般完&lt;a data-link="1" href="https://c.duomai.com/track.php?site_id=242986&amp;euid=&amp;t=https://mideajiadian.jd.com/" target="_blank">美的&lt;/a>校准精度和性能表现</strong>,创造13项智能&lt;a data-link="1" href="https://c.duomai.com/track.php?site_id=242986&amp;euid=&amp;t=https://shouji.jd.com/" target="_blank">手机&lt;/a>显示记录。&lt;/p><p style="text-align: left;">规格方é¢ï¼Œä¸€åŠ 8 Proæ载高通éªé¾™865旗舰平å°ï¼Œå‰ç½®1600万åƒç´ ï¼ŒåŽç½®4800万è¶
æ¸
四摄,电池容é‡ä¸º4510mAh,支æŒ30W Warp无线闪å

ã€Warpé—ªå

30T有线å

电。&lt;/p><p style="text-align: left;">æ¤å¤–,一加8Tå
¨é¢çŽ°è´§å‘售,起售价3399å
ƒï¼Œä¸€åŠ 8é™è‡³3299å
ƒã€‚&lt;/p><p style="text-align: center;"><a href="https://static.cnbetacdn.com/article/2021/0304/cfdb4208167012e.jpg" target="_blank"><img src="https://static.cnbetacdn.com/thumb/article/2021/0304/cfdb4208167012e.jpg"/></a><a href="https://static.cnbetacdn.com/article/2021/0304/e7bb8bbd2e5b913.jpg" target="_blank"><img src="https://static.cnbetacdn.com/thumb/article/2021/0304/e7bb8bbd2e5b913.jpg"/></a><a href="https://static.cnbetacdn.com/article/2021/0304/2e6cad84f505f43.jpg" target="_blank"><img src="https://static.cnbetacdn.com/thumb/article/2021/0304/2e6cad84f505f43.jpg"/></a><a href="https://static.cnbetacdn.com/article/2021/0304/b77d443a3761049.jpg" target="_blank"><img src="https://static.cnbetacdn.com/thumb/article/2021/0304/b77d443a3761049.jpg"/></a><a href="https://static.cnbetacdn.com/article/2021/0304/ea9ebd51f33109f.jpg" target="_blank"><img src="https://static.cnbetacdn.com/thumb/article/2021/0304/ea9ebd51f33109f.jpg"/></a><a href="https://static.cnbetacdn.com/article/2021/0304/db97040429b984a.jpg" target="_blank"><img src="https://static.cnbetacdn.com/thumb/article/2021/0304/db97040429b984a.jpg"/></a><a href="https://static.cnbetacdn.com/article/2021/0304/6a9943e38585768.jpg" target="_blank"><img src="https://static.cnbetacdn.com/thumb/article/2021/0304/6a9943e38585768.jpg"/></a><a href="https://static.cnbetacdn.com/article/2021/0304/06ceef5e21085e6.jpg" target="_blank"><img src="https://static.cnbetacdn.com/thumb/article/2021/0304/06ceef5e21085e6.jpg"/></a><a href="https://static.cnbetacdn.com/article/2021/0304/c052adf0ce81f58.jpg" target="_blank"><img src="https://static.cnbetacdn.com/thumb/article/2021/0304/c052adf0ce81f58.jpg"/></a><a href="https://static.cnbetacdn.com/article/2021/0304/5e3036dd27cbd45.jpg" target="_blank"><img src="https://static.cnbetacdn.com/thumb/article/2021/0304/5e3036dd27cbd45.jpg"/></a></p> </div>
<div class="tac">
<div class="tal cbv"><script type="text/javascript"><!--
google_ad_client = "ca-pub-3507708728694406";
/* cnBeta.COM æ–‡ç« é¡µæ–‡æœ«é€šæ  #1 */
google_ad_slot = "1385693419";
google_ad_width = 810;
google_ad_height = 100;
//-->
</script>
<script src="//pagead2.googlesyndication.com/pagead/show_ads.js" type="text/javascript">
</script></div>
<div class="tal cbv">
<a href="https://click.aliyun.com/m/1000245337/" target="_blank"><img src="https://static.cnbetacdn.com/article/2021/03/7bcc0f26b07694b.jpg"/></a>
</div>
<div class="tal cbv">
<script type="text/javascript"><!--
google_ad_client = "ca-pub-3507708728694406";
/* cnBeta.COM æ–‡ç« é¡µæ–‡æœ«é€šæ  #2 */
google_ad_slot = "8489727379";
google_ad_width = 810;
google_ad_height = 100;
//-->
</script>
<script src="//pagead2.googlesyndication.com/pagead/show_ads.js" type="text/javascript">
</script>
</div>
<div class="cbv810">
<div class="left500"><script type="text/javascript">
        (function() {
            var s = "_" + Math.random().toString(36).slice(2);
            document.write('<div style="" id="' + s + '"></div>');
            (window.slotbydup = window.slotbydup || []).push({
                id: "u4395341",
                container: s
            });
        })();
</script><script async="async" defer="defer" src="//cpro.baidustatic.com/cpro/ui/c.js" type="text/javascript">
</script>
</div>
<div class="right300"><script type="text/javascript"><!--
google_ad_client = "ca-pub-3507708728694406";
/* cnBeta.COM V5 文章页文末画ä¸ç”» #2 */
google_ad_slot = "5755245019";
google_ad_width = 300;
google_ad_height = 250;
//-->
</script>
<script src="//pagead2.googlesyndication.com/pagead/show_ads.js" type="text/javascript">
</script></div>
</div> </div>
<div class="article-share-code">
<div class="share-unit"><div class="share-btns bdsharebuttonbox"><a class="bds_tsina share-btn weibo" data-cmd="tsina" href="#" title="分享到新浪微åš&quot;>新浪微åš&lt;/a><a class="bds_qzone share-btn qzone" data-cmd="qzone" href="#" title="分享到QQ空间&quot;>QQ空间&lt;/a><a class="bds_tqq share-btn tqq" data-cmd="tqq" href="#" title="分享到è
¾è®¯å¾®åš">è
¾è®¯å¾®åš</a><a class="bds_sqq share-btn sqq" data-cmd="sqq" href="#" title="分享到QQ好å‹&quot;>QQ好å‹&lt;/a><a class="bds_weixin share-btn weixin" data-cmd="weixin" href="#" title="分享到微信&quot;>微信</a><a class="bds_douban share-btn douban" data-cmd="douban" href="#" title="分享到豆瓣网&quot;>豆瓣网&lt;/a><a class="bds_youdao share-btn youdao" data-cmd="youdao" href="#" title="分享到有é“云笔记&quot;>有é“云笔记&lt;/a><a class="bds_tieba share-btn tieba" data-cmd="tieba" href="#" title="分享到百度贴å§">百度贴å§&lt;/a><a class="bds_linkedin share-btn linkedin" data-cmd="linkedin" href="#" title="分享到linkedin">Linkedin</a><div class="more"></div></div></div>
<label><img src="//static.cnbetacdn.com/share/r2.gif"/></label>
</div>
<div class="article-global"></div> </div>

如果您观察,divwith classarticle-share-code是一个子节点。如果删除父节点,所有子节点也会被删除。

因此,如果您运行以下代码,子节点也会被删除

res = requests.get("https://www.cnbeta.com/articles/tech/1097507.htm")
soup = BeautifulSoup(res.text)
soup.find("div", {"class": "cnbeta-article-body"}).decompose()

仅删除divwith 类article-share-code检查以下代码

soup.find("div", {"class": "article-share-code"}).decompose()

推荐阅读