首页 > 解决方案 > 正则表达式匹配完全匹配不是python中的所有匹配

问题描述

嗨,我有一个字符串http://www.yifysubtitles.com/subtitles/blockers2018720pwebripx264-ytsam-arabic-128849"><span class="text-muted">subtitle</span> Blockers.2018.720p.WEBRip.x264-[YTS.AM]</a></td><td class="other-cell"></td><td class="uploader-cell"><a href="/user/SHINAWY">SHINAWY</a></td><td class="download-cell"><a href="/subtitles/blockers-arabic-yify-128849" class="subtitle-download" >download</a></td></tr><tr data-id="128835"><td class="rating-cell"><span class="label">0</span></td><td class="flag-cell"><span class="flag flag-cn"></span><span class="sub-lang">Chinese</span></td><td><a href="/subtitles/blockers2018720pblurayx264-ytsmecht-chinese-128835"><span class="text-muted">subtitle</span> Blockers.2018.720p.BluRay.x264-[YTS.ME].cht </a></td><td class="other-cell"></td><td class="uploader-cell"><a href="/user/osamawang">osamawang</a></td><td class="download-cell"><a href="/subtitles/blockers-chinese-yify-128835" class="subtitle-download" >download</a></td></tr><tr data-id="128543" class="high-rating"><td class="rating-cell"><span class="label label-success">6</span></td><td class="flag-cell"><span class="flag flag-gb"></span><span class="sub-lang">English</span></td><td><a href="/subtitles/blockers2018web-dlx264-fgt-english-128543"><span class="text-muted">subtitle</span> Blockers.2018.WEB-DL.x264-FGT</a></td><td class="other-cell"></td><td class="uploader-cell"><a href="/user/sub">sub</a></td><td class="download-cell"><a href="/subtitles/blockers-english-yify-128543" class="subtitle-download" >download</a></td></tr><tr data-id="128633"><td class="rating-cell"><span class="label">0</span></td><td class="flag-cell"><span class="flag flag-rs"></span><span class="sub-lang">Serbian</span></td><td><a href="/subtitles/blockers2018720pblurayx264ytsag-serbian-128633"><span class="text-muted">subtitle</span> Blockers.2018.720p.BluRay.x264.[YTS.AG]</a></td><td class="other-cell"></td><td class="uploader-cell"><a href="/user/TesneGace">TesneGace</a></td><td class="download-cell"><a href="/subtitles/blockers-serbian-yify-128633" class="subtitle-download" >download</a></td></tr><tr data-id="128702"><td class="rating-cell"><span class="label label-success">2</span></td><td class="flag-cell"><span class="flag flag-es"></span><span class="sub-lang">Spanish</span></td><td><a href="/subtitles/blockers2018720pblurayx264ytsag-spanish-128702"><span class="text-muted">subtitle</span> Blockers.2018.720p.BluRay.x264.[YTS.AG]</a></td><td class="other-cell"></td><td class="uploader-cell"><a href="/subtitles/blockers-english-yify-128543

我正在尝试匹配第一次出现的英语yify"/subtitles/blockers-english-yify-128543

我的模式是re.search(r'/subtitles/.+\-english\-yify-\d+',text)

但我的代码返回整个字符串,请帮助

我的正则表达式在这里可用

标签: pythonregexstring

解决方案


您的字符串实际上是 html - 您应该改用 html 解析器。我建议使用出色的 lxml.html 解析器。

为了回答您的问题,正则表达式默认是贪婪的,这意味着您的.+部分将获取尽可能多的字符以满足条件。因此,您将获得第一个/subtitles/和最后一个-english\-yify-以及介于两者之间的所有内容。


推荐阅读