首页 > 解决方案 > 抓取 m3u8 文件的路径

问题描述

我目前正在尝试在嵌入式视频的 m3u8 url 路径中抓取一个唯一值以进行自学。网站上的每个嵌入视频都共享相同的 url 路径,唯一值除外。

例如,从https://headlines.yahoo.co.jp/videonews/ann?a=20190526-00000026-ann-int页面,我可以通过检查器的网络选项卡找到 m3u8 路径:

https://gw-yvpub.c.yimg.jp/v1/hls/CFukHuaO2W13gxbJ/video.m3u8

这里的唯一值是CFukHuaO2W13gxbJ。但是,我终其一生都无法在页面源代码的任何位置或检查器选项卡中的任何其他位置找到此值。是否可以在页面源或生成此 url 的位置找到此 url 链接?

旁注:在请求调用 m3u8 文件之前,对这个 blob url 进行了请求调用:

blob:https://s.yimg.jp/f23ed5ca-7a95-4409-bf66-c26c577157d2

Thanks in advance for any guidance!

标签: javascriptweb-scrapingblobhttp-live-streamingm3u8

解决方案


The m3u8 urls are present in request made to this url:

https://feapi-yvpub.yahooapis.jp/v1/content/1576087?appid=dj0zaiZpPVZMTVFJR0FwZWpiMyZzPWNvbnN1bWVyc2VjcmV0Jng9YjU-&output=json&space_id=2078710316&domain=headlines.yahoo.co.jp&ak=044ddff76151606c2d97ada9daa3ea45&device_type=1100&thumb_width=1204&thumb_height=676&thumb_priority=l&thumb_bd=0

Values for that come from your given url here:

<script type="text/javascript">
YAHOO.JP.srch.dlink.onLoad(function(sl) {
    sl.setParams({"serviceCode":"nws","appID":"dj0zaiZpPWlzQ3RiOHo1cGxBNSZzPWNvbnN1bWVyc2VjcmV0Jng9ODQ-","articleID":"20190526-00000026-ann","category":null,"mediaID":"ann","spaceID":2078710316,"linkCount":"5","launchAfterDocLoad":false});
});
</script>

As well as content id seen, for example

<script type="text/javascript" class="yvpub-player" src="https://s.yimg.jp/images/yvpub/player/js/embed.js?contentid=1576087&amp;width=602&amp;height=338&amp;propertyname=jp_news&amp;spaceid=2078710316&amp;repeat=0&amp;recommend=0&amp;autostart=1" data-composed="1"></script>

This 044ddff76151606c2d97ada9daa3ea45 is an access key I think. Not sure if that is something you can re-use across requests. Perhaps also look at the API documentation if there is any. Has a whiff of random hash (probably governed by length) - that could pose problems.


推荐阅读