首页 > 解决方案 > 当从 Python 导入与 CLI 调用时,Youtube-dl --dump-json 返回播放列表的不同提取器输出

问题描述

我无法复制提供给 youtube-dl cli 的特定--flat-playlist -j--flat-playlist -J参数的确切输出,而是使用直接 python 调用。

也许我无法清除我需要提供的确切选项,或者完全错误的功能/方法?

我喜欢这个命令,因为它速度很快,而且只给了我 YouTube ID,没有深入研究每个视频(作为一个简单的基准,我可以在大约 30 秒内完成 4000,而通过常规信息仅在 70 分钟内完成 4000萃取)。

CLI 版本,请注意 cli 以某种方式迭代,而直接调用方法不会...

[root@tjds temp]# /usr/local/bin/youtube-dl --flat-playlist -J  https://www.youtube.com/channel/UC6RNSPFcqY4BblL2Jg9SUtw 
{"extractor": "youtube:playlist", "_type": "playlist", "uploader": "Patrick Best", "entries": [{"url": "IaNEfhPmhPM", "_type": "url", "ie_key": "Youtube", "id": "IaNEfhPmhPM", "title": "Rona in Aurora - the death of a big box"}, {"url": "JN-2vTf8-WM", "_type": "url", "ie_key": "Youtube", "id": "JN-2vTf8-WM", "title": "RealWorldNumbers - How to Test a Raspberry Pi Powered by Alkaline AA Batteries - After Poweroff"}, {"url": "NmQmM36ja3o", "_type": "url", "ie_key": "Youtube", "id": "NmQmM36ja3o", "title": "RealWorldNumbers - How to Test a Raspberry Pi Powered by Alkaline AA Batteries - During the Test"}, {"url": "MsrgW1Rdlso", "_type": "url", "ie_key": "Youtube", "id": "MsrgW1Rdlso", "title": "Raspberry Pi powered by AA batteries direct"}], "id": "UU6RNSPFcqY4BblL2Jg9SUtw", "title": "Uploads from Patrick Best", "extractor_key": "YoutubePlaylist", "uploader_id": "patrickscottbest", "uploader_url": "https://www.youtube.com/user/patrickscottbest", "webpage_url": "https://www.youtube.com/playlist?list=UU6RNSPFcqY4BblL2Jg9SUtw", "webpage_url_basename": "playlist"}



[root@tjds temp]# /usr/local/bin/youtube-dl --flat-playlist -j  https://www.youtube.com/channel/UC6RNSPFcqY4BblL2Jg9SUtw 
{"url": "IaNEfhPmhPM", "_type": "url", "ie_key": "Youtube", "id": "IaNEfhPmhPM", "title": "Rona in Aurora - the death of a big box"}
{"url": "JN-2vTf8-WM", "_type": "url", "ie_key": "Youtube", "id": "JN-2vTf8-WM", "title": "RealWorldNumbers - How to Test a Raspberry Pi Powered by Alkaline AA Batteries - After Poweroff"}
{"url": "NmQmM36ja3o", "_type": "url", "ie_key": "Youtube", "id": "NmQmM36ja3o", "title": "RealWorldNumbers - How to Test a Raspberry Pi Powered by Alkaline AA Batteries - During the Test"}
{"url": "MsrgW1Rdlso", "_type": "url", "ie_key": "Youtube", "id": "MsrgW1Rdlso", "title": "Raspberry Pi powered by AA batteries direct"}

这是我的python脚本:

        import youtube_dl
        ydl_opts = {
            'extract_flat': True, ## --flat-playlist according to options.py
             'dumpjson': True, ## lower -j
             #'dump_single_json': True, ## UPPER -J
        }
        with youtube_dl.YoutubeDL(ydl_opts) as ydl:
            object = ydl.extract_info(providedURL, download=False)
            print (object)
            print len(object)

结果似乎只显示了频道信息而没有迭代:

[youtube:channel] UC6RNSPFcqY4BblL2Jg9SUtw: Downloading channel page
{'_type': 'url', 'url': 'https://www.youtube.com/playlist?list=UU6RNSPFcqY4BblL2Jg9SUtw', 'ie_key': 'YoutubePlaylist', 'extractor': 'youtube:channel', 'webpage_url': 'https://www.youtube.com/channel/UC6RNSPFcqY4BblL2Jg9SUtw', 'webpage_url_basename': 'UC6RNSPFcqY4BblL2Jg9SUtw', 'extractor_key': 'YoutubeChannel'}

7

谁能引导我朝着正确的方向前进或告诉我完成这项快速任务的最佳方法是什么?是否有我需要解决的模板组件?我需要在响应准备好显示之前对其进行处理吗?

>> print (youtube_dl.version.unicode_literals)
_Feature((2, 6, 0, 'alpha', 2), (3, 0, 0, 'alpha', 0), 131072)

python --version
Python 3.6.5

更新,

正如所建议的,使用“url”的返回值似乎是一个足够好的解决方法来继续我的工作。我仍然不知道为什么我没有看到与使用 CLI 时相同的结果......那里似乎有一个不同的逻辑,可以自动循环浏览频道列表,而无需再次运行命令时间。

>>> with youtube_dl.YoutubeDL(ydl_opts) as ydl:
...     object = ydl.extract_info(providedURL, download=False)
...     print (len(object))
...     object2 = ydl.extract_info(object['url'], download=False)
...     print (len(object2))
...     print (object2)
... 
[youtube:channel] UC6RNSPFcqY4BblL2Jg9SUtw: Downloading channel page
7   


[download] Downloading playlist: Uploads from Patrick Best
[youtube:playlist] playlist Uploads from Patrick Best: Downloading 4 videos
[download] Downloading video 1 of 4
[download] Downloading video 2 of 4
[download] Downloading video 3 of 4
[download] Downloading video 4 of 4
[download] Finished downloading playlist: Uploads from Patrick Best
11
{'_type': 'playlist', 'entries': [{'_type': 'url', 'url': 'IaNEfhPmhPM', 'ie_key': 'Youtube', 'id': 'IaNEfhPmhPM', 'title': 'Rona in Aurora - the death of a big box'}, {'_type': 'url', 'url': 'JN-2vTf8-WM', 'ie_key': 'Youtube', 'id': 'JN-2vTf8-WM', 'title': 'RealWorldNumbers - How to Test a Raspberry Pi Powered by Alkaline AA Batteries - After Poweroff'}, {'_type': 'url', 'url': 'NmQmM36ja3o', 'ie_key': 'Youtube', 'id': 'NmQmM36ja3o', 'title': 'RealWorldNumbers - How to Test a Raspberry Pi Powered by Alkaline AA Batteries - During the Test'}, {'_type': 'url', 'url': 'MsrgW1Rdlso', 'ie_key': 'Youtube', 'id': 'MsrgW1Rdlso', 'title': 'Raspberry Pi powered by AA batteries direct'}], 'id': 'UU6RNSPFcqY4BblL2Jg9SUtw', 'title': 'Uploads from Patrick Best', 'uploader': 'Patrick Best', 'uploader_id': 'patrickscottbest', 'uploader_url': 'https://www.youtube.com/user/patrickscottbest', 'extractor': 'youtube:playlist', 'webpage_url': 'https://www.youtube.com/playlist?list=UU6RNSPFcqY4BblL2Jg9SUtw', 'webpage_url_basename': 'playlist', 'extractor_key': 'YoutubePlaylist'}
>>> 

标签: pythonyoutube-dl

解决方案


我相信如果您指的是播放列表,那么您应该这样传递您的 URL:

url = 'https://www.youtube.com/playlist?list=UU6RNSPFcqY4BblL2Jg9SUtw'

然后您应该能够打印对象并遍历每个字典,例如:

object = ydl.extract_info(url, download=False)
for jsn in object.get('entries'):
    print jsn

推荐阅读