python - 如何以可编程方式获取 ArchLinux 中 AUR 提供的所有包的元数据?
问题描述
如何以可编程的方式获取 ArchLinux 中 AUR 提供的所有包的元数据,包括那些未安装在本地的包?最好在 Python 中。
我尝试过AurJson,这是一组用于访问包元数据的 API,但必须提供最小长度的搜索关键字才能查询包元数据。
解决方案
这是一个有趣的问题!
AUR 包
您可以从https://aur.archlinux.org/packages.gz获取所有 AUR 包的列表。
然后,您可以使用 AurJson 接口的info
请求并批量处理多个包(不确定每个请求的最大值是多少):
一定要表现得很好并限制您的请求!这样的事情会让你开始......
import requests
packages = requests.get('https://aur.archlinux.org/packages.gz').text.splitlines()
batch_size = 50
package_infos = {}
while packages:
batch, packages = packages[:batch_size], packages[batch_size:]
for result in requests.get(
'https://aur.archlinux.org/rpc.php/rpc/',
params={'v': 5, 'type': 'info', 'arg[]': batch},
).json()['results']:
package_infos[result['Name']] = result
break # Replace this with throttling code :)
print(package_infos)
结果是
{'adwaita-dark-darose': {'Depends': ['gnome-themes-standard'],
'Description': 'Adwaita theme hacked to use my custom '
'color scheme. (Dark blues instead of '
'greys.)',
'FirstSubmitted': 1493136022,
'ID': 464990,
'Keywords': [],
'LastModified': 1511841278,
'License': ['GPL'],
'Maintainer': 'darose',
'MakeDepends': ['glib2', 'gtk3'],
'Name': 'adwaita-dark-darose',
'NumVotes': 3,
'OutOfDate': None,
'PackageBase': 'adwaita-dark-darose',
'PackageBaseID': 121780,
'Popularity': 0.024409,
'URL': 'none',
'URLPath': '/cgit/aur.git/snapshot/adwaita-dark-darose.tar.gz',
'Version': '3.22.3-10'},
'atari-adventure': {'Depends': ['stella'],
'Description': 'The original Adventure game for the old '
'Atari 2600 game console',
'FirstSubmitted': 1247592088,
'ID': 214107,
'Keywords': [],
'LastModified': 1437534447,
'License': ['unknown'],
'Maintainer': 'darose',
'Name': 'atari-adventure',
'NumVotes': 2,
'OutOfDate': None,
'PackageBase': 'atari-adventure',
'PackageBaseID': 28288,
'Popularity': 0,
'URL': 'http://www.atariage.com/software_page.html?SoftwareID=802',
'URLPath': '/cgit/aur.git/snapshot/atari-adventure.tar.gz',
'Version': '1.0-3'},
....
拱包
(我误解了原来的问题,但这是原来的答案。)
您可以使用Python 中的库查看 Arch 数据库文件,根据 Arch Linux wiki,这些文件是 tar.gz 文件。tarfile
所以假设你已经从镜像下载了 core.db/community.db/extra.db(例如https://mirrors.edge.kernel.org/archlinux/core/os/x86_64/core.db / https:// mirrors.edge.kernel.org/archlinux/community/os/x86_64/community.db / https://mirrors.edge.kernel.org/archlinux/extra/os/x86_64/extra.db但请使用另一个更接近的镜像你),你可以阅读它们,例如(Python 3)
import tarfile
tf = tarfile.open('core.db', 'r:gz')
for member in tf.getmembers():
if member.name.endswith('/desc'):
with tf.extractfile(member) as fp:
print(fp.read().decode())
print('-' * 40)
它以原始格式打印出描述文件,例如
%FILENAME%
archlinux-keyring-20180404-1-any.pkg.tar.xz
%NAME%
archlinux-keyring
%VERSION%
20180404-1
%DESC%
Arch Linux PGP keyring
%CSIZE%
684236
%ISIZE%
948224
%MD5SUM%
9ba27bf598d60f2ea6320339289a2401
%SHA256SUM%
6f0f2f8d72742da18b61b7e4a1900d419c718b6d9dcad804763b80a12cc9abaf
%PGPSIG%
iQEzBAABCAAdFiEE82kWh9hnuBtRzgfZu+Q3cUhzKKkFAlrEfLMACgkQu+Q3cUhzKKmE7ggAgNjBAz6FkFqy2+Q0Rfzt0ZibYT/KW6ibQoKgpxDQNkzcl/1ZVzS4rkZRjHkBJd8fKI2n6NtiijwiQBPBsTI8t4+nVD19C4zZbDHzTdABm4EaDdJg+ya635Df8xMqt6GNzxV5DmABioSww2ebY9EuSwl3yvMNTQUI8hAjWPfOirDRZDic9DEYvhPabUn9NlLzShQeDIZP/R0ejDCfBIcu2NMX+NSUg41w0+LGrLNpqdnI+ej0n3X6NDkvCZwvvC3DPCWs1PAhFS5yC5dve4pDBjf8fLuJBPbRQJx6Se0K0CCoeUVA2V4ld2HLXor1aLG0bijF2QhMLzHmW4XxWbpWLA==
%URL%
https://projects.archlinux.org/archlinux-keyring.git/
%LICENSE%
GPL
%ARCH%
any
%BUILDDATE%
1522826386
%PACKAGER%
Bartłomiej Piotrowski <bpiotrowski@archlinux.org>
编辑:您还可以使用类似的东西将数据库文件解析为字典
def read_aur_db_entry(fp):
db_entry = collections.defaultdict(str)
key = None
for line in fp.readlines():
if line.startswith(b'%') and line.endswith(b'%\n'):
key = line[1:-2].decode()
continue
db_entry[key] += line.decode()
return {key: value.strip() for (key, value) in db_entry.items()}
所以你得到
{'ARCH': 'any',
'BUILDDATE': '1522826386',
'CSIZE': '684236',
'DESC': 'Arch Linux PGP keyring',
'FILENAME': 'archlinux-keyring-20180404-1-any.pkg.tar.xz',
'ISIZE': '948224',
'LICENSE': 'GPL',
'MD5SUM': '9ba27bf598d60f2ea6320339289a2401',
'NAME': 'archlinux-keyring',
'PACKAGER': 'Bartłomiej Piotrowski <bpiotrowski@archlinux.org>',
'PGPSIG': 'iQEzBAABCAAdFiEE82kWh9hnuBtRzgfZu+Q3cUhzKKkFAlrEfLMACgkQu+Q3cUhzKKmE7ggAgNjBAz6FkFqy2+Q0Rfzt0ZibYT/KW6ibQoKgpxDQNkzcl/1ZVzS4rkZRjHkBJd8fKI2n6NtiijwiQBPBsTI8t4+nVD19C4zZbDHzTdABm4EaDdJg+ya635Df8xMqt6GNzxV5DmABioSww2ebY9EuSwl3yvMNTQUI8hAjWPfOirDRZDic9DEYvhPabUn9NlLzShQeDIZP/R0ejDCfBIcu2NMX+NSUg41w0+LGrLNpqdnI+ej0n3X6NDkvCZwvvC3DPCWs1PAhFS5yC5dve4pDBjf8fLuJBPbRQJx6Se0K0CCoeUVA2V4ld2HLXor1aLG0bijF2QhMLzHmW4XxWbpWLA==',
'SHA256SUM': '6f0f2f8d72742da18b61b7e4a1900d419c718b6d9dcad804763b80a12cc9abaf',
'URL': 'https://projects.archlinux.org/archlinux-keyring.git/',
'VERSION': '20180404-1'}
推荐阅读
- angular - 订阅 Oberservable:可以手动触发吗?
- c# - 将 ascii 字符串生成器写入文件的问题
- r - 如何从 sapply 的逻辑结果中过滤 TRUE 值?
- usb - QEMU PCI 内存映射 I/O
- php - PHP 8.0.0 高亮显示 (PhpStorm)
- java - Jackson:为什么用 @JsonFormat(shape=JsonFormat.Shape.STRING) 注释 BigDecimal 字段不起作用?
- akka - 从名称创建子演员的 TestProbe
- python - SQLAchemy 在 MySQL 中以微秒精度存储日期时间
- gitlab - 如何在 GitLab 中更改发布文件的文件名?
- php - 如何为 CommandBus 编写 .phpstorm.meta.php 文件