首页 > 解决方案 > 如何修复停止工作的 Python 请求分页

问题描述

我使用下面的代码来发出 python 请求。我想获取查询“v”的所有产品结果。

url = 'https://www.walmart.com/store/1003-York-pa/search?query=ice%20cream'
api_url = 'https://www.walmart.com/store/electrode/api/search'

params = {
    'query': 'v',
    'cat_id': 0,
    'ps': 24,
    'offset': 0,
    'prg': 'desktop',
    'stores': re.search(r'store/(\d+)', url).group(1)
}

data1 = requests.get(api_url, params=params).json()


for page in range(0, 319):

        params = {
            'query': word,
            'cat_id': 0,
            'page':page,     // try to update the new page
            'ps': 24,
            'offset': 0,
            'prg': 'desktop',
            'stores': re.search(r'store/(\d+)', url).group(1)
        }

 data = requests.get(api_url, params=params).json()

网站搜索显示 319 页结果。它有些停止在第 100 页返回结果。我想获得所有页面的结果。我怎样才能做到这一点?

标签: pythonpython-requests

解决方案


尝试增加offset参数,而不是page

import re
import requests


url = 'https://www.walmart.com/store/1003-York-pa/search?query=ice%20cream'
api_url = 'https://www.walmart.com/store/electrode/api/search'

params = {
    'query': 'ice cream',
    'cat_id': 0,
    'ps': 24,
    'offset': 0,
    'prg': 'desktop',
    'stores': re.search(r'store/(\d+)', url).group(1)
}

count = 0

for params['offset'] in range(0, 319, 24):

    data = requests.get(api_url, params=params).json()

    for i in data.get('items', []):
        print(i['title'])
        count += 1

print('Total', count)

印刷:

...

Paletas Helados Mexico Rompope Premium Bolis, 5.0 fl oz, 6 count
Blue Ribbon Classics Homemade Vanilla Frozen Treat Bar
Dean's Country Fresh Fudge Bars, 30 oz
Blue Ribbon Classics Star Frozen Treat Bar
Dean Foods Deans Country Fresh  Vanilla Bars, 12 ea
Blue Ribbon Classics Orange Dream Frozen Treat Bar
OUTSHINE Creamy Coconut Frozen Fruit Bars, 6 Ct. Box | Gluten Free
La Michoacana Variety Pack Paletas, 12 ct, 36 fl oz
Breyers CarbSmart Frozen Dairy Dessert Vanilla Bars 6 ct
Total 484

推荐阅读