首页 > 解决方案 > 递归处理分页

问题描述

我正在使用requestslib 从远程服务器获取数据并将数据保存在模型中,但我需要处理分页,目前我只从服务器加载一页。

我有一个这样的分页网址

{
"status": "success",
"count": 32,
"total": 32,
"next": "https://pimber.ly/api/v2/products?sinceId=5c3ca8470985af0016229b5b",
"previous": "https://pimber.ly/api/v2/products?maxId=5c3ca8470985af0016229b04",
"sinceId": "5c3ca8470985af0016229b04",
"maxId": "5c3ca8470985af0016229b5b",
"data": [
    {
        "Primary ID": "API_DOCS_PROD1",
        "Product Name": "Example Product 1",
        "Product Reference": "Example Reference 1",
        "Buyer": "Example Buyer 1",
        "_id": "5c3ca8470985af0016229b04",
        "primaryId": "API_DOCS_PROD1"
    },

我尝试使用 python 生成器来处理当前情况,但是,这并没有做任何事情

_plimber_data = response.json()
yield _plimber_data
_next = _plimber_data['next']
print(_next)
for page in _next:
    _next_page = session.get(_plimber_data, params={'next': page}).json()
    yield _next_page['next']

    for _data in page:
        Product.objects.create(
            qr_id=_data['primaryId'],
            ean_code=_data['EAN'],
            description=_data['Description105'],
            category=_data['Category'],
            marketing_text=_data['Marketing Text'],
            bullet=_data['Bullet 1'],
            brand_image=_data['Brand Image'],
            image=_data['Images']
        )
        logger.debug(f'Something went wrong {_data}')
        print(f'This is the Data:{_data}')

有人可以解释一下如何处理这个问题,以便我可以将所有数据加载到数据库中,谢谢。

标签: pythondjangopython-requests

解决方案


好的,我已经解决了,两个认为第一个生成器函数

def _get_product():
    """
    TODO: Fetch data from server
    """
    headers = {
        'Accept': 'application/json',
        'Content-Type': 'application/json',
        'Authorization': settings.TOKEN
    }

    try:
        response = requests.get(
            url=f'{settings.API_DOMAIN}',
            headers=headers
        )
        response.raise_for_status()
    except HTTPError as http_err:
        print(f'HTTP error occurred: {http_err}')

    else:
        _plimber_data = response.json()
        while _plimber_data['next'] is not None:
            response = requests.get(
                _plimber_data['next'],
                headers=headers
            )
            _plimber_data = response.json()
            for _data in _plimber_data['data']:
                yield _data

然后我遍历生成器函数,并保存数据:

    def run(self):
    _page_data = _get_product()
    for _product in _page_data:
        Product.objects.create(
            qr_id=_product['primaryId'],
            ean_code=_product['EAN'],
            description=_product['Description105'],
            category=_product['Category'],
            marketing_text=_product['Marketing Text'],
            bullet=_product['Bullet 1'],
            brand_image='\n'.join(_product['Brand Image']),
            image='\n'.join(_product['Images'])
        )
        logger.debug(f'Something went wrong {_product}')
        print(f'This is the Data:{_product}')

推荐阅读