首页 > 解决方案 > 循环URL,分组json结果

问题描述

我正在尝试通过遍历 url 并使用露营地编号的变量来解析多个露营地的 json 数据。然后我想按 UnitId 对结果进行分组,但循环看不到为每个露营地运行。

我尝试过使用请求来执行此操作,然后在阅读后使用熊猫似乎更好,但我无法让它工作。

这是我的代码示例。出于某种原因,仅显示 UnitId 5098 的结果,并且过滤器不起作用。

# Is one of these campsites available
# Unit 5095
# Unit 5096
# Unit 5097
# Unit 5099

import requests
import pandas as pd

result = []
for i in range(5095, 5099):
    resp = requests.get("https://calirdr.usedirect.com/rdr/rdr/fd/availability/getbyunit/"+str(i)+"/startdate/2020-11-01/nights/30/true?")

result.extend(resp.json())

df = pd.DataFrame(result)

df.groupby(['UnitId', 'StartTime', 'IsFree', 'IsWalkin'])
print(df.groupby(['UnitId', 'StartTime', 'IsFree', 'IsWalkin']).groups)

JSON 数据来自我的代码。

{(5098, '2020-11-01T00:00:00', False, False): [0], (5098, '2020-11-02T00:00:00', False, False): [1], (5098, '2020-11-03T00:00:00', False, False): [2], (5098, '2020-11-04T00:00:00', False, False): [3], (5098, '2020-11-05T00:00:00', False, False): [4], (5098, '2020-11-06T00:00:00', False, False): [5], (5098, '2020-11-07T00:00:00', False, False): [6], (5098, '2020-11-08T00:00:00', False, False): [7], (5098, '2020-11-09T00:00:00', False, False): [8], (5098, '2020-11-10T00:00:00', False, False): [9], (5098, '2020-11-11T00:00:00', False, False): [10], (5098, '2020-11-12T00:00:00', False, False): [11], (5098, '2020-11-13T00:00:00', False, False): [12], (5098, '2020-11-14T00:00:00', False, False): [13], (5098, '2020-11-15T00:00:00', False, False): [14], (5098, '2020-11-16T00:00:00', False, False): [15], (5098, '2020-11-17T00:00:00', False, False): [16], (5098, '2020-11-18T00:00:00', False, False): [17], (5098, '2020-11-19T00:00:00', False, False): [18], (5098, '2020-11-20T00:00:00', False, False): [19], (5098, '2020-11-21T00:00:00', False, False): [20], (5098, '2020-11-22T00:00:00', False, False): [21], (5098, '2020-11-23T00:00:00', False, False): [22], (5098, '2020-11-24T00:00:00', False, False): [23], (5098, '2020-11-25T00:00:00', False, False): [24], (5098, '2020-11-26T00:00:00', False, False): [25], (5098, '2020-11-27T00:00:00', False, False): [26], (5098, '2020-11-28T00:00:00', False, False): [27], (5098, '2020-11-29T00:00:00', False, False): [28], (5098, '2020-11-30T00:00:00', False, False): [29]}

我想打印的结果。这只是一个例子,很难找到实际可用的露营地。谢谢大家的帮助。这只是一个爱好,我绝不是程序员,但它非常酷,尤其是 Python。谢谢你。

(5095, '2020-11-05T00:00:00', False, False):
(5096, '2020-11-12T00:00:00', False, False):
(5099, '2020-11-25T00:00:00', False, False):

标签: pythonjsonpandas

解决方案


你必须缩进result.extend(resp.json())身体for loop。此外,您可能需要考虑使用filter(). 例如:

from datetime import datetime

import pandas as pd
import requests
from tabulate import tabulate

result = []
for unit_id in range(5095, 5099):
    resp = requests.get(
        f"https://calirdr.usedirect.com/rdr/rdr/fd/"
        f"availability/getbyunit/{unit_id}/startdate/2020-11-01/nights/30/true?").json()
    result.extend(resp)

filter_by = ['UnitId', 'StartTime', 'IsFree', 'IsWalkin']
df = pd.DataFrame(result)
df = df.filter(items=filter_by)
df['StartTime'] = df['StartTime'].apply(lambda d: datetime.fromisoformat(d).strftime("%Y-%m-%d"))
df = df[df['IsFree']]
print(tabulate(df, headers=filter_by))


输出:

      UnitId  StartTime    IsFree    IsWalkin
--  --------  -----------  --------  ----------
60      5097  2020-11-01   True      False
78      5097  2020-11-19   True      False
87      5097  2020-11-28   True      False

推荐阅读