首页 > 解决方案 > 解决 Pandas read_json 中的“协议未知错误”

问题描述

我正在尝试将一些气象数据从 NASA 网站加载到 Pandas DataFrame 中。我收到一个 ValueError:协议未知错误。这是我的具体代码:

import pandas as pd
import requests 

cmd = 'https://power.larc.nasa.gov/cgi-bin/v1/DataAccess.py?request=execute&identifier=SinglePoint&parameters=T10M&startDate=20200101&endDate=20200103&userCommunity=SSE&tempAverage=DAILY&outputList=JSON,ASCII&lat=-0.2739&lon=36.3765&user=anonymous' 

df = pd.read_json(requests.get(cmd).text, lines=True, orient='table')  

注意:我尝试了几种 read_json 参数的组合,结果都相同。

这是我的 Traceback 错误报告:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-26-cf1926f4d68e> in <module>
      7 """
      8 
----> 9 df = pd.read_json(requests.get(cmd).text, lines=True, orient='table')

D:\Anaconda3\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
    197                 else:
    198                     kwargs[new_arg_name] = new_arg_value
--> 199             return func(*args, **kwargs)
    200 
    201         return cast(F, wrapper)

D:\Anaconda3\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
    294                 )
    295                 warnings.warn(msg, FutureWarning, stacklevel=stacklevel)
--> 296             return func(*args, **kwargs)
    297 
    298         return wrapper

D:\Anaconda3\lib\site-packages\pandas\io\json\_json.py in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit, encoding, lines, chunksize, compression, nrows)
    591 
    592     compression = infer_compression(path_or_buf, compression)
--> 593     filepath_or_buffer, _, compression, should_close = get_filepath_or_buffer(
    594         path_or_buf, encoding=encoding, compression=compression
    595     )

D:\Anaconda3\lib\site-packages\pandas\io\common.py in get_filepath_or_buffer(filepath_or_buffer, encoding, compression, mode, storage_options)
    219 
    220         try:
--> 221             file_obj = fsspec.open(
    222                 filepath_or_buffer, mode=mode or "rb", **(storage_options or {})
    223             ).open()

D:\Anaconda3\lib\site-packages\fsspec\core.py in open(urlpath, mode, compression, encoding, errors, protocol, newline, **kwargs)
    427     ``OpenFile`` object.
    428     """
--> 429     return open_files(
    430         [urlpath],
    431         mode,

D:\Anaconda3\lib\site-packages\fsspec\core.py in open_files(urlpath, mode, compression, encoding, errors, name_function, num, protocol, newline, auto_mkdir, expand, **kwargs)
    278     be used as a single context
    279     """
--> 280     fs, fs_token, paths = get_fs_token_paths(
    281         urlpath,
    282         mode,

D:\Anaconda3\lib\site-packages\fsspec\core.py in get_fs_token_paths(urlpath, mode, num, name_function, storage_options, protocol, expand)
    598                     "share the same protocol"
    599                 )
--> 600         cls = get_filesystem_class(protocol)
    601         optionss = list(map(cls._get_kwargs_from_urls, urlpath))
    602         paths = [cls._strip_protocol(u) for u in urlpath]

D:\Anaconda3\lib\site-packages\fsspec\registry.py in get_filesystem_class(protocol)
    189     if protocol not in registry:
    190         if protocol not in known_implementations:
--> 191             raise ValueError("Protocol not known: %s" % protocol)
    192         bit = known_implementations[protocol]
    193         try:

ValueError: Protocol not known: {
 "features": [
  {
   "geometry": {
    "coordinates": [
     36.37651,
     -0.27389,
     2178.98
    ],
    "type": "Point"
   },
   "properties": {
    "parameter": {
     "T10M": {
      "20200101": 15.57,
      "20200102": 15.7,
      "20200103": 16.01
     }
    }
   },
   "type": "Feature"
  }
 ],
 "header": {
  "api_version": "1.1.0",
  "endDate": "20200103",
  "fillValue": "-999",
  "startDate": "20200101",
  "title": "NASA/POWER SRB/FLASHFlux/MERRA2/GEOS 5.12.4 (FP-IT) 0.5 x 0.5 Degree Daily Averaged Data"
 },
 "messages": [],
 "outputs": {
  "ascii": "https

我的问题是如何正确将此输入加载到数据框中。

标签: pythonjsonpython-3.xpandas

解决方案


您可以尝试此示例来加载数据(使用pd.json_normalize. 我假设您只想从"T10M"键加载值):

import requests
import pandas as pd


url = 'https://power.larc.nasa.gov/cgi-bin/v1/DataAccess.py?request=execute&identifier=SinglePoint&parameters=T10M&startDate=20200101&endDate=20200103&userCommunity=SSE&tempAverage=DAILY&outputList=JSON,ASCII&lat=-0.2739&lon=36.3765&user=anonymous' 
df = pd.json_normalize(requests.get(url).json()['features'][0]['properties']['parameter']['T10M']).T
df.index = pd.to_datetime(df.index)

print(df)

印刷:

                0
2020-01-01  15.57
2020-01-02  15.70
2020-01-03  16.01

推荐阅读