首页 > 解决方案 > 从 json 文件创建数据框

问题描述

我正在尝试从 json 文件创建一个数据框,但没有成功。我不确定为什么,因为这是我第一次处理 json 文件,但从我在寻找解决方案时发现的情况来看,数据似乎是严重嵌套的。

这是 json https://stats.nba.com/stats/leagueLeaders?ActiveFlag=No&LeagueID=00&PerMode=Totals&Scope=S&Season=All+Time&SeasonType=Regular+Season&StatCategory=AST

我尝试使用 pandas .read_json() 以不同的方向、打开以及使用请求来创建 df。请求方法实际上给了我一个数据框,虽然高度未配置。

AST_LDR = pd.read_json('C:\\Users\\user\\Desktop\\python\\AssistLeaders.json')

#Error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-20-55a99c3e3802> in <module>
----> 1 AST_LDR = pd.read_json('C:\\Users\\user\\Desktop\\python\\Kobe Bryant\\AssistLeaders.json')
      2 
      3 

~\Anaconda3\lib\site-packages\pandas\io\json\json.py in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit, encoding, lines, chunksize, compression)
    425         return json_reader
    426 
--> 427     result = json_reader.read()
    428     if should_close:
    429         try:

~\Anaconda3\lib\site-packages\pandas\io\json\json.py in read(self)
    535             )
    536         else:
--> 537             obj = self._get_object_parser(self.data)
    538         self.close()
    539         return obj

~\Anaconda3\lib\site-packages\pandas\io\json\json.py in _get_object_parser(self, json)
    554         obj = None
    555         if typ == 'frame':
--> 556             obj = FrameParser(json, **kwargs).parse()
    557 
    558         if typ == 'series' or obj is None:

~\Anaconda3\lib\site-packages\pandas\io\json\json.py in parse(self)
    650 
    651         else:
--> 652             self._parse_no_numpy()
    653 
    654         if self.obj is None:

~\Anaconda3\lib\site-packages\pandas\io\json\json.py in _parse_no_numpy(self)
    869         if orient == "columns":
    870             self.obj = DataFrame(
--> 871                 loads(json, precise_float=self.precise_float), 
  dtype=None)
    872         elif orient == "split":
    873             decoded = {str(k): v for k, v in compat.iteritems(

   ValueError: Expected object or value



#------------

import json
with open('C:\\Users\\user\\Desktop\\python\\Kobe Bryant\\AssistLeaders.json') as f:
    data = json.load(f)
AST_LDR = pd.DataFrame(data)

#Error

---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
<ipython-input-19-f61dd4edbb9e> in <module>
      1 import json
      2 with open('C:\\Users\\user\\Desktop\\python\\Kobe Bryant\\AssistLeaders.json') as f:
----> 3     data = json.load(f)
      4 AST_LDR = pd.DataFrame(data)

~\Anaconda3\lib\json\__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    294         cls=cls, object_hook=object_hook,
    295         parse_float=parse_float, parse_int=parse_int,
--> 296         parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
    297 
    298 

~\Anaconda3\lib\json\__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    346             parse_int is None and parse_float is None and
    347             parse_constant is None and object_pairs_hook is None and not kw):
--> 348         return _default_decoder.decode(s)
    349     if cls is None:
    350         cls = JSONDecoder

~\Anaconda3\lib\json\decoder.py in decode(self, s, _w)
    335 
    336         """
--> 337         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338         end = _w(s, end).end()
    339         if end != len(s):

~\Anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
    353             obj, end = self.scan_once(s, idx)
    354         except StopIteration as err:
--> 355             raise JSONDecodeError("Expecting value", s, err.value) from None
    356         return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

标签: jsonpython-3.xpandasdataframe

解决方案


JSON 是一种非常灵活的格式。pd.read_json只接受几种格式,而且通常情况下,实际数据不适合其中任何一种。您最好将其视为字典,提取所需的数据并相应地构建数据框:

url = 'https://stats.nba.com/stats/leagueLeaders?ActiveFlag=No&LeagueID=00&PerMode=Totals&Scope=S&Season=All+Time&SeasonType=Regular+Season&StatCategory=AST'
data = requests.get(url).json()

df = pd.DataFrame(data['resultSet']['rowSet'], columns=data['resultSet']['headers'])

结果:

   PLAYER_ID    PLAYER_NAME    GP    MIN   FGM    FGA  FG_PCT    FG3M    FG3A  FG3_PCT   FTM   FTA  FT_PCT    OREB    DREB   REB    AST     STL    BLK     TOV    PF    PTS  AST_TOV  STL_TOV  EFG_PCT  TS_PCT  GP_RANK  MIN_RANK  FGM_RANK  FGA_RANK  FG_PCT_RANK  FG3M_RANK  FG3A_RANK  FG3_PCT_RANK  FTM_RANK  FTA_RANK  FT_PCT_RANK  OREB_RANK  DREB_RANK  REB_RANK  AST_RANK  STL_RANK  BLK_RANK  TOV_RANK  PF_RANK  PTS_RANK  AST_TOV_RANK  STL_TOV_RANK  EFG_PCT1  TS_PCT1
0        304  John Stockton  1504  47766  7039  13658   0.515   845.0  2202.0    0.384  4788  5796   0.826   966.0  3085.0  4051  15806  3265.0  315.0  4244.0  3942  19711    3.724    0.769    0.546   0.608        4         9        69        93          109        153        169            88        40        48          183        407        246       371         1         1       400         2       14        45             5           164        50       18
1        467     Jason Kidd  1391  50116  6219  15557   0.400  1988.0  5701.0    0.349  3103  3954   0.785  1768.0  6957.0  8725  12091  2684.0  450.0  4003.0  2572  17529    3.020    0.670    0.464   0.507       11         5       101        56         1190         10          7           293       149       152          447        138         27        56         2         2       284         5      171        85            30           275       881      888
2        959     Steve Nash  1217  38073  6321  12892   0.490  1685.0  3939.0    0.428  3060  3384   0.904   643.0  2999.0  3642  10335   899.0  102.0  3478.0  1982  17387    2.972    0.258    0.556   0.605       36        46        96       116          267         22         35            13       154       221            3        601        262       436         3       222       861        13      411        87            37          1030        32       22
3        349   Mark Jackson  1296  39117  4793  10731   0.447   734.0  2213.0    0.332  2169  2818   0.770  1281.0  3682.0  4963  10334  1608.0  117.0  3155.0  2230  12489    3.275    0.510    0.481   0.522       23        38       227       207          748        193        165           413       324       322          561        273        176       257         4        32       806        23      289       223            19           571       650      679
4      77142  Magic Johnson   906  33245  6211  11951   0.520   325.0  1074.0    0.303  4960  5850   0.848  1601.0  4958.0  6559  10141  1724.0  374.0  3506.0  2050  17707    2.892    0.492    0.533   0.610      227        98       102       145           91        392        369           520        37        45           91        180         80       138         5        21       345        11      368        80            46           625        98       16

推荐阅读