首页 > 解决方案 > JSONDecodeError:期望值:第 1 行第 1 列(字符 0)(解析问题?)

问题描述

我正在尝试加载然后清理动态网站https://www.expresslanes.com/map-your-trip

res = requests.get('https://www.expresslanes.com/themes/custom/transurbangroup/js/on-the-road/entry_exit.js?v=1.x')

我正在使用 jupyter notebook,但是当我输入命令时 type(res.json())

我收到错误消息:

    ---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_11132/3073830826.py in <module>
----> 1 type(res.json())

e:\python\lib\site-packages\requests\models.py in json(self, **kwargs)
    899             if encoding is not None:
    900                 try:
--> 901                     return complexjson.loads(
    902                         self.content.decode(encoding), **kwargs
    903                     )

e:\python\lib\json\__init__.py in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    344             parse_int is None and parse_float is None and
    345             parse_constant is None and object_pairs_hook is None and not kw):
--> 346         return _default_decoder.decode(s)
    347     if cls is None:
    348         cls = JSONDecoder

e:\python\lib\json\decoder.py in decode(self, s, _w)
    335 
    336         """
--> 337         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338         end = _w(s, end).end()
    339         if end != len(s):

e:\python\lib\json\decoder.py in raw_decode(self, s, idx)
    353             obj, end = self.scan_once(s, idx)
    354         except StopIteration as err:
--> 355             raise JSONDecodeError("Expecting value", s, err.value) from None
    356         return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

请帮助我,在此先感谢!

标签: jsonweb-scrapingjsondecoder

解决方案


我假设您想将entryExits结构加载到 Python:

import re
import json
import requests


url = "https://www.expresslanes.com/themes/custom/transurbangroup/js/on-the-road/entry_exit.js?v=1.x"

js_doc = requests.get(url).text

data = re.search(r"entryExits = (\{.*\});", js_doc, flags=re.S)
data = json.loads(data.group(1))

# pretty print the data:
print(json.dumps(data, indent=4))

印刷:

{
    "Northbound": {
        "entries": {
            "183NO": {
                "id": "183NO",
                "label": "Jones Branch Drive/Route 123",
                "latitude": "38.9275663011649600",
                "longitude": "-77.2125236759185300",
                "path": "495North",
                "index": "11",
                "details": {
                    "title": "Jones Branch Drive/Route 123 Access",
                    "image": "/images/flash/access-maps/jones-branch-drive/large-access-static.jpg",
                    "description": [
                        "From Jones Branch Drive/Route 123 in Tysons Corner you can travel north or south in the Express Lanes.",
                        "From the northbound or southbound Express Lanes you can exit onto Jones Branch Drive/Route 123."
                    ]
                },
                "exits": [
                    {
                        "id": "181ND",
                        "ods": [
                            "1038"
                        ]
                    }
                ]
            },
            "185NO": {
                "id": "185NO",

...

推荐阅读