首页 > 解决方案 > 在 python 中解析多行 json 对象时出错

问题描述

尝试在 Python 中解析以逗号分隔的多行、多个 json 对象。但是,在 json.load 或作为列表或作为 jsonlines 对象的任何一种模式下,它都无法解析数据。

输入:这以以下方式存在于单个文件中

{
    "0": "mdm-898040540420",
    "1": {
        "dchannel": "FR al"
    },
    "2": {
        "dchannel": "FR Website"
    },
    "3": {
        "dcountry": "BDF"
    }
},
{
    "0": "mdm-846290540037",
    "1": {
        "dchannel": "FR alk"
    },
    "2": {
        "dchannel": "FR Website"
    },
    "3": {
        "dcountry": "BDF"
    }
},......

等等,它就像一个文件中的许多小的 json 对象。

试图用 [] 像 - [{json1},{json2}...] 将整个文件括起来并使用

with open("C:\\Users\\viv\\Downloads\\2020_11_21-10_31_03_PM_v2.json", 'r') as f:
    object_list = []
    for line in f.readlines():
        object_list.append(json.loads(line))

AND 不使用 [].,将整个括在 {} 中并使用 json 库。无论采用哪种方法,它都无法解析。

任何解析它的方法都将不胜感激。我想生成一个 csv 作为输出,其中:

id               dchannel             dcountry
mdm-846290540037,"FR al, FR Website", BDF 

错误信息:

1. While trying 
df = pd.read_json("C:\\Users\\viv\\Downloads\\2020_11_21-10_31_03_PM_v2.json", lines=True)
df.head()

    self._parse_no_numpy()
  File "D:\workspace\BillingDashboard\venv\lib\site-packages\pandas\io\json\_json.py", line 1093, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None
ValueError: Expected object or value


2. while running : 

entitiesList = []
print("Started Reading JSON file which contains multiple JSON document")
with open("C:\\Users\\viv\\Downloads\\edited_file_b.json",'r') as f:
    for jsonObj in f:
        entitiesDict = json.loads(jsonObj)
        entitiesList.append(entitiesDict)




  File "D:/workspace/BillingDashboard/bsdf_json_csv_converter.py", line 12, in <module>
    entitiesDict = json.loads(jsonObj)
  File "C:\python37\lib\json\__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "C:\python37\lib\json\decoder.py", line 337, in decode
Started Reading JSON file which contains multiple JSON document
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\python37\lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 2)

Process finished with exit code 1

标签: pythonjsonparsing

解决方案


问题可能与 json 数据的格式有关。

例如,如果原始 json 看起来像这样:

{"0": "mdm-898040540420",
    "1": {
        "dchannel": "FR al"
    },
    "2": {
        "dchannel": "FR Website"
    },
    "3": {
        "dcountry": "BDF"
    }
},
{
    "0": "mdm-846290540037",
    "1": {
        "dchannel": "FR alk"
    },
    "2": {
        "dchannel": "FR Website"
    },
    "3": {
        "dcountry": "BDF"
    }
}

您可以尝试将其包围{"test": []}并解析它json.loads(text)(我觉得解析器在您的情况下并不重要)。

{"test":
    [{"0": "mdm-898040540420",
        "1": {
            "dchannel": "FR al"
        },
        "2": {
            "dchannel": "FR Website"
        },
        "3": {
            "dcountry": "BDF"
        }
    },
    {
        "0": "mdm-846290540037",
        "1": {
            "dchannel": "FR alk"
        },
        "2": {
            "dchannel": "FR Website"
        },
        "3": {
            "dcountry": "BDF"
        }
    }]
}

以下应该有效:

with open('./jsonpath.json', 'r') as f:
    data = json.loads(f.read())
print(data)

推荐阅读