首页 > 解决方案 > 将 Valve 数据结构转换为 JSON

问题描述

我有一个看起来与 JSON 非常相似的数据结构:

"items"
{
    "first"
    {
        "a"     "1"
        "b"     "2"
        "c"     "3"
        "d"     "4"
        "e"     "5"
    }
    "second"
    {
        "f"     "6"
        "g"     "7"
        "h"     "8"
        "i"     "9"
        "j"     "10"
    }
}

但问题是这种格式不适用于 JSON 解析器。Python中有什么方法可以将这种格式转换为JSON格式来管理我的数据?我试过使用json.loads(json.dumps(data)),但它不起作用。当使用 ie 搜索这种格式内的数据时,jsonObj['items']它会显示给我TypeError: string indices must be integers

我的目标是按照以下方式获得一些东西:

"items" :
{
    "first" :
    {
        "a" : "1",
        "b" : "2",
        "c" : "3",
        "d" : "4",
        "e" : "5"
    },
    "second" :
    {
        "f" : "6",
        "g" : "7",
        "h" : "8",
        "i" : "9",
        "j" : "10"
    }
}

标签: pythonjson

解决方案


令人高兴的是,该格式使用的原子与 Python 非常相似,我们可以将tokenizeast模块用于临时解析器。

它可能会在输入中断时严重中断,但适用于您的示例数据:)

import tokenize
import token
import ast
import io
import json


def parse_valve_format(data):
    dest = {}
    stack = [dest]
    for tok in tokenize.tokenize(io.BytesIO(data.encode()).readline):
        if tok.type == token.STRING:
            ts = ast.literal_eval(tok.string)
            if isinstance(stack[-1], str):
                # already a string on the stack?
                # this has to be a key-value setting
                key = stack.pop(-1)
                stack[-1][key] = ts
            else:
                # otherwise assume we'll find a } soon
                stack.append(ts)
        elif tok.type == token.OP and tok.string == "{":
            obj = {}
            key = stack.pop(-1)
            stack[-1][key] = obj
            stack.append(obj)
        elif tok.type == token.OP and tok.string == "}":
            assert isinstance(stack[-1], dict), "stray }"
            stack.pop(-1)
    return dest


result_dict = parse_valve_format(
    """
"items"
{
    "first"
    {
        "a"     "1"
        "b"     "2"
        "c"     "3"
        "d"     "4"
        "e"     "5"
    }
    "second"
    {
        "f"     "6"
        "g"     "7"
        "h"     "8"
        "i"     "9"
        "j"     "10"
    }
}
"""
)

print(json.dumps(result_dict, indent=2))

输出:

{
  "items": {
    "first": {
      "a": "1",
      "b": "2",
      "c": "3",
      "d": "4",
      "e": "5"
    },
    "second": {
      "f": "6",
      "g": "7",
      "h": "8",
      "i": "9",
      "j": "10"
    }
  }
}

推荐阅读