首页 > 解决方案 > 将字典加载到二进制 - 结构格式的坏字符

问题描述

我有一个非常大的形状字典:

{'rounding': (4, [(900, 1), (4406, 0), (5772, 1), (6210, 1)]), 'thee': (5, [(901, 1), (3452, 1), (3803, 1), (4178, 1), (5793, 1)]), 'hotdog': (13, [(902, 2), (902, 2), (996, 1), (1765, 1), (2602, 1), (3824, 1), (4701, 1), (4924, 1), (5544, 1), (5741, 1), (5984, 1), (6972, 1), (7236, 2), (7236, 2), (7469, 1)]), 'hotdogs': (9, [(902, 1), (1765, 2), (1765, 2), (4924, 0), (5110, 1), (5228, 1), (6883, 1), (7034, 1), (7236, 1), (8638, 1)]),} 

它持续不断地持续约 45 万个术语左右。这是一个非常大的字典。我需要将此对象写入二进制文件。我正在关注这些资源:

导致错误的代码是(整个程序有几千行):

inverted_index = {word:(document_frequency[word], d[word]) for word in d}
json_dict = json.dumps(inverted_index)
struct.pack(json_dict)

这会产生错误

  File "take_7.py", line 179, in <module>
    main()
  File "take_7.py", line 176, in main
    driver(sys.argv[1])
  File "take_7.py", line 164, in driver
    struct.pack('i', json_dict)
struct.error: bad char in struct format

我尝试查找结构文档,然后尝试:

inverted_index = {word:(document_frequency[word], d[word]) for word in d}
json_dict = json.dumps(inverted_index)
binary_file = struct.pack('s', bytes(json_dict, 'utf-8'))

其中编译。然而:

print(binary_file) yields b'{'

print(struct.unpack('s', binary_file)) yields (b'{',)

如何将我的dict(如上所述)转换为二进制文件,以便将其保存到磁盘,然后从磁盘读回以供使用?

标签: pythonpython-3.xdictionarystruct

解决方案


推荐阅读