首页 > 解决方案 > 解析两个 JSON 文件

问题描述

我有两个JSON files,结构完全相同,只是它们的值不同。我的代码,我用它解析一个文件并将数据保存到CSV file

#!/usr/bin/env python3

import json

filename_json = "/root/test.json"

with open(filename_json, "r") as read_file:
    data = json.load(read_file)
    file_csv = open("/tmp/tmp.csv", "w")
    file_csv.write("Source IP;Source Port;Destination IP;Destination Port \n")

    for file in data:
        destination_ip = file["destination_ip"]
        source_ip = file["source_ip"]
        source_port = file["source_port"]
        destination_port = file["destination_port"]
        source_port = str(source_port)
        destination_port = str(destination_port)
        
        inference = (source_ip + ";" + source_port + ";" + destination_ip + ";" +  destination_port + "\n")

        file_csv.write(inference)
    file_csv.close()

...而我还有另一个JSON文件,也是一样的,即这个解析适合它。

你能告诉我,我如何确保将结果first JSON file写入(当然,CSV file firstfirst row列名之后),并且将第一个结果second JSON file写入second lineCSV 文件。一般来说,结果应该是这样的CSV file

源 IP 源端口 目标 IP 目的端口 没有这样的列,只是一个描述,以便清楚地从哪个 JSON 以及应该如何编写字符串
192.168.1.1 25 192.168.1.2 25 JSON文件中的first反汇编块first
192.168.1.2 25 192.168.1.1 25 JSON文件中的first反汇编块second
192.168.1.9 21 192.168.1.8 21 JSON文件中的second反汇编块first
192.168.1.8 21 192.168.1.9 21 JSON文件中的second反汇编块second

依此类推,按此顺序排列。非常感谢!

标签: python-3.x

解决方案


您可以查看 csv 文件阅读:https ://docs.python.org/3/library/csv.html

例如:

import json
import csv


j1 = json.loads('[{"c1": "bla", "c2": 5}, {"c1": "bla1", "c2": 6}, {"c1": "bla2", "c2": 7}]')
j2 = json.loads('[{"c1": "blub", "c2": 1}, {"c1": "blub", "c2": 2}, {"c1": "blub", "c2": 3}]')


with open("./yourfile.csv", "w", encoding="utf-8") as csv_file:
    csv_writer = csv.writer(csv_file)
    csv_writer.writerow(["c1", "C2"])
    for i in range(len(j1)):
        csv_writer.writerow([j1[i]['c1'],j1[i]['c2']])
        csv_writer.writerow([j2[i]['c1'],j2[i]['c2']])

如果 json 文件的长度不同,您可以将较长的用于 range 命令并尝试在循环内部进行一些操作:

import json
import csv


j1 = json.loads('[{"c1": "bla", "c2": 5}, {"c1": "bla1", "c2": 6}, {"c1": "bla2", "c2": 7}]')
j2 = json.loads('[{"c1": "blub", "c2": 1}, {"c1": "blub", "c2": 2}, {"c1": "blub", "c2": 3}, {"c1": "blub", "c2": 9}]')


with open("./yourfile.csv", "w", encoding="utf-8") as csv_file:
    csv_writer = csv.writer(csv_file)
    csv_writer.writerow(["c1", "C2"])
    for i in range(max([len(j1),len(j2)])):
        try:
            csv_writer.writerow([j1[i]['c1'],j1[i]['c2']])
        except IndexError:
            print('First json file out of data')
        try:
            csv_writer.writerow([j2[i]['c1'],j2[i]['c2']])
        except IndexError:
            print('Second json file out of data')

使用 pandas 的另一种方法是操作输入文件的索引并在连接后对它们进行排序:

import pandas as pd
import numpy as np

df1 = pd.read_json('[{"c1": "bla", "c2": 1}, {"c1": "bla1", "c2": 3}, {"c1": "bla2", "c2": 5}, {"c1": "bla3", "c2": 7}]')
df2 = pd.read_json('[{"c1": "blub", "c2": 2}, {"c1": "blub1", "c2": 4}, {"c1": "blub2", "c2": 6}, {"c1": "blub3", "c2": 8}, {"c1": "blub4", "c2": 10}]')

df1.index = np.arange(0, (len(df1))*2, 2)
df2.index = np.arange(1, (len(df2))*2, 2)
output = pd.concat([df1, df2]).sort_index()


output.to_csv('output.csv', index=False)

推荐阅读