首页 > 解决方案 > 使用两个二进制文件构建加权直方图

问题描述

我有两个需要同时迭代的二进制文件,以便一个文件中产生的值正确地(相同位置)对应于另一个文件中产生的值。我将值分类到直方图箱中,一个文件中的值对应于另一个文件中值的权重。

我尝试了以下语法:

import numpy as np
import struct
import matplotlib.pyplot as plt

low = np.inf
high = -np.inf

struct_fmt = 'f'
struct_len = struct.calcsize(struct_fmt)
struct_unpack = struct.Struct(struct_fmt).unpack_from

file = "/projects/current/real-core-snaps/core4_256_velx_0009.bin"
file2 = "/projects/current/real-core-snaps/core4_256_dens_0009.bin"

def read_chunks(f, length):
    while True:
        data = f.read(length)
        if not data: break
        yield data

loop = 0

with open(file,"rb") as f:
    for chunk in read_chunks(f, struct_len):   
        x = struct_unpack(chunk)
        low = np.minimum(x, low)
        high = np.maximum(x, high)
        loop += 1

nbins = math.ceil(math.sqrt(loop)) 

bin_edges = np.linspace(low, high, nbins + 1)
total = np.zeros(nbins, np.int64)


f = open(file,"rb")
f2 = open(file2,"rb")

for chunk1,chunk2 in zip(read_chunks(f, struct_len),read_chunks(f2, struct_len)):
    subtotal,e = np.histogram(struct_unpack(chunk1),bins=bin_edges,weights=struct_unpack(chunk2))
    total = np.add(total,subtotal,out=total,casting="unsafe")

plt.hist(bin_edges[:-1], bins=bin_edges, weights=total)
plt.savefig('hist-veldens.svg')

但产生的直方图是荒谬的(见下文)。我究竟做错了什么?在此处输入图像描述

数据文件位于https://drive.google.com/file/d/1fhia2CGzl_aRX9Q9Ng61W-4XJGQe1OCV/view?usp=sharinghttps://drive.google.com/file/d/1CrhQjyG2axSFgK9LGytELbxjy3Ndon1S/view?usp=分享

标签: pythonnumpyhistogram

解决方案


错误在于total = np.zeros(nbins, np.int64)将整数类型分配给数组的每个元素total。鉴于subtotal在加权直方图中不包含计数,而是浮点类型,所以 total 也应该是 type float


推荐阅读