首页 > 解决方案 > What is the best way to read a structure from a binary file containing IP header fragments?

问题描述

During my computer networks lab, I had to read many binary files which contains the packet in IPv4 format. Here is the IPv4 header file format.

The following structure encapsulates all the essential part of the IP header.

struct ip_header {
    uint8_t version;
    uint8_t header_length;
    uint8_t service_type;
    uint16_t total_length;
    uint16_t identification;
    uint8_t flags;
    uint16_t fragment_offset;
    uint8_t ttl;
    uint8_t protocol;
    uint16_t checksum;
    uint32_t src;
    uint32_t dest;
    /* other fields for options if needed */
};

One way to read the binary file to get the data in a structured format is to read the file bytes-by-bytes and then specifically typecast each byte field to respective fields for the above structure. Reading the file is not an issue.

I want to know whether this is the only way to do it, or is there any other nice and magical way to achieve the same. Also, recently I got know that endianness also creates some problem while reading these kinds of files with different sized data types.

标签: c++c

解决方案


如果您的 IPv4 标头以与“它们进入”相同的格式存储(这是存储它们的常用方式) - 源和目标地址是标头中的最后一个字段,应该这样做:

#include <fstream>
#include <iostream>

#include <netinet/ip.h> // a common place to find a "iphdr" definition

// add a streaming operator for reading an iphdr
std::istream& operator>>(std::istream& is, iphdr& ip) {
    return is.read(reinterpret_cast<char*>(&ip), sizeof(iphdr));
}

// add a streaming operator for writing an iphdr
std::ostream& operator<<(std::ostream& os, const iphdr& ip) {
    return os.write(reinterpret_cast<const char*>(&ip), sizeof(iphdr));
}

int main() {
    std::ifstream ips("ipheaders");
    if(ips) {
        iphdr h;
        while(ips >> h) {
            std::cout << h.version << "\n"
                      << h.ihl << "\n"
                      << h.tos << "\n"
                      << h.tot_len << "\n"
                      << h.id << "\n"
                      << h.frag_off << "\n"
                      << h.ttl << "\n"
                      << h.protocol << "\n"
                      << h.check << "\n"
                      << h.saddr << "\n"
                      << h.daddr << "\n";
        }
    }
}

物理标头中的前 4 位始终是version,但正如@Mirco 所示,当您摆弄位字段时,您编译程序的计算机的字节顺序是很重要的。通过网络传来并存储在文件中的前 4 位仍然存在version- 如果您也iphdr使用添加的内容将其写入磁盘,operator<<也会如此。如果您想要便携,请完全按照自 IPv4 发明以来的外观读取和写入 IP 标头。

幸运的是,ip 标头的布局与大多数系统上所需的基本数据类型的对齐方式相匹配。如果您发现无法创建与原始数据匹配的 IP 标头结构的系统,您很可能找不到netinet/ip.h- 但如果您仍然担心它,您可以添加编译时检查:

    static_assert(alignof(uint8_t) == 1);
    static_assert(alignof(uint16_t) == 2);
    static_assert(alignof(uint32_t) == 4);

推荐阅读