首页 > 解决方案 > 我需要做什么才能让 Boost.Beast HTTP 解析器找到正文的结尾?

问题描述

我正在尝试使用boost::beast::http::parser. 我的解析器是这样定义的:

boost::beast::http::parser<false, boost::beast::http::string_body> response_parser;

异步读取的回调是这样的:

void AsyncHttpsRequest::on_response_read(const boost::system::error_code &error_code, uint32_t bytes_transferred)
{
    if (bytes_transferred > 0)
    {
        response_parser.put(boost::asio::buffer(data_buffer, bytes_transferred), http_error_code);
        std::cout << "Parser status: " << http_error_code.message() << std::endl;
        std::cout << "Read " << bytes_transferred << " bytes of HTTPS response" << std::endl;
        std::cout << std::string(data_buffer, bytes_transferred) << std::endl;
    }
    if (error_code)
    {
        std::cout << "Error during HTTPS response read: " << error_code.message() << std::endl;
        callback(error_code, response_parser.get());
    }
    else
    {
        if (response_parser.is_done())
        {
            callback(error_code, response_parser.get());
        }
        else
        {
            std::cout << "Response is not yet finished, reading more" << std::endl;
            read_response();
        }
    }
}

当响应没有正文时一切正常,response_parser.is_done()返回true。但是,当响应包含正文时,false即使正文已完全读取,它也会始终返回。响应还有一个Content-Length与正文中的字节数匹配的标头,因此没有问题。

Boost 文档说,如果消息的语义表明需要一个正文,并且整个正文被解析,则response_parser.is_done()应该返回。true

当我使用发送请求Connection: keep-alive时,我卡在读取响应上,因为服务器没有任何东西要发送并且response_parser尚未完成。当我使用Connection: close我的完成回调被调用,但boost::beast::http::messageparsed 里面没有正文。但是,我登录到标准输出显示有正文并且已完全读取。

当从正文读取的字节数等于时,我需要做什么才能使boost::beast::http::parser识别正文结束并返回?trueis_done()Content-Length

标签: c++httpboostboost-beast

解决方案


你的期望是对的。

背景、细节和注意事项:

您可以观察到它确实有效:

住在科利鲁

#include <boost/beast/http.hpp>
#include <iostream>
#include <iomanip>
#include <random>
using boost::system::error_code;
namespace http = boost::beast::http;

int main() {
    std::mt19937 prng { std::random_device{}() };
    std::uniform_int_distribution<size_t> packet_size { 1, 372 };

    std::string const response = 
"HTTP/1.1 200 OK\r\n"
"Age: 207498\r\n"
"Cache-Control: max-age=604800\r\n"
"Content-Type: text/html; charset=UTF-8\r\n"
"Date: Sat, 20 Mar 2021 23:24:40 GMT\r\n"
"Etag: \"3147526947+ident\"\r\n"
"Expires: Sat, 27 Mar 2021 23:24:40 GMT\r\n"
"Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT\r\n"
"Server: ECS (bsa/EB15)\r\n"
"Vary: Accept-Encoding\r\n"
"X-Cache: HIT\r\n"
"Content-Length: 1256\r\n"
"\r\n"
"<!doctype html>\n<html>\n<head>\n    <title>Example Domain</title>\n\n    <meta charset=\"utf-8\" />\n    <meta http-equiv=\"Content-type\" content=\"text/html; charset=utf-8\" />\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\" />\n    <style type=\"text/css\">\n    body {\n        background-color: #f0f0f2;\n        margin: 0;\n        padding: 0;\n        font-family: -apple-system, system-ui, BlinkMacSystemFont, \"Segoe UI\", \"Open Sans\", \"Helvetica Neue\", Helvetica, Arial, sans-serif;\n        \n    }\n    div {\n        width: 600px;\n        margin: 5em auto;\n        padding: 2em;\n        background-color: #fdfdff;\n        border-radius: 0.5em;\n        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);\n    }\n    a:link, a:visited {\n        color: #38488f;\n        text-decoration: none;\n    }\n    @media (max-width: 700px) {\n        div {\n            margin: 0 auto;\n            width: auto;\n        }\n    }\n    </style>    \n</head>\n\n<body>\n<div>\n    <h1>Example Domain</h1>\n    <p>This domain is for use in illustrative examples in documents. You may use this\n    domain in literature without prior coordination or asking for permission.</p>\n    <p><a href=\"https://www.iana.org/domains/example\">More information...</a></p>\n</div>\n</body>\n</html>\n";

    std::string const input = response + response;
    std::string_view emulated_stream = input;

    error_code ec;
    while (not emulated_stream.empty()) {
        std::cout << "== Emulated stream of " << emulated_stream.size()
                  << " remaining" << std::endl;

        http::parser<false, http::string_body> response_parser;

        while (not (ec or response_parser.is_done() or emulated_stream.empty())) {
            auto next     = std::min(packet_size(prng), emulated_stream.size());
            auto consumed = response_parser.put(
                boost::asio::buffer(emulated_stream.data(), next), ec);

            std::cout << "Consumed " << consumed << std::boolalpha
                      << "\tHeaders done:" << response_parser.is_header_done()
                      << "\tDone:" << response_parser.is_done()
                      << "\tChunked:" << response_parser.chunked()
                      << "\t" << ec.message() << std::endl;

            if (ec == http::error::need_more)
                ec.clear();

            emulated_stream.remove_prefix(consumed);
        }

        auto res = response_parser.release();

        std::cout << "== Content length " << res["Content-Length"] << " and body "
                  << res.body().length() << std::endl;
        std::cout << "== Headers: " << res.base() << std::endl;
    }

    std::cout << "== Stream depleted " << ec.message() << std::endl;
}

打印例如

== Emulated stream of 3182 remaining
Consumed 101    Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 234    Headers done:true   Done:false  Chunked:false   Success
Consumed 305    Headers done:true   Done:false  Chunked:false   Success
Consumed 326    Headers done:true   Done:false  Chunked:false   Success
Consumed 265    Headers done:true   Done:false  Chunked:false   Success
Consumed 216    Headers done:true   Done:false  Chunked:false   Success
Consumed 144    Headers done:true   Done:true   Chunked:false   Success
== Content length 1256 and body 1256
== Headers: HTTP/1.1 200 OK
Age: 207498
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Sat, 20 Mar 2021 23:24:40 GMT
Etag: "3147526947+ident"
Expires: Sat, 27 Mar 2021 23:24:40 GMT
Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
Server: ECS (bsa/EB15)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1256

== Emulated stream of 1591 remaining
Consumed 204    Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 0  Headers done:false  Done:false  Chunked:false   need more
Consumed 131    Headers done:true   Done:false  Chunked:false   Success
Consumed 355    Headers done:true   Done:false  Chunked:false   Success
Consumed 137    Headers done:true   Done:false  Chunked:false   Success
Consumed 139    Headers done:true   Done:false  Chunked:false   Success
Consumed 89 Headers done:true   Done:false  Chunked:false   Success
Consumed 87 Headers done:true   Done:false  Chunked:false   Success
Consumed 66 Headers done:true   Done:false  Chunked:false   Success
Consumed 355    Headers done:true   Done:false  Chunked:false   Success
Consumed 28 Headers done:true   Done:true   Chunked:false   Success
== Content length 1256 and body 1256
== Headers: HTTP/1.1 200 OK
Age: 207498
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Sat, 20 Mar 2021 23:24:40 GMT
Etag: "3147526947+ident"
Expires: Sat, 27 Mar 2021 23:24:40 GMT
Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
Server: ECS (bsa/EB15)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1256

== Stream depleted Success

也许

  • 您的流内容实际上不是有效的 HTTP

  • 您的响应根本没有内容长度标头。need_eof()在这种情况下,标头解析完成后true,值为:

    根据标头的内容,解析器可能需要文件结束通知来了解正文的结束位置。如果此函数返回 true ,则必须put_eof在输入中永远不会有其他数据时调用。

  • 你的数据包太小了。如果您将数据包大小分布减少到一个极端,您可以看到这种效果:

     std::uniform_int_distribution<size_t> packet_size { 1, 3 };
    

    这将导致永远不会消费任何内容。文件:

    在某些情况下,输入缓冲区中的八位字节数可能不足以向前推进。这由代码表示error::need_more。发生这种情况时,调用者应将额外的字节放入缓冲区序列并再次调用 put。错误代码 error::need_more 很特殊。返回此错误时,如果缓冲区已更新,则后续调用 put 可能会成功

    在您的真实代码中,您不会继续用少量重试,因为缓冲区只会累积并最终满足取得进展的要求。

也可以看看

奖励:简化!

好消息是您通常不需要使用如此复杂的任何东西。在大多数情况下,您将能够http::readhttp::async_read直接进入响应对象。

这将在引擎盖下与解析器一起完成整个舞蹈,而您无需担心细节:

住在科利鲁

boost::beast::flat_buffer buf;
boost::system::error_code ec;
for (http::response<http::string_body> res; !ec && read(pipe, buf, res, ec); res.clear()) {
    std::cout << "== Content length " << res["Content-Length"] << " and body "
              << res.body().length() << std::endl;
    std::cout << "== Headers: " << res.base() << std::endl;
}

std::cout << "== Stream depleted " << ec.message() << "\n" << std::endl;

就是这样。仍然打印:

== Content length 1256 and body 1256
== Headers: HTTP/1.1 200 OK
Age: 207498
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Sat, 20 Mar 2021 23:24:40 GMT
Etag: "3147526947+ident"
Expires: Sat, 27 Mar 2021 23:24:40 GMT
Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
Server: ECS (bsa/EB15)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1256

== Content length 1256 and body 2512
== Headers: HTTP/1.1 200 OK
Age: 207498
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Sat, 20 Mar 2021 23:24:40 GMT
Etag: "3147526947+ident"
Expires: Sat, 27 Mar 2021 23:24:40 GMT
Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
Server: ECS (bsa/EB15)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1256

== Stream depleted end of stream

推荐阅读