c++ - libcurl 无法正确下载图像文件
问题描述
我已经创建了这个非常基本的 curl 包装器,并且可以使用它下载 html 页面,但是我遇到的问题是当我尝试获取图像时(没有尝试过其他文件)。
class BasicCurlWrapper
{
CURL* m_curlHandle{ nullptr };
std::string m_current_url{};
std::string m_destinationFilePath{};
std::ofstream m_outputFile{};
std::ios_base::openmode m_fileOpenMode{ std::ios::out };
bool m_verbose{ false };
public:
BasicCurlWrapper()
{
m_curlHandle = curl_easy_init();
}
~BasicCurlWrapper()
{
curl_easy_cleanup(m_curlHandle);
//curl_global_cleanup();
}
void downloadUrl(const std::string& url, const std::string& destination, std::ios_base::openmode openmode = std::ios::out)
{
if (m_outputFile.is_open()) {
m_outputFile.close();
}
m_current_url = url;
m_destinationFilePath = destination;
m_fileOpenMode = openmode;
char errbuf[CURL_ERROR_SIZE] = { 0 };
curl_easy_setopt(m_curlHandle, CURLOPT_URL, url.data());
curl_easy_setopt(m_curlHandle, CURLOPT_VERBOSE, m_verbose ? 1L : 0L); //Switch on full protocol/debug output while testing
curl_easy_setopt(m_curlHandle, CURLOPT_NOPROGRESS, 1L); //disable progress meter, set to 0L to enable it
curl_easy_setopt(m_curlHandle, CURLOPT_FOLLOWLOCATION, 1L);
curl_easy_setopt(m_curlHandle, CURLOPT_USERAGENT, "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.89 Safari/537.36");
curl_easy_setopt(m_curlHandle, CURLOPT_WRITEFUNCTION, BasicCurlWrapper::write_data);
curl_easy_setopt(m_curlHandle, CURLOPT_WRITEDATA, this);
curl_easy_setopt(m_curlHandle, CURLOPT_FAILONERROR, 1L);
curl_easy_setopt(m_curlHandle, CURLOPT_ERRORBUFFER, errbuf);
//curl_easy_setopt(m_curlHandle, CURLOPT_ACCEPT_ENCODING, "");
//curl_easy_setopt(m_curlHandle, CURLOPT_SSLCERT, "C:/msys64/mingw64/ssl/certs/ca-bundle.crt");
auto res = curl_easy_perform(m_curlHandle);
if (m_outputFile.is_open()) {
m_outputFile.close();
}
if (res == CURLE_OK) {
std::cout << "Downloaded file\n";
} else {
std::cout << "ERROR: " << curl_easy_strerror(res) << '\n' << errbuf << '\n';
}
}
void setVerbose(bool cond)
{
m_verbose = cond;
}
//https://curl.haxx.se/mail/lib-2008-09/0250.html
static std::size_t write_data(const char* ptr, const std::size_t size, const std::size_t nmemb, void* classIntance)
{
if (nmemb > 0) {
static_cast<BasicCurlWrapper*>(classIntance)->writeToFile(ptr, nmemb);
}
return nmemb;
}
private:
void writeToFile(const char* ptr, const std::size_t nmemb)
{
if (!m_outputFile.is_open()) {
m_outputFile.open(m_destinationFilePath, m_fileOpenMode);
}
if (m_outputFile.is_open()) {
std::cout << "Writing data amount: " << nmemb << '\n';
m_outputFile.write(ptr, nmemb);
} else {
auto errorMsg{ std::string{"Unable to open file: " + m_destinationFilePath } };
throw std::runtime_error{ errorMsg };
}
}
};
所以我这样使用它:
BasicCurlWrapper cr;
cr.setVerbose(true);
cr.downloadUrl("https://icons.iconarchive.com/icons/google/noto-emoji-activities/512/52730-soccer-ball-icon.png", "ball.png", std::ios::out | std::ios::binary);
这确实下载了一些东西:
‰PNG
¾M&S»Á€>öÝÀKþ駟ªC²²²Ð½{wÕ5–-[†…*7Þx½zõ¢C˜ž––L›6
555ŠÛŽ1þ³ºÂr'Å·Íê>ð^ùpAmèÀŽãœ.—«–@èEÀŒ±yJÛ)©éâàÔóÚÄ™ÄA]]¦NŠ¦æfÅ÷uÍ5Tò—+Ö[‡¾òŠªúÕ×^CvŸ>gtò'É·ý›œü¹QYñÇÝér¹þmöçpÁð^¯w€AJÛFâR€–tîܹ=Ï cä`íÚµX»vâëÙív,X°€ªþa…$I¸ë®»T•¾ðÂqß}÷µÏàÛÖä:„ŠŠ
Šbª$€Ðÿ.
虽然它以 PNG 开头,但这不是一个有效的 png,原始文件也是 39kb。我是否必须发送一些额外的标题或其他东西?我希望能够下载任何指定的文件。
我曾经vcpkg
得到 libcurl:
curl:x64-windows 7.68.0
编辑:
我已经更新了代码以反映我现在write
用来将数据输出到文件的@Some程序员老兄的答案。这已经修复了我使用的示例图像。
我现在遇到的问题是我正在尝试下载的另一个图像。
cr.downloadUrl("https://v217.mangabeast.com/manga/Onepunch-Man/0130-007.png", "image.png", std::ios::out | std::ios::binary);
该文件image.png
现在包含以下文本:
error code: 1010
我只需使用以下命令即可下载此图像:
curl -O <url>
所以我没有通过 curl 命令传递任何东西,所以我需要在 libcurl 中传递什么?
这是请求的输出:
* STATE: INIT => CONNECT handle 0x24781b66728; line 1605 (connection #-5000)
* Added connection 0. The cache now contains 1 members
* STATE: CONNECT => WAITRESOLVE handle 0x24781b66728; line 1646 (connection #0)
* Trying 104.31.15.158:443...
* TCP_NODELAY set
* STATE: WAITRESOLVE => WAITCONNECT handle 0x24781b66728; line 1725 (connection #0)
* Connected to v217.mangabeast.com (104.31.15.158) port 443 (#0)
* STATE: WAITCONNECT => SENDPROTOCONNECT handle 0x24781b66728; line 1781 (connection #0)
* Marked for [keep alive]: HTTP default
* schannel: SSL/TLS connection with v217.mangabeast.com port 443 (step 1/3)
* schannel: checking server certificate revocation
* schannel: sending initial handshake data: sending 184 bytes...
* schannel: sent initial handshake data: sent 184 bytes
* schannel: SSL/TLS connection with v217.mangabeast.com port 443 (step 2/3)
* schannel: failed to receive handshake, need more data
* STATE: SENDPROTOCONNECT => PROTOCONNECT handle 0x24781b66728; line 1796 (connection #0)
* schannel: SSL/TLS connection with v217.mangabeast.com port 443 (step 2/3)
* schannel: encrypted data got 2709
* schannel: encrypted data buffer: offset 2709 length 4096
* schannel: sending next handshake data: sending 93 bytes...
* schannel: SSL/TLS connection with v217.mangabeast.com port 443 (step 2/3)
* schannel: encrypted data got 258
* schannel: encrypted data buffer: offset 258 length 4096
* schannel: SSL/TLS handshake complete
* schannel: SSL/TLS connection with v217.mangabeast.com port 443 (step 3/3)
* schannel: stored credential handle in session cache
* STATE: PROTOCONNECT => DO handle 0x24781b66728; line 1815 (connection #0)
> GET /manga/Onepunch-Man/0130-007.png HTTP/1.1
Host: v217.mangabeast.com
User-Agent: User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.89 Safari/537.36
Accept: */*
* STATE: DO => DO_DONE handle 0x24781b66728; line 1870 (connection #0)
* STATE: DO_DONE => PERFORM handle 0x24781b66728; line 1991 (connection #0)
* schannel: client wants to read 16384 bytes
* schannel: encdata_buffer resized 17408
* schannel: encrypted data buffer: offset 0 length 17408
* schannel: encrypted data got 674
* schannel: encrypted data buffer: offset 674 length 17408
* schannel: decrypted data length: 611
* schannel: decrypted data added: 611
* schannel: decrypted cached: offset 611 length 16384
* schannel: encrypted data length: 34
* schannel: encrypted cached: offset 34 length 17408
* schannel: decrypted data length: 5
编辑2:
我现在添加了一些错误检查以及错误失败。我得到以下信息:
ERROR: HTTP response code said error
The requested URL returned error: 403 Forbidden
我不明白如何403
通过命令行使用 cURL 获得图像。
编辑 3:
刚刚注意到用户代理字符串有User-Agent:
,在放入一个有效的用户代理后,我得到了文件!
解决方案
您有两个问题,都源于您将收到的数据视为文本。
第一个问题是您以文本模式打开文件,这可能意味着某些字节被转换为其他字节(甚至是多个其他字节)。最常见的此类翻译是换行符'\n'
,在 Windows 上通常会被翻译为两个字符序列'\r'
和'\n'
.
第二个问题是您的writeToFile
函数假定数据是一个以空字符结尾的字符串,但事实并非如此。用于字符串的空终止符只是一个带有 value 的字节0
。任意二进制数据(如 PNG 图像)将包含零字节。您需要使用该write
函数写入数据,将数据的实际长度(以字节为单位)传递size
给 cURL“写入数据”函数回调的参数。
要解决您的第一个问题,您需要通过在打开文件std::ios::bin
时添加标志以二进制模式打开文件。第二个问题可以通过使用write
前面提到的函数来解决。
推荐阅读
- excel - 绕过事务进行日志记录
- google-analytics - 具有人口统计维度的谷歌分析产品数据
- java - 在私有方法中模拟公共方法
- sql - 计算早上的平均活动时间
- javascript - 我如何在 discord.js 中拥有公会的数量
- java - Vertx 事件总线拦截器阻塞请求调用
- assembly - 如何查看为 JNI 调用生成的机器代码?
- linux - .ko(elf format).strtab 索引是如何决定的?
- python - 循环访问 pandas DataFrame 以控制索引
- node.js - ERR_HTTP_HEADERS_SENT:当我使用 res.write() 和 res.send()