首页 > 解决方案 > 登录站点并使用 cURL 下载文件

问题描述

我正在尝试通过 curl 登录网站。该网站是这样的:

https://web.spaggiari.eu/home/app/default/login.php

稍后使用此命令从该站点下载文件::

curl -Lv https://web.spaggiari.eu/fml/app/default/xml_export.php?stampa=%3Astampa%3A&report_name=&tipo=agenda&data=03+11+20&autore_id=6583250&tipo_export=EVENTI_AGENDA_STUDENTI&quad=%3Aquad%3A&materia_id=&classe_id=%3Aclasse_id%3A&gruppo_id=%3Agruppo_id%3A&ope=RPT&dal=2020-11-03&al=2020-11-03&formato=xls

但是,在没有登录的情况下,会下载页面的源代码,而不是我要下载的 xls 文件。其实使用这个命令可以看到需要认证:

* Expire in 1 ms for 1 (transfer 0x5605f44e3f50)
* Expire in 2 ms for 1 (transfer 0x5605f44e3f50)
*   Trying 159.69.199.242...
* TCP_NODELAY set
* Expire in 149997 ms for 3 (transfer 0x5605f44e3f50)
* Expire in 200 ms for 4 (transfer 0x5605f44e3f50)
* Connected to web.spaggiari.eu (159.69.199.242) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: CN=*.spaggiari.eu
*  start date: May 29 00:00:00 2020 GMT
*  expire date: May 29 12:00:00 2022 GMT
*  subjectAltName: host "web.spaggiari.eu" matched certs "*.spaggiari.eu"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=GeoTrust RSA CA 2018
*  SSL certificate verify ok.
> GET /fml/app/default/xml_export.php?stampa=%3Astampa%3A HTTP/1.1
> Host: web.spaggiari.eu
> User-Agent: curl/7.64.0
> Accept: */*
> 
< HTTP/1.1 302 Found
< Server: nginx/1.18.0
< Date: Wed, 04 Nov 2020 10:07:44 GMT
< Content-Type: text/html; charset=UTF-8
< Content-Length: 0
< Connection: keep-alive
< X-Frame-Options: SAMEORIGIN;
< Content-Security-Policy: script-src 'self' filesystem: 'unsafe-eval' 'unsafe-inline' *.spaggiari.eu https://ajax.googleapis.com/ https://cdnjs.cloudflare.com/ https://cdn.jsdelivr.net/ https://code.jquery.com/ https://d31qbv1cthcecs.cloudfront.net/atrk.js https://fonts.googleapis.com/ https://www.google-analytics.com/ https://www.google.com/recaptcha/ https://www.googletagmanager.com/ https://www.gstatic.com/recaptcha/;frame-ancestors 'self' file: *.spaggiari.eu;
< Set-Cookie: PHPSESSID=u2nkberujpq5t8ja8sh7u21jl2htt5vn; path=/; secure; HttpOnly
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Location: ../../../home/app/default/login.php
< X-ZVersion: c
< Pragma: public
< Cache-Control: public, must-revalidate, proxy-revalidate
< 
* Connection #0 to host web.spaggiari.eu left intact

通过浏览器的 DevTools,我了解到身份验证请求实际上是使用表单 POST 对 web.spaggiari.eu/auth-p7/app/default/AuthApi4.php?a=aLoginPw‌ 执行的。我最初尝试使用通常的 curl 命令登录网站:

curl --anyauth --user mail:password web.spaggiari.eu/auth-p7/app/default/AuthApi4.php?a=aLoginPwd

curl --user mail:password https://web.spaggiari.eu/auth-p7/app/default/AuthApi4.php?a=aLoginPwd

curl --data mail:password https://web.spaggiari.eu/auth-p7/app/default/AuthApi4.php?a=aLoginPwd

但是,使用这些命令,我​​无法解决问题,并且输出始终相同。所以我在互联网上搜索了许多解决方案,我意识到当我尝试登录时,我可以将我在浏览器的 DevTools 中找到的内容复制为 cURL(我使用“Edge Version 88.0.680.1(Official Build)dev(64 bit) ”)。

这是我尝试登录网站时发现的 curl 命令:

curl 'https://web.spaggiari.eu/auth-p7/app/default/AuthApi4.php?a=aLoginPwd'\
  -H 'Connection: keep-alive' \
  -H 'Accept: */*' \
  -H 'X-Requested-With: XMLHttpRequest' \
  -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4295.0 Safari/537.36 Edg/88.0.680.1' \
  -H 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8' \
  -H 'Origin: https://web.spaggiari.eu' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Referer: https://web.spaggiari.eu/home/app/default/login.php?target=atv&mode=' \
  -H 'Accept-Language: it,it-IT;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6' \
  -H 'Cookie: _ga=GA1.2.1118066416.1604149840; webrole=gen; webidentity=S6583250C; __auc=b7176b856757ec8351961876044; weblogin=mail.example@example.it; PHPSESSID=cjgqkc6oih4k77ufg2v20enhgkl168em; __asc=4f7867e8175923893ed8b4d9596' \
  --data-raw 'cid=&uid=mail.example%40example.it&pwd=password&pin=&target=' \
  --compressed

这是我登录成功时发现的curl命令:

curl 'https://web.spaggiari.eu/home/app/default/login_ok_redirect.php' \
  -H 'Connection: keep-alive' \
  -H 'Cache-Control: max-age=0' \
  -H 'Upgrade-Insecure-Requests: 1' \
  -H 'Origin: https://web.spaggiari.eu' \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4295.0 Safari/537.36 Edg/88.0.680.1' \
  -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'Sec-Fetch-Mode: navigate' \
  -H 'Sec-Fetch-User: ?1' \
  -H 'Sec-Fetch-Dest: document' \
  -H 'Referer: https://web.spaggiari.eu/home/app/default/login.php?target=atv&mode=' \
  -H 'Accept-Language: it,it-IT;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6' \
  -H 'Cookie: _ga=GA1.2.1118066416.1604149840; webrole=gen; webidentity=S6583250C; __auc=b7176b841757ec8351585876044; _gid=GA1.2.115190717.1604390974; weblogin=mail.example@example.it; __asc=8123be1b178230d092a7920630f; LAST_REQUESTED_TARGET=atv; PHPSESSID=ghcl8g9otsje4psuq6vflk4kqhbsho9q' \
  --data-raw 'custcode=&login=mail.example%40example.it&password=password&pin=' \
  --compressed

这是我下载要下载的xls文件时发现的curl命令:

curl 'https://web.spaggiari.eu/fml/app/default/xml_export.php?stampa=%3Astampa%3A&report_name=&tipo=agenda&data=03+11+20&autore_id=6583250&tipo_export=EVENTI_AGENDA_STUDENTI&quad=%3Aquad%3A&materia_id=&classe_id=%3Aclasse_id%3A&gruppo_id=%3Agruppo_id%3A&ope=RPT&dal=2020-11-03&al=2020-11-03&formato=xls' \
  -H 'Connection: keep-alive' \
  -H 'Upgrade-Insecure-Requests: 1' \
  -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4295.0 Safari/537.36 Edg/88.0.680.1' \
  -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
  -H 'Sec-Fetch-Site: none' \
  -H 'Sec-Fetch-Mode: navigate' \
  -H 'Sec-Fetch-User: ?1' \
  -H 'Sec-Fetch-Dest: document' \
  -H 'Accept-Language: it,it-IT;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6' \
  -H 'Cookie: _ga=GA1.2.1118066416.1604149840; webrole=gen; webidentity=S6583250C; __auc=b4701b841757ec8351585876044; weblogin=mail.example@example.it; PHPSESSID=cjgqkc6oih4k77ufg2v20edmmkl168em; __asc=4f6735e8175928173ed8b4d6783' \
  --compressed

尽管我尝试使用这些命令,但我无法登录并下载 xls 文件。向我询问可能有助于解决问题的任何其他细节。你有什么我可以尝试的解决方案吗?感谢大家的帮助。

编辑:

这是从 DevTools 获得的第一个命令的输出。最后一个字符串可能意味着我已登录该站点。

* Expire in 11 ms for 1 (transfer 0x56303982af50)
* Expire in 14 ms for 1 (transfer 0x56303982af50)
*   Trying 159.69.199.244...
* TCP_NODELAY set
* Expire in 149978 ms for 3 (transfer 0x56303982af50)
* Expire in 200 ms for 4 (transfer 0x56303982af50)
* Connected to web.spaggiari.eu (159.69.199.244) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: CN=*.spaggiari.eu
*  start date: May 29 00:00:00 2020 GMT
*  expire date: May 29 12:00:00 2022 GMT
*  subjectAltName: host "web.spaggiari.eu" matched certs "*.spaggiari.eu"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=GeoTrust RSA CA 2018
*  SSL certificate verify ok.
> POST /auth-p7/app/default/AuthApi4.php?a=aLoginPwd HTTP/1.1
> Host: web.spaggiari.eu
> Accept-Encoding: deflate, gzip
> Connection: keep-alive
> Accept: */*
> X-Requested-With: XMLHttpRequest
> User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4295.0 Safari/537.36 Edg/88.0.680.1
> Content-Type: application/x-www-form-urlencoded; charset=UTF-8
> Origin: https://web.spaggiari.eu
> Sec-Fetch-Site: same-origin
> Sec-Fetch-Mode: cors
> Sec-Fetch-Dest: empty
> Referer: https://web.spaggiari.eu/home/app/default/login.php?target=atv&mode=
> Accept-Language: it,it-IT;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6
> Cookie: _ga=GA1.2.1118066416.1604149840; webrole=gen; webidentity=S6583250C; __auc=b7176b841757ec8351585876044; weblogin=t.dordoni@ccgraphos.it; PHPSESSID=cjgqkc6oih4k77ufg2v20edmmkl168em; __asc=4f2356e8175928173ed8b4d9596
> Content-Length: 63
> 
* upload completely sent off: 63 out of 63 bytes
< HTTP/1.1 200 OK
< Server: nginx/1.18.0
< Date: Wed, 04 Nov 2020 11:00:29 GMT
< Content-Type: application/json
< Transfer-Encoding: chunked
< Connection: keep-alive
< X-Frame-Options: SAMEORIGIN;
< Content-Security-Policy: script-src 'self' filesystem: 'unsafe-eval' 'unsafe-inline' *.spaggiari.eu https://ajax.googleapis.com/ https://cdnjs.cloudflare.com/ https://cdn.jsdelivr.net/ https://code.jquery.com/ https://d31qbv1cthcecs.cloudfront.net/atrk.js https://fonts.googleapis.com/ https://www.google-analytics.com/ https://www.google.com/recaptcha/ https://www.googletagmanager.com/ https://www.gstatic.com/recaptcha/;frame-ancestors 'self' file: *.spaggiari.eu;
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Set-Cookie: PHPSESSID=46e4ipq6v6dh32mv1mvdvf4sb4ukpb84; path=/; secure; HttpOnly
< X-ZVersion: c
< Content-Encoding: gzip
< Pragma: public
< Cache-Control: public, must-revalidate, proxy-revalidate
< 
* Connection #0 to host web.spaggiari.eu left intact
{"time":"2020-11-04T12:00:28+01:00","data":{"auth":{"verified":true,"loggedIn":true,"actionRequested":false,"hints":[],"errors":[],"accountInfo":{"type":"S","id":6583250,"cognome":"NAME","nome":"SURNAME","cid":"MIIT0065"},"redirects":[],"aMode":"sam","mMode":"SEML","errCod":[]},"pfolio":false},"error":[],"api":{"env":"production","AuthSpa":{"version":"2.8.4"}

但是当我尝试运行最后一个命令来下载我编写的 xls 文件时,输出是这样的,我无法下载该文件。

* Expire in 7 ms for 1 (transfer 0x55ca86595f50)
* Expire in 9 ms for 1 (transfer 0x55ca86595f50)
*   Trying 212.83.134.163...
* TCP_NODELAY set
* Expire in 149986 ms for 3 (transfer 0x55ca86595f50)
* Expire in 200 ms for 4 (transfer 0x55ca86595f50)
* Connected to web.spaggiari.eu (212.83.134.163) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: CN=*.spaggiari.eu
*  start date: May 29 00:00:00 2020 GMT
*  expire date: May 29 12:00:00 2022 GMT
*  subjectAltName: host "web.spaggiari.eu" matched certs "*.spaggiari.eu"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=GeoTrust RSA CA 2018
*  SSL certificate verify ok.
> GET /fml/app/default/xml_export.php?stampa=%3Astampa%3A&report_name=&tipo=agenda&data=03+11+20&autore_id=6583250&tipo_export=EVENTI_AGENDA_STUDENTI&quad=%3Aquad%3A&materia_id=&classe_id=%3Aclasse_id%3A&gruppo_id=%3Agruppo_id%3A&ope=RPT&dal=2020-11-03&al=2020-11-03&formato=xls HTTP/1.1
> Host: web.spaggiari.eu
> Accept-Encoding: deflate, gzip
> Connection: keep-alive
> Upgrade-Insecure-Requests: 1
> User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4295.0 Safari/537.36 Edg/88.0.680.1
> Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
> Sec-Fetch-Site: none
> Sec-Fetch-Mode: navigate
> Sec-Fetch-User: ?1
> Sec-Fetch-Dest: document
> Accept-Language: it,it-IT;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6
> Cookie: _ga=GA1.2.1118066416.1604149840; webrole=gen; webidentity=S6583250C; __auc=b7176b841757ec8351585876044; weblogin=t.dordoni@ccgraphos.it; PHPSESSID=cjgqkc6oih4k77ufg2v20edmmkl168em; __asc=4f2356e8175928173ed8b4d9596
> 
< HTTP/1.1 302 Found
< Server: nginx/1.18.0
< Date: Wed, 04 Nov 2020 11:10:03 GMT
< Content-Type: text/html; charset=UTF-8
< Content-Length: 0
< Connection: keep-alive
< X-Frame-Options: SAMEORIGIN;
< Content-Security-Policy: script-src 'self' filesystem: 'unsafe-eval' 'unsafe-inline' *.spaggiari.eu https://ajax.googleapis.com/ https://cdnjs.cloudflare.com/ https://cdn.jsdelivr.net/ https://code.jquery.com/ https://d31qbv1cthcecs.cloudfront.net/atrk.js https://fonts.googleapis.com/ https://www.google-analytics.com/ https://www.google.com/recaptcha/ https://www.googletagmanager.com/ https://www.gstatic.com/recaptcha/;frame-ancestors 'self' file: *.spaggiari.eu;
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Location: ../../../home/app/default/login.php
< X-ZVersion: c
< Pragma: public
< Cache-Control: public, must-revalidate, proxy-revalidate
< 
* Connection #0 to host web.spaggiari.eu left intact

标签: bashshellcurlpost

解决方案


推荐阅读