python - 在重定向链接中完成登录表单后如何下载文件
问题描述
我想.tgz
通过 python 代码从网站下载一些文件。当我单击文件链接时,它会转到另一个页面,让我填写表格(用于登录),填写表格后,它会返回文件链接并开始下载。我尝试过python3
,requests
但没有成功:
我的代码:
import requests
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
payload={'username':'salvandi69@gmail.com','password':'123asdzxc'}
myurl="https://eogdata.mines.edu/wwwdata/viirs_products/dnb_composites/v10//201707/vcmslcfg/SVDNB_npp_20170701-20170731_75N060W_vcmslcfg_v10_c201708061200.tgz"
myurl2="https://eogauth.mines.edu/auth/realms/master/protocol/openid-connect/auth?response_type=code&scope=email%20openid&client_id=eogdata_oidc&state=VyIetf3UzkQbxOjX-jJ-ae5lMaM&redirect_uri=https%3A%2F%2Feogdata.mines.edu%2Feog%2Foauth2callback&nonce=DRL2KruY5oxbgo2G6HxNHX-CgiMoxfF6FdGOV-FK65o"
r = requests.post(myurl2, verify=False, data=payload, timeout=6)
print(r.text)
myurl
是文件链接并被myurl2
重定向链接结果:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" class="login-pf">
<head>
<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta name="robots" content="noindex, nofollow">
<meta name="viewport" content="width=device-width,initial-scale=1"/>
<title>Log in to Earth Observation Group Login</title>
<link rel="icon" href="/auth/resources/afx5f/login/eog/img/favicon.ico" />
<link href="/auth/resources/afx5f/common/keycloak/node_modules/patternfly/dist/css/patternfly.min.css" rel="stylesheet" />
<link href="/auth/resources/afx5f/common/keycloak/node_modules/patternfly/dist/css/patternfly-additions.min.css" rel="stylesheet" />
<link href="/auth/resources/afx5f/common/keycloak/lib/zocial/zocial.css" rel="stylesheet" />
<link href="/auth/resources/afx5f/login/eog/css/login.css" rel="stylesheet" />
</head>
<body class="">
<div class="login-pf-page">
<div id="kc-header" class="login-pf-page-header">
<div id="kc-header-wrapper" class=""><div class="kc-logo-text"><span>EOG</span></div></div>
</div>
<div class="card-pf ">
<header class="login-pf-header">
<div id="kc-locale">
<div id="kc-locale-wrapper" class="">
<div class="kc-dropdown" id="kc-locale-dropdown">
<a href="#" id="kc-current-locale-link">English</a>
<ul>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=de">Deutsch</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=no">Norsk</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=ru">Русский</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=sv">Svenska</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=pt-BR">Português (Brasil)</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=lt">Lietuvių</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=en">English</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=it">Italiano</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=fr">Français</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=zh-CN">中文简体</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=es">Español</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=cs">Čeština</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=ja">日本語</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=sk">Slovenčina</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=pl">Polish</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=ca">Català</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=nl">Nederlands</a></li>
<li class="kc-dropdown-item"><a href="/auth/realms/master/protocol/openid-connect/auth?kc_locale=tr">tr</a></li>
</ul>
</div>
</div>
</div>
<h1 id="kc-page-title"> We are sorry...
</h1>
</header>
<div id="kc-content">
<div id="kc-content-wrapper">
<div id="kc-error-message">
<p class="instruction">Invalid Request</p>
</div>
</div>
</div>
</div>
</div>
</body>
</html>
解决方案
主要问题是:您POST
使用登录页面发送到 url,但form
不必这样做。您应该检查<form action=...>
以获取正确的POST
.
我用来BeautifulSoup
从HTML
.
我没有username
并password
测试所有元素,但至少现在POST
获取带有登录表单和消息Invalid username or password.
的页面,而不是带有Invalid Request
import requests
from bs4 import BeautifulSoup as BS
s = requests.Session()
#s.headers.update({'User-Agent': 'Mozilla/5.0'})
# --- use tgz to get login page -------
url_tgz = "https://eogdata.mines.edu/wwwdata/viirs_products/dnb_composites/v10//201707/vcmslcfg/SVDNB_npp_20170701-20170731_75N060W_vcmslcfg_v10_c201708061200.tgz"
r = s.get(url_tgz)
#print(r.status_code)
#print(r.history)
print('\n--- url page ---\n')
print(r.url)
# --- find url in form ---
soup = BS(r.text, 'html.parser')
item = soup.find('form')
url = item['action']
print('\n--- url form ---\n')
print(url)
print('\n--- url page == url in form ---\n')
print( r.url == url )
# --- login ---
payload = {
'username': 'salvandi69@gmail.com',
'password': '123asdzxc',
'credentialId': '',
}
r = s.post(url, data=payload)
#print(r.status_code)
#print(r.history)
#print(r.url)
#print(r.text)
# --- result ---
print('\n--- login ---\n')
soup = BS(r.text, 'html.parser')
item = soup.find('span', {'class': 'kc-feedback-text'})
if item:
print('Message:', item.text)
else:
print("Can't see error message")
print('\n--- end ---\n')
推荐阅读
- git-commit - 从本地系统删除文件后更新 git 分支
- javascript - React 父子通信:TypeError: _this.state.myFunction is not a function
- websphere - 如何使用 wsadmin Jython 脚本设置自定义 HTTP URL 前缀
- stata - 如何在我的数据集中将变量重塑为宽?
- r - 如何使用 period.apply 将 R 中的 xts 对象列表转换为每周平均值?
- assembly - 汇编 x86 x87 数字处理器
- bash - UNIX 按两个值分组
- jmeter-5.0 - 在使用 SharePoint 在线应用程序时,JMeter 中的 1000 个用户的空闲时间是多长?
- sql - 在 SQL Server 2017 中向名称添加检查约束
- excel - 使用另一个单元格的条件格式