首页 > 解决方案 > 在 Python 中发送发布请求以登录网站所涉及的步骤

问题描述

我在这里读过很多这样的问题,但仍然没有解决我的问题

我正在尝试使用 Python登录该网站,但无法登录。

我找到了我想填写的表格

<form id="loginForm" class="form" method="post">
    <div class="form__group">
        <label for="email" class="form__label">Email Address</label>
        <input type="hidden" value="false" name="associateAccount"/>
        <input type="email" name="email" id="email" pattern="^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,64}$" class="form__input" placeholder="Enter Email Address" title="The email address must valid to continue." maxlength="63" autocomplete="email" required=""/>
        <span></span>
    </div>
    <div class="form__group">
        <label for="pin" class="form__label">Pin Number</label>
        <input type="password" name="pin" required="" id="pin" class="form__input" maxlength="" placeholder="Enter Pin" title="The pin must be valid to continue"/>
        <span></span>
        <a class="text-link text-link--forgotten-pin" id="ForgottenPinLink" href="/forgotten-pin/">Forgotten Your Pin?</a>
    </div>
    <div class="form__group">
        <input type="submit" id="login-submit" class="button button--primary button--wide button--large" value="Login"/>
    </div>
</form>

但我面临以下问题:

我尝试了各种方法(使用 urllib、mechanize 和 requests),但现在正在尝试这个

import requests

url = 'https://www.puregym.com/Login/'
payload = {'email':'myusername', 'pin': 'mypin'}
r = requests.post(url, params=payload)
with open("requests_results.html", "w") as f:
    f.write(r.content)

问题总是一样的,它只是返回我所在的同一页面。我认为这段代码不正确,因为我发布到了错误的 url,但同样,我不知道在哪里可以找到正确的。

如果可能的话,我真的很想解释如何用实际的 python 代码解决这个问题。

编辑:

使用 Chrome 开发人员工具检查 POST 请求,我得到了

Request URL: https://www.puregym.com/api/members/login/
Request Method: POST
Status Code: 200 
Remote Address: 123.45.57.89:123
Referrer Policy: no-referrer-when-downgrade

此外,有效负载是(替换我的凭据)

associateAccount:"false"
email:"myemail@email.com"
pin:"1234567890"

然而,这段代码

import requests

# Fill in your details here to be posted to the login form.
payload = {"associateAccount":"false", "email":"myemail@email.com", "pin":"1234567890"}

LOGIN_URL = 'https://www.puregym.com/api/members/login/'

# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
    p = s.post(LOGIN_URL, data=payload)
    # print the html returned or something more intelligent to see if it's a successful login page.
    print p.text

    # An authorised request.
    r = s.get('https://www.puregym.com/members/')
    print r.text

刚回来

{
  "message": "An error has occurred."
}

编辑2:

当前代码是

import urllib2
from bs4 import BeautifulSoup
import requests
import re

payload = {
    "associateAccount":"false", 
    "email":"test@gmail.com",
    "pin":"123456789"
}

headers = {
    "accept": "application/json, text/javascript",
    "accept-encoding": "gzip, deflate, br",
    "accept-language": "en,pt-PT;q=0.9,pt;q=0.8,en-US;q=0.7,tr;q=0.6",
    "content-type": "application/json, text/javascript",
    "referer": "https://www.puregym.com/Login/?ReturnUrl=%2Fmembers%2F",
    "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5)",
    "x-requested-with": "XMLHttpRequest"
}

POST_URL = 'https://www.puregym.com/api/members/login/'
LOGIN_URL = 'https://www.puregym.com/Login/'

# Scrape the login page first to get __RequestVerificationToken from form
page = urllib2.urlopen(LOGIN_URL)
soup = BeautifulSoup(page, "html.parser")
form = soup.find("form", {"id": "__AjaxAntiForgeryForm"})
__requestverificationtoken = form.input.attrs['value']

headers['__requestverificationtoken'] = __requestverificationtoken

# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
    p = s.post(POST_URL, data = payload, headers = headers)
    # print the html returned or something more intelligent to see if it's a successful login page.
    print(p.text)

但回应是

{"errorCode":"P2BBD16-190139","message":"We have had a little problem which is normally only temporary. You can try again or contact our Member Services Team on 03444 770 005 (between 8am and 10pm) who can help."}

但是通过浏览器登录就可以了。

标签: pythonhtmlweb-scraping

解决方案


您的帖子还应该有标题(不仅仅是有效负载)。重要的是,这__requestverificationtoken可能是网站的 csrf 令牌,旨在提高安全性。

payload = {"associateAccount":"false", "email":"myemail@email.com", "pin":"1234567890"}
headers = {
           "__requestverificationtoken": "_Q4aj4hNp5uOMv6EbrD8KRQA85nsLLKRQIAC8m2PHNAvI2OHfUrtW11MANIjJVWWhqFJTjq1wHGMhCLj4AyLYCUcEqA1",
           "accept": "application/json, text/javascript",
           "accept-encoding": "gzip, deflate, br",
           "accept-language": "en-US,en;q=0.9",
           "content-type": "application/json",
           "x-requested-with": "XMLHttpRequest"
           }

LOGIN_URL = 'https://www.puregym.com/api/members/login/'

# Use 'with' to ensure the session context is closed after use.
with requests.Session() as s:
    p = s.post(LOGIN_URL, data=payload, headers = headers)
    # print the html returned or something more intelligent to see if it's a successful login page.
    print(p.text)

我没有帐户来测试您需要登录的页面,但上面的代码返回"Your email address or PIN is incorrect. Please try again."表明该代码运行正常。


推荐阅读