首页 > 解决方案 > 使用 scrapy 重定向 strava.com 中的授权问题。日志说 strava 将我从 /login 重定向到 /login

问题描述

我真的需要你的帮助:已经尝试了一切!目标 -使用 scrapy授权https://www.strava.com/login

那是我的代码:

import scrapy
from scrapy.http import FormRequest
from scrapy.utils.response import open_in_browser


class StravaSpider(scrapy.Spider):
    name = 'strava'
    start_urls = ('https://www.strava.com/dashboard',)

    def parse(self, response):
        token = response.xpath('//*[@name="csrf-token"]/@content').get()
        return FormRequest.from_response(response,
                                         formdata={
                                             'authenticity_token': token,
                                             'plan': "",
                                             'email': 'login',
                                             'password': 'password'},
                                         #dont_filter=True,
                                         #meta={'dont_redirect': True, 'handle_httpstatus_list': [302]},
                                         callback=self.scrape_page)

    def scrape_page(self, response):
        print('okkkk', '\n\n\n\n')
        open_in_browser(response)

问题在于重定向:

2020-08-16 20:18:49 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.strava.com/dashboard> from <POST https://www.strava.com/session>
2020-08-16 20:18:49 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.79 Safari/537.36
2020-08-16 20:18:49 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.strava.com/login> from <GET https://www.strava.com/dashboard>
2020-08-16 20:18:49 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36
2020-08-16 20:18:50 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.strava.com/dashboard> from <GET https://www.strava.com/login>
2020-08-16 20:18:50 [scrapy.dupefilters] DEBUG: Filtered duplicate request: <GET https://www.strava.com/dashboard> - no more duplicates will be shown (see DUPEFILTER_DEBUG to show all duplicates)
2020-08-16 20:18:50 [scrapy.core.engine] INFO: Closing spider (finished)

我关闭了重复过滤器,添加了handle_httpstatus_list,在设置中添加了scrapy-redirect ......没有任何效果。请不要 bs4 或 selenium - 我已经和他们一起做了这个程序,现在我只需要scrapy和这个授权......让我哭了

标签: pythonpython-3.xweb-scrapingscrapy

解决方案


推荐阅读