首页 > 解决方案 > See what for loop has checked

问题描述

I don't really know what to call this issue, sorry for the undescriptive title. My program checks if a element exists on multiple paths of a website. The program has a base url that gets different paths of the domain to check, which are located in a json file (name.json). In this current state of my program, it prints 1 if the element is found and 2 if not. I want it to print the url instead of 1 or 2. But my problem is that the id's gets saved before the final for loop. When trying to print fullurl I'm only getting the last id in my json file printed multiple times(because it isnt being saved), instead of the unique url.

import json
import grequests
from bs4 import BeautifulSoup

idlist = json.loads(open('name.json').read())

baseurl = 'https://steamcommunity.com/id/'


complete_urls = []

for uid in idlist:
    fullurl = baseurl + uid
    complete_urls.append(fullurl)

rs = (grequests.get(fullurl) for fullurl in complete_urls)
resp = grequests.map(rs)

for r in resp:
    soup = BeautifulSoup(r.text, 'lxml')

    if soup.find('span', class_='actual_persona_name'):
        print('1')

    else:
        print('2')

标签: pythonfor-looppython-requests

解决方案


由于 grequests.map 按请求的顺序返回响应(请参阅this),因此您可以使用枚举将每个请求的完整 URL 与响应匹配。

import json
import grequests
from bs4 import BeautifulSoup

idlist = json.loads(open('name.json').read())

baseurl = 'https://steamcommunity.com/id/'

for uid in idlist:
    fullurl = baseurl + uid

complete_urls = []

for uid in idlist:
    fullurl = baseurl + uid
    complete_urls.append(fullurl)

rs = (grequests.get(fullurl) for fullurl in complete_urls)
resp = grequests.map(rs)

for index,r in enumerate(resp): # use enumerate to get the index of response
    soup = BeautifulSoup(r.text, 'lxml')
    print(complete_urls[index]) # using the index of responses to access the already existing list of complete_urls
    if soup.find('span', class_='actual_persona_name'):
        print('1')

    else:
        print('2')

推荐阅读