首页 > 解决方案 > 从下拉菜单python中的每个选项中刮取表格

问题描述

我正在尝试从该网站上抓取所有数据:
http ://www.dartsdatabase.co.uk/PlayerStats.aspx?statKey=1&pg=7

但是,我不知道如何遍历“stat”下拉菜单。这些选项中的每一个都包含一个我需要抓取的表。

到目前为止,我有以下代码,其中列出了与下拉列表中的每个元素关联的选项和值:

url = 'http://www.dartsdatabase.co.uk/PlayerStats.aspx'

response = requests.get(url).text

soup = BeautifulSoup(response,"lxml")

drop = soup.find('select',{'name':'stat'}).findAll("option")

options = []

val = []

for i in range(0,len(drop)):

    options.append(drop[i].text)

    val.append(drop[i]['value'])

任何帮助将不胜感激!

标签: pythonpython-3.xweb-scrapingdrop-down-menubeautifulsoup

解决方案


发出更改stat参数的 POST 请求。value您可以从选项的页面属性中收集适当的值

import requests
import pandas as pd
from bs4 import BeautifulSoup as bs

data = {
  'nameSearch': '',
  'dateFrom': '02/10/2017',
  'dateTo': '02/10/2019',
  'organStat': 'All',
  'stat': '1',
  'tourns': 'All',
  'pg': '7'
}

def get_soup():
    r = s.post('http://www.dartsdatabase.co.uk/PlayerStats.aspx?statKey=1&pg=7', data=data)
    soup = bs(r.content, 'lxml')  
    return soup

with requests.Session() as s:
    soup = get_soup()
    table = pd.read_html(str(soup.select_one('br + table')))[0]
    stats = [i['value'] for i in soup.select('[name="stat"] option')][1:]
    print(table)

    for i in stats:
        data['stat']=i
        soup = get_soup()
        table = pd.read_html(str(soup.select_one('br + table')))[0]
        print(table)

推荐阅读