首页 > 解决方案 > 如何使用 Selenium 和 Python 从选择每个下拉选项的表中抓取信息?

问题描述

试图帮助为非营利组织工作的人。目前正在尝试从 STL 县委员会/委员会网站 ( https://boards.stlouisco.com/ ) 获取信息。

遇到麻烦有几个原因:

打算尝试使用 BeautifulSoup,但直到您从上方的下拉栏中选择 Board/Commission 时才会显示实际数据,因此我已切换到 Selenium,这是我的新手。

这个任务可以吗?当我查看该站点的 html 代码时,我发现该信息并未存储在页面中,而是从另一个位置提取并仅根据从下拉菜单中选择的选项显示在站点上。

function ShowMemberList(selectedBoard) {
        ClearMeetingsAndMembers();
        var htmlString = "";
        var boardsList = [{"id":407,"name":"Aging Ahead","isActive":true,"description":"... ...1.","totalSeats":14}];
        var totalMembers = boardsList[$("select[name='BoardsList'] option:selected").index() - 1].totalSeats;
        $.get("/api/boards/" + selectedBoard + "/members", function (data) {
            if (data.length > 0) {
                htmlString += "<table id=\"MemberTable\" class=\"table table-hover\">";
                htmlString += "<thead><th>Member Name</th><th>Title</th><th>Position</th><th>Expiration Date</th></thead><tbody>";
                for (var i = 0; i < totalMembers; i++) {
                    if (i < data.length) {
                        htmlString += "<tr><td>" + FormatString(data[i].firstName) + " " + FormatString(data[i].lastName) + "</td><td>" + FormatString(data[i].title) + "</td><td>" + FormatString(data[i].position) + "</td><td>" + FormatString(data[i].expirationDate) + "</td></tr>";
                    } else {
                        htmlString += "<tr><td colspan=\"4\">---Vacant Seat---</td></tr>" 
                    }
                }
                htmlString += "</tbody></table>";
            } else {
                htmlString = "<span id=\"MemberTable\">There was no data found for this board.</span>";
            }
            $("#Results").append(htmlString);
        });
    }

到目前为止,我有这个(不是很多),它进入页面并从列表中选择每个板:

driver = webdriver.Chrome()
driver.get("https://boards.stlouisco.com/")
select = Select(wait(driver, 10).until(EC.presence_of_element_located((By.ID, 'BoardsList'))))
options = select.options

for board in options:
    select.select_by_visible_text(board.text)

从这里我希望能够从 MemberTable 中抓取信息,但我不知道如何前进/如果它在我的能力范围内,或者即使它是 Selenium 可能的。

我尝试使用 find_by 几个不同的元素来单击成员表,但遇到了错误。我也尝试在我的选择之后调用成员表,但它无法找到该元素。任何提示/指针/建议表示赞赏!

标签: pythonseleniumweb-scrapinghtml-selectwebdriverwait

解决方案


要从下拉列表中选择每个董事会/委员会并抓取您必须诱导WebDriverWait的页面,您可以使用以下定位器策略element_to_be_clickable()

代码:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get("https://boards.stlouisco.com/")
select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'BoardsList'))))
for option in select.options:
    option.click()
    print("Scrapping :"+option.text)

控制台输出:

Scrapping :---Choose a Board---
Scrapping :Aging Ahead
Scrapping :Aging Ahead Advisory Council
Scrapping :Air Pollution & Noise Control Appeal Board
Scrapping :Animal Care & Control Advisory Board
Scrapping :Bi-State Development Agency (Metro)
Scrapping :Board Of Examiners For Mechanical Licensing
Scrapping :Board of Freeholders
Scrapping :Boundary Commission
Scrapping :Building Code Review Committee
Scrapping :Building Commission & Board Of Building Appeals
Scrapping :Business Advisory Council
Scrapping :Center for Educational Media
Scrapping :Civil Service Commission
Scrapping :Commission On Disabilities
Scrapping :County Health Advisory Board
Scrapping :Domestic And Family Violence Council
Scrapping :East-West Gateway Council of Governments Board of Directors
Scrapping :Economic Development Collaborative Advisory Board
Scrapping :Economic Rescue Team
Scrapping :Electrical Code Review Committee
Scrapping :Electrical Examiners, Board Of
Scrapping :Emergency Communications System Commission
Scrapping :Equalization, Board Of
Scrapping :Fire Standards Commission
Scrapping :Friends of the Kathy J. Weinman Shelter for Battered Women, Inc.
Scrapping :Fund Investment Advisory Committee
Scrapping :Historic Building Commission
Scrapping :Housing Authority
Scrapping :Housing Resources Commission
Scrapping :Human Relations Commission
Scrapping :Industrial Development Authority Board
Scrapping :Justice Services Advisory Board
Scrapping :Lambert Airport Eastern Perimeter Joint Development Commission
Scrapping :Land Clearance For Redevelopment Authority
Scrapping :Lemay Community Improvement District
Scrapping :Library Board
Scrapping :Local Emergency Planning Committee
Scrapping :Mechanical Code Review Committee
Scrapping :Metropolitan Park And Recreation District Board Of Directors (Great Rivers Greenway)
Scrapping :Metropolitan St. Louis Sewer District
Scrapping :Metropolitan Taxicab Commission
Scrapping :Metropolitan Zoological Park and Museum District Board
Scrapping :Municipal Court Judges
Scrapping :Older Adult Commission
Scrapping :Parks And Recreation Advisory Board
Scrapping :Planning Commission
Scrapping :Plumbing Code Review Committee
Scrapping :Plumbing Examiners, Board Of
Scrapping :Police Commissioners, Board Of
Scrapping :Port Authority Board Of Commissioners
Scrapping :Private Security Advisory Committee
Scrapping :Productive Living Board
Scrapping :Public Transportation Commission of St. Louis County
Scrapping :Regional Arts Commission
Scrapping :Regional Convention & Sports Complex Authority
Scrapping :Regional Convention & Visitors Commission
Scrapping :REJIS Commission
Scrapping :Restaurant Commission
Scrapping :Retirement Board Of Trustees
Scrapping :St. Louis Airport Commission
Scrapping :St. Louis County Children's Service Fund Board
Scrapping :St. Louis County Clean Energy Development Board (PACE)
Scrapping :St. Louis County Workforce Development Board
Scrapping :St. Louis Economic Development Partnership
Scrapping :St. Louis Regional Health Commission
Scrapping :St. Louis-Jefferson Solid Waste Management District
Scrapping :Tax Increment Financing Commission of St. Louis County
Scrapping :Transportation Board
Scrapping :Waste Management Commission
Scrapping :World Trade Center - St. Louis
Scrapping :Zoning Adjustment,  Board of
Scrapping :Zoo-Museum District - Art Museum Subdistrict Board of Commissioners
Scrapping :Zoo-Museum District - Botanical Garden Subdistrict Board of Commissioners
Scrapping :Zoo-Museum District - Missouri History Museum Subdistrict Board of Commissioners
Scrapping :Zoo-Museum District - St. Louis Science Center Subdistrict Board of Commissioners
Scrapping :Zoo-Museum District - Zoological Park Subdistrict Board of Commissioners

参考

您可以在以下位置找到一些相关的讨论:


推荐阅读