首页 > 解决方案 > 在 Web Scraping 时加载相关的下拉选项

问题描述

我正在尝试从以下网站抓取数据:http: //www.equibase.com/stats/View.cfm? tf=meet &tb=jockey&rbt=TB

我希望 VBA 代码执行以下步骤:

  1. 转到网址
  2. 点击“骑师”
  3. 从下拉列表中选择一个曲目。说,选择“ALBUQUERQUE”
  4. 根据所选曲目,页面会加载“Available Meets”下拉菜单。

现在我想从此下拉列表中选择第一次见面

我的代码从第一个下拉列表中选择值“ALBUQUERQUE”,但没有在第二个下拉列表中加载数据。

Sub extract()

Dim ie As New InternetExplorer
Dim doc As New HTMLDocument
Dim optionText As String

optionText = "ALBUQUERQUE"
ie.Visible = True
Url = "http://www.equibase.com/stats/View.cfm?tf=meet&tb=jockey&rbt=TB"
ie.Navigate Url

Application.StatusBar = "Navigating to URL..."

Do
    DoEvents
Loop Until ie.ReadyState = READYSTATE_COMPLETE

Do While ie.Busy
    DoEvents
Loop

Set doc = ie.Document

Set jockeyButton = doc.getElementsByClassName("scMainTab")
   
For Each Button In jockeyButton
    If Button.getAttribute("href") = "#jockey" Then
        Button.Click
        Exit For
    End If
Next Button

Set tracksDropdown = doc.getElementById("selAvailTracks")

''AT THIS POINT, IT SHOULD AUTOMATICALLY LOAD THE SECOND DROP DOWN BUT IT IS NOT HAPPENING

ie.Quit
Set ie = Nothing

End Sub

如何从第二个下拉列表中选择第一项?

标签: excelvbaweb-scraping

解决方案


神奇的词是“html事件”。要使下拉菜单中的选择生效,必须触发其更改事件。否则什么都不会发生。

您不能将“ALBUQUERQUE”放在第一个下拉列表中。“ALBUQUERQUE”的值为“ALB:USA”

<select id="selAvailTracks" name="selAvailTracks" class="scTrackSelects">
  <option value=""> Available Tracks </option>
  <option value="ALB:USA">ALBUQUERQUE</option>
  <option value="AQU:USA">AQUEDUCT</option>
  <option value="ARP:USA">ARAPAHOE PARK</option>
  <option value="AZD:USA">ARIZONA DOWNS</option>
  <option value="AP :USA">ARLINGTON</option>
  <option value="ASD:CAN">ASSINIBOIA DOWNS</option>
  <option value="ATO:USA">ATOKAD DOWNS</option>
  <option value="BEL:USA">BELMONT PARK</option>
  ...
  ...
  ...

另一种选择方法是所需元素的索引。这用于下拉编号。2.

尝试使用此宏进行选择,包括下拉 2:

Sub Extract()

'Declare all variables
Dim url As String
Dim browser As Object
Dim htmlDoc As Object
Dim nodeTracksDropdown As Object
Dim dateDropdown As Object
Dim trackInDropdown As String

  'Initialize variables
  trackInDropdown = "ALB:USA" 'You can also get this from a cell of a table
  url = "http://www.equibase.com/stats/View.cfm?tf=meet&tb=jockey&rbt=TB"

  'Initialize Internet Explorer, set visibility,
  'call URL and wait until page is fully loaded
  Set browser = CreateObject("internetexplorer.application")
  browser.Visible = True
  browser.navigate url
  Do Until browser.ReadyState = 4: DoEvents: Loop
  'Short break to load dynamic content
  Application.Wait (Now + TimeSerial(0, 0, 3))

  'Shortening document reference
  Set htmlDoc = browser.document

  'Get first dropdown, select track, trigger change event
  'and wait a second to set up the second dropdown
  Set nodeTracksDropdown = htmlDoc.getElementById("selAvailTracks")
  nodeTracksDropdown.Value = trackInDropdown
  Call TriggerEvent(htmlDoc, nodeTracksDropdown, "change")
  Application.Wait (Now + TimeSerial(0, 0, 1))

  'Get second dropdown, select second entry, trigger change event
  'and wait a second to set up the following elements
  Set dateDropdown = htmlDoc.getElementById("selAvailRaceMeets")
  dateDropdown.selectedIndex = 1
  Call TriggerEvent(htmlDoc, dateDropdown, "change")
  Application.Wait (Now + TimeSerial(0, 0, 1))

  'Do whatever you want here
  '...
  '...
  '...

  'Clean up
  'browser.Quit
  'Set browser = Nothing
  'Set nodeTracksDropdown = Nothing
  'Set dateDropdown = Nothing
End Sub

此过程触发 html 事件:

Private Sub TriggerEvent(htmlDocument As Object, htmlElementWithEvent As Object, eventType As String)

  Dim theEvent As Object

  htmlElementWithEvent.Focus
  Set theEvent = htmlDocument.createEvent("HTMLEvents")
  theEvent.initEvent eventType, True, False
  htmlElementWithEvent.dispatchEvent theEvent
End Sub

推荐阅读