首页 > 解决方案 > 使用单击按钮进行网页抓取

问题描述

团队,我正在尝试单击加载更多按钮,我只需单击一下即可单击并运行宏而没有问题。那是一次。我需要以下几点帮助

  1. 我正在尝试自动化代码以重复单击按钮,直到页面加载所有数据以进行网络抓取。

  2. 此外,我需要一个代码来检查网页中的加载更多按钮是否存在,然后再将数据抓取到 excel 中。如果找不到“加载更多”按钮,请继续执行下一个代码。(仅供参考加载更多存在于我的网页底部)。

谢谢,如果我的问题不清楚,请回复我。

下面是单击加载更多按钮之前的 Html 代码

<button type="button" class="btn primary btn-primary modal-button-print add-notes" data-bind="click: getNotes, visible: isLoadMoreButtonEnable() &amp;&amp; !$root.providerShouldAcceptDecline()">
  <i class="fa fa-refresh" aria-hidden="true"></i>Load More
</button>    

下面是多次单击加载更多按钮直到加载完整数据后的 Html 代码

<button class="btn primary btn-primary modal-button-print add-notes" style="display: none;" type="button" data-bind="click: getWoNotes, visible: isLoadMoreNotesButtonEnable() &amp;&amp; !$root.providerShouldAcceptDecline()">
                <i class="fa fa-refresh" aria-hidden="true"></i>Load More Work Order Notes
            </button>

我从上面的 html 代码中看到的区别是style="display: none;" 在我多次单击按钮后添加,直到它在网页中加载完整数据。

我有一个与我的网页相似的示例网站。我在这里使用此链接只是为了显示页面在我的网站中的加载方式。

Sub abc()

  
Set IE = New InternetExplorer
Link = my url
.
.
.
.
For L = 2 To Lr1
    IE.navigate Link 
    Set Html = New MSHTML.HTMLDocument
    Set Ws = Scraping
    Do
    DoEvents: Loop Until IE.readyState = READYSTATE_COMPLETE
    Application.Wait (Now + TimeValue("00:00:05"))
    IE.document.querySelector("button[type=button]").Click
    Do
    DoEvents: Loop Until IE.readyState = READYSTATE_COMPLETE
    Application.Wait (Now + TimeValue("00:00:05"))
    IE.document.querySelector("button[type=button]").Click
    Do
    DoEvents: Loop Until IE.readyState = READYSTATE_COMPLETE
    Application.Wait (Now + TimeValue("00:00:05"))
    IE.document.querySelector("button[type=button]").Click
    Do
    DoEvents: Loop Until IE.readyState = READYSTATE_COMPLETE
    Application.Wait (Now + TimeValue("00:00:05"))

    Html.body.innerHTML = IE.document.querySelectorAll(".list").Item(1).outerHTML
    Set Tariku = Html.querySelectorAll(".columns")
    Set data = Html.querySelectorAll(".datalist")
        With Ws

        ' Do all the stuff  

        End With
        IE.document.querySelector("#Logout").Click
        IE.Quit
       Exit Sub
    
  Next L

End Sub

标签: htmlexcelvbaweb-scraping

解决方案


You can try this. Is it possible for you to post the URL if it don't work?

Sub Abc()

Dim browser As Object
Dim url As String
Dim nodeButton As Object
Dim noButtonFound As Boolean

  url = "Your URL here"

  'Initialize Internet Explorer, set visibility,
  'call URL and wait until page is fully loaded
  Set browser = CreateObject("internetexplorer.application")
  browser.Visible = False
  browser.navigate url
  Do Until browser.ReadyState = 4: DoEvents: Loop

  'Click button as often as found
  Do
    'Try to catch button
    Set nodeButton = browser.document.getElementsByTagName("button")(0)

    'Check if button was found
    If Not nodeButton Is Nothing Then
      'Check if it has an style attribute
      If nodeButton.hasAttribute("style") Then
        'Check if button is visible
        If nodeButton.getAttribute("style") <> "display: none;" Then
          'Click button
          nodeButton.Click

          'Wait for load more data
          Application.Wait (Now + TimeSerial(0, 0, 5))
        End If
          'No visible button found, leave loop
          noButtonFound = True
        End If
      Else
        'No visible button found, leave loop
        noButtonFound = True
      End If
    Else
      'No visible button found, leave loop
      noButtonFound = True
    End If
  Loop Until noButtonFound

  'All dynamic data was load
  'Do here what ever you want
  'But I think you don't need a new html document
End Sub

推荐阅读