html - 使用单击按钮进行网页抓取
问题描述
团队,我正在尝试单击加载更多按钮,我只需单击一下即可单击并运行宏而没有问题。那是一次。我需要以下几点帮助
我正在尝试自动化代码以重复单击按钮,直到页面加载所有数据以进行网络抓取。
此外,我需要一个代码来检查网页中的加载更多按钮是否存在,然后再将数据抓取到 excel 中。如果找不到“加载更多”按钮,请继续执行下一个代码。(仅供参考加载更多存在于我的网页底部)。
谢谢,如果我的问题不清楚,请回复我。
下面是单击加载更多按钮之前的 Html 代码
<button type="button" class="btn primary btn-primary modal-button-print add-notes" data-bind="click: getNotes, visible: isLoadMoreButtonEnable() && !$root.providerShouldAcceptDecline()">
<i class="fa fa-refresh" aria-hidden="true"></i>Load More
</button>
下面是多次单击加载更多按钮直到加载完整数据后的 Html 代码
<button class="btn primary btn-primary modal-button-print add-notes" style="display: none;" type="button" data-bind="click: getWoNotes, visible: isLoadMoreNotesButtonEnable() && !$root.providerShouldAcceptDecline()">
<i class="fa fa-refresh" aria-hidden="true"></i>Load More Work Order Notes
</button>
我从上面的 html 代码中看到的区别是style="display: none;" 在我多次单击按钮后添加,直到它在网页中加载完整数据。
我有一个与我的网页相似的示例网站。我在这里使用此链接只是为了显示页面在我的网站中的加载方式。
Sub abc()
Set IE = New InternetExplorer
Link = my url
.
.
.
.
For L = 2 To Lr1
IE.navigate Link
Set Html = New MSHTML.HTMLDocument
Set Ws = Scraping
Do
DoEvents: Loop Until IE.readyState = READYSTATE_COMPLETE
Application.Wait (Now + TimeValue("00:00:05"))
IE.document.querySelector("button[type=button]").Click
Do
DoEvents: Loop Until IE.readyState = READYSTATE_COMPLETE
Application.Wait (Now + TimeValue("00:00:05"))
IE.document.querySelector("button[type=button]").Click
Do
DoEvents: Loop Until IE.readyState = READYSTATE_COMPLETE
Application.Wait (Now + TimeValue("00:00:05"))
IE.document.querySelector("button[type=button]").Click
Do
DoEvents: Loop Until IE.readyState = READYSTATE_COMPLETE
Application.Wait (Now + TimeValue("00:00:05"))
Html.body.innerHTML = IE.document.querySelectorAll(".list").Item(1).outerHTML
Set Tariku = Html.querySelectorAll(".columns")
Set data = Html.querySelectorAll(".datalist")
With Ws
' Do all the stuff
End With
IE.document.querySelector("#Logout").Click
IE.Quit
Exit Sub
Next L
End Sub
解决方案
You can try this. Is it possible for you to post the URL if it don't work?
Sub Abc()
Dim browser As Object
Dim url As String
Dim nodeButton As Object
Dim noButtonFound As Boolean
url = "Your URL here"
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set browser = CreateObject("internetexplorer.application")
browser.Visible = False
browser.navigate url
Do Until browser.ReadyState = 4: DoEvents: Loop
'Click button as often as found
Do
'Try to catch button
Set nodeButton = browser.document.getElementsByTagName("button")(0)
'Check if button was found
If Not nodeButton Is Nothing Then
'Check if it has an style attribute
If nodeButton.hasAttribute("style") Then
'Check if button is visible
If nodeButton.getAttribute("style") <> "display: none;" Then
'Click button
nodeButton.Click
'Wait for load more data
Application.Wait (Now + TimeSerial(0, 0, 5))
End If
'No visible button found, leave loop
noButtonFound = True
End If
Else
'No visible button found, leave loop
noButtonFound = True
End If
Else
'No visible button found, leave loop
noButtonFound = True
End If
Loop Until noButtonFound
'All dynamic data was load
'Do here what ever you want
'But I think you don't need a new html document
End Sub
推荐阅读
- python-3.x - 比较两个文本文件时,如何使标记化不将缩略词及其对应部分视为相同?
- servicestack - 自定义 CSV 反序列化
- python - 为什么`sys.stderr`和`sys.stdout`在shell末尾放一个数字,但不在模块中 - python
- mailchimp - 邮件黑猩猩直接取消订阅,无需确认页面
- php - PHP Cron.Hourly 需要 mysql 驱动
- python - 将递归函数转换为迭代函数
- javascript - 引导 + C3;网格布局在加载时不起作用
- sql - 如何解决连接错误 VB.net & SQL?
- c# - 将json作为字符串传递是否正确?
- c# - 在 C# WinForm App 中将图像存储在服务器硬盘上的最佳方式