html - VBA网页抓取代码错误与宏链接
问题描述
我已经完成了我创建的这段代码,看起来我遇到了一些问题。当您手动进入代码并运行它时,代码可以正常工作,但是每次我尝试使用宏按钮自动运行代码时都会遇到问题。
我收到运行时错误“70”:权限被拒绝。我不确定为什么代码在我自动运行时会跳闸并抛出此代码。
这个想法是能够在 excel 中输入城镇和州,它将搜索这两个网站的数据。
我附上了下面的代码
'start a new subroutine called SearchBot
Sub SearchBot1()
'dimension (declare or set aside memory for) our variables
Dim objIE As InternetExplorer 'special object variable representing the IE browser
Dim aEle As HTMLLinkElement 'special object variable for an <a> (link) element
Dim HTMLinputs As MSHTML.IHTMLElementCollection
Dim y As Integer 'integer variable we'll use as a counter
Dim result As String 'string variable that will hold our result link
'initiating a new instance of Internet Explorer and asigning it to objIE
Set objIE = New InternetExplorer
'make IE browser visible (False would allow IE to run in the background)
'objIE.Visible = True
'navigate IE to this web page (a pretty neat search engine really)
objIE.navigate "https://www.zillow.com/orange-county-ny/home-values/"
'wait here a few seconds while the browser is busy
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
'in the search box put cell "A2" value, the word "in" and cell "C1" value
objIE.document.getElementById("local-search").Value = _
Sheets("Sheet2").Range("B3").Value & ", " & Sheets("Sheet2").Range("B4").Value
'click the 'go' button
Set HTMLinputs = objIE.document.getElementsByTagName("button")
For Each input_element In HTMLinputs
If input_element.getAttribute("name") = "SubmitButton" Then
input_element.Click
Exit For
End If
Next input_element
'wait again for the browser
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
'price for home
Set Doc = objIE.document
Dim cclass As String
cclass = Trim(Doc.getElementsByClassName("value-info-list")(0).Children(0).innerText)
'MsgBox (cclass)
Dim aclass As Variant
aclass = Split(cclass, " ")
Range("Market_Price").Value = aclass(0)
'1-YR Forecast
cclass = Trim(Doc.getElementsByClassName("value-info-list")(0).Children(1).innerText)
'MsgBox (cclass)
Dim bclass As Variant
bclass = Split(cclass, " ")
Range("yr_forecast").Value = bclass(0)
'Median List Price
cclass = Trim(Doc.getElementsByClassName("value-info-list")(0).Children(2).innerText)
'MsgBox (cclass)
Dim dclass As Variant
dclass = Split(cclass, " ")
Range("Median_List_Price").Value = dclass(0)
'Median Sale Price
cclass = Trim(Doc.getElementsByClassName("value-info-list")(0).Children(3).innerText)
'MsgBox (cclass)
Dim eclass As Variant
eclass = Split(cclass, " ")
Range("Median_Sale_Price").Value = eclass(0)
'Health of market
cclass = Trim(Doc.getElementsByClassName("value-info-list")(1).Children(0).innerText)
'MsgBox (cclass)
Dim fclass As Variant
fclass = Split(cclass, " ")
Range("Healthy").Value = fclass(0)
' Home with Negative Equity
cclass = Trim(Doc.getElementsByClassName("value-info-list")(1).Children(1).innerText)
'MsgBox (cclass)
Dim gclass As Variant
gclass = Split(cclass, " ")
Range("Home_With_Negative_Equity").Value = gclass(0)
'Delinquent on Mortgage
cclass = Trim(Doc.getElementsByClassName("value-info-list")(1).Children(2).innerText)
'MsgBox (cclass)
Dim hclass As Variant
hclass = Split(cclass, " ")
Range("Delinquent_On_Mortgage").Value = hclass(0)
'Listings with price cut
cclass = Trim(Doc.getElementsByClassName("value-info-list")(2).Children(2).innerText)
'MsgBox (cclass)
Dim iclass As Variant
iclass = Split(cclass, " ")
Range("Price_Cut").Value = iclass(0)
'Breakeven Horizon
cclass = Trim(Doc.getElementsByClassName("value-info-list")(3).Children(2).innerText)
'MsgBox (cclass)
Dim jclass As Variant
jclass = Split(cclass, " ")
Range("Breakeven").Value = jclass(0)
'Rent List Price
cclass = Trim(Doc.getElementsByClassName("value-info-list")(3).Children(3).innerText)
'MsgBox (cclass)
Dim kclass As Variant
kclass = Split(cclass, " ")
Range("Rent_List_Price").Value = kclass(0)
'Rent List Price/sq ft
cclass = Trim(Doc.getElementsByClassName("value-info-list")(3).Children(4).innerText)
'MsgBox (cclass)
Dim lclass As Variant
lclass = Split(cclass, " ")
Range("Rent_sq").Value = lclass(0)
'close the browser
objIE.Quit
Set ws = ThisWorkbook.Worksheets("Engine")
'initiating a new instance of Internet Explorer and asigning it to objIE
Set objIE = New InternetExplorer
'make IE browser visible (False would allow IE to run in the background)
'objIE.Visible = True
'navigate IE to this web page (a pretty neat search engine really)
objIE.navigate "https://datausa.io/profile/geo/" & ws.Range("City_Search").Value & "-" & ws.Range("State_Search").Value
'wait here a few seconds while the browser is busy
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
Set Doc = objIE.document
Dim Data As String
Data = Trim(Doc.getElementsByClassName("stat")(0).Children(1).innerText)
'MsgBox (Data)
Dim adata As Variant
adata = Split(Data, "")
ws.Range("Population").Value = adata(0)
End Sub
'exit our SearchBot subroutine
如果有人可以帮助我,将不胜感激。我将继续解决问题,看看我是否可以让它工作。如果您对公式有任何疑问,请询问。
谢谢
解决方案
关于您的代码的注释:
- 我无法复制您的错误。有时您可能会遇到验证码请求。您可以通过设置来检查这一点
objIE.Visible = True
。 - 您正在打开一个 IE 实例,退出它,然后重新打开。您可以继续使用现有的并导航到新的 URL。这将减少代码量。
- 您有未声明的变量,这意味着您没有
Option Explicit
在代码顶部使用。 - 有点不清楚您打算在某些时候使用哪些工作表,因为您没有完全限定范围与它们的父工作表对象。因此我的假设如下。当您不符合条件时,范围对象将使用 Activesheet。
- 您的人口统计数据会返回下降百分比以及实际人口数据 - 这是您将许多其他结果拆分以获得子字符串时的意图吗?
- 通过以
value
eg类为目标doc.getElementsByClassName("value")
,您可以遍历返回的集合并完全避免使用Split
并大大减少代码量。 - 您可以关闭屏幕更新以获得更快的结果。
- 根据您执行此操作的频率,您可以切换到 XHR,如下所示,这要快得多。
我的代码:
我不得不做出一些假设,但以下内容从页面中获取信息。我假设所有信息都来自和去往,sheet2
除了Population
. 我在下面的屏幕截图中显示它,以便您可以一起查看所有结果。
XHR 和提琴手:
我曾经fiddler
在进行选择并按下搜索按钮时检查网络流量。这表明我提出了一个GET
请求,我使用fiddler的检查员提供的信息来制定正确的GET
请求。
提琴手结果:
请注意,如果您在太短的时间内尝试太多GET
请求,您最终会得到验证码。
CSS 选择器:
检查检索到的页面的 HTML,我可以看到所有相关值的 className 为value
我可以通过使用.value
where"."
表示类的 CSS 选择器来定位这些元素。
匹配元素示例:
由于匹配的元素很多,我使用 的.querySelectorAll
方法document
来检索NodeList
包含所有匹配项的 a。我遍历.Length
以NodeList
访问所需的值。我使用Select Case
通过索引位置来确定将值写入哪个命名范围。您可能需要验证我是否正确。
VBA:
Option Explicit
Public Sub GetInfo()
Dim html As New MSHTML.HTMLDocument, ws As Worksheet
Application.ScreenUpdating = False
Set ws = ThisWorkbook.Worksheets("Sheet2") '<== Must be sheet name where named ranges are
Dim searchString As String, URL As String
searchString = Replace$(LCase$(ws.Range("B3")), Chr$(32), Chr$(45)) & Chr$(45) & LCase$(ws.Range("B4")) 'Santa Ana in B3, CA in B4
URL = "https://www.zillow.com/" & searchString & "/home-values/"
html.body.innerHTML = GetHTML(URL)
Dim aNodeList As Object, i As Long
Set aNodeList = html.querySelectorAll(".value")
With ws
For i = 0 To aNodeList.Length - 1
Select Case i
Case 0 ' ZHVI
.Range("Market_Price") = aNodeList.item(i).innerText
Case 1 ' 1-yr forecast
.Range("yr_forecast") = aNodeList.item(i).innerText
Case 2 'Median listing price
.Range("Median_List_Price") = aNodeList.item(i).innerText
Case 3 'Median sale price
.Range("Median_Sale_Price") = aNodeList.item(i).innerText
Case 4 'Market Health Index
.Range("Healthy") = aNodeList.item(i).innerText
Case 5 'Homes with negative equity
.Range("Home_With_Negative_Equity") = aNodeList.item(i).innerText
Case 6 'Delinquent on mortgage
.Range("Delinquent_On_Mortgage") = aNodeList.item(i).innerText
Case 7 'Median list price / sq ft
Case 8 'Median sale price / sq ft
Case 9
Case 10 'Value Listings with price cut
.Range("Price_Cut") = aNodeList.item(i).innerText
Case 11 'Breakeven horizon
.Range("Breakeven") = aNodeList.item(i).innerText
Case 12
.Range("Rent_List_Price") = aNodeList.item(i).innerText ' Rent list price
Case 13 'List price / sq ft
.Range("Rent_sq") = aNodeList.item(i).innerText
End Select
Next i
URL = "https://datausa.io/profile/geo/" & searchString
html.body.innerHTML = GetHTML(URL)
ThisWorkbook.Worksheets("Engine").Range("Population") = html.querySelector(".stat-value").innerText
End With
Application.ScreenUpdating = True
End Sub
Public Function GetHTML(ByVal URL As String) As String
Dim sResponse As String
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, False
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
GetHTML = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
End Function
工作表2中的结果:
所需参考资料:
HTML 对象库
推荐阅读
- javascript - 如何为使用代理选项的 axios 设置超时?
- ios - . 为什么我收到错误消息?构建一个 React 应用程序
- pointers - 工厂构造类中的段错误
- ios - UIButton 文本在更新为新文本之前闪烁“旧”文本
- makefile - 比较 Makefile 中两个文件的名称
- r - R中的概率回归
- azure - 是否可以在 Azure B2C 中使用登录用户流后立即运行密码重置用户流?
- python - “NoneType”对象没有“编码”属性和 SMTP 334 错误
- python - 如何在 postgres 中强制 sqalchemy 浮点类型为实数
- python - 尝试在 Sagemaker 笔记本实例中进行预测时,Sagemaker 端点返回错误