首页 > 解决方案 > 无法在工作表中的相关 asins 旁边嵌入图像

问题描述

我正在尝试结合 IE 在 vba 中创建一个脚本,以从某些网页中抓取图像链接,并将其嵌入工作表中的 ASIN 列旁边。在 A 列中有 ASIN 列表,在 B 列中,我想像这样嵌入图像。因此,该脚本从 A 列获取 asin 并通过将该 asin 附加到该基本链接来形成一个合格的链接https://www.amazon.in/dp/。这是这样一个网页的地址。

这里有一些asins:

B08SRFZX5Z
B08KKJQ8N7
B081RC61YN

形成以下链接:

https://www.amazon.in/dp/B08SRFZX5Z
https://www.amazon.in/dp/B08KKJQ8N7
https://www.amazon.in/dp/B081RC61YN

我试过:

Sub GetImages()
    Const URL$ = "https://www.amazon.in/dp/"
    Dim IE As Object, ws As Worksheet, cel As Range
    
    Set ws = ThisWorkbook.Worksheets("Sheet1")
    Set IE = CreateObject("InternetExplorer.Application")

    For Each cel In ws.Range("A2:A" & ws.Cells(Rows.count, 1).End(xlUp).Row)
        IE.Visible = False
        IE.navigate URL & cel
        While IE.Busy Or IE.readyState < 4: DoEvents: Wend
        ws.Range(cel(1, 2).Address) = IE.document.querySelector("[id='imgTagWrapperId'] > img").getAttribute("src")
    Next cel
End Sub

如何在工作表中的 asin 列表旁边嵌入图像?

标签: vbainternet-explorerweb-scraping

解决方案


请测试下一个代码(没有 IE):

Sub GetImagesNoIE()
 Dim imageUrl$, p As Shape, ws As Worksheet, cel As Range, rng As Range, arrSrc
 Dim Http As Object, HTMLDoc As Object    'late binding
 'in order to benefit of intellisense suggestions the next two references should be added:
 'Microsoft WinHTTP Services, version 5.1
 'Microsoft HTML Object Library
 Const URL$ = "https://www.amazon.in/dp/"
 
 Set Http = CreateObject("WinHttp.WinHttpRequest.5.1")
 Set HTMLDoc = CreateObject("htmlfile")
 
 Set ws = ActiveSheet 'ThisWorkbook.Worksheets("Sheet1")
 'delete the previous (existing) shapes in column B:B, if any:
 Application.ScreenUpdating = False
 Application.EnableEvents = False
  Set rng = ws.Range("B:B")
  For Each p In ws.Shapes
      If Not Intersect(rng, p.TopLeftCell) Then p.Delete
  Next

  For Each cel In ws.Range("A2:A" & ws.Cells(Rows.Count, 1).End(xlUp).Row)
    If cel.Value <> "" Then
        With Http
            .Open "GET", URL & cel.Value, False
            .setRequestHeader "User-Agent", "Firefox"
            .send
            HTMLDoc.body.innerHTML = .responseText
        End With
        If InStr(HTMLDoc.body.innerText, "We're sorry. The Web address you entered is not a functioning page on our site") = 0 Then
             imageUrl = HTMLDoc.querySelector("[id='imgTagWrapperId'] > img").getAttribute("src")
             Set p = ws.Shapes.AddPicture(imageUrl, msoFalse, msoTrue, cel.Offset(0, 1).Left, _
                                   cel.Offset(0, 1).Top, cel.Offset(0, 1).Width, cel.Offset(0, 1).Height)
            
        Else
            cel.Offset(0, 1).Value = "The Web address you entered is not a functioning page on our site"
        End If
    End If
  Next
  Application.ScreenUpdating = False
  Application.EnableEvents = False
  MsgBox "Ready..."
End Sub

推荐阅读