首页 > 解决方案 > 如何通过在 selenium 中搜索找到 div 元素,然后使用 selenium 和 python 从该 div 复制属性?

问题描述

我想要做的是从 html 中的 div(元素)获取 asin(属性),然后与 amazon.com/dp/ + asin 连接以形成一个 URL,然后访问该 URL。div 没有 id,但由 div 元素中的 data-index="1" 属性标识,所以我想知道如何调用此 div 元素,然后专门搜索 asin 属性。谢谢阅读。使用 python 3.7 和 selenium webdriver

import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Chrome()

email = ('.')
password = ('.')
query = ('macbook')

urls = []
prices = []
names = []
descs = []

def  search_amazon(query):
    driver.get('https://amazon.com/')
    searchBox = driver.find_element_by_id('twotabsearchtextbox')
    time.sleep(2)
    searchBox.send_keys(query)
    searchBox.send_keys(Keys.ENTER)
    time.sleep(3)

    firstResult = driver.find_element_by_name('data-index="1"')
    asin = firstResult.getAttribute('data-asin')
    print(asin)
    url = 'https://amazon.com/dp/' + asin
    driver.get(url)
    print(url)

    return url

search_amazon(query)

标签: pythonhtmlseleniumselenium-webdriver

解决方案


您需要使用我提供的代码更改这两行代码。

firstResult = driver.find_element_by_name('data-index="1"')
asin = firstResult.getAttribute('data-asin')

由于data-index不是名称,它是一个属性。您可以使用以下css选择器。

firstResult = driver.find_element_by_css_selector('div[data-index="1"]>div')
asin = firstResult.get_attribute('data-asin')

这是工作代码。

import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Chrome()

email = ('.')
password = ('.')
query = ('macbook')

urls = []
prices = []
names = []
descs = []

def  search_amazon(query):
    driver.get('https://amazon.com/')
    searchBox = driver.find_element_by_id('twotabsearchtextbox')
    time.sleep(2)
    searchBox.send_keys(query)
    searchBox.send_keys(Keys.ENTER)
    time.sleep(3)

    firstResult = driver.find_element_by_css_selector('div[data-index="1"]>div')
    asin = firstResult.get_attribute('data-asin')
    print(asin)
    url = 'https://amazon.com/dp/' + asin
    driver.get(url)
    print(url)

    return url

search_amazon(query)

推荐阅读