首页 > 解决方案 > Selenium Web Scraping find_element_by_xpath 错误

问题描述

我正在尝试从页面https://www.mahindrausa.com/map-hours-directions-tractors-utvs-farming-equipment中提取一些数据(例如经销商名称、地址、电话号码和电子邮件 ID)-经销商--locate-a-dealer使用带有 selenium 库的 python,但我无法使用“find_element_by_xpath”命令提取文本。

每当我运行下面的程序时,它都会给我一个带有错误的空白文本,我不确定我在这里做错了什么。下面是错误

NoSuchElementException: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@id="locationsAR"]/div/ul/li[1]/a[2]"}
  (Session info: chrome=83.0.4103.116)

有人可以帮忙吗?

from selenium import webdriver
import pandas as pd

data={}
abvr=['AL','AR','AZ','CA','CO','CT','DE','FL','GA','IA','ID','IL','IN','KS','KY','LA','MA','MD','ME','MI','MN','MO','MS','MT','NC','ND','NH','NJ','NM','NV','NY','OH','OK','OR','PA','SC','SD','TN','TX','UT','VA','VT','WA','WI','WV','WY']
df=pd.DataFrame(columns=['Name','Address 1','Address 2','Phone#','Email'])
path=r"C:\Program Files\chromedriver.exe"
driver=webdriver.Chrome(path)
driver.get("https://www.mahindrausa.com/map-hours-directions-tractors-utvs-farming-equipment--dealership--locate-a-dealer")
a=driver.find_elements_by_class_name("locations-list")
for s in abvr:
    name="locations"+s
    for n in a:
        for k in n.find_elements_by_class_name("state-location"):
            count=1
            names= '//*[@id=\"'+name+'\"]/div/ul/li['+str(count)+']/h4'
            address1='//*[@id=\"'+name+'\"]/div/ul/li['+str(count)+']/span[1]/span[1]'
            address2='//*[@id=\"'+name+'\"]/div/ul/li['+str(count)+']/span[1]/span[2]'
            phone='//*[@id=\"'+name+'\"]/div/ul/li['+str(count)+']/span[2]/a'
            email='//*[@id=\"'+name+'\"]/div/ul/li['+str(count)+']/a[2]'
            data['Name']=k.find_element_by_xpath(names).text
            data['Address 1']=k.find_element_by_xpath(address1).text
            data['Address 2']=k.find_element_by_xpath(address2).text
            data['Phone#']=k.find_element_by_xpath(phone).text
            data['Email']=k.find_element_by_xpath(email).text
            df=df.append(data,ignore_index=True)
            count=+1
driver.quit()
print(df)

标签: pythonseleniumweb-scraping

解决方案


好吧,如果您想收集经销商列表,可能有一个更简单的解决方案。完整列表(498 个经销商)似乎存储在一个名为的变量jCollection中,您可以使用以下代码读取该变量:

from selenium import webdriver
driver=webdriver.Chrome()
driver.get("https://www.mahindrausa.com/map-hours-directions-tractors-utvs-farming-equipment--dealership--locate-a-dealer")
dealers = driver.execute_script("return jCollection;")
print(dealers)

输出

{'features': [{'geometry': {'coordinates': [36.16876, -84.07945],
    'type': 'Point'},
   'properties': {'address': '2401 N Charles G Seivers Blvd',
    'city': 'Clinton',
    'dealerCode': 'TOM08',
    'dealerName': " Tommy's Motorsports",
    ...
    }}]}

推荐阅读