python-3.x - 无法使用 selenium 从网站中定位元素
问题描述
试图从业务目录中抓取数据,但我不断获取数据,但未找到
name =
driver.find_elements_by_xpath('/html/body/div[3]/div/div/div[1]/div/div[1]/div/div[1]/h4')[0].text
# Results in: IndexError: list index out of range
所以我尝试使用WebDriverWait
让代码等待数据加载但它没有找到元素,即使数据被加载到网站。
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
from bs4 import BeautifulSoup
import requests
import time
url='https://www.dmcc.ae/business-search?directory=1&submissionGuid=2c8df029-a92e-4b5d-a014-7ef9948e664b'
driver = webdriver.Firefox()
driver.get(url)
wait=WebDriverWait(driver,50)
wait.until(EC.visibility_of_element_located((By.CLASS_NAME,'searched-list ng-scope')))
name = driver.find_elements_by_xpath('/html/body/div[3]/div/div/div[1]/div/div[1]/div/div[1]/h4')[0].text
print(name)
解决方案
<iframe src="https://dmcc.secure.force.com/Business_directory_Page?initialWidth=987&childId=pym-0&parentTitle=List%20of%20Companies%20Registered%20in%20Dubai%2C%20DMCC%20Free%20Zone&parentUrl=https%3A%2F%2Fwww.dmcc.ae%2Fbusiness-search%3Fdirectory%3D1%26submissionGuid%3D2c8df029-a92e-4b5d-a014-7ef9948e664b" width="100%" scrolling="no" marginheight="0" frameborder="0" height="3657px"></iframe>
切换到 iframe 并处理接受按钮。
driver.get('https://www.dmcc.ae/business-search?directory=1&submissionGuid=2c8df029-a92e-4b5d-a014-7ef9948e664b')
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#hs-eu-confirmation-button"))).click()
wait.until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,'#pym-0 > iframe')))
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR,'.searched-list.ng-scope')))
name = driver.find_elements_by_xpath('//*[@id="directory_list"]/div/div/div/div[1]/h4')[0]
print(name.text))
输出
1 BOXOFFICE DMCC
推荐阅读
- sql - 查询视图时有没有提高性能的方法?
- react-native - 在 react-native 中向 webService 发送数据
- php - 如果更改特定字段,则不应要求所有必填字段
- configuration - 多域 TYPO3 设置中的不同扩展配置?
- django - 无法将值映射到 Django Rest Serializer 中的不同键
- cakephp - 从电子邮件布局中附加带有 contentId 的文件
- android - Android:具有不同元素大小、可调整大小、可拖动和水平分页的主屏幕
- r - How to loop over several models for survival analysis in R?
- python - 为什么我的 tkinter 窗口随着时间的推移变得越来越迟钝
- javascript - 传单默认控制框的Z-Index?