python - 如何通过循环搜索字段并在python中的数据集中附加输出来解析数据?
问题描述
我的问题与上一个问题有关:如何在 python 中解析具有相同类名的网站的多个属性?
我想将解析包含在循环中cap
,并将生成的解析文本附加到循环末尾的向量或数据集中,然后在顶部继续。
我的循环现在看起来像这样:
driver = webdriver.Chrome('pathtoChrome/chromedriver.exe')
caps = ['11100']
for cap in caps:
driver.get("https://www.conad.it/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@href='javascript:void(0)']"))).click() # accept the cookies
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='location-input']"))).send_keys(caps)
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//input[@class = 'btn btn-default btn-lg btn-block']"))).find_element_by_xpath("//input[@class = 'btn btn-default btn-lg btn-block']").click()
#WebDriverWait(driver, 20).until(EC.element_to_be_clickable(driver.find_element_by_xpath("//input[@class = 'btn btn-default btn-lg btn-block']").click()
print([item.text for item in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(@class,'col-md-8')]//p")))])
注释掉的行只是另一种尝试。在我的两次尝试中,当我.click()
在循环中包含命令行时,CAP 都没有得到回答。
但是,如果我不循环,它可以工作,即:
driver = webdriver.Chrome('pathtoChrome/chromedriver.exe')
driver.get("https://www.conad.it/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@href='javascript:void(0)']"))).click() # accept the cookies
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='location-input']"))).send_keys('11100')
driver.find_element_by_xpath("//input[@class = 'btn btn-default btn-lg btn-block']").click()
print([item.text for item in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(@class,'col-md-8')]//p")))])
我想将结果写入数据集或向量中,然后将下一轮循环附加到它,类似这样,它应该将文本附加到通过键入11100
or找到的数据中11020
,但在键入时不应打印任何内容11000
,因为有没有条目cap
:
driver = webdriver.Chrome('pathtoChrome/chromedriver.exe')
caps = ['11000', '11100', '11020']
data = []
for cap in caps:
driver.get("https://www.conad.it/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@href='javascript:void(0)']"))).click() # accept the cookies
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='location-input']"))).send_keys(caps)
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//input[@class = 'btn btn-default btn-lg btn-block']"))).find_element_by_xpath("//input[@class = 'btn btn-default btn-lg btn-block']").click()
#WebDriverWait(driver, 20).until(EC.element_to_be_clickable(driver.find_element_by_xpath("//input[@class = 'btn btn-default btn-lg btn-block']").click()
print(data.append([item.text for item in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(@class,'col-md-8')]//p")))]))
任何帮助表示赞赏!
解决方案
如果未找到项目,则使用try..except
块,然后继续循环。
caps = ['11000','11100', '11020','13022']
data = []
driver.get("https://www.conad.it/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@href='javascript:void(0)']"))).click() # accept the cookies
for cap in caps:
driver.get("https://www.conad.it/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='location-input']"))).clear()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='location-input']"))).send_keys(cap)
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"input.btn.btn-default.btn-lg.btn-block"))).click()
try:
data.append([item.text for item in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(@class,'col-md-8')]//p")))])
except:
print("no data found")
continue
print(data)
输出:
[['Frazione Condemine 84, 11010 Sarre', 'Grand Chemin C/c Centreville 3, 11020 Saint-christophe', "Localita' Arensod 27, 11010 Sarre"], ['Grand Chemin C/c Centreville 3, 11020 Saint-christophe', "Localita' Perolle 21, 11024 Chatillon", 'Frazione Condemine 84, 11010 Sarre', "Localita' Arensod 27, 11010 Sarre"], ['Via Durio 26, 13019 Varallo', 'Via Brigate Garibaldi 24/a, 13019 Varallo']]
推荐阅读
- c# - 如何解决 Wpf System.Data.SqlClient.SqlException 中的以下错误:“'System.Windows.Controls.PasswordBox'附近的语法不正确。”
- firebase - 如何查询存储在firebase数组中的字符串模式?
- angular - 解决服务完成后角隐藏启动画面
- typescript - 在子类中密封方法,因此孙子不能覆盖它
- sql - 插入时间戳值“2021-08-31T16:30:01.850”时不是有效的月份错误
- maven - 在 Artifactory 中搜索具有 Maven POM 属性的工件
- reactjs - 如何在useEffect下使用RTK查询?
- c++ - 使用 Arduino 在串行监视器上显示错误值的键盘
- postgresql - 用于 postgres 容器的 Docker compose 未创建 unix 套接字文件
- laravel - 添加两次laravel