首页 > 解决方案 > 如何通过循环搜索字段并在python中的数据集中附加输出来解析数据?

问题描述

我的问题与上一个问题有关:如何在 python 中解析具有相同类名的网站的多个属性?

我想将解析包含在循环中cap,并将生成的解析文本附加到循环末尾的向量或数据集中,然后在顶部继续。

我的循环现在看起来像这样:

driver = webdriver.Chrome('pathtoChrome/chromedriver.exe')
caps = ['11100']
for cap in caps:
   driver.get("https://www.conad.it/")
   WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@href='javascript:void(0)']"))).click() # accept the cookies
   WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='location-input']"))).send_keys(caps)
   WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//input[@class = 'btn btn-default btn-lg btn-block']"))).find_element_by_xpath("//input[@class = 'btn btn-default btn-lg btn-block']").click()
   #WebDriverWait(driver, 20).until(EC.element_to_be_clickable(driver.find_element_by_xpath("//input[@class = 'btn btn-default btn-lg btn-block']").click()
   print([item.text for item in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(@class,'col-md-8')]//p")))])

注释掉的行只是另一种尝试。在我的两次尝试中,当我.click()在循环中包含命令行时,CAP 都没有得到回答。

但是,如果我不循环,它可以工作,即:

driver = webdriver.Chrome('pathtoChrome/chromedriver.exe')
driver.get("https://www.conad.it/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@href='javascript:void(0)']"))).click() # accept the cookies
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='location-input']"))).send_keys('11100')
driver.find_element_by_xpath("//input[@class = 'btn btn-default btn-lg btn-block']").click()
print([item.text for item in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(@class,'col-md-8')]//p")))])

我想将结果写入数据集或向量中,然后将下一轮循环附加到它,类似这样,它应该将文本附加到通过键入11100or找到的数据中11020,但在键入时不应打印任何内容11000,因为有没有条目cap

driver = webdriver.Chrome('pathtoChrome/chromedriver.exe')
caps = ['11000', '11100', '11020']
data = []

for cap in caps:
   driver.get("https://www.conad.it/")
   WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@href='javascript:void(0)']"))).click() # accept the cookies
   WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='location-input']"))).send_keys(caps)
   WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//input[@class = 'btn btn-default btn-lg btn-block']"))).find_element_by_xpath("//input[@class = 'btn btn-default btn-lg btn-block']").click()
   #WebDriverWait(driver, 20).until(EC.element_to_be_clickable(driver.find_element_by_xpath("//input[@class = 'btn btn-default btn-lg btn-block']").click() 
   print(data.append([item.text for item in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(@class,'col-md-8')]//p")))]))

任何帮助表示赞赏!

标签: pythonseleniumloops

解决方案


如果未找到项目,则使用try..except块,然后继续循环。

caps = ['11000','11100', '11020','13022']
data = []
driver.get("https://www.conad.it/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@href='javascript:void(0)']"))).click() # accept the cookies
for cap in caps:
   driver.get("https://www.conad.it/")
   WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='location-input']"))).clear()
   WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='location-input']"))).send_keys(cap)
   WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"input.btn.btn-default.btn-lg.btn-block"))).click()
   try:
      data.append([item.text for item in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(@class,'col-md-8')]//p")))])

   except:
       print("no data found")
       continue

print(data)

输出:

[['Frazione Condemine 84, 11010 Sarre', 'Grand Chemin C/c Centreville 3, 11020 Saint-christophe', "Localita' Arensod 27, 11010 Sarre"], ['Grand Chemin C/c Centreville 3, 11020 Saint-christophe', "Localita' Perolle 21, 11024 Chatillon", 'Frazione Condemine 84, 11010 Sarre', "Localita' Arensod 27, 11010 Sarre"], ['Via Durio 26, 13019 Varallo', 'Via Brigate Garibaldi 24/a, 13019 Varallo']]

推荐阅读