首页 > 解决方案 > 检查 Selenium 是否基于 Web 元素滚动的条件?

问题描述

目前,我有一个脚本将转到TripAdvisor并尝试抓取该特定过滤器中的每张图像。我想知道我应该将 if 语句设置为什么条件,以使其脱离 while 循环,然后解析 url 列表,以便为我提供每个图像的清晰 url 链接。一旦我到达最后一个网络元素,我只是对如何判断我是否已经到达终点感到困惑。if 语句就在最后一个打印循环之前的末尾。任何帮助是极大的赞赏!

# import dependencies
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
import re
import selenium
import io
import pandas as pd
import urllib.request
import urllib.parse
import requests
from bs4 import BeautifulSoup
import pandas as pd
from selenium.webdriver.common.action_chains import ActionChains
from selenium import webdriver
import time
from _datetime import datetime
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
options = webdriver.ChromeOptions()
options.headless=False

driver = webdriver.Chrome("/Users/rishi/Downloads/chromedriver 3")
driver.maximize_window()
prefs = {"profile.default_content_setting_values.notifications" : 2} 
options.add_experimental_option("prefs", prefs)

#open up website
driver.get(
    "https://www.tripadvisor.com/Hotel_Review-g28970-d84078-Reviews-Hyatt_Regency_Washington_on_Capitol_Hill-Washington_DC_District_of_Columbia.html#/media/84078/?albumid=101&type=2&category=101")

image_url = []

end = False
while not(end):
    #wait until element is found and then store all webelements into list
    images = WebDriverWait(driver, 20).until(
        EC.presence_of_all_elements_located(
            (By.XPATH, '//*[@class="media-viewer-dt-root-GalleryImageWithOverlay__galleryImage--1Drp0"]')))

    #iterate through visible images and acquire their url based on background image style
    for index, image in enumerate(images):
        image_url.append(images[index].value_of_css_property("background-image"))

    #if you are at the end of the page then leave loop
    # if(length == end_length):
    #     end = True

    #move to next visible images in the array
    driver.execute_script("arguments[0].scrollIntoView();", images[-1])

    #wait one second
    time.sleep(1)

    if():
        end = True

#clean the list to provide clear links
for i in range(len(image_url)):
    start = image_url[i].find("url(\"") + len("url(\"")
    end = image_url[i].find("\")")
    print(image_url[i][start:end]) 

#print(image_url)

标签: pythonseleniumselenium-webdriverweb-scrapingselenium-chromedriver

解决方案


推荐阅读