首页 > 解决方案 > Xpath 嵌套跨度

问题描述

我有以下html:

<div id="aod-price-1" class="a-section a-spacing-none a-padding-none">
    <span class="a-price" data-a-size="l" data-a-color="base">
        <span class="a-offscreen">$79.58</span>
        <span aria-hidden="true">
            <span class="a-price-symbol">$</span>
            <span class="a-price-whole">
                "79"
                <span class="a-price-decimal">.</span>
            </span>
            <span class="a-price-fraction">58</span>
        </span>
    </span>
</div>

我正在尝试提取 79.58 美元。

我用了:

priceFound = WebDriverWait(browser,10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-price']")))

这似乎有效,但并不完全符合我的预期:

它返回:

$79
58

2 行,无小数

我正在尝试提取完整的文本字符串:$79.58

我什至尝试过:

priceFound = WebDriverWait(browser,10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-offscreen']")))

priceFound = WebDriverWait(browser,10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-price-whole']")))

那2个没有用。


根据迄今为止的建议进行更新:

请注意,这priceFound是一个列表,在实际的 html 中有几个类似于上面的块(许多价格)。

<div id="aod-price-1" ... </div>
<div id="aod-price-2" ... </div>
<div id="aod-price-3" ... </div>
<div id="aod-price-4" ... </div>

为了清楚起见,我只发布了一个块,这就是我选择一个列表的原因。

priceFound = WebDriverWait(browser, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-price']/span[@class='a-offscreen']")))

for price in priceFound:
    print(price.text)

这返回:几个空行(空回车要具体)

我想知道我是否需要 XPath 中某处的 .text 引用?

更新 2:

我使用以下内容单击查看所有购买选择按钮。它确实有效。然后引入轻微的等待以等待价格填充。

  Expand_button_Element = browser.find_element_by_id("buybox-see-all-buying-choices")
  Expand_button_Element.click()

更新 3:

 wait = WebDriverWait(browser, 10);
        # wait for panel to be visible
     wait.until(EC.visibility_of_element_located((By.ID, "aod-container")))
        # this wait is probably no longer needed but left in to be safe
     priceFound = wait.until(EC.visibility_of_all_elements_located((By.XPATH,"//span[@class='a-price']")))
     for price in priceFound:
        print(price.text)

`

产品(例如): $77 23 $77 24 $79 59 $79 94 $78 95 $83 94 $79 99 $79 95 $79 99 $89 00

退货

但是当我尝试下面的代码建议时:

browser.find_element_by_id("buybox-see-all-buying-choices").click()
     wait = WebDriverWait(browser, 10);
     # wait for panel to be visible
     wait.until(EC.visibility_of_element_located((By.ID, "aod-container")))
     # this wait is probably no longer needed but left in to be safe
     priceFound = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.a-price > span.a-offscreen")))
     for price in priceFound:
        print(price.text)

我收到以下错误:

 priceFound =   
wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.a-
price > span.a-offscreen")))
File "/home/codingArea/.local/lib/python3.8/site-packages/selenium/webdriver
/support/wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: 

这似乎是一个类似的问题:

如何从硒的嵌套跨度标签中获取文本

我认为这肯定会起作用:

priceFound = wait.until(EC.visibility_of_all_elements_located((By.XPATH,"//span[@class='a-price']/span[@class='a-offscreen']")))

但它超时了,这对我来说毫无意义。我确保我运行的是最新的硒。

更新 4: 我使用了以下内容,它没有出错,但它产生了空行(如 10 个回车符)。

priceFound = browser.find_elements_by_css_selector('span.a-offscreen')

     for price in priceFound:
        print(price.text)

标签: pythonselenium-webdriverxpath

解决方案


您尝试了多种方法,但没有发布每种方法的结果。因此,作为基线,让我们从一些简单的事情开始,看看它是否能解决问题。如果没有,我们可以从那里开始工作。

让我们使用一个简单的 CSS 选择器并添加等待可见性。(注意:您正在使用存在,但这只是意味着该元素在 DOM 中,而不是它是可见的)

# this code starts after clicking link to open product panel with pricing
browser.find_element_by_id("buybox-see-all-buying-choices").click()
wait = WebDriverWait(browser, 10);
# wait for panel to be visible
wait.until(EC.visibility_of_element_located((By.ID, "aod-container")))
# this wait is probably no longer needed but left in to be safe
priceFound = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.a-price > span.a-offscreen")))
for price in priceFound:
    print(price.text)

推荐阅读