python - Xpath 嵌套跨度
问题描述
我有以下html:
<div id="aod-price-1" class="a-section a-spacing-none a-padding-none">
<span class="a-price" data-a-size="l" data-a-color="base">
<span class="a-offscreen">$79.58</span>
<span aria-hidden="true">
<span class="a-price-symbol">$</span>
<span class="a-price-whole">
"79"
<span class="a-price-decimal">.</span>
</span>
<span class="a-price-fraction">58</span>
</span>
</span>
</div>
我正在尝试提取 79.58 美元。
我用了:
priceFound = WebDriverWait(browser,10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-price']")))
这似乎有效,但并不完全符合我的预期:
它返回:
$79
58
2 行,无小数
我正在尝试提取完整的文本字符串:$79.58
我什至尝试过:
priceFound = WebDriverWait(browser,10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-offscreen']")))
和
priceFound = WebDriverWait(browser,10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-price-whole']")))
那2个没有用。
根据迄今为止的建议进行更新:
请注意,这priceFound
是一个列表,在实际的 html 中有几个类似于上面的块(许多价格)。
<div id="aod-price-1" ... </div>
<div id="aod-price-2" ... </div>
<div id="aod-price-3" ... </div>
<div id="aod-price-4" ... </div>
为了清楚起见,我只发布了一个块,这就是我选择一个列表的原因。
priceFound = WebDriverWait(browser, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//span[@class='a-price']/span[@class='a-offscreen']")))
for price in priceFound:
print(price.text)
这返回:几个空行(空回车要具体)
我想知道我是否需要 XPath 中某处的 .text 引用?
更新 2:
我使用以下内容单击查看所有购买选择按钮。它确实有效。然后引入轻微的等待以等待价格填充。
Expand_button_Element = browser.find_element_by_id("buybox-see-all-buying-choices")
Expand_button_Element.click()
更新 3:
wait = WebDriverWait(browser, 10);
# wait for panel to be visible
wait.until(EC.visibility_of_element_located((By.ID, "aod-container")))
# this wait is probably no longer needed but left in to be safe
priceFound = wait.until(EC.visibility_of_all_elements_located((By.XPATH,"//span[@class='a-price']")))
for price in priceFound:
print(price.text)
`
产品(例如): $77 23 $77 24 $79 59 $79 94 $78 95 $83 94 $79 99 $79 95 $79 99 $89 00
但是当我尝试下面的代码建议时:
browser.find_element_by_id("buybox-see-all-buying-choices").click()
wait = WebDriverWait(browser, 10);
# wait for panel to be visible
wait.until(EC.visibility_of_element_located((By.ID, "aod-container")))
# this wait is probably no longer needed but left in to be safe
priceFound = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.a-price > span.a-offscreen")))
for price in priceFound:
print(price.text)
我收到以下错误:
priceFound =
wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.a-
price > span.a-offscreen")))
File "/home/codingArea/.local/lib/python3.8/site-packages/selenium/webdriver
/support/wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
这似乎是一个类似的问题:
我认为这肯定会起作用:
priceFound = wait.until(EC.visibility_of_all_elements_located((By.XPATH,"//span[@class='a-price']/span[@class='a-offscreen']")))
但它超时了,这对我来说毫无意义。我确保我运行的是最新的硒。
更新 4: 我使用了以下内容,它没有出错,但它产生了空行(如 10 个回车符)。
priceFound = browser.find_elements_by_css_selector('span.a-offscreen')
for price in priceFound:
print(price.text)
解决方案
您尝试了多种方法,但没有发布每种方法的结果。因此,作为基线,让我们从一些简单的事情开始,看看它是否能解决问题。如果没有,我们可以从那里开始工作。
让我们使用一个简单的 CSS 选择器并添加等待可见性。(注意:您正在使用存在,但这只是意味着该元素在 DOM 中,而不是它是可见的)
# this code starts after clicking link to open product panel with pricing
browser.find_element_by_id("buybox-see-all-buying-choices").click()
wait = WebDriverWait(browser, 10);
# wait for panel to be visible
wait.until(EC.visibility_of_element_located((By.ID, "aod-container")))
# this wait is probably no longer needed but left in to be safe
priceFound = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.a-price > span.a-offscreen")))
for price in priceFound:
print(price.text)
推荐阅读
- azure - Azure VM 操作系统构建 - Powershell
- javascript - 如何在 Angular 6 中使用 Jasmine 将加载的数据测试为反应形式?
- javascript - Javascript RegEx 语法混乱
- javascript - 在 Ant.Design 中个性化 ReactJS 分页样式 - 有什么办法吗?
- javascript - 从 URL 参数设置隐藏的表单字段
- git - Git 帮助理解基于合并的冲突
- xml - 如何将 XML 转换为 CIF 格式?
- apache-spark - spark-hive - Upsert 到动态分区 hive 表中会引发错误 - 分区规范包含非分区列
- ubuntu-16.04 - 在 Arch 下创建 ubuntu 16 容器时出错
- ios - 由于未捕获的异常“CALayerInvalidGeometry”而终止应用程序,原因:“CALayer 位置包含 NaN:[nan 40]”