python - 如何在 Python 中使用 Selenium 打印打开的 pdf 链接?
问题描述
我无法打印运行给定代码后打开的最终 pdf 的链接
from selenium import webdriver
from selenium.webdriver.support import ui
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
def page_is_loaded(driver):
return driver.find_element_by_tag_name("body")!= None
def check_exists_by_text(text):
try:
driver.find_element_by_link_text(text)
except NoSuchElementException:
return False
return True
driver = webdriver.Chrome("C:/Users/Roshan/Desktop/sbi/chromedriver")
driver.maximize_window()
driver.get("http://www.careratings.com/brief-rationale.aspx")
wait = ui.WebDriverWait(driver,10)
wait.until(page_is_loaded)
location_field = driver.find_element_by_name("txtfromdate")
location_field.send_keys("2019-05-06")
last_date = driver.find_element_by_name("txttodate")
last_date.send_keys("2019-05-21")
driver.find_element_by_xpath("//input[@name='btn_submit']").click()
if check_exists_by_text('Reliance Capital Limited'):
elm =driver.find_element_by_link_text('Reliance Capital Limited')
driver.implicitly_wait(5)
elm.click()
driver.implicitly_wait(50)
#time.sleep(5)
#driver.quit()
else :
print("Company is not rated in the given Date range")
我期待实际输出是这个 pdf 的链接:
“ http://www.carratings.com/upload/CompanyFiles/PR/Reliance%20Capital%20Ltd.-05-18-2019.pdf ”
但我不知道如何打印此链接
解决方案
您需要找到表中的所有元素,然后从中提取数据。
from selenium import webdriver
import os
# setup path to chrome driver
chrome_driver = os.getcwd() + '/chromedriver'
# initialise chrome driver
browser = webdriver.Chrome(chrome_driver)
# load url
browser.get('http://www.careratings.com/brief-rationale.aspx')
# setup date range
location_field = browser.find_element_by_name("txtfromdate")
location_field.send_keys("2019-05-06")
last_date = browser.find_element_by_name("txttodate")
last_date.send_keys("2019-05-21")
browser.find_element_by_xpath("//input[@name='btn_submit']").click()
# get all data rows
content = browser.find_elements_by_xpath('//*[@id="divManagementSpeak"]/table/tbody/tr/td/a')
# get text and href link from each element
collected_data = []
for item in content:
url = item.get_attribute("href")
description = item.get_attribute("innerText")
collected_data.append((url, description ))
输出:
('http://www.careratings.com/upload/CompanyFiles/PR/Ashwini%20Frozen%20Foods-05-21-2019.pdf', 'Ashwini Frozen Foods')
('http://www.careratings.com/upload/CompanyFiles/PR/Vanita%20Cold%20Storage-05-21-2019.pdf', 'Vanita Cold Storage')
等等
推荐阅读
- scala - Spark Scala Job 在最终 Job 中体验长时间运行的任务
- java - 如何在不影响大小写和换行符的情况下获取实际源代码?
- firebase - 显示来自 Firebase 的图像
- android - 代码编译后如何设置状态 Flutter
- javascript - Http 403 - 在反应中使用 RSS 解析器的 CORS 错误
- c++ - QWaitCondtion 没有唤醒
- python - 从公会中获取成员 ID,并将其放在列表中
- macos - 无法使用自制软件更新木桶应用程序
- java - Spring Boot Web 应用程序的前端 html 登录(通过 rest api)
- java - CXF webclient - Java.net.SocketTimeoutException - 未记录入站消息