python - Selenium - 下载文件
问题描述
我正在尝试运行一个脚本来访问纳斯达克网站,以下载过去 18 个月公司列表的股票信息。运行以下脚本后,我只设法打开带有公司信息和下载按钮的 Firefox 页面,但它不会立即为我下载。
为什么?
def pull_nasdaq_data(tickers, save_path, rm_path):
# To prevent download dialog box in selenium
profile = webdriver.FirefoxProfile()
profile.set_preference('browser.download.folderList', 2) # custom location
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference('browser.download.dir', r'C:\Users\Filippo Sebastio\Desktop\Stock')
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', "text/plain, application/vnd.ms-excel, text/csv, application/csv, text/comma-separated-values, application/download, application/octet-stream, binary/octet-stream, application/binary, application/x-unknown")
# Setup Webdriver
driver = webdriver.Firefox(executable_path=r'C:\Users\Filippo Sebastio\Desktop\geckodriver.exe')
popup = True # Will there be a popup?
for ticker in tickers:
# Get the stocks website
site = 'http://www.nasdaq.com/symbol/' + ticker + '/historical'
driver.get(site)
# Choose 10 year data from a drop down
data_range = driver.find_element_by_name('ddlTimeFrame')
for option in data_range.find_elements_by_tag_name('option'):
if option.text == '18 Months':
option.click()
break
time.sleep(10)
# Click to Download Data
driver.find_element_by_id('lnkDownLoad').click()
# Open the file from the downloads folder
time.sleep(25) # Wait for file to download
data = pd.read_csv('~/Downloads/HistoricalQuotes.csv')
# Rename and save the file in the desired location
file_loc = save_path + ticker + '.csv'
data.to_csv(file_loc, index=False)
# Delete the downloaded file
os.remove(removal_path)
print("Downloaded: ", ticker)
# Wait for the next page to load
time.sleep(20)
tickers = ['tesla', 'mmm']
save_path = my patht to where I want the docuemnts downloaded
rm_path = my Download path
pull_nasdaq_data(tickers, save_path, rm_path)
解决方案
推荐阅读
- javascript - JS:将字符串数组转换为对象数组
- python - 如何从href(selenium,python)内的范围访问文本
- javascript - 如何使用“输入”键在表单中前进,以及在多步表单上单击按钮作为选项
- javascript - Knokcout JS:未捕获的 ReferenceError:未定义操作
- spring-webflux - 在 webflux 中使用 Flux.cache 复用 Redis 频道订阅
- angular - 有没有办法从没有signalR的角度应用程序接收来自服务总线的消息?
- hadoop - 控制集群中每个节点上的映射器数量
- webhooks - 无法部署 webhook
- python - 检查字典列表中的值时将 for 循环转换为 all()
- java - 带有递归的斐波那契如何工作