首页 > 解决方案 > 从互联网访问数据

问题描述

我想使用 Python 3 自动访问该文件。该网站是https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls

当您手动将 url 输入资源管理器时,它会要求您下载文件,但我想在 python 中自动执行此操作并将数据加载为 df。

我收到以下错误

网址错误:

from urllib.request import urlretrieve
import pandas as pd

# Assign url of file: url
url = 'https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls'

# Save file locally
urlretrieve(url, 'my-sheet.xls')

# Read file into a DataFrame and print its head
df=pd.read_excel('my-sheet.xls')
print(df.head())

URLError: <urlopen error [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>

标签: pythonpandas

解决方案


$ curl https://www.dax-indices.com/documents/dax-indices/Documents/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>307 Temporary Redirect</title>
</head><body>
<h1>Temporary Redirect</h1>
<p>The document has moved <a href="https://www.dax-indices.com/document/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls">here</a>.</p>
</body></html>

你只是被重定向。有一些方法可以在代码中实现这一点,但我只需将 url 更改为“ https://www.dax-indices.com/document/Resources/WeightingFiles/Ranking/2019/March/MDAX_RKC.20190329.xls


推荐阅读