python - 迭代多个月以获取不同的数据
问题描述
我需要经历从 2015 年 1 月到 2020 年 2 月 5 日的每一天。
以下脚本为我提供了截至 2020 年 2 月 5 日为止的每个月的日期:
import pandas as pd
date = pd.datetime.now().strftime("%Y%m%d")
dates = pd.date_range(start='20150101', end='20200205', freq = "M").strftime("%Y%m%d")
print(dates)
结果:
Index(['20150131', '20150228', '20150331', '20150430', '20150531', '20150630',
'20150731', '20150831', '20150930', '20151031', '20151130', '20151231',
'20160131', '20160229', '20160331', '20160430', '20160531', '20160630',
'20160731', '20160831', '20160930', '20161031', '20161130', '20161231',
'20170131', '20170228', '20170331', '20170430', '20170531', '20170630',
'20170731', '20170831', '20170930', '20171031', '20171130', '20171231',
'20180131', '20180228', '20180331', '20180430', '20180531', '20180630',
'20180731', '20180831', '20180930', '20181031', '20181130', '20181231',
'20190131', '20190228', '20190331', '20190430', '20190531', '20190630',
'20190731', '20190831', '20190930', '20191031', '20191130', '20191231',
'20200131'],
dtype='object'
以下脚本会在 2015 年 1 月的每一天抓取风速:在我的主目录中,我指定了 URL 中使用的 API 密钥、开始日期和结束日期。我相信这是可以合并两个脚本的地方。
import pandas as pd
import requests
import warnings
headers = {
'scheme': 'https',
'accept': 'application/json, text/plain, */*',
'accept-encoding' : 'gzip, deflate, br',
'accept-language': 'en-GB,en;q=0.9,en-US;q=0.8,da;q=0.7',
'origin': 'https://www.wunderground.com',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'cross-site',
'user-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36'
}
#Here I get the relevant data, being the dates and wind speed, and add it to a seperate dataframe called dkk
def get_data(response):
df = response.json()
df = pd.DataFrame(df["observations"])#[1]["valid_time_gmt", "wspd"]
df["time"] = pd.to_datetime(df["valid_time_gmt"],unit='s')
dkk = df.groupby(df["time"].dt.date)["wspd"].mean()
return dkk
if __name__ == "__main__":
date = pd.datetime.now().strftime("%d-%m-%Y")
api_key = "xxxxxx"
start_date = "20150101"
end_date = "20150131"
urls = [
"https://api.weather.com/v1/location/EGNV:9:GB/observations/historical.json?apiKey="+api_key+"&units=e&startDate="+start_date+"&endDate="+end_date+""
]
#here I append data to dataframe and transpose it and store in df_transposed, which results in the
below.
df = pd.DataFrame()
for url in urls:
warnings.simplefilter('ignore' ,InsecureRequestWarning)
res = requests.get(url, headers= headers, verify = False)
data = get_data(res)
df = df.append(data)
df_transposed = df.T
print(df_transposed)
结果:
wspd
2015-01-01 24.333333
2015-01-02 18.696970
...
2015-01-30 12.121212
2015-01-31 21.575758
问题是:我需要获取 2015 年 1 月 1 日至 2020 年 2 月 5 日的风速。如何最好地组合我的脚本以获得所需的输出,这将是一个包含日期和风速的两列数据框(wspd ) 在第二。
所需的输出:
wspd
2015-01-01 24.333333
2015-01-02 18.696970
2015-01-03 8.454545
2015-01-04 10.363636
2015-01-05 11.333333
...
2020-02-04 13.5
2020-02-05 7.1
最后两个日期的 wspd 可以在这里看到:
https://www.wunderground.com/history/monthly/gb/darlington/EGNV/date/2020-2
解决方案
使用Series.where
:
s = df_transposed.index.to_series()
df_transposed= df_transposed.where((s >='2015-01-01') &(s<='2020-02-05'),'XXX')
编辑
s = df_transposed.index.to_series()
df_transposed= df_transposed.where((s >=pd.to_datetime('2015-01-01')) &
(s<=pd.to_datetime('2020-02-05')),'XXX')
推荐阅读
- c++ - 在 C++ 中使用 sort() 函数对字符串数组进行排序的最坏情况时间复杂度是多少?
- ssas - 我们可以将两种措施合二为一吗?
- c# - 多层感知器(神经网络) - 我错过了什么?
- java - for循环中的多个数组
- php - Codeigniter 3 博客应用程序:重定向到更新的帖子失败
- python - 创建与每一行具有相同列表的新熊猫列?
- image - JMeter - 从数据文件中读取图像以进行 PUT 请求
- ethereum - Nethereum 异常:“在 JSON 中找不到必需的属性‘代码’。”
- nsis - NSIS反模拟器
- r - 如何使用ggplot在R中沿x轴填充缺失的时间戳值?