python - 如何在具有强模式的日常数据中查看趋势和残差模式
问题描述
我试图从具有如下日常活动形状的数据集中删除模式。我尝试了season_decompose,这可能不合适。
我想要做的是删除预期的峰值使用模式并达到趋势或峰值,就像您在每月数据中应用seasonal_decompose 函数时一样。
有谁知道我可以在这样的日常数据中看到趋势和异常数据吗?
编辑:这是重现上述示例的代码。
sample = {'EventTime': [pd.Timestamp('2020-09-21 00:00:00'), pd.Timestamp('2020-09-21 01:00:00'), pd.Timestamp('2020-09-21 02:00:00'), pd.Timestamp('2020-09-21 03:00:00'), pd.Timestamp('2020-09-21 04:00:00'), pd.Timestamp('2020-09-21 05:00:00'), pd.Timestamp('2020-09-21 06:00:00'), pd.Timestamp('2020-09-21 07:00:00'), pd.Timestamp('2020-09-21 08:00:00'), pd.Timestamp('2020-09-21 09:00:00'), pd.Timestamp('2020-09-21 10:00:00'), pd.Timestamp('2020-09-21 11:00:00'), pd.Timestamp('2020-09-21 12:00:00'), pd.Timestamp('2020-09-21 13:00:00'), pd.Timestamp('2020-09-21 14:00:00'), pd.Timestamp('2020-09-21 15:00:00'), pd.Timestamp('2020-09-21 16:00:00'), pd.Timestamp('2020-09-21 17:00:00'), pd.Timestamp('2020-09-22 01:00:00'), pd.Timestamp('2020-09-22 02:00:00'), pd.Timestamp('2020-09-22 03:00:00'), pd.Timestamp('2020-09-22 04:00:00'), pd.Timestamp('2020-09-22 05:00:00'), pd.Timestamp('2020-09-22 06:00:00'), pd.Timestamp('2020-09-22 07:00:00'), pd.Timestamp('2020-09-22 08:00:00'), pd.Timestamp('2020-09-22 09:00:00'), pd.Timestamp('2020-09-22 10:00:00'), pd.Timestamp('2020-09-22 11:00:00'), pd.Timestamp('2020-09-22 12:00:00'), pd.Timestamp('2020-09-22 13:00:00'), pd.Timestamp('2020-09-22 14:00:00'), pd.Timestamp('2020-09-22 15:00:00'), pd.Timestamp('2020-09-22 16:00:00'), pd.Timestamp('2020-09-22 17:00:00'), pd.Timestamp('2020-09-23 00:00:00'), pd.Timestamp('2020-09-23 01:00:00'), pd.Timestamp('2020-09-23 02:00:00'), pd.Timestamp('2020-09-23 03:00:00'), pd.Timestamp('2020-09-23 04:00:00'), pd.Timestamp('2020-09-23 05:00:00'), pd.Timestamp('2020-09-23 06:00:00'), pd.Timestamp('2020-09-23 07:00:00'), pd.Timestamp('2020-09-23 08:00:00'), pd.Timestamp('2020-09-23 09:00:00'), pd.Timestamp('2020-09-23 10:00:00'), pd.Timestamp('2020-09-23 11:00:00'), pd.Timestamp('2020-09-23 12:00:00'), pd.Timestamp('2020-09-23 13:00:00'), pd.Timestamp('2020-09-23 14:00:00'), pd.Timestamp('2020-09-23 15:00:00'), pd.Timestamp('2020-09-23 16:00:00'), pd.Timestamp('2020-09-23 17:00:00'), pd.Timestamp('2020-09-24 01:00:00'), pd.Timestamp('2020-09-24 02:00:00'), pd.Timestamp('2020-09-24 03:00:00'), pd.Timestamp('2020-09-24 04:00:00'), pd.Timestamp('2020-09-24 05:00:00'), pd.Timestamp('2020-09-24 06:00:00'), pd.Timestamp('2020-09-24 07:00:00'), pd.Timestamp('2020-09-24 08:00:00'), pd.Timestamp('2020-09-24 09:00:00'), pd.Timestamp('2020-09-24 10:00:00'), pd.Timestamp('2020-09-24 11:00:00'), pd.Timestamp('2020-09-24 12:00:00'), pd.Timestamp('2020-09-24 13:00:00'), pd.Timestamp('2020-09-24 14:00:00'), pd.Timestamp('2020-09-24 15:00:00'), pd.Timestamp('2020-09-24 16:00:00'), pd.Timestamp('2020-09-24 17:00:00'), pd.Timestamp('2020-09-25 00:00:00'), pd.Timestamp('2020-09-25 01:00:00'), pd.Timestamp('2020-09-25 02:00:00'), pd.Timestamp('2020-09-25 03:00:00'), pd.Timestamp('2020-09-25 04:00:00'), pd.Timestamp('2020-09-25 05:00:00'), pd.Timestamp('2020-09-25 06:00:00'), pd.Timestamp('2020-09-25 07:00:00'), pd.Timestamp('2020-09-25 08:00:00'), pd.Timestamp('2020-09-25 09:00:00'), pd.Timestamp('2020-09-25 10:00:00'), pd.Timestamp('2020-09-25 11:00:00'), pd.Timestamp('2020-09-25 12:00:00'), pd.Timestamp('2020-09-25 13:00:00'), pd.Timestamp('2020-09-25 14:00:00')],
'SpeedKbs': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1088.48, 58282.31, 83008.37, 58044.14, 34211.61, 27468.72, 25756.96, 14090.29, 5392.43, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1008.33, 44002.72, 47254.5, 37419.96, 23934.41, 19402.93, 18192.84, 9040.67, 3842.37, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1241.15, 43260.7, 56718.99, 41968.16, 33144.51, 22361.08, 28672.93, 21182.31, 5352.42, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 946.01, 46169.63, 51720.39, 37393.39, 27732.89, 25779.79, 24790.86, 15786.72, 4202.65, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 871.7, 37196.78, 40910.71, 26758.97, 17710.98, 16024.61, 15312.96, 9529.89]}
from statsmodels.tsa.seasonal import seasonal_decompose
seasonal_decompose(pd.DataFrame(sample).set_index("EventTime"), model='additive', period=1).plot();
解决方案
推荐阅读
- php - PHP 文件包含 app.php > 应用包含 template.php > 模板包含 header.php 和 footer.php
- wordpress - Woocommerce(结帐页面)-下订单后替换内容
- python - 服务器和客户端之间的远程命令执行?
- react-native - 从标头组件传递数据的好习惯是什么
- html - 无法从闪亮的弹出窗口中删除滚动条
- css - 我无法将 flex 功能赋予 select 元素内的 menuItem
- git - Git_GitHub Fork 一个 repo 的唯一主分支
- r - 将数据框中的 week.year 格式化为在 R 中从零开始
- groovy - 在 Groovy 3 中,添加了大约 80 种新的此类扩展方法。在哪里可以找到完整列表?
- javascript - 如何在点击功能上将数据设置到数据表中?