首页 > 解决方案 > 将时间戳日期添加到 X 以进行机器学习

问题描述

此循环使用 np 将未来日期添加到预测数据集:

# Future prediction, add dates here for which you want to predict
dates = ["2021-12-23", "2022-12-24", "2023-12-25", "2024-12-26", "2025-12-27",]
#convert to time stamp
for dt in dates:
  datetime_object = datetime.strptime(dt, "%Y-%m-%d")
  timestamp = datetime.timestamp(datetime_object)
  # to array X
  print(int(timestamp))
  np.append(X, int(timestamp))

它正确地返回了这个值:

1640214000
1671836400
1703458800
1735167600
1766790000

代码没有将这 5 个时间戳值附加到数组 X 的问题(假设是 e+09 - 表示法 - 但不知道如何使其工作)。

数组 X 的结构是:

array([[1.5383520e+09],
       [1.5384384e+09],
       [1.5385248e+09],
       (...)
       [1.6339968e+09],
       [1.6340832e+09],
       [1.6341696e+09]])

将这些时间戳值添加到 X 后,在预测代码中会出现错误:

# Future prediction, add dates here for which you want to predict
from datetime import datetime
import numpy as np
from matplotlib import pyplot as plt

from sklearn.metrics import mean_squared_error

# Define model
model = DecisionTreeRegressor()
# Fit to model
model.fit(X_train, Y_train)
# predict
predictions = model.predict(X)
print(mean_squared_error(Y, predictions))

错误:

ValueError: Found input variables with inconsistent numbers of samples: [766, 771]

在最后一行

有错误 - 导致 X 和 Y 值不同:

ValueError: x and y must have same first dimension, but have shapes (771, 1) and (766, 1)

但是已经应该预测来自 Y 的这 5 个值。

标签: pythonpandasmachine-learning

解决方案


内部循环使用这个:

X = np.append(X, int(timestamp))

推荐阅读