python - 使用 while 循环训练模型
问题描述
我试图迭代一些值,而我的数据集 S_train 的长度 <= 比某个给定的数字,在这种情况下为 11。这是我到目前为止所拥有的
S_new = train
T_new = test
mu_new = mu
mu_test_new = mu_test
while len(S_new) <= 11:
ground_test = T_new[target].values.tolist()
acquisition_function = abs(mu_test - ground_test)
max_item = np.argmax(acquisition_function) #step 3 : value in test set that maximizes the abs difference of the energy
alpha_al = test.iloc[[max_item]] #identify the minimum step in test set
S_new = S_new.append(alpha_al)
len(S_new)
T_new = T_new.drop(test.index[max_item])
len(T_new)
gpr = GaussianProcessRegressor(
# kernel is the covariance function of the gaussian process (GP)
kernel=Normalization( # kernel equals to normalization -> normalizes a kernel using the cosine of angle formula, k_normalized(x,y) = k(x,y)/sqrt(k(x,x)*k(y,y))
# graphdot.kernel.fix.Normalization(kernel), set kernel as marginalized graph kernel, which is used to calculate the similarity between 2 graphs
# implement the random walk-based graph similarity kernel as Kashima, H., Tsuda, K., & Inokuchi, A. (2003). Marginalized kernels between labeled graphs. ICML
Tang2019MolecularKernel()
),
alpha=1e-4, # value added to the diagonal of the kernel matrix during fitting
optimizer=True, # default optimizer of L-BFGS-B based on scipy.optimize.minimize
normalize_y=True, # normalize the y values so taht the means and variance is 0 and 1, repsectively. Will be reversed when predicions are returned
regularization='+', # alpha (1e-4 in this case) is added to the diagonals of the kernal matrix
)
start_time = time.time()
gpr.fit(S_new.graphs, S_new[target], repeat=1, verbose=True) # Fitting train set as graphs (independent variable) with train[target] as dependient variable
end_time = time.time()
print("the total time consumption is " + str(end_time - start_time) + ".")
gpr.kernel.hyperparameters
rmse_training = []
rmse_test = []
mu_new = gpr.predict(S_new.graphs)
print('Training set')
print('MAE:', np.mean(np.abs(S_new[target] - mu_new)))
print('RMSE:', np.std(S_new[target] - mu_new))
rmse_training.append(np.std(S_new[target] - mu_new)
mu_test_new = gpr.predict(T_new.graphs)
print('Training set')
print('MAE:', np.mean(np.abs(T_new[target] - mu_test_new)))
print('RMSE:', np.std(T_new[target] - mu_test_new))
rmse_test.append(np.std(T_new[target] - mu_test_new)
基本上,我正在计算 T_new 中的值,该值使 T_new 和 mu_test 中的第 i 个元素之间的绝对误差最大化,并将其添加到集合 S_train,然后将其从 T_new 中删除。使用新的 S_train,我将再次训练我的模型,然后执行我上面解释的相同操作。我从来没有使用过while循环,我正在寻找sintaxis,对我来说看起来是正确的,但我收到了这个错误消息:
File "<ipython-input-55-d284ca5f9d1f>", line 42
mu_test_new = gpr.predict(T_new.graphs)
^
SyntaxError: invalid syntax
你知道是什么原因造成的吗?任何建议都非常感谢。一直感谢您的帮助。
解决方案
问题不在于while循环。这只是打字错误。特别是这条线 -
rmse_training.append(np.std(S_new[target] - mu_new)
缺少右括号。
如果你试试
rmse_training.append(np.std(S_new[target] - mu_new))
您看到的错误将消失。
非常值得注意的是,针对特定行报告的错误有时是由于早期的语法错误,这是调试时需要注意的事情。
推荐阅读
- python - optimize_for_inference_lib.optimize_for_inference 不融合 batchnorm 层?
- excel - 使用“Ctrl+P”或“文件 - 打印”时,在显示打印预览之前不会调用 Workbook.BeforePrint 事件 (Excel)
- javascript - 将日期对象打印到控制台
- python - Boost Python 暴露 C++ 类,构造函数采用 std::list
- kubernetes - 将 Kubernetes 服务帐号连接到 Google Cloud 服务帐号
- ramda.js - 只运行一次地图,ramda js
- php - 字符串变量比较与php中的值失败
- c# - 是否可以使用 OLEDB 从包含图像的 excel 文件中读取文本?
- ios - 改变闭包外的变量
- json - JSON 无法处理 numpy 的 float128 类型