首页 > 解决方案 > Tensorflow estimator.DNNClassifier 不重复结果

问题描述

每次我运行以下代码时,我都会在训练模型时得到不同的“最后一步的损失”。随后的评估精度也会发生变化。我已经检查了来自 train_test_split 的输入数据是恒定的。我已经设置了 tf.random_seed 的值,关闭了 shuffle 并设置了 num_threads 的值。我正在使用 TensorFlow 1.8。谁能告诉我我还需要做什么?

from __future__ import print_function
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split

np.random.seed(1)
tf.set_random_seed(1)

df = pd.read_csv('diabetes.csv')
X = df.iloc[:,0:8]
y = df['Outcome']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,
                                   stratify=None, random_state=1)

def create_feature_cols():
  return [
    tf.feature_column.numeric_column('Pregnancies'),
    tf.feature_column.numeric_column('Glucose'),
    tf.feature_column.numeric_column('BloodPressure'),
    tf.feature_column.numeric_column('SkinThickness'),
    tf.feature_column.numeric_column('Insulin'),
    tf.feature_column.numeric_column('BMI'),
    tf.feature_column.numeric_column('DiabetesPedigreeFunction'),
    tf.feature_column.numeric_column('Age')
  ]

input_func = tf.estimator.inputs.pandas_input_fn(x=X_train,y=y_train,
             batch_size=10,num_epochs=1000,shuffle=False,num_threads=1)
model  =  tf.estimator.DNNClassifier(hidden_units=[20,20],
          feature_columns=create_feature_cols(),n_classes=2)
model.train(input_fn=input_func,steps=1000)

eval_input_func = tf.estimator.inputs.pandas_input_fn(
      x=X_test,
      y=y_test,
      batch_size=10,
      num_epochs=1,
      shuffle=False,
      num_threads=1)
results = model.evaluate(eval_input_func)`

标签: pythonrepeattensorflow-estimator

解决方案


这是 TensorFlow 发给我的一些代码来解决这个问题。不要使用 tf.set_random_seed,而是使用 tf.estimator.RunConfig。

import tensorflow as tf
tf.reset_default_graph()
config = tf.estimator.RunConfig(tf_random_seed=234)

input1_col = tf.feature_column.numeric_column('input1')
input2_col = tf.feature_column.numeric_column('input2')
model = tf.estimator.DNNClassifier(hidden_units=[20,20],
feature_columns=[input1_col, input2_col],n_classes=2,config=config)

import numpy as np
input1 = np.random.random(size=(100, 1))
input2 = np.random.random(size=(100, 1))
target = np.where(np.sum(input1 + input2, axis = 1) > 0, 1, 0)
def train_input_fn():
return ({'input1': input1, 'input2': input2}, target)

model.train(input_fn=train_input_fn, steps = 10)

推荐阅读