首页 > 解决方案 > Tensorflow 不训练:“DataFrame”对象是可变的,因此它们不能被散列

问题描述

我想在 kaggle 数据集“房价”上构建和训练一个神经网络tensorflow(但没有Keras,我让它工作)。Keras我使用 Python,除了实际训练之外,我的代码运行良好。但是,在训练时,我要么没有错误(但它没有训练),要么得到TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed.

我在 ipynotebook 中在 Google 的 colab 上运行脚本,我相信主要问题在于输入feed_dict. 但是,我不知道这里有什么问题。batch_X包含100x10特征,并且具有batch_Y标签100。我想这可能是关键的片段:

train_data = { X: batch_X, Y_:batch_Y }

train_data就是我喂的东西sess.run(train_step, feed_dict=train_data")

这是我的代码:https ://colab.research.google.com/drive/1qabmzzicZVu7v72Be8kljM1pUaglb1bY

# train and train_normalized are the training data set (DataFrame)
# train_labels_normalized are the labels only

#Start session:
with tf.Session() as sess:
  sess.run(init)

  possible_indeces = list(range(0, train.shape[0]))
  iterations = 1000
  batch_size = 100

  for step in range(0, iterations):
    #draw batch indeces:
    batch_indeces = random.sample(possible_indeces, batch_size)
    #get features and respective labels
    batch_X = np.array(train_normalized.iloc[batch_indeces])
    batch_Y = np.array(train_labels_normalized.iloc[batch_indeces])

    train_data = { X: batch_X, Y_: batch_Y}

    sess.run(train_step, feed_dict=train_data)

我希望它会运行几分钟并返回优化的权重(每个2隐藏层都有48节点),让我能够做出预测。但是,它只是跳过上面的代码或抛出错误。

有谁知道出了什么问题?

TypeError Traceback (most recent call last)
<ipython-input-536-79506f90a868> in <module>()
     13     batch_Y = p.array(train_labels_normalized.iloc[batch_indeces])
     14 
---> 15     train_data = { X: batch_X, Y_: batch_Y}
     16 
     17     sess.run(train_step, feed_dict=train_data)

  /usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in __hash__(self)

   1814  def __hash__(self):
   1815  raise TypeError('{0!r} objects are mutable, thus they cannot be'
-> 1816     ' hashed'.format(self.__class__.__name__))
   1817
   1818     def __iter__(self):

  TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed

标签: tensorflowtraining-datakaggle

解决方案


问题源于您的第七(测试)步骤。

#Set X to the test data
X = test_normalized.astype(np.float32)
print(type(X)) # **<class 'pandas.core.frame.DataFrame'>**
Y1 = tf.nn.sigmoid(tf.matmul(X, W1))
Y2 = tf.nn.sigmoid(tf.matmul(Y1, W2))
Y3 = tf.matmul(Y2, W3)

您正在设置X为 DataFrame。在第一次运行时,这不会影响任何事情。但是,当你在第七步之后运行第六步时,你会遇到这个问题,因为你已经覆盖了X.

尝试更改XX_

X_ = test_normalized.astype(np.float32)
Y1 = tf.nn.sigmoid(tf.matmul(X_, W1))

此外,您的最终评估不起作用。把它变成一个tf.Session.


推荐阅读