首页 > 解决方案 > Mnist 数据集拆分

问题描述

任何人都可以帮助我按照我们希望的比率将 mnist 数据集拆分为训练、测试和验证。

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

使用 70-20-10 拆分进行培训、验证和测试。

标签: tensorflowmachine-learningkerastraining-datamnist

解决方案


这种方法应该可以做到。它基本上迭代地使用train_test_split来自 tensorflow 的函数将数据集拆分为验证测试​​训练:

train_ratio = 0.70
validation_ratio = 0.20
test_ratio = 0.10

# train is now 70% of the entire data set
# the _junk suffix means that we drop that variable completely
x_train, x_test, y_train, y_test = train_test_split(dataX, dataY, test_size=1 - train_ratio)

# test is now 10% of the initial data set
# validation is now 20% of the initial data set
x_val, x_test, y_val, y_test = train_test_split(x_test, y_test, test_size=test_ratio/(test_ratio + validation_ratio)) 

推荐阅读