首页 > 解决方案 > 深度学习模型无限训练

问题描述

我正在为 meme 图像分类训练一个深度学习模型。

我正在尝试级联 VGG 模型和 LSTM 文本分类,模型训练无限并且在提到的时代之后永不停止。我尝试了所有可能的方法来使用提前停止来终止,但它仍然无限运行。

我的数据文件夹有 2 个文件夹,一个带有 meme 图像,另一个带有标签文本文件

下面是代码。

label_df_clean = t

num_of_samples = label_df_clean.shape[0]
num_of_samples

# ## Glove

from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences

maxlen = 100
training_samples = num_of_samples
tag_vocabulary_size = 10000
max_words = tag_vocabulary_size

glove_dir = 'glove/glove.6B/'

embeddings_index = {}
f = open(os.path.join(glove_dir, 'glove.6B.100d.txt'), encoding = "UTF-8")
for line in f:
    values = line.split()
    word = values[0]
    coefs = np.asarray(values[1:], dtype='float32')
    embeddings_index[word] = coefs
f.close()

print('Found %s word vectors.' % len(embeddings_index))

tokenizer = Tokenizer(num_words=max_words)
texts = []
for tag_list in label_df_clean['word_tags']:
    texts.append(' '.join(tag_list))
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
word_index = tokenizer.word_index
print('Found {} unique tokens'.format(len(word_index)))
tag_data = pad_sequences(sequences, maxlen=maxlen)

tag_data.shape

tag_data

embedding_dim = 100

embedding_matrix = np.zeros((max_words, embedding_dim))
for word, i in word_index.items():
    embedding_vector = embeddings_index.get(word)
    if i < max_words:
        if embedding_vector is not None:
            # Words not found in embedding index will be all-zeros.
            embedding_matrix[i] = embedding_vector

嵌入矩阵:

tag_data.shape

tag_input = Input(shape=(None,), dtype='int32', name='tag')
embedded_tag = layers.Embedding(max_words, embedding_dim)(tag_input)
encoded_tag = layers.LSTM(512)(embedded_tag)

# ## CONV2D

from keras.applications import VGG16

image_input = Input(shape=(150, 150, 3), name='image')
vgg16 = VGG16(weights='imagenet',
                  include_top=False,
                  input_shape=(150, 150, 3))(image_input)
x = layers.Flatten()(vgg16) 
x = layers.Dense(256, activation='relu')(x)

import tensorflow

concatenated = layers.concatenate([x, encoded_tag], axis=-1)
output = layers.Dense(1, activation='sigmoid')(concatenated)

model = Model([image_input, tag_input], output)

# model.layers[1].trainable = False # freeze VGG16 convolutional base
model.layers[4].set_weights([embedding_matrix]) 
# model.layers[4].trainable = False # freeze GloVe word embedding

class new_callback(tensorflow.keras.callbacks.Callback):
    def epoch_end(self, epoch, logs={}): 
        if(logs.get('accuracy')> 0.65): # select the accuracy
            print("\n !!! 65% accuracy, no further training !!!")
            self.model.stop_training = True
            
callbacks = new_callback()

model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])

model.summary()

# # model.layers[1].trainable = False # freeze VGG16 convolutional base
# model.layers[4].set_weights([embedding_matrix])
# model.layers[4].trainable = False # freeze GloVe word embedding
label_df_clean.head()

dim = (150, 150)
X_image_train = []
X_tag_train = tag_data
y_train = label_df_clean["label"]
    
for fname in listdir(small_image_path):
    fpath = os.path.join(small_image_path, fname)
    im = cv2.imread(fpath)
    im_resized = cv2.resize(im, dim, interpolation = cv2.INTER_AREA)
    X_image_train.append(im_resized)
#     y_train.append(1)
    
# # add wrong tag samples
# num_negative_samples = len(y_train)
# for i in range(num_negative_samples):
#     image = X_image_train[i]
#     X_image_train.append(image)
#     j = (i + 1) % num_negative_samples # get a different tag
#     tag = X_tag_train[j]
#     X_tag_train = np.append(X_tag_train, tag) 
#     y_train
# from sklearn import preprocessing

# le = preprocessing.LabelEncoder()
# le.fit(y_train)
# y_train = le.transform(y_train)

X_image_train = np.array(X_image_train)
X_tag_train   = np.array(tag_data)
y_train       = np.array(y_train)

perm = np.arange(y_train.shape[0])
np.random.shuffle(perm)
X_image_train = X_image_train[perm]
X_tag_train   = X_tag_train[perm]
y_train       = y_train[perm]

X_image_train.shape, X_tag_train.shape, y_train.shape
X_image_train
X_tag_train
y_train

model.fit([X_image_train, X_tag_train],y_train, batch_size = 64, epochs = 10, callbacks =[callbacks])')

模型不断训练,永不停止。注意代码来自Jupyter笔记本,请忽略缩进和打印问题。

如果您在模型火车中发现逻辑问题,请告诉我。

注意:我没有使用标签编码器,因为我的火车是二进制数据。

我目前使用Keras 2.3.1TensorFlow 2.1.0

标签: machine-learningdeep-learning

解决方案


推荐阅读