python - 训练模型时,它会打印 epoch 1/2,然后单元格会关闭并且所有变量都会丢失(使用 IBM 实验室)
问题描述
当尝试训练我的 LSTM 模型时,我的单元将打印出对 n 个样本等进行的 epoch 1/2 训练,然后关闭并且所有变量都丢失了,因此我必须再次运行所有单元。我附上了我构建模型的代码部分的图像以及尝试训练模型时会发生什么,任何帮助都会很棒!我在某处读过这段代码可能会有所帮助 - 在与 os 相关的代码下方(不知道它做了什么),但它没有改变任何东西
import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'
下面是我的代码,不包括数据清理等。
导入的库的图像(代码在此链接下方)
!pip install --upgrade tensorflow-gpu==2.0
!pip install plotly
!pip install --upgrade nbformat
!pip install nltk
!pip install spacy # spaCy is an open-source software library for advanced natural language processing
!pip install WordCloud
!pip install gensim # Gensim is an open-source library for unsupervised topic modeling and natural language processing
import nltk
nltk.download('punkt')
import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from wordcloud import WordCloud, STOPWORDS
import nltk
import re
from nltk.stem import PorterStemmer, WordNetLemmatizer
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize
import gensim
from gensim.utils import simple_preprocess
from gensim.parsing.preprocessing import STOPWORDS
# import keras
from tensorflow.keras.preprocessing.text import one_hot, Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Embedding, Input, LSTM, Conv1D, MaxPool1D, Bidirectional, Dropout
from tensorflow.keras.models import Model
nltk.download("stopwords")
标记化和填充的图像(下面的代码)
# split data into test and train
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(df.clean_joined, df.label, test_size = 0.20)
from nltk import word_tokenize
# Create a tokenizer to tokenize the words and create sequences of tokenized words
tokenizer = Tokenizer(num_words = total_words)
tokenizer.fit_on_texts(x_train)
train_sequences = tokenizer.texts_to_sequences(x_train)
test_sequences = tokenizer.texts_to_sequences(x_test)
embedding_dim = 100
embeddings_index = dict()
f = open('glove.6B.100d.txt')
for line in f:
values = line.split()
word = values[0]
coefs = np.asarray(values[1:], dtype='float32')
embeddings_index[word] = coefs
f.close()
print(f'Found {len(embeddings_index)} word vectors')
embedding_matrix = np.zeros((total_words, embedding_dim))
for word, index in tokenizer.word_index.items():
if index > total_words -1:
break
else:
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[index] = embedding_vector
padded_train = pad_sequences(train_sequences,maxlen = 16274, padding = 'post', truncating = 'post')
padded_test = pad_sequences(test_sequences,maxlen = 16274, truncating = 'post')
建筑模型和模型摘要的图像(代码如下)
# Sequential Model
model = Sequential()
model.add(Embedding(total_words, embedding_dim,
embeddings_initializer=tf.keras.initializers.Constant(embedding_matrix),
trainable=False))
model.add(Dropout(0.2))
# Bi-Directional RNN and LSTM
model.add(Bidirectional(LSTM(128)))
# Dense layers
model.add(Dense(128, activation = 'relu'))
model.add(Dense(1,activation= 'sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, None, 100) 17437200
_________________________________________________________________
dropout (Dropout) (None, None, 100) 0
_________________________________________________________________
bidirectional (Bidirectional (None, 256) 234496
_________________________________________________________________
dense (Dense) (None, 128) 32896
_________________________________________________________________
dense_1 (Dense) (None, 1) 129
=================================================================
Total params: 17,704,721
Trainable params: 267,521
Non-trainable params: 17,437,200
____________________________________
当我尝试训练模型时会发生什么的图像 - 注意细胞关闭(下面的代码)
y_train = np.asarray(y_train)
# train the model
model.fit(padded_train, y_train, batch_size = 64, validation_split = 0.1, epochs = 2, verbose=2)
可能有助于在此处查看图像,但本质上这会被打印并且单元格已关闭
'训练 14947 个样本,验证 1661 个样本 Epoch 1/2'
任何帮助都会很棒!
解决方案
推荐阅读
- php - 如何使用 VARIANT 类将 VB 脚本转换为 PHP 脚本
- openshift - Openshift - 等待卷附加或挂载 pod 超时
- excel - Vlookup 完全匹配 (false) 的 VBA 代码应该可以工作吗?
- javascript - Bitfinex API websocket 订单簿
- arrays - 给定两个数组,找出更大的元素个数
- php - 如何使用 AJAX 调用 PHP 函数?
- python - 如何使用来自 mac 和 pycharm 的 pyinstaller,带参数
- c# - 从用户 webBrowser c# 获取最新的 pastebin .txt
- multithreading - MPI slave发送数据,但只有四分之一正确
- c++ - 如何访问链接列表内的链接列表中的成员