首页 > 解决方案 > 如何对我的 RNN 模型进行预测?

问题描述

我创建了一个基于包含以下样本的数据集的模型:Name Gender。

我从数据集中读取数据并将其拆分为训练数据和测试数据。

# Give the location of the file
loc = ("/content/gdrive/My Drive/DatasetProject/name_gender_dataset.csv")

data = pd.read_csv(loc, header = 0) 
listNames = data['Name'].map(lambda x: x) 
listGenders = data['Gender'].map(lambda x: x) 

dictionary = dict(zip(listNames, listGenders))
max_rows = 500000 # Reduction due to memory limitations

df = (pd.read_csv(loc, usecols=['Name', 'Gender'])
        .dropna(subset=['Name', 'Gender'])
        .assign(Name = lambda x: x.Name.str.strip())
        .head(max_rows))

names_train, names_test, gen_train, gen_test =  train_test_split(listNames, listGenders, test_size=0.25, shuffle=True, random_state=123)

for name, gen in zip(names_train[:20], gen_train[:20]):
      print(name, gen)

之后,我使用 Tokenizers 为我的图层创建输入。

encoder_train = tf.keras.preprocessing.text.Tokenizer(char_level=True)
encoder_train.fit_on_texts(names_train)

encoder_test = tf.keras.preprocessing.text.Tokenizer(char_level=True)
encoder_test.fit_on_texts(names_test)

sequences = encoder_train.texts_to_sequences(names_train)
sequences= tf.keras.preprocessing.sequence.pad_sequences(sequences)

sequences_test= encoder_test.texts_to_sequences(names_test)
sequences_test= tf.keras.preprocessing.sequence.pad_sequences(sequences_test)

encoder_gen_train = tf.keras.preprocessing.text.Tokenizer(lower=False, char_level=True)
encoder_gen_train.fit_on_texts(gen_train)

encoder_gen_test = tf.keras.preprocessing.text.Tokenizer(lower=False, char_level=True)
encoder_gen_test.fit_on_texts(gen_test)

gender_vec_train = encoder_gen_train.texts_to_sequences(gen_train)
gender_vec_train = np.asarray(gender_vec_train)

gender_vec_test = encoder_gen_test.texts_to_sequences(gen_test)
gender_vec_test = np.asarray(gender_vec_test)

embedding_input_dim = max(encoder_train.index_word) + 1
embedding_output_dim = 32

我的模型描述如下:

model = Sequential()
model.add(Embedding(input_dim=embedding_input_dim,
                    output_dim=embedding_output_dim,
                    mask_zero=True))
model.add(LSTM(64, return_sequences= True))
model.add(LSTM(32))
model.add(Dense(64, activation='relu'))
model.add(Dense(2, activation='sigmoid'))

在此之后,我训练并编译了我的模型。

model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
              optimizer=tf.keras.optimizers.Adam(lr=0.075),
              metrics=['accuracy'])
history = model.fit(sequences, 
                    gender_vec_train,
                    epochs=5,
                    batch_size= 200,
                    validation_data= (sequences_test, gender_vec_test))

最后我想做出预测。我想写这样的东西:model.predict("Andrew")

我应该在我的模型中进行哪些更改才能做到这一点?

标签: pythontensorflowkerasmodelpredict

解决方案


推荐阅读