首页 > 解决方案 > 制作 LSTM 模型时出错:“ValueError:层顺序需要 1 个输入,但它接收到 200 个输入张量。”

问题描述

我正在尝试将 LSTM 模型作为肽序列生成器的一部分。LSTM 是一个鉴别器网络,而 RNN 生成潜在的肽序列。但是,我却不断收到此错误:

ValueError: Layer sequential expects 1 input(s), but it received 200 input tensors.

我的输入是肽序列的 one_hot 编码向量列表(其中 200 个在列表中)。我将此输入用于机器学习模型时出现此错误,因此我相信我必须以某种方式转换我的输入。不过,不知道具体怎么做。

这是我的代码。

import tensorflow as tf
import numpy as np
import os
import pickle
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout
from string import punctuation

pos = "/content/drive/MyDrive/pepfun/Training_format_pos (1).txt"
neg = "/content/drive/MyDrive/pepfun/Training_format_neg.txt"

f = open(pos, 'r')
file_contents = f.read()
data = file_contents
f.close()
newdatapos = data.splitlines()
print(newdatapos)

f2 = open(neg, 'r')
file_contents2 = f2.read()
data2 = file_contents2
f2.close()
newdataneg = data2.splitlines()
print(newdataneg)

alldata = newdataneg + newdatapos

sequence_length = 53
BATCH_SIZE = 50
EPOCHS=30

import os, sys, math
import numpy as np
import pandas as pd
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_context("notebook", font_scale=1.4)

codes = ['A', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'K', 'L',
         'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'V', 'W', 'Y']
print(len(codes))

def one_hot_encode(seq):
    o = list(set(codes) - set(seq))
    s = pd.DataFrame(list(seq))    
    x = pd.DataFrame(np.zeros((len(seq),len(o)),dtype=int),columns=o)    
    a = s[0].str.get_dummies(sep=',')
    a = a.join(x)
    a = a.sort_index(axis=1)
    return np.array(a)

onehot = []

for item in alldata:
  onehot.append(one_hot_encode(item))

print(onehot)


N=53
for i,a in enumerate(onehot):
  rows, cols = a.shape
  if rows != N:
    onehot[i] = np.vstack([a, np.zeros((N - rows, cols), dtype=a.dtype)])

model = Sequential([
    LSTM(256,input_shape=[53,20], return_sequences=True),
    Dropout(0.3),
    LSTM(256),
    Dense(20, activation="softmax"),
])

model.summary()
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

model.fit(onehot, steps_per_epoch =4, batch_size=BATCH_SIZE, epochs=EPOCHS)

我正在根据此处给出的代码创建我的模型:https ://towardsdatascience.com/image-generation-in-10-minutes-with-generation-adversarial-networks-c2afc56bfa3b 。肽序列是我的训练数据,输入形状为 (53,20)。

标签: pythonnumpytensorflowkerasbioinformatics

解决方案


推荐阅读