python - 使用 gunicorn 进行预测时无法解开对象
问题描述
目前我正在使用 API 提供下一个单词预测模型。该模型在使用烧瓶时成功运行,但在使用 gunicorn 进行部署时解开对象存在问题。Pickeled 对象依赖于类定义,我在需要的地方明确地提供类定义。
class LanguageModel(nn.Module):
def __init__(self, vocab_size, embedding_size, hidden_size, n_layers=1, dropout_p=0.5):
# Defining layers
super(LanguageModel, self).__init__()
self.n_layers = n_layers
self.hidden_size = hidden_size
self.embed = nn.Embedding(vocab_size, embedding_size)
self.rnn = nn.LSTM(embedding_size, hidden_size, n_layers, batch_first=True)
self.linear = nn.Linear(hidden_size, vocab_size)
self.dropout = nn.Dropout(dropout_p)
def init_weight(self):
# self.embed.weight = nn.init.xavier_uniform(self.embed.weight)
self.embed.weight.data.copy_(torch.from_numpy(new_w))
self.linear.weight = nn.init.xavier_uniform(self.linear.weight)
self.linear.bias.data.fill_(0)
# importing word indexes
with open(w2i, "rb") as f1:
word2index = pickle.load(f1)
with open(i2w, "rb") as f2:
index2word = pickle.load(f2)
# loading model
model = torch.load(wordModel)
def getNextWords(words):
results = []
data = [words]
data = flatten([co.strip().split() + ['</s>'] for co in data])
x = prepare_sequence(data, word2index)
x = x.unsqueeze(1)
x = batchify(x, 1)
with torch.no_grad():
hidden = model.init_hidden(1)
for batch in getBatch(x, 1):
inputs, targets = batch
output, hidden = model(inputs, hidden)
prob = output.exp()
word_id = torch.multinomial(prob, num_samples=1).item()
# word_probs = torch.multinomial(prob, num_samples=1).probs()
word = index2word[word_id]
results.append(word)
return [res for res in results if res.isalpha()][:4] # return results
app = Flask(__name__)
@app.route('/')
def home():
return "Home"
@app.route('/getPredictions', methods=["POST"])
def getPredictions():
#...... code .........
resultJSON = {'inputPhrase': inputPhrase,
'predictions': predictions} # predictions [nextPhrase]
print('result: ', predictions)
return jsonify(resultJSON)
if __name__ == '__main__':
app.run(host='0.0.0.0', port=3001, debug=True) # 10.2.1.29
Gunicorn wsgi.py 文件:
from m_api import app
import torch
import torch.nn as nn
from torch.autograd import Variable
if __name__ == "__main__":
class LanguageModel(nn.Module):
def __init__(self, vocab_size, embedding_size, hidden_size, n_layers=1, dropout_p=0.5):
# Defining layers
super(LanguageModel, self).__init__()
self.n_layers = n_layers
self.hidden_size = hidden_size
self.embed = nn.Embedding(vocab_size, embedding_size)
self.rnn = nn.LSTM(embedding_size, hidden_size, n_layers, batch_first=True)
self.linear = nn.Linear(hidden_size, vocab_size)
self.dropout = nn.Dropout(dropout_p)
def init_weight(self):
# self.embed.weight = nn.init.xavier_uniform(self.embed.weight)
self.embed.weight.data.copy_(torch.from_numpy(new_w))
self.linear.weight = nn.init.xavier_uniform(self.linear.weight)
self.linear.bias.data.fill_(0)
app.run()
这个应用程序在由烧瓶提供服务时运行得非常好,但是当我使用 gunicorn 时会抛出一个错误:
model = torch.load(wordModel)
File "/home/.conda/envs/sppy36/lib/python3.6/site-packages/torch/serialization.py", line 426, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/home/.conda/envs/sppy36/lib/python3.6/site-packages/torch/serialization.py", line 613, in _load
result = unpickler.load()
AttributeError: Can't get attribute 'LanguageModel' on <module '__main__' from '/home/.conda/envs/sppy36/bin/gunicorn'>
为了解决这个问题,我也在 wsgi.py 文件中包含了类定义,但是在加载 pickeled 文件时仍然无法获取类定义。我需要在哪里指定类定义仍然未知。
解决方案
问题是因为 gunicorn 寻找 Class 定义 int 的主要方法,即 gunicorn 可执行文件。这就是为什么即使在两个 .py 文件中明确定义类在 gunicorn 上运行时也没有完成预期的工作,但在使用烧瓶时却做到了。为了克服这个问题,我在 gunicorn 可执行文件中明确定义了该类并且它起作用了。目前,我发现这是可行的解决方案。
gunicorn.py
#!/home/user/anaconda3/envs/envName/bin/python
import re
import sys
from gunicorn.app.wsgiapp import run
import torch
import torch.nn as nn
from torch.autograd import Variable
USE_CUDA = torch.cuda.is_available()
if __name__ == '__main__':
# defining model class
class LanguageModel(nn.Module):
def __init__(self, vocab_size, embedding_size, hidden_size, n_layers=1, dropout_p=0.5):
# Defining layers
super(LanguageModel, self).__init__()
self.n_layers = n_layers
self.hidden_size = hidden_size
self.embed = nn.Embedding(vocab_size, embedding_size)
self.rnn = nn.LSTM(embedding_size, hidden_size, n_layers, batch_first=True)
self.linear = nn.Linear(hidden_size, vocab_size)
self.dropout = nn.Dropout(dropout_p)
def init_weight(self):
# self.embed.weight = nn.init.xavier_uniform(self.embed.weight)
self.embed.weight.data.copy_(torch.from_numpy(new_w))
self.linear.weight = nn.init.xavier_uniform(self.linear.weight)
self.linear.bias.data.fill_(0)
def init_hidden(self, batch_size):
hidden = Variable(torch.zeros(self.n_layers, batch_size, self.hidden_size))
context = Variable(torch.zeros(self.n_layers, batch_size, self.hidden_size))
return (hidden.cuda(), context.cuda()) if USE_CUDA else (hidden, context)
def detach_hidden(self, hiddens):
return tuple([hidden.detach() for hidden in hiddens])
def forward(self, inputs, hidden, is_training=False):
embeds = self.embed(inputs)
if is_training:
embeds = self.dropout(embeds)
out, hidden = self.rnn(embeds, hidden)
return self.linear(out.contiguous().view(out.size(0) * out.size(1), -1)), hidden
sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
sys.exit(run())
推荐阅读
- javascript - 获取随机 Discord 用户
- swift - 防止 GKGridGraph 在寻路时崩溃
- button - BackgroundImage Javafx 组合框
- maven - maven 版本在哪里被覆盖?
- neural-network - 如何使用自组织地图对数据进行聚类?
- ajax - ReactJS Fetch 替代这个 AJAX 样板
- python - 向下舍入/截断大浮点数
- django - 如何过滤 FK 中包含的对象?
- php - {{asset('css/app.css')}} 和 laravel 中的 {{mix('css/app.css')}} 一样吗
- twitter-bootstrap - 如何在 Asp.net core 2.0.9 Web 应用程序中启用引导程序