首页 > 解决方案 > 部署 keras 模型使用 tensorflow 服务得到 501 服务器错误:未实现 url:http://localhost:8501/v1/models/genre:predict

问题描述

我使用 SavedModelBuilder 将 keras .h5 模型保存到 .pb。在我使用 tensorflow/serving:1.14.0 的 docker 映像部署我的模型后,当我运行预测过程时,我得到了“requests.exceptions.HTTPError: 501 Server Error: Not Implemented for url: http://localhost:8501/ v1/模型/流派:预测

模型构建代码如下:

from keras import backend as K
import tensorflow as tf   
from keras.models import load_model

model=load_model('/home/li/model.h5')

model_signature = 
tf.saved_model.signature_def_utils.predict_signature_def(
    inputs={'input': model.input}, outputs={'output': model.output})
#export_path = os.path.join(model_path,model_version)
export_path = "/home/li/genre/1" 

builder = tf.saved_model.builder.SavedModelBuilder(export_path)
builder.add_meta_graph_and_variables(
    sess=K.get_session(),
    tags=[tf.saved_model.tag_constants.SERVING],
    signature_def_map={
        'predict':
        model_signature,
        'serving_default':
        model_signature
    })
builder.save()

然后我得到了 .pb 模型: pd模型结构

当我运行saved_model_cli show --dir /home/li/genre/1 --all时,保存的 .pd 模型信息如下:

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['predict']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['input'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1, 128, 1292)
        name: conv2d_1_input_2:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['output'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 19)
        name: dense_2_2/Softmax:0
  Method name is: tensorflow/serving/predict

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['input'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1, 128, 1292)
        name: conv2d_1_input_2:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['output'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 19)
        name: dense_2_2/Softmax:0
  Method name is: tensorflow/serving/predict

我用来在 docker image tensorflow/serving 上部署的命令是

docker run -p 8501:8501 --name tfserving_genre --mount type=bind,source=/home/li/genre,target=/models/genre -e MODEL_NAME=genre -t tensorflow/serving:1.14.0 &

在浏览器中打开http://localhost:8501/v1/models/genre时,我收到了消息

{
 "model_version_status": [
  {
   "version": "1",
   "state": "AVAILABLE",
   "status": {
    "error_code": "OK",
    "error_message": ""
   }
  }
 ]
}

客户端预测代码如下:

import requests
import numpy as np
import os
import sys

from audio_to_spectrum_v2 import split_song_to_frames


# Define a Base client class for Tensorflow Serving
class TFServingClient:
    """
    This is a base class that implements a Tensorflow Serving client
    """
    TF_SERVING_URL_FORMAT = '{protocol}://{hostname}:    {port}/v1/models/{endpoint}:predict'

    def __init__(self, hostname, port, endpoint, protocol="http"):
        self.protocol = protocol
        self.hostname = hostname
        self.port = port
        self.endpoint = endpoint

    def _query_service(self, req_json):
        """
        :param req_json: dict (as define in https://cloud.google.com/ml-engine/docs/v1/predict-request)
        :return: dict
        """
        server_url =   self.TF_SERVING_URL_FORMAT.format(protocol=self.protocol,
                                                       hostname=self.hostname,
                                                   port=self.port,
                                                   endpoint=self.endpoint)
        response = requests.post(server_url, json=req_json)
        response.raise_for_status()
        print(response.json())
        return np.array(response.json()['output'])


# Define a specific client for our inception_v3 model
class GenreClient(TFServingClient):
    # INPUT_NAME is the config value we used when saving the model (the only value in the `input_names` list)
    INPUT_NAME = "input"

    def load_song(self, song_path):
        """Load a song from path,slices to pieces, and extract features, returned as np.array format"""

        song_pieces = split_song_to_frames(song_path,False,30)
        return song_pieces

    def predict(self, song_path):
        song_pieces = self.load_song(song_path)

        # Create a request json dict
        req_json = {
                "instances": song_pieces.tolist()
        }
        print(req_json)
        return self._query_service(req_json)

def main():    
    song_path=sys.argv[1]

    print("file name:{}".format(os.path.split(song_path)[-1]))

    hostname = "localhost"
    port = "8501"
    endpoint="genre"
    client = GenreClient(hostname=hostname, port=port, endpoint=endpoint)

    prediction = client.predict(song_path)
    print(prediction)

if __name__=='__main__':
    main()

运行预测代码后,得到如下错误信息:

Traceback (most recent call last):
  File "client_predict.py", line 90, in <module>
    main()
  File "client_predict.py", line 81, in main
    prediction = client.predict(song_path)
  File "client_predict.py", line 69, in predict
    return self._query_service(req_json)
  File "client_predict.py", line 40, in _query_service
    response.raise_for_status()
  File "/home/li/anaconda3/lib/python3.7/site-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 501 Server Error: Not Implemented for url: http://localhost:8501/v1/models/genre:predict

我想知道这个部署问题的原因是什么,以及如何解决它,谢谢大家。

标签: tensorflowkeraspython-requeststensorflow-serving

解决方案


我试图打印响应使用

pred = json.loads(r.content.decode('utf-8'))
print(pred)

问题是由“conv implementation 目前仅支持 NHWC 张量格式”引起的。

最后,我在 Conv2d 中将数据格式从 NCHW 更改为 NWHC


推荐阅读