首页 > 解决方案 > SageMaker 多模型和 RandomCutForest

问题描述

我正在尝试使用 RandomCutForest 模型调用 MultiModel 端点。我收到错误,“错误加载模型”。我可以使用示例中给出的模型调用端点。我是否遗漏了一些东西,例如我可以使用哪些模型的限制?

对于 MultiModel 的灵感,我在下面使用:

https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/multi_model_xgboost_home_value/xgboost_multi_model_endpoint_home_value.ipynb

https://aws.amazon.com/blogs/machine-learning/save-on-inference-costs-by-using-amazon-sagemaker-multi-model-endpoints/

我正在尝试从 MultiModel 端点中的 RCF 示例下方部署输出的“model.tar.gz”:

https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/random_cut_forest/random_cut_forest.ipynb

model_name = 'model'
full_model_name = '{}.tar.gz'.format(model_name)
features = data

body = ','.join(map(str, features)) + '\n'
response = runtime_sm_client.invoke_endpoint(
                    EndpointName=endpoint_name,
                    ContentType='text/csv',
                    TargetModel=full_model_name,
                    Body=body)
print(response)

Cloudwatch 日志错误:

> 2020-04-27 17:28:59,005 [INFO ]
> W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Error loading model: Unable
> to load model: invalid load key, '{'. [17:28:59]
> /workspace/src/learner.cc:334: Check failed: fi->Read(&mparam_,
> sizeof(mparam_)) == sizeof(mparam_) (25 vs. 136) : BoostLearner: wrong
> model format 2020-04-27 17:28:59,005 [INFO ]
> W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Stack trace: 2020-04-27
> 17:28:59,005 [INFO ] W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   [bt] (0)
> /miniconda3/xgboost/libxgboost.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x24)
> [0x7f37ce1cacb4] 2020-04-27 17:28:59,005 [INFO ]
> W-9003-b39b888fb4a3fa6cf83bb34a9 com.amazonaws.ml.mms.wlm.WorkerThread
> - Backend response time: 0 2020-04-27 17:28:59,005 [INFO ] W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   [bt] (1)
> /miniconda3/xgboost/libxgboost.so(xgboost::LearnerImpl::Load(dmlc::Stream*)+0x4b5)
> [0x7f37ce266985] 2020-04-27 17:28:59,005 [INFO ]
> W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   [bt] (2)
> /miniconda3/xgboost/libxgboost.so(XGBoosterLoadModel+0x37)
> [0x7f37ce1bf417] 2020-04-27 17:28:59,005 [INFO ]
> W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   [bt] (3)
> /miniconda3/lib/python3.7/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c)
> [0x7f37ee993ec0] 2020-04-27 17:28:59,005 [INFO ]
> W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   [bt] (4)
> /miniconda3/lib/python3.7/lib-dynload/../../libffi.so.6(ffi_call+0x22d)
> [0x7f37ee99387d] 2020-04-27 17:28:59,005 [INFO ]
> W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   [bt] (5)
> /miniconda3/lib/python3.7/lib-dynload/_ctypes.cpython-37m-x86_64-linux-gnu.so(_ctypes_callproc+0x2ce)
> [0x7f37eeba91de] 2020-04-27 17:28:59,005 [INFO ]
> W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   [bt] (6)
> /miniconda3/lib/python3.7/lib-dynload/_ctypes.cpython-37m-x86_64-linux-gnu.so(+0x12c14)
> [0x7f37eeba9c14] 2020-04-27 17:28:59,005 [INFO ]
> W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   [bt] (7)
> /miniconda3/bin/python(_PyObject_FastCallKeywords+0x48b)
> [0x563d71b4218b] 2020-04-27 17:28:59,005 [INFO ]
> W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle -   [bt] (8)
> /miniconda3/bin/python(_PyEval_EvalFrameDefault+0x52cf)
> [0x563d71b91e8f] 2020-04-27 17:28:59,005 [INFO ]
> W-9003-b39b888fb4a3fa6cf83bb34a9-stdout
> com.amazonaws.ml.mms.wlm.WorkerLifeCycle -  2020-04-27 17:28:59,005
> [WARN ] W-9003-b39b888fb4a3fa6cf83bb34a9
> com.amazonaws.ml.mms.wlm.WorkerThread - Backend worker thread
> exception. java.lang.IllegalArgumentException: reasonPhrase contains
> one of the following prohibited characters: \r\n: Unable to load
> model: Unable to load model: invalid load key, '{'. [17:28:59]
> /workspace/src/learner.cc:334: Check failed: fi->Read(&mparam_,
> sizeof(mparam_)) == sizeof(mparam_) (25 vs. 136) : BoostLearner: wrong
> model format Stack trace:   [bt] (0)
> /miniconda3/xgboost/libxgboost.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x24)
> [0x7f37ce1cacb4]   [bt] (1)
> /miniconda3/xgboost/libxgboost.so(xgboost::LearnerImpl::Load(dmlc::Stream*)+0x4b5)
> [0x7f37ce266985]   [bt] (2)
> /miniconda3/xgboost/libxgboost.so(XGBoosterLoadModel+0x37)
> [0x7f37ce1bf417]   [bt] (3)
> /miniconda3/lib/python3.7/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c)
> [0x7f37ee993ec0]   [bt] (4)
> /miniconda3/lib/python3.7/lib-dynload/../../libffi.so.6(ffi_call+0x22d)
> [0x7f37ee99387d]   [bt] (5)
> /miniconda3/lib/python3.7/lib-dynload/_ctypes.cpython-37m-x86_64-linux-gnu.so(_ctypes_callproc+0x2ce)
> [0x7f37eeba91de]   [bt] (6)
> /miniconda3/lib/python3.7/lib-dynload/_ctypes.cpython-37m-x86_64-linux-gnu.so(+0x12c14)
> [0x7f37eeba9c14]   [bt] (7)
> /miniconda3/bin/python(_PyObject_FastCallKeywords+0x48b)
> [0x563d71b4218b]   [bt] (8)
> /miniconda3/bin/python(_PyEval_EvalFrameDefault+0x52cf)
> [0x563d71b91e8f]

标签: amazon-web-servicesmachine-learningxgboostamazon-sagemaker

解决方案


SageMaker Random Cut Forest 是内置算法库的一部分,不能部署在多模型端点 (MME) 中。内置算法目前无法部署到 MME。XGboost 是个例外,因为它有一个开源容器https://github.com/aws/sagemaker-xgboost-container

如果你真的需要将 RCF 部署到多模型端点,一个选择是找到一个相当相似的开源实现(例如rrcf看起来相当严肃:基于Guha 等人的同一篇论文,并且有 170 多个 github 星)并创建一个自定义 MME docker 容器。文档在这里,这里有一个很好的教程


推荐阅读