首页 > 解决方案 > Python AWS-Unicode错误算法错误导致:'ascii'编解码器无法对位置1-2中的字符进行编码:序数不在范围内(128)

问题描述

我正在学习在使用 Python 3.6 的 AWS Sagemaker 上使用机器学习模型进行训练和转换。

我能够成功训练线性学习器模型,但在转换过程中,出现错误:

UnexpectedStatusException:转换作业 linear-learner-2019-08-30-11-22-02-821 出错:失败。原因:ClientError:有关更多信息,请参阅作业日志

哪个映射到 CloudWatch 日志中的此错误

算法错误:(由 UnicodeEncodeError 引起)
引起:“ascii”编解码器无法对位置 1-2 中的字符进行编码:序数不在范围内(128)

我正在使用的代码是这个
创建和训练模型

import boto3
import sagemaker

sess = sagemaker.Session()

linear = sagemaker.estimator.Estimator(container,
                                       role, 
                                       train_instance_count=1, 
                                       train_instance_type='ml.c4.xlarge',
                                       output_path=output_location,
                                       sagemaker_session=sess)
linear.set_hyperparameters(feature_dim=18,
                           predictor_type='regressor')

linear.fit({'train': s3_train_data})

为测试数据创建输入和输出 S3 位置

batch_input ='s3://{}/{}/test/examples'.format(bucket, prefix) # The location of the test dataset   
batch_output = 's3://{}/{}/batch-inference'.format(bucket, prefix) # The location to store the results of the batch transform job

print(batch_input)
print(batch_output)

转换测试数据

housing_test=strat_test_set
housing_test_inputs =full_pipeline.transform (housing_test)
housing_test_inputs=np.float32(housing_test_inputs)
housing_test_labels=strat_test_set['median_house_value'].values
housing_test_labels=np.float32(housing_test_labels)

检查特征数量的一致性

print(housing_test_labels.shape)
print(housing_test_inputs.shape)
print(housing_labels.shape)
print(housing_inputs.shape)

从上面返回的值是

(4128,)
(4128, 18)
(16512,)
(16512, 18)

测试数据上传到 S3

buf = io.BytesIO()
smac.write_numpy_to_dense_tensor(buf, housing_test_inputs, housing_test_labels)
buf.seek(0)
key = 'examples'
boto3.resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'test', key)).upload_fileobj(buf)
s3_test_data = 's3://{}/{}/test/{}'.format(bucket, prefix, key)
print('uploaded test data location: {}'.format(s3_test_data))

预测值

transformer = linear.transformer(instance_count=1, instance_type='ml.m4.xlarge', output_path=batch_output)

transformer.transform(data=batch_input, data_type='S3Prefix', content_type='text/csv', split_type='Line')

transformer.wait()

我在控制台收到的错误是 -

UnexpectedStatusException Traceback (最近一次调用最后一次) in () 3 transformer.transform(data=batch_input, data_type='S3Prefix', content_type='text/csv', split_type='Line') 4 ----> 5 transformer.wait ()

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/transformer.py in wait(self) 227 """Placeholder docstring""" 228 self._ensure_last_transform_job() --> 229 self.latest_transform_job .wait() 230 231 def _ensure_last_transform_job(self):

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/transformer.py in wait(self) 344 345 def wait(self): --> 346 self.sagemaker_session.wait_for_transform_job(self.job_name) 347 348 @静态方法

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/session.py 在 wait_for_transform_job(self, job, poll) 1050 """ 1051 desc = _wait_until(lambda: _transform_job_status(self.sagemaker_client, job ), poll) -> 1052 self._check_job_status(job, desc, "TransformJobStatus") 1053 return desc 1054

~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/session.py in _check_job_status(self, job, desc, status_key_name) 1077), 1078 allowed_statuses=["Completed", "Stopped"], -> 1079 实际状态=状态,1080 ) 1081

UnexpectedStatusException:转换作业 linear-learner-2019-08-30-11-22-02-821 出错:失败。原因:ClientError:有关更多信息,请参阅作业日志

有人可以建议这个错误的根本原因是什么以及如何解决这个问题?

标签: pythonunicodecharacter-encodingamazon-sagemaker

解决方案


推荐阅读