首页 > 解决方案 > AWS Sagemaker DeepAR 验证错误 不允许附加属性(“训练”出乎意料)

问题描述

我不知道问题是什么。这是代码:

estimator = sagemaker.estimator.Estimator(
    image_uri=image_name,
    sagemaker_session=sagemaker_session,
    role=role,
    train_instance_count=1,
    train_instance_type="ml.m5.large",
    base_job_name="deepar-stock",
    output_path=s3_output_path,
)

hyperparameters = {
    "time_freq": "24H",
    "epochs": "100",
    "early_stopping_patience": "10",
    "mini_batch_size": "64",
    "learning_rate": "5E-4",
    "context_length": str(context_length),
    "prediction_length": str(prediction_length),
    "likelihood": "gaussian",
}

estimator.set_hyperparameters(**hyperparameters)

%%time

estimator.fit(inputs=f"{s3_data_path}/train/")

当我尝试训练模型时,我得到以下错误(全部)。

------------------------------------------------------------------------

---
UnexpectedStatusException                 Traceback (most recent call last)
<timed eval> in <module>

/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in fit(self, inputs, wait, logs, job_name, experiment_config)
    681         self.jobs.append(self.latest_training_job)
    682         if wait:
--> 683             self.latest_training_job.wait(logs=logs)
    684 
    685     def _compilation_job_name(self):

/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in wait(self, logs)
   1626         # If logs are requested, call logs_for_jobs.
   1627         if logs != "None":
-> 1628             self.sagemaker_session.logs_for_job(self.job_name, wait=True, log_type=logs)
   1629         else:
   1630             self.sagemaker_session.wait_for_job(self.job_name)

/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in logs_for_job(self, job_name, wait, poll, log_type)
   3658 
   3659         if wait:
-> 3660             self._check_job_status(job_name, description, "TrainingJobStatus")
   3661             if dot:
   3662                 print()

/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in _check_job_status(self, job, desc, status_key_name)
   3218                 ),
   3219                 allowed_statuses=["Completed", "Stopped"],
-> 3220                 actual_status=status,
   3221             )
   3222 
UnexpectedStatusException: Error for Training job deepar-2021-07-31-22-25-54-110: Failed. Reason: ClientError: Unable to initialize the algorithm. Failed to validate input data configuration. (caused by ValidationError)

Caused by: Additional properties are not allowed ('training' was unexpected)

Failed validating 'additionalProperties' in schema:
    {'$schema': 'http://json-schema.org/draft-04/schema#',
     'additionalProperties': False,
     'anyOf': [{'required': ['train']}, {'required': ['state']}],
     'definitions': {'data_channel': {'properties': {'ContentType': {'enum': ['json',
                                                                              'json.gz',
                                                                              'parquet',
                                                                              'auto'],
                                                                     'type': 'string'},
                                                     'RecordWrapperType': {'enum': ['None'],

On instance:
    {'training': {'RecordWrapperType': 'None',
                  'S3DistributionType': 'FullyReplicated',
                  'TrainingInputMode': 'File'}}

这里说'training' was unexpected。我不知道为什么它'training'在最后一行说On instance:。我不知道如何解决这个问题。我查看了其他页面寻求帮助,但我找不到直接的答案。我知道我的数据结构正确。错误似乎与超参数有关,但我不确定。请帮忙!

标签: pythonamazon-sagemakerdeepar

解决方案


我只需要添加这行代码并将以下代码更改为如下所示。

data_channels = {"train": f"{s3_data_path}/train/"}

estimator.fit(inputs=data_channels)

推荐阅读