首页 > 解决方案 > ray.exceptions.RayTaskError(TuneError) HuggingFace+RayTune

问题描述

我正在使用带有 HuggingFace 的 Raytune 进行超参数调整,以下是我的代码片段:

from ray.tune.schedulers import PopulationBasedTraining
from ray.tune import uniform
from random import randint
scheduler = PopulationBasedTraining(
    mode = "max",
    metric='mean_accuracy',
    perturbation_interval=2,
    hyperparam_mutations={
        "weight_decay": lambda: uniform(0.0, 0.3),
        "learning_rate": lambda: uniform(1e-5, 5e-5),
        "per_gpu_train_batch_size": [16, 32, 64],
        "num_train_epochs": [2,3,4],
        "warmup_steps":lambda: randint(0, 500)
    }
)

best_trial = trainer.hyperparameter_search(
    direction="maximize",
    backend="ray",
    n_trials=4,
    keep_checkpoints_num=1,
    scheduler=scheduler)

但是,我不明白它给了我错误:

  [TuneError: ('Trials did not complete', \[_inner_53895_00000, _inner_53895_00001, _inner_53895_00002, _inner_53895_00003\])][1]

输出:[1]:https ://i.stack.imgur.com/1zmM7.png

标签: pythonmachine-learninghuggingface-transformersray-tune

解决方案


推荐阅读