首页 > 解决方案 > Google AI Platform: The replica master 0 exited with a non-zero status of 127

问题描述

There's a similar SO question: Tensorflow on ML Engine: The replica master 0 exited with a non-zero status of 1

But here, I'm encountering error "127" instead. Similar to that question, I launched a pytorch custom training container on AI Platform (previously ML Engine) and after about 2 minutes I get the error message "The replica master 0 exited with a non-zero status of 127".

The documentation here doesn't quite say what "127" means: https://cloud.google.com/ai-platform/training/docs/troubleshooting#understanding_training_application_return_codes

Anyone have an idea?

标签: google-cloud-ml

解决方案


推荐阅读