首页 > 解决方案 > AI Platform Pipelines 实例无法部署

问题描述

我们正在使用 AI Platform Pipelines 来管理 GKE 集群上的 Kubeflow Pipelines 安装。但是,通过 UI 进行的典型部署过程似乎已经停止工作。当我尝试将管道实例部署到现有集群时,我遇到了错误:

 Failed to create CustomResourceDefinition.

{"metadata":{},"status":"Failure","message":"CustomResourceDefinition.apiextensions.k8s.io \"applications.app.k8s.io\" is invalid: [spec.versions[0].schema.openAPIV3Schema: Required value: schemas are required, spec.versions: Invalid value: []apiextensions.CustomResourceDefinitionVersion{apiextensions.CustomResourceDefinitionVersion{Name:\"v1beta1\", Served:false, Storage:false, Schema:(*apiextensions.CustomResourceValidation)(nil), Subresources:(*apiextensions.CustomResourceSubresources)(nil), AdditionalPrinterColumns:[]apiextensions.CustomResourceColumnDefinition(nil)}}: must have exactly one version marked as storage version, status.storedVersions: Invalid value: []string(nil): must have at least one stored version, metadata.annotations[api-approved.kubernetes.io]: Required value: protected groups must have approval annotation \"api-approved.kubernetes.io\", see https://github.com/kubernetes/enhancements/pull/1111]","reason":"Invalid","details":{"name":"applications.app.k8s.io","group":"apiextensions.k8s.io","kind":"CustomResourceDefinition","causes":[{"reason":"FieldValueRequired","message":"Required value: schemas are required","field":"spec.versions[0].schema.openAPIV3Schema"},{"reason":"FieldValueInvalid","message":"Invalid value: []apiextensions.CustomResourceDefinitionVersion{apiextensions.CustomResourceDefinitionVersion{Name:\"v1beta1\", Served:false, Storage:false, Schema:(*apiextensions.CustomResourceValidation)(nil), Subresources:(*apiextensions.CustomResourceSubresources)(nil), AdditionalPrinterColumns:[]apiextensions.CustomResourceColumnDefinition(nil)}}: must have exactly one version marked as storage version","field":"spec.versions"},{"reason":"FieldValueInvalid","message":"Invalid value: []string(nil): must have at least one stored version","field":"status.storedVersions"},{"reason":"FieldValueRequired","message":"Required value: protected groups must have approval annotation \"api-approved.kubernetes.io\", see https://github.com/kubernetes/enhancements/pull/1111","field":"metadata.annotations[api-approved.kubernetes.io]"}]},"code":422} 

我尝试过使用多个集群、两个单独的项目、两个单独的区域和三个不同版本的 GKE:1.18.17-gke.700、1.18.17-gke.1200 和 1.19.9-gke.1900。在所有情况下都会发生相同的错误。这些集群满足GCP 文档中列出的资源要求。

这里没有大量信息,但我不确定如何调试这个问题。如果我可以收集到其他有用的信息,请告诉我。我无法确定正在使用的 Kubeflow Pipelines 的版本,据我所知,直到创建实例之后才可见。

这是我应该与 GCP 支持人员讨论的问题吗?或者我应该尝试进一步挖掘是否有错误?我试图四处寻找上述失败消息中包含的一些特定错误,但没有找到太多。提到的拉取请求已经合并:https ://github.com/kubernetes/enhancements/pull/1111

标签: google-cloud-platformkubeflowkubeflow-pipelinesgoogle-cloud-ai-platform-pipelines

解决方案


推荐阅读