首页 > 解决方案 > GKE 集群的 velero 备份失败

问题描述

我正在尝试velero创建GKE Cluster. 我安装成功GKE Cluster如下图

   $ kubectl get deployment/velero --namespace velero
   NAME     READY   UP-TO-DATE   AVAILABLE   AGE
   velero   1/1     1            1           43h 

   $ kubectl get pods --namespace velero
   NAME                      READY   STATUS    RESTARTS    AGE
   velero-847c69f497-hwv6l   1/1     Running     0          43h  

我执行了下面的命令来启动备份

  $ velero backup create cluster1-backup --include-namespaces default --snapshot-volumes
  Backup request "cluster1-backup" submitted successfully.
  Run `velero backup describe cluster1-backup` or `velero backup logs cluster1-backup` for more details.

看起来备份过程失败了

  $ velero backup describe cluster1-backup
   Name:         cluster1-backup
   Namespace:    velero
   Labels:       velero.io/storage-location=default
   Annotations:  velero.io/source-cluster-k8s-gitversion=v1.15.12-gke.20
   velero.io/source-cluster-k8s-major-version=1
   velero.io/source-cluster-k8s-minor-version=15+

   Phase:  Failed (run `velero backup logs cluster1-backup` for more information)

   Errors:    0
   Warnings:  0

   Namespaces:
   Included:  default
   Excluded:  <none>

   Resources:
   Included:        *
   Excluded:        <none>
   Cluster-scoped:  auto
   Label selector:  <none>
   Storage Location:  default
   Velero-Native Snapshot PVs:  true
   TTL:  720h0m0s
   Hooks:  <none>
   Backup Format Version:  1.1.0

   Started:    2020-10-05 09:57:12 +0000 UTC
   Completed:  <n/a>

   Expiration:  2020-11-04 09:57:12 +0000 UTC
   Velero-Native Snapshots: <none included>

  $ velero get backups
  NAME              STATUS   ERRORS   WARNINGS   CREATED  EXPIRES   STORAGE LOCATION   SELECTOR
 cluster1-backup    Failed   0        0     2020-10-05 09:57:12 +0000 UTC   29d default        <none>

日志显示以下内容

$ velero backup logs cluster1-backup
An error occurred: timed out waiting for download URL

我在public GKE Cluster仅 启用到 35.235.240.0/20的情况下SharedVPC使用。Master Authorized Networks有什么建议可以解决这个问题吗?

标签: kubernetesbackupgoogle-kubernetes-enginevelero

解决方案


现在问题已解决

在日志中看到以下错误

 kubectl logs deployment/velero -n velero
 

time="2020-10-05T13:41:19Z" level=error msg="Error getting backup store for this location" backupLocation=default controller=backup-sync error="backup storage location's bucket name \"gs://bucketname/\" must not contain a '/' (if using a prefix, put it in the 'Prefix' field instead)" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/persistence/object_store.go:110" error.function=github.com/vmware-tanzu/velero/pkg/persistence.NewObjectBackupStore logSource="pkg/controller/backup_sync_controller.go:168"

创建环境变量时,桶名后有一个“/”

似乎在创建环境变量时,我们不必在环境变量中添加“gs://”。

            BUCKET=bucketname

如果存储桶不存在,则创建存储桶,如下所示

  gsutil mb gs://$BUCKET/

安装velero服务器时,命令中的bucket名称前不要加gs://,velero install如下图

 velero install --provider gcp --plugins velero/velero-plugin-for-gcp:v1.1.0 --bucket $BUCKET  --secret-file ./credentials-velero

BUCKET=bucketname

$ velero backup describe backup-test-ns
Name:         backup-test-ns
Namespace:    velero
Labels:       <none>
Annotations:  <none>

Phase:  New

Errors:    0
Warnings:  0

Namespaces:
   Included:  backup-test
   Excluded:  <none>

 Resources:
    Included:        *
    Excluded:        <none>
    Cluster-scoped:  auto

 Label selector:  <none>

 Storage Location:

 Velero-Native Snapshot PVs:  auto

 TTL:  720h0m0s

 Hooks:  <none>

 Backup Format Version:

 Started:    <n/a>
 Completed:  <n/a>

 Expiration:  <nil>

 Velero-Native Snapshots: <none included>

在尝试新安装之前,您可能需要删除现有velero安装。要卸载velero使用以下命令

      kubectl delete namespace -n velero

推荐阅读