首页 > 解决方案 > Kubernetes 上的 Spark:spark-local-dir 错误:已经存在/不唯一

问题描述

为了正确设置本地目录,我正在努力理解 Spark 文档。

设置:

我通过 Sparkoperator 方法在 Kubernetes 上运行 Spark 3.1.2。执行器 Pod 的数量因集群上的作业大小和可用资源而异。一个典型的情况是,我以 20 个请求的执行者开始工作,但 3 个 Pod 仍处于挂起状态,并用 17 个执行者完成工作。

基础问题:

我在错误“节点资源不足:临时存储”中运行。由于大量数据溢出到通过empty-dirkubernetes 节点创建的默认本地目录中。

这是一个已知问题,应通过将 指向已local-dir安装的永久卷来解决。

我试图接近,但两者都不起作用:

方法一:

按照文档https://spark.apache.org/docs/latest/running-on-kubernetes.html#local-storage我将以下选项添加到 spark-config

"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.claimName": "tmp-spark-spill"
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.storageClass": "csi-rbd-sc"
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.sizeLimit": "3000Gi"
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.mount.path": ="/spill-data"
"spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.mount.readOnly": "false"

完整的 yaml 看起来像

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: job1
  namespace: spark
spec:
  serviceAccount: spark
  type: Python
  pythonVersion: "3"
  mode: cluster
  image: "xxx/spark-py:app-3.1.2"
  imagePullPolicy: Always
  mainApplicationFile: local:///opt/spark/work-dir/nfs/06_dwh_core/jobs/job1/main.py
  sparkVersion: "3.0.0"
  restartPolicy:
    type: OnFailure
    onFailureRetries: 0
    onFailureRetryInterval: 10
    onSubmissionFailureRetries: 0
    onSubmissionFailureRetryInterval: 20
  sparkConf:
    "spark.default.parallelism": "400"
    "spark.sql.shuffle.partitions": "400"
    "spark.serializer": "org.apache.spark.serializer.KryoSerializer"
    "spark.sql.debug.maxToStringFields": "1000"
    "spark.ui.port": "4045"
    "spark.driver.maxResultSize": "0" 
    "spark.kryoserializer.buffer.max": "512"
    "spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.claimName": "tmp-spark-spill"
    "spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.storageClass": "csi-rbd-sc"
    "spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.sizeLimit": "3000Gi"
    "spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.mount.path": ="/spill-data"
    "spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.mount.readOnly": "false"
  driver:
    cores: 1
    memory: "20G"
    labels:
      version: 3.1.2
    serviceAccount: spark
    volumeMounts:
      - name: nfs
        mountPath: /opt/spark/work-dir/nfs
  executor:
    cores: 20
    instances: 20
    memory: "150G"
    labels:
      version: 3.0.0
    volumeMounts:
      - name: nfs
        mountPath: /opt/spark/work-dir/nfs
  volumes:
    - name: nfs
      nfs:
        server: xxx
        path: /xxx
        readOnly: false

问题一:

这会导致错误提示 pvc 已经存在并且它只会有效地创建一个执行程序。

io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://kubernetes.default.svc/api/v1/namespaces/spark-poc/persistentvolumeclaims. Message: persistentvolumeclaims "tmp-spark-spill" already exists. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, kind=persistentvolumeclaims, name=tmp-spark-spill, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=persistentvolumeclaims "tmp-spark-spill" already exists, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=AlreadyExists, status=Failure, additionalProperties={}).

我是否必须为每个执行者定义这个本地目录声明?有点儿

 "spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-1.options.claimName": "tmp-spark-spill"
.
.
.
 "spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-2.options.claimName": "tmp-spark-spill"
.
.
.
 "spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-3.options.claimName": "tmp-spark-spill"
.
.
.

但是,如果我有不断变化的执行者数量,我该如何动态地做到这一点?它不会自动从执行程序配置中获取吗?

方法二:

我自己创建了一个 pvc 将其安装为卷并将本地目录设置为 spark config 参数

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-spark-spill
  namespace: spark-poc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 3000Gi
  storageClassName: csi-rbd-sc
  volumeMode: Filesystem

安装到执行器上,例如

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: job1
  namespace: spark
spec:
  serviceAccount: spark
  type: Python
  pythonVersion: "3"
  mode: cluster
  image: "xxx/spark-py:app-3.1.2"
  imagePullPolicy: Always
  mainApplicationFile: local:///opt/spark/work-dir/nfs/06_dwh_core/jobs/job1/main.py
  sparkVersion: "3.0.0"
  restartPolicy:
    type: OnFailure
    onFailureRetries: 0
    onFailureRetryInterval: 10
    onSubmissionFailureRetries: 0
    onSubmissionFailureRetryInterval: 20
  sparkConf:
    "spark.default.parallelism": "400"
    "spark.sql.shuffle.partitions": "400"
    "spark.serializer": "org.apache.spark.serializer.KryoSerializer"
    "spark.sql.debug.maxToStringFields": "1000"
    "spark.ui.port": "4045"
    "spark.driver.maxResultSize": "0" 
    "spark.kryoserializer.buffer.max": "512"
    "spark.local.dir": "/spill"
  driver:
    cores: 1
    memory: "20G"
    labels:
      version: 3.1.2
    serviceAccount: spark
    volumeMounts:
      - name: nfs
        mountPath: /opt/spark/work-dir/nfs
  executor:
    cores: 20
    instances: 20
    memory: "150G"
    labels:
      version: 3.0.0
    volumeMounts:
      - name: nfs
        mountPath: /opt/spark/work-dir/nfs
      - name: pvc-spark-spill
        mountPath: /spill
  volumes:
    - name: nfs
      nfs:
        server: xxx
        path: /xxx
        readOnly: false
    - name: pvc-spark-spill
      persistentVolumeClaim:
        claimName: pvc-spark-spill

第 2 期

此方法失败,并显示/spill必须唯一的消息。

 Message: Pod "job1-driver" is invalid: spec.containers[0].volumeMounts[7].mountPath: Invalid value: "/spill": must be unique.

总结和问题

似乎每个执行者都需要他自己的 pvc 或至少他自己的 pvc 文件夹来溢出他的数据。但是我该如何正确配置呢?我没有从文档中得到它

感谢您的帮助亚历克斯

标签: apache-sparkkubernetespyspark

解决方案


spark 应该能够通过设置 claimName= OnDemand 来动态创建 PVC。为同一 pvc 附加多个 pod 将在 Kubernetes 端出现问题

附上文档截图 配置截图

您可以查看将在 kubenetes 管理卷之外工作的 nfs 共享。示例 https://www.datamechanics.co/blog-post/apache-spark-3-1-release-spark-on-kubernetes-is-now-ga


推荐阅读