首页 > 解决方案 > EKS 集群中 AWS EBS 的 HDFS Namenode 格式问题

问题描述

我有带有 EBS 存储类/卷的 EKS 集群。我的 EBS 存储(作为持久卷/pvc)运行良好的弹性搜索集群。我正在尝试使用 statefulset 部署 hdfs namenode 映像(bde2020/hadoop-namenode),但它总是给我以下错误:

2020-05-09 08:59:02,400 INFO util.GSet: capacity      = 2^15 = 32768 entries
2020-05-09 08:59:02,415 INFO common.Storage: Lock on /hadoop/dfs/name/in_use.lock acquired by nodename 87@hdfs-name-0.hdfs-name.pulse.svc.cluster.local
2020-05-09 08:59:02,417 WARN namenode.FSNamesystem: Encountered exception loading fsimage
java.io.IOException: NameNode is not formatted.
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:252)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:720)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:648)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:710)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:953)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:926)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1692)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1759)

我检查了这个 iameg 的 run.sh,如果 dir 为空,它似乎正在格式化 namenode。但这在可能的情况下不起作用(EBS 作为 PVC)。任何帮助将不胜感激。

我的部署 yml 是:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: hdfs-name
  labels:
    component: hdfs-name
spec:
  serviceName: hdfs-name
  replicas: 1
  selector:
    matchLabels:
      component: hdfs-name
  template:
    metadata:
      labels:
        component: hdfs-name
    spec:
      containers:
      - name: hdfs-name
        image: bde2020/hadoop-namenode
        env:
        - name: CLUSTER_NAME
          value: hdfs-k8s
        ports:
        - containerPort: 8020
          name: nn-rpc
        - containerPort: 50070
          name: nn-web
        volumeMounts:
        - name: hdfs-name-pv-claim
          mountPath: /hadoop/dfs/name 
  volumeClaimTemplates:
  - metadata:
      name: hdfs-name-pv-claim
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: ebs
      resources:
        requests:
          storage: 1Gi

标签: kuberneteshdfs

解决方案


使用 ebs 存储类,会自动创建 lost+found 文件夹。因此,不会出现名称节点格式。
让 initcontainer 删除 lost+found 文件夹似乎有效。

initContainers:
  - name: delete-lost-found
    image: busybox
    command: ["sh", "-c", "rm -rf /hadoop/dfs/name/lost+found"]
    volumeMounts:
    - name: hdfs-name-pv-claim
      mountPath: /hadoop/dfs/name 

推荐阅读