首页 > 解决方案 > Kubernetes - Ec2 - 获得 0/2 个节点的弹性搜索可用:2 内存不足

问题描述

正在研究 Kubernetes - 弹性搜索部署,

我遵循了 elastic.co 提供的文档(https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-deploy-elasticsearch.html

我的弹性 YAML 文件如下:

cat <<EOF | kubectl apply -f -
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: quickstart
spec:
  version: 7.8.0
  nodeSets:
  - name: default
    count: 1
    config:
      node.master: true
      node.data: true
      node.ingest: true
      node.store.allow_mmap: false
    podTemplate:
      spec:
        initContainers:
        - name: sysctl
          securityContext:
            privileged: true
          command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
        containers:
        - name: elasticsearch
          env:
          - name: ES_JAVA_OPTS
            value: -Xms2g -Xmx2g
          resources:
            requests:
              memory: 4Gi
              cpu: 0.5
            limits:
              memory: 4Gi
              cpu: 2
EOF

但是当我描述创建的 pod 时,我遇到了错误

Name:           quickstart-es-default-0
Namespace:      default
Priority:       0
Node:           <none>
Labels:         common.k8s.elastic.co/type=elasticsearch
                controller-revision-hash=quickstart-es-default-55759bb696
                elasticsearch.k8s.elastic.co/cluster-name=quickstart
                elasticsearch.k8s.elastic.co/config-hash=178912897
                elasticsearch.k8s.elastic.co/http-scheme=https
                elasticsearch.k8s.elastic.co/node-data=true
                elasticsearch.k8s.elastic.co/node-ingest=true
                elasticsearch.k8s.elastic.co/node-master=true
                elasticsearch.k8s.elastic.co/node-ml=true
                elasticsearch.k8s.elastic.co/statefulset-name=quickstart-es-default
                elasticsearch.k8s.elastic.co/version=7.8.0
                statefulset.kubernetes.io/pod-name=quickstart-es-default-0
Annotations:    co.elastic.logs/module: elasticsearch
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  StatefulSet/quickstart-es-default
Init Containers:
  elastic-internal-init-filesystem:
    Image:      docker.elastic.co/elasticsearch/elasticsearch:7.8.0
    Port:       <none>
    Host Port:  <none>
    Command:
      bash
      -c
      /mnt/elastic-internal/scripts/prepare-fs.sh
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:     100m
      memory:  50Mi
    Environment:
      POD_IP:     (v1:status.podIP)
      POD_NAME:  quickstart-es-default-0 (v1:metadata.name)
      POD_IP:     (v1:status.podIP)
      POD_NAME:  quickstart-es-default-0 (v1:metadata.name)
    Mounts:
      /mnt/elastic-internal/downward-api from downward-api (ro)
      /mnt/elastic-internal/elasticsearch-bin-local from elastic-internal-elasticsearch-bin-local (rw)
      /mnt/elastic-internal/elasticsearch-config from elastic-internal-elasticsearch-config (ro)
      /mnt/elastic-internal/elasticsearch-config-local from elastic-internal-elasticsearch-config-local (rw)
      /mnt/elastic-internal/elasticsearch-plugins-local from elastic-internal-elasticsearch-plugins-local (rw)
      /mnt/elastic-internal/probe-user from elastic-internal-probe-user (ro)
      /mnt/elastic-internal/scripts from elastic-internal-scripts (ro)
      /mnt/elastic-internal/transport-certificates from elastic-internal-transport-certificates (ro)
      /mnt/elastic-internal/unicast-hosts from elastic-internal-unicast-hosts (ro)
      /mnt/elastic-internal/xpack-file-realm from elastic-internal-xpack-file-realm (ro)
      /usr/share/elasticsearch/config/http-certs from elastic-internal-http-certificates (ro)
      /usr/share/elasticsearch/config/transport-remote-certs/ from elastic-internal-remote-certificate-authorities (ro)
      /usr/share/elasticsearch/data from elasticsearch-data (rw)
      /usr/share/elasticsearch/logs from elasticsearch-logs (rw)
  sysctl:
    Image:      docker.elastic.co/elasticsearch/elasticsearch:7.8.0
    Port:       <none>
    Host Port:  <none>
    Command:
      sh
      -c
      sysctl -w vm.max_map_count=262144
    Environment:
      POD_IP:     (v1:status.podIP)
      POD_NAME:  quickstart-es-default-0 (v1:metadata.name)
    Mounts:
      /mnt/elastic-internal/downward-api from downward-api (ro)
      /mnt/elastic-internal/elasticsearch-config from elastic-internal-elasticsearch-config (ro)
      /mnt/elastic-internal/probe-user from elastic-internal-probe-user (ro)
      /mnt/elastic-internal/scripts from elastic-internal-scripts (ro)
      /mnt/elastic-internal/unicast-hosts from elastic-internal-unicast-hosts (ro)
      /mnt/elastic-internal/xpack-file-realm from elastic-internal-xpack-file-realm (ro)
      /usr/share/elasticsearch/bin from elastic-internal-elasticsearch-bin-local (rw)
      /usr/share/elasticsearch/config from elastic-internal-elasticsearch-config-local (rw)
      /usr/share/elasticsearch/config/http-certs from elastic-internal-http-certificates (ro)
      /usr/share/elasticsearch/config/transport-certs from elastic-internal-transport-certificates (ro)
      /usr/share/elasticsearch/config/transport-remote-certs/ from elastic-internal-remote-certificate-authorities (ro)
      /usr/share/elasticsearch/data from elasticsearch-data (rw)
      /usr/share/elasticsearch/logs from elasticsearch-logs (rw)
      /usr/share/elasticsearch/plugins from elastic-internal-elasticsearch-plugins-local (rw)
Containers:
  elasticsearch:
    Image:       docker.elastic.co/elasticsearch/elasticsearch:7.8.0
    Ports:       9200/TCP, 9300/TCP
    Host Ports:  0/TCP, 0/TCP
    Limits:
      cpu:     2
      memory:  4Gi
    Requests:
      cpu:      500m
      memory:   4Gi
    Readiness:  exec [bash -c /mnt/elastic-internal/scripts/readiness-probe-script.sh] delay=10s timeout=5s period=5s #success=1 #failure=3
    Environment:
      ES_JAVA_OPTS:              -Xms2g -Xmx2g
      POD_IP:                     (v1:status.podIP)
      POD_NAME:                  quickstart-es-default-0 (v1:metadata.name)
      PROBE_PASSWORD_PATH:       /mnt/elastic-internal/probe-user/elastic-internal-probe
      PROBE_USERNAME:            elastic-internal-probe
      READINESS_PROBE_PROTOCOL:  https
      HEADLESS_SERVICE_NAME:     quickstart-es-default
      NSS_SDB_USE_CACHE:         no
    Mounts:
      /mnt/elastic-internal/downward-api from downward-api (ro)
      /mnt/elastic-internal/elasticsearch-config from elastic-internal-elasticsearch-config (ro)
      /mnt/elastic-internal/probe-user from elastic-internal-probe-user (ro)
      /mnt/elastic-internal/scripts from elastic-internal-scripts (ro)
      /mnt/elastic-internal/unicast-hosts from elastic-internal-unicast-hosts (ro)
      /mnt/elastic-internal/xpack-file-realm from elastic-internal-xpack-file-realm (ro)
      /usr/share/elasticsearch/bin from elastic-internal-elasticsearch-bin-local (rw)
      /usr/share/elasticsearch/config from elastic-internal-elasticsearch-config-local (rw)
      /usr/share/elasticsearch/config/http-certs from elastic-internal-http-certificates (ro)
      /usr/share/elasticsearch/config/transport-certs from elastic-internal-transport-certificates (ro)
      /usr/share/elasticsearch/config/transport-remote-certs/ from elastic-internal-remote-certificate-authorities (ro)
      /usr/share/elasticsearch/data from elasticsearch-data (rw)
      /usr/share/elasticsearch/logs from elasticsearch-logs (rw)
      /usr/share/elasticsearch/plugins from elastic-internal-elasticsearch-plugins-local (rw)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  elasticsearch-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  elasticsearch-data-quickstart-es-default-0
    ReadOnly:   false
  downward-api:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.labels -> labels
  elastic-internal-elasticsearch-bin-local:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  elastic-internal-elasticsearch-config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  quickstart-es-default-es-config
    Optional:    false
  elastic-internal-elasticsearch-config-local:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  elastic-internal-elasticsearch-plugins-local:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  elastic-internal-http-certificates:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  quickstart-es-http-certs-internal
    Optional:    false
  elastic-internal-probe-user:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  quickstart-es-internal-users
    Optional:    false
  elastic-internal-remote-certificate-authorities:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  quickstart-es-remote-ca
    Optional:    false
  elastic-internal-scripts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      quickstart-es-scripts
    Optional:  false
  elastic-internal-transport-certificates:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  quickstart-es-transport-certificates
    Optional:    false
  elastic-internal-unicast-hosts:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      quickstart-es-unicast-hosts
    Optional:  false
  elastic-internal-xpack-file-realm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  quickstart-es-xpack-file-realm
    Optional:    false
  elasticsearch-logs:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      
    SizeLimit:   <unset>
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  <unknown>          default-scheduler  running "VolumeBinding" filter plugin for pod "quickstart-es-default-0": pod has unbound immediate PersistentVolumeClaims
  Warning  FailedScheduling  <unknown>          default-scheduler  running "VolumeBinding" filter plugin for pod "quickstart-es-default-0": pod has unbound immediate PersistentVolumeClaims
  Warning  FailedScheduling  20m (x3 over 21m)  default-scheduler  0/2 nodes are available: 2 Insufficient memory.

问题 2: 我创建了两个 ec2 服务器(t2 large)。师傅和工人。我为两台服务器使用 300 GB 硬盘。

我有以下 pv

NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv0001 200Gi RWO Retain Available

我正在使用以下代码为我的弹性创建索赔。

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: quickstart
spec:
version: 7.8.0
nodeSets:
name: default count: 1 config: node.master: true node.data: true node.ingest: true node.store.allow_mmap: false volumeClaimTemplates:
metadata: name: elasticsearch-data spec: accessModes:
ReadWriteOnce resources: requests: storage: 200Gi storageClassName: gp2 EOF

存储类:(我创建并设为默认)

NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION 
gp2 (default) kubernetes.io/aws-ebs Delete Immediate false

Kubectl 获取光伏

Labels:          <none>
Annotations:     Finalizers:  [kubernetes.io/pv-protection]
StorageClass:    
Status:          Available
Claim:           
Reclaim Policy:  Retain
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        200Gi
Node Affinity:   <none>

kubectl 获取 pvc

Namespace:     default
StorageClass:  gp2
Status:        Pending
Volume:        
Labels:        common.k8s.elastic.co/type=elasticsearch
               elasticsearch.k8s.elastic.co/cluster-name=quickstart
               elasticsearch.k8s.elastic.co/statefulset-name=quickstart-es-default
Annotations:   volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Mounted By:    quickstart-es-default-0
Events:
  Type     Reason              Age                 From                         Message
  ----     ------              ----                ----                         -------
  Warning  ProvisioningFailed  61s (x18 over 24m)  persistentvolume-controller  Failed to provision volume with StorageClass "gp2": Failed to get AWS Cloud Provider. GetCloudProvider returned <nil> instead

但是出现以下错误: 为 pod“quickstart-es-default-0”运行“VolumeBinding”过滤器插件:pod has unbound immediate PersistentVolumeClaims

我的卷在 Ec2 EBS 中

标签: elasticsearchkubernetes

解决方案


根据来自 pod 的日志,您需要修复 2 个问题


资源

Warning  FailedScheduling  20m (x3 over 21m)  default-scheduler  0/2 nodes are available: 2 Insufficient memory.

在您提供的文档中,指定您需要至少 2GiB 的内存,您应该尝试在限制和请求中将请求资源从 4Gi 更改为 2Gi。正如 aws文档中提到的t2.large 有 2vCPU 和 8GiB 内存,因此对于您当前的请求,它几乎是 vms 的所有资源。


卷绑定

Warning  FailedScheduling  <unknown>          default-scheduler  running "VolumeBinding" filter plugin for pod "quickstart-es-default-0": pod has unbound immediate PersistentVolumeClaims

如果你kubectl describe在 pv 和 pvc 上做一个,你应该能够看到更多关于为什么它不能被绑定的细节。

我认为这是因为没有默认存储类。

文档中所述

根据安装方法,您的 Kubernetes 集群可能会使用标记为默认的现有 StorageClass 进行部署。然后,此默认 StorageClass 用于为不需要任何特定存储类的 PersistentVolumeClaims 动态配置存储。有关详细信息,请参阅PersistentVolumeClaim 文档

您可以检查您是否有默认存储类

kubectl get storageclass

可用于使您的存储类成为默认存储类的命令。

kubectl patch storageclass <name_of_storageclass> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

关于弹性讨论有相关问题。


编辑

引用自文档

Retain 回收策略允许手动回收资源。当 PersistentVolumeClaim 被删除时,PersistentVolume 仍然存在,该卷被认为是“已释放”。但它还不能用于另一个索赔,因为前一个索赔人的数据仍在卷上。管理员可以通过以下步骤手动回收卷。

  • 删除持久卷。删除 PV 后,外部基础设施(如 AWS EBS、GCE PD、Azure 磁盘或 Cinder 卷)中的关联存储资产仍然存在。
  • 相应地手动清理相关存储资产上的数据。
  • 手动删除关联的存储资产,或者如果您想重复使用相同的存储资产,请使用存储资产定义创建一个新的 PersistentVolume。

所以请尝试删除您的 pv 和 pvc 并重新创建它们。


推荐阅读