首页 > 解决方案 > 为什么这个索引状态是红色的:opendistro-ism-config

问题描述

我认为我什至没有接触过这个索引,但它使我的整个集群都处于红色状态。不知道它是什么或如何修复它,尝试添加另一个节点但没有工作。在索引管理视图中,我可以看到它是唯一的红色索引。问题指数为opendistro-ism-config。我尝试更改索引的副本数、添加节点等,但没有帮助。

编辑

正如@Val 所问,我添加了以下查询。我的索引保持红色状态,这会在我部署集群的 AWS 上向我发出垃圾邮件警报。我已经分配了索引,所以我从输出中删除了它们shard_sizes ,只留下了一个有问题的索引。我有4 x t2.small35 GiB SSD,集群中有足够的备用空间。这不是我的产品集群,所以还不错,但很烦人。

https://{{ES_DOMAIN}}/_cluster/allocation/explain?include_disk_info&include_yes_decisions
{
    "index": ".opendistro-ism-config",
    "shard": 1,
    "primary": true,
    "current_state": "unassigned",
    "unassigned_info": {
        "reason": "ALLOCATION_FAILED",
        "at": "2020-08-01T09:18:40.288Z",
        "failed_allocation_attempts": 5,
        "details": "failed shard on node [ex3PL3THRHmAxkvMjOwrQQ]: failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[.opendistro-ism-config][1]: obtaining shard lock timed out after 5000ms, previous lock details: [shard creation] trying to lock for [shard creation]]; ",
        "last_allocation_status": "no_valid_shard_copy"
    },
    "cluster_info": {
        "nodes": {
            "KnCBTiL1TZCGz1DNYfm9_A": {
                "node_name": "ef9116cc46563e2c73d12eb7a8887f4c",
                "least_available": {
                    "total_bytes": 36722737152,
                    "used_bytes": 2143232000,
                    "free_bytes": 34579505152,
                    "free_disk_percent": 94.2,
                    "used_disk_percent": 5.8
                },
                "most_available": {
                    "total_bytes": 36722737152,
                    "used_bytes": 2143232000,
                    "free_bytes": 34579505152,
                    "free_disk_percent": 94.2,
                    "used_disk_percent": 5.8
                }
            },
            "90rKZw_SSOSlOGWv_WyQQQ": {
                "node_name": "45cfd2c275112972c5e68e7e00295d45",
                "least_available": {
                    "total_bytes": 36722737152,
                    "used_bytes": 2144980992,
                    "free_bytes": 34577756160,
                    "free_disk_percent": 94.2,
                    "used_disk_percent": 5.8
                },
                "most_available": {
                    "total_bytes": 36722737152,
                    "used_bytes": 2144980992,
                    "free_bytes": 34577756160,
                    "free_disk_percent": 94.2,
                    "used_disk_percent": 5.8
                }
            },
            "2F_QTYueTs69Q7KhCped9w": {
                "node_name": "a8314d5f13c0043f8454997d973e8c03",
                "least_available": {
                    "total_bytes": 36722737152,
                    "used_bytes": 1957380096,
                    "free_bytes": 34765357056,
                    "free_disk_percent": 94.7,
                    "used_disk_percent": 5.3
                },
                "most_available": {
                    "total_bytes": 36722737152,
                    "used_bytes": 1957380096,
                    "free_bytes": 34765357056,
                    "free_disk_percent": 94.7,
                    "used_disk_percent": 5.3
                }
            },
            "8-oMtA69QvO3bKTAAUPeBw": {
                "node_name": "9c042bb3814270c16b4fba03ff85208d",
                "least_available": {
                    "total_bytes": 36722737152,
                    "used_bytes": 2140692480,
                    "free_bytes": 34582044672,
                    "free_disk_percent": 94.2,
                    "used_disk_percent": 5.8
                },
                "most_available": {
                    "total_bytes": 36722737152,
                    "used_bytes": 2140692480,
                    "free_bytes": 34582044672,
                    "free_disk_percent": 94.2,
                    "used_disk_percent": 5.8
                }
            }
        },
        "shard_sizes": {
            "[.opendistro-ism-config][2][r]_bytes": 56497,
            "[.opendistro-ism-config][0][p]_bytes": 53651,
            "[.opendistro-ism-config][0][r]_bytes": 53651,
            "[.opendistro-ism-config][4][p]_bytes": 33157,
            "[.opendistro-ism-config][2][p]_bytes": 56497
            }
        },
        "can_allocate": "no_valid_shard_copy",
        "allocate_explanation": "cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster",
        "node_allocation_decisions": [
            {
                "node_id": "2F_QTYueTs69Q7KhCped9w",
                "node_name": "a8314d5f13c0043f8454997d973e8c03",
                "node_decision": "no",
                "store": {
                    "found": false
                }
            },
            {
                "node_id": "8-oMtA69QvO3bKTAAUPeBw",
                "node_name": "9c042bb3814270c16b4fba03ff85208d",
                "node_decision": "no",
                "store": {
                    "found": false
                }
            },
            {
                "node_id": "90rKZw_SSOSlOGWv_WyQQQ",
                "node_name": "45cfd2c275112972c5e68e7e00295d45",
                "node_decision": "no",
                "store": {
                    "found": false
                }
            },
            {
                "node_id": "KnCBTiL1TZCGz1DNYfm9_A",
                "node_name": "ef9116cc46563e2c73d12eb7a8887f4c",
                "node_decision": "no",
                "store": {
                    "found": false
                }
            }
        ]
    }

标签: elasticsearch

解决方案


使您的集群再次工作的解决方法是手动重新路由分片。

问题原因:当它与主节点断开连接时,如果有一个主节点没有分配给该节点的副本,则通常会发生这种情况。因此,当重新加入集群时,节点上本地分配的分片副本无法释放以前使用的资源,此时主节点已经进行了 5 次尝试再次将分片分配给节点失败。

在 5 次不成功的分配尝试后,master 放弃并需要手动触发再次分配。

解决方案:运行以下命令以解决相同问题:

curl -XPOST 'localhost:9200/_cluster/reroute?retry_failed

推荐阅读