首页 > 解决方案 > 从站上的 Redis 哨兵从站错误链接状态

问题描述

我尝试使用 redis sentinel 进行复制,在带有 redis 3 的开发服务器上所有东西都可以正常工作,但是当我使用 redis 5 时在生产中我遇到了问题。首先认为我开始使用replicaofin config slave 进行复制,然后我配置 sentinel

sentinel down-after-milliseconds mymaster 15000
sentinel failover-timeout mymaster 20000

哨兵发现了主人,但没有从redis的奴隶,然后我尝试 sentinel known-slave mymaster SLAVE-IP 6379 在更改后手动将奴隶添加到哨兵我重新启动哨兵,然后将奴隶改为主人并使旧主人损坏 ,因为master-link-status = err

SENTINEL failover mymaster
(error) NOGOODSLAVE No suitable slave to promote

redis 之间没有哨兵复制工作正常

redis-slave 配置

protected-mode yes
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 300
daemonize yes
supervised no
pidfile "/var/run/redis/redis-server.pid"
loglevel notice
logfile "/var/log/redis/redis-server.log"
databases 16
always-show-logo yes
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename "dump.rdb"
dir "/var/lib/redis"
replica-serve-stale-data yes
replica-read-only no
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
replica-priority 100
maxmemory 2000mb
maxmemory-policy allkeys-lru
lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
replica-lazy-flush no
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
aof-use-rdb-preamble yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
stream-node-max-bytes 4096
stream-node-max-entries 100
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit replica 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
dynamic-hz yes
aof-rewrite-incremental-fsync yes
rdb-save-incremental-fsync yes

redis 主配置

protected-mode yes
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 60
daemonize yes
supervised no
pidfile "/var/run/redis/redis-server.pid"
loglevel notice
logfile "/var/log/redis/redis-server.log"
databases 16
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename "dump.rdb"
dir "/var/lib/redis"
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
replica-priority 100
maxmemory 4000mb
maxmemory-policy allkeys-lru
appendonly no
appendfilename "appendonly.aof"
appendfsync no
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit replica 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes

标签: redissentinel

解决方案


有类似的问题。我所做的是向所有 Sentinel 实例发送一个 SENTINEL RESET 命令。一个接一个,实例之间至少等待 30 秒。这修复了“主链接状态”并让故障转移发生。

$ src/redis-cli -h 192.168.1.153 -p 26379 
> SENTINEL RESET mymaster
(wait 30 seconds)

$ src/redis-cli -h 192.168.1.154 -p 26379 
> SENTINEL RESET mymaster
(wait 30 seconds)

$ src/redis-cli -h 192.168.1.155 -p 26379 
> SENTINEL RESET mymaster

> sentinel slaves mymaster
    2) "192.168.1.155:6379"
   31) "master-link-status"
   32) "ok"

    2) "192.168.1.154:6379"       
   31) "master-link-status"
   32) "ok"

-> so that seems fine.
192.168.1.155:26379> sentinel failover mymaster
OK
-> finally!

推荐阅读