Openshift上的Redis主故障转移



我已经安装了http://rediscart-claytondev.rhcloud.com/build/manifest/redis-2.8墨盒,并将其扩展到3个档位。REDIS_SENTINEL_QUORUM在每个上被设置为2。在我将~/redis/bin/control从:更改后,哨兵启动正常

erb conf/redis-sentinel.conf.erb | redis-server conf - --sentinel

至:

erb conf/redis-sentinel.conf.erb > conf/redis-sentinel.conf
redis-server conf/redis-sentinel.conf --sentinel

现在,重新启动墨盒后,它看起来很好,直到我杀死大师。奴隶们坐在那里数着他们最后一次看到它的秒数…他们(其中一个)的日志上写着:

[42612] 25 Feb 14:49:36.548 # Sentinel runid is 88269647396c4fcd07e8a1e6030eb01a7b8adcb3
[42612] 25 Feb 14:49:36.548 # +monitor master 54edcc79ca2895e4a300021f 127.1.1.45 38846 quorum 2
[42612] 25 Feb 14:49:36.548 # +monitor master 54edac43ca2895e4a30001e5 127.1.1.45 38961 quorum 2
[42612] 25 Feb 14:49:36.548 # +monitor master 54edac2cca2895e4a30001cd 127.1.1.46 38821 quorum 2
[42612] 25 Feb 14:49:36.550 * +slave slave 127.1.1.45:16379 127.1.1.45 16379 @ 54edac2cca2895e4a30001cd 127.1.1.46 38821
[42605] 25 Feb 14:49:36.603 * MASTER <-> SLAVE sync: receiving 18 bytes from master
[42605] 25 Feb 14:49:36.603 * MASTER <-> SLAVE sync: Flushing old data
[42605] 25 Feb 14:49:36.603 * MASTER <-> SLAVE sync: Loading DB in memory
[42605] 25 Feb 14:49:36.603 * MASTER <-> SLAVE sync: Finished with success
[42612] 25 Feb 14:49:37.700 * +sentinel sentinel 127.1.1.45:26379 127.1.1.45 26379 @ 54edac2cca2895e4a30001cd 127.1.1.46 38821
[42612] 25 Feb 14:49:37.748 * +sentinel sentinel 127.1.1.46:26379 127.1.1.46 26379 @ 54edac2cca2895e4a30001cd 127.1.1.46 38821
[42612] 25 Feb 14:49:46.632 # +sdown slave 127.1.1.45:16379 127.1.1.45 16379 @ 54edac2cca2895e4a30001cd 127.1.1.46 38821
[42612] 25 Feb 14:49:47.735 # +sdown sentinel 127.1.1.45:26379 127.1.1.45 26379 @ 54edac2cca2895e4a30001cd 127.1.1.46 38821
[42612] 25 Feb 14:49:47.793 # +sdown sentinel 127.1.1.46:26379 127.1.1.46 26379 @ 54edac2cca2895e4a30001cd 127.1.1.46 38821
[42612] 25 Feb 14:50:06.596 # +sdown master 54edcc79ca2895e4a300021f 127.1.1.45 38846
[42612] 25 Feb 14:50:06.596 # +sdown master 54edac43ca2895e4a30001e5 127.1.1.45 38961
[42605] 25 Feb 14:51:17.914 # Connection with master lost.
[42605] 25 Feb 14:51:17.914 * Caching the disconnected master state.
[42605] 25 Feb 14:51:18.649 * Connecting to MASTER 54edac2cca2895e4a30001cd-redis.ose.dr.myriadpayments.co.uk:38821
[42605] 25 Feb 14:51:18.650 * MASTER <-> SLAVE sync started
[42605] 25 Feb 14:51:18.650 # Error condition on socket for SYNC: Connection refused

更新:

根据要求,包括配置(删除所有注释):

Done with:
erb redis.conf.erb | grep -vE "(^[#]|^$)" > redis.conf && erb redis-sentinel.conf.erb | grep -vE "(^[#]|^$)" > redis-sentinel.conf

REDIS 54edcc79ca2895e4a300021f

daemonize yes
pidfile /var/lib/openshift/54edcc79ca2895e4a300021f/redis//pid/redis.pid
port 16379
bind 127.2.69.129
timeout 0
tcp-keepalive 0
loglevel notice
logfile /var/lib/openshift/54edcc79ca2895e4a300021f/redis//logs/redis.log
databases 16
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/openshift/54edcc79ca2895e4a300021f/app-root/data//.redis/dbs/
slaveof 54edac2cca2895e4a30001cd-redis.openshift.zu 38821
masterauth ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5
slave-serve-stale-data yes
slave-read-only yes
repl-disable-tcp-nodelay no
slave-priority 100
requirepass ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5
appendonly no
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes

REDIS 54edac2ca2895e4a30001cd

daemonize yes
pidfile /var/lib/openshift/54edac2cca2895e4a30001cd/redis//pid/redis.pid
port 16379
bind 127.2.67.1
timeout 0
tcp-keepalive 0
loglevel notice
logfile /var/lib/openshift/54edac2cca2895e4a30001cd/redis//logs/redis.log
databases 16
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/openshift/54edac2cca2895e4a30001cd/app-root/data//.redis/dbs/
slave-serve-stale-data yes
slave-read-only yes
repl-disable-tcp-nodelay no
slave-priority 100
requirepass ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5
appendonly no
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes

REDIS 54edac4ca2895e4a30001e5

daemonize yes
pidfile /var/lib/openshift/54edac43ca2895e4a30001e5/redis//pid/redis.pid
port 16379
bind 127.2.81.1
timeout 0
tcp-keepalive 0
loglevel notice
logfile /var/lib/openshift/54edac43ca2895e4a30001e5/redis//logs/redis.log
databases 16
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/openshift/54edac43ca2895e4a30001e5/app-root/data//.redis/dbs/
slaveof 54edac2cca2895e4a30001cd-redis.openshift.zu 38821
masterauth ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5
slave-serve-stale-data yes
slave-read-only yes
repl-disable-tcp-nodelay no
slave-priority 100
requirepass ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5
appendonly no
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes

SENTINEL 54edcc79ca2895e4a300021f

pidfile /var/lib/openshift/54edcc79ca2895e4a300021f/redis//pid/redis-sentinel.pid
daemonize yes
logfile /var/lib/openshift/54edcc79ca2895e4a300021f/redis//logs/redis.log
bind 127.2.69.130
port 26379
sentinel monitor 54edac2cca2895e4a30001cd 54edac2cca2895e4a30001cd-redis.openshift.zu 38821 2
sentinel auth-pass 54edac2cca2895e4a30001cd ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5
sentinel down-after-milliseconds 54edac2cca2895e4a30001cd 10000
sentinel parallel-syncs 54edac2cca2895e4a30001cd 1
sentinel failover-timeout 54edac2cca2895e4a30001cd 30000
sentinel monitor 54edac43ca2895e4a30001e5 54edac43ca2895e4a30001e5-redis.openshift.zu 38961 2
sentinel auth-pass 54edac43ca2895e4a30001e5 ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5
sentinel down-after-milliseconds 54edac43ca2895e4a30001e5 10000
sentinel parallel-syncs 54edac43ca2895e4a30001e5 1
sentinel failover-timeout 54edac43ca2895e4a30001e5 30000
sentinel monitor 54edcc79ca2895e4a300021f 54edcc79ca2895e4a300021f-redis.openshift.zu 38846 2
sentinel auth-pass 54edcc79ca2895e4a300021f ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5
sentinel down-after-milliseconds 54edcc79ca2895e4a300021f 10000
sentinel parallel-syncs 54edcc79ca2895e4a300021f 1
sentinel failover-timeout 54edcc79ca2895e4a300021f 30000

SENTINEL 54edac2ca2895e4a30001cd

pidfile /var/lib/openshift/54edac2cca2895e4a30001cd/redis//pid/redis-sentinel.pid
daemonize yes
logfile /var/lib/openshift/54edac2cca2895e4a30001cd/redis//logs/redis.log
bind 127.2.67.2
port 26379
sentinel monitor 54edac2cca2895e4a30001cd 54edac2cca2895e4a30001cd-redis.openshift.zu 38821 2
sentinel auth-pass 54edac2cca2895e4a30001cd ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5
sentinel down-after-milliseconds 54edac2cca2895e4a30001cd 10000
sentinel parallel-syncs 54edac2cca2895e4a30001cd 1
sentinel failover-timeout 54edac2cca2895e4a30001cd 30000

SENTINEL 54edac43ca895e4a30001e5

pidfile /var/lib/openshift/54edac43ca2895e4a30001e5/redis//pid/redis-sentinel.pid
daemonize yes
logfile /var/lib/openshift/54edac43ca2895e4a30001e5/redis//logs/redis.log
bind 127.2.81.2
port 26379
sentinel monitor 54edac2cca2895e4a30001cd 54edac2cca2895e4a30001cd-redis.openshift.zu 38821 2
sentinel auth-pass 54edac2cca2895e4a30001cd ZTNiMGM0NDI5OGZjMWMxNDlhZmJmNGM4OTk2ZmI5
sentinel down-after-milliseconds 54edac2cca2895e4a30001cd 10000
sentinel parallel-syncs 54edac2cca2895e4a30001cd 1
sentinel failover-timeout 54edac2cca2895e4a30001cd 30000

查看您的配置,您有使用相同名称的哨兵和Redis实例。他们还共享日志文件,这会造成混乱。这不是你想要的。

您想要:

3个具有唯一名称的sentinel实例。

一个Redis master+一个Redis slave(这是一个"Pod",你会用一个唯一的东西来命名它,它标识了这个组合,而不是其中的一个节点)。

您将其添加到您的哨兵设置中,并设置其密码。假设您将pod命名为"pod1",那么您的配置将如下所示:

sentinel monitor pod1 <master-ip> <master-port> 2
sentinel auth-pass <the-master-auth-pass-and-requirepass-setting>
sentinel down-after-milliseconds pod1 10000
sentinel parallel-syncs pod1 1
sentinel failover-timeout pod1 30000

虽然你可以为每个吊舱运行一个哨兵星座,但使用一个哨兵星群来管理多个吊舱更有效。同样,你也不希望这些哨兵与它们监控的Redis实例在同一主机上运行——否则,在你最需要它们的时候,可能会失去它们。

不幸的是,我不熟悉OpenShift,所以我不知道你要通过什么途径让它为你配置它。然而,了解配置应该是什么样子应该可以帮助您确认它是否正确。

最新更新