基础概念
Slot
RedisCluster将key分成16384个slot。key和slot的映射关系如下: (CRC16是一种冗余码校验和,可将字符串转换成16位的数字)
1 | slot = CRC16(key) mod 16384 |
因为有slot的存在,所以multi-key操作(scan/mget/keys…)无法支持。为此RedisCluster提供hash tags
,用于确保多个key能被分配到同一个slot中,并可支持multi-key操作。但hash tags
容易引起数据倾斜,需谨慎。直接在key前加入{hash_tag}
即可,如:{foo}student
Gossip协议
RedisCluster通过gossip协议,实现集群间状态同步更新、选举自助failover等重要的集群功能。
gossip协议包含多种消息:(RedisCluster使用redis_port+10000
端口作为node间通信端口)
meet
: 某个节点发送meet给新加入的节点,让新节点加入集群中,然后新节点就会开始与其他节点进行通信ping
: 每个节点都会频繁给其他节点发送ping,其中包含自己的状态还有自己维护的集群元数据,互相通过ping交换元数据pong
: 应答ping和meet,包含自己的状态和其他信息,也可以用于信息广播和更新fail
: 某个节点判断另一个节点fail之后,就发送fail给其他节点,通知其他节点,指定的节点宕机了s
Failover机制
failover是RedisCluster提供的容错机制,failover支持两种方式:
- 故障failover: 自动恢复集群可用性
- 人为failover: 手动操作恢复集群可用性
fail探测:
- node在
node timeout
时间内没有响应PING请求,则被标记为PFAIL
PFAIL
标记随着gossip传播- 过半node都标记
PFAIL
,则更改node状态为FAIL
并广播消息
故障failover过程:
- slave探测到master为
FAIL
- slave将记录的
currentEpoch + 1
,并广播Failover Request
消息 - 所有node接收到广播消息,只有master能响应,判断合法性,合法则发送
FAILOVER_AUTH_ACK
- slave收集
FAILOVER_AUTH_ACK
消息,过半同意则升级为master - 成功升级为master后通过PONG消息通知所有node
安装
系统参数优化
1 | vm.overcommit_memory=1 |
依赖
1 | yum -y install zlib zlib-devel openssl openssl-devel gcc gcc-c++ |
ruby
1 | wget -O ruby-2.4.4.tar.gz 'https://cache.ruby-lang.org/pub/ruby/2.4/ruby-2.4.4.tar.gz' --no-check-certificate |
1 | ruby -v |
rubygems
1 | wget -O rubygems-2.7.6.tgz 'https://rubygems.org/rubygems/rubygems-2.7.6.tgz' --no-check-certificate |
1 | gem -v |
redis.gem
redis-4.x.x.gem 有bug!不能进行扩容/缩容reshard|rebalance
, 报错Syntax error ,try CLIENT (LIST|KILL|GETNAME|SETNAME|PAUSE|REPLY)
安装redis-3.x.x.gem版本可调用reshard|rebalance
但不支持密码,不适用于有密码验证的集群
4.x.x 版本
1
2wget -O redis-4.0.2.gem 'https://rubygems.org/downloads/redis-4.0.2.gem' --no-check-certificate
gem install -l redis-4.0.2.gem1
2
3
4
5gem list redis
*** LOCAL GEMS ***
redis (4.0.2)3.x.x 版本
1
2
3# gem uninstall redis --version 4.0.2
wget -O redis-3.3.3.gem 'https://rubygems.org/downloads/redis-3.3.3.gem' --no-check-certificate
gem install -l redis-3.3.3.gem
tcl
1 | wget -O tcl868-src.zip 'https://jaist.dl.sourceforge.net/project/tcl/Tcl/8.6.8/tcl868-src.zip' --no-check-certificate |
redis-4.0.11
1 | wget -O redis-4.0.11.tar.gz 'http://download.redis.io/releases/redis-4.0.11.tar.gz' |
配置
1 | su - redis -c "mkdir -p /usr/local/redis-cluster/conf /data/logs/redis/ /data/redis/{8000,8001,8002}" |
1 | # /usr/local/redis-cluster/redis.conf |
1 | # /usr/local/redis-cluster/conf/8000.conf |
1 | # /usr/local/redis-cluster/conf/8001.conf |
集群
启动redis实例
1 | su - redis -c "/usr/local/redis-cluster/bin/redis-server /usr/local/redis-cluster/conf/8000.conf" |
创建redis cluster
iptables
1
2iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 8000:8006 -j ACCEPT
iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 18000:18006 -j ACCEPT修改密码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19# 修改:password
/usr/local/ruby/lib/ruby/gems/2.4.0/gems/redis-4.0.2/lib/redis/client.rb
#/usr/local/ruby/lib/ruby/gems/2.4.0/gems/redis-3.3.3/lib/redis/client.rb
DEFAULTS = {
:url => lambda { ENV["REDIS_URL"] },
:scheme => "redis",
:host => "127.0.0.1",
:port => 6379,
:path => nil,
:timeout => 5.0,
:password => nil, #修改此处
:db => 0,
:driver => nil,
:id => nil,
:tcp_keepalive => 0,
:reconnect_attempts => 1,
:inherit_socket => false
}创建集群
自动分配主从角色
1
redis-trib.rb create --replicas 1 192.168.1.180:8000 192.168.1.180:8001 192.168.1.180:8002 192.168.1.181:8003 192.168.1.181:8004 192.168.1.181:8005
手动分配主从角色
创建master
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33redis-trib.rb create 192.168.1.180:8000 192.168.1.180:8001 192.168.1.180:8002
>>> Creating cluster
>>> Performing hash slots allocation on 3 nodes...
Using 3 masters:
192.168.1.180:8000
192.168.1.180:8001
192.168.1.180:8002
M: 673e32925e0a6f9beefac8aeaad8a397758c5e47 192.168.1.180:8000
slots:0-5460 (5461 slots) master
M: 605950ea5c2214f50d5f3dddd87c80f4e7d1b631 192.168.1.180:8001
slots:5461-10922 (5462 slots) master
M: a923e10183ef356bcadc1566503be4ab1ea1adb6 192.168.1.180:8002
slots:10923-16383 (5461 slots) master
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join.
>>> Performing Cluster Check (using node 192.168.1.180:8000)
M: 673e32925e0a6f9beefac8aeaad8a397758c5e47 192.168.1.180:8000
slots:0-5460 (5461 slots) master
0 additional replica(s)
M: 605950ea5c2214f50d5f3dddd87c80f4e7d1b631 192.168.1.180:8001
slots:5461-10922 (5462 slots) master
0 additional replica(s)
M: a923e10183ef356bcadc1566503be4ab1ea1adb6 192.168.1.180:8002
slots:10923-16383 (5461 slots) master
0 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.创建slave
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22# 添加slave:将192.168.1.181:8003加入到192.168.1.180:8000集群中,并且作为指定<node_id>的slave
redis-trib.rb add-node --slave --master-id 673e32925e0a6f9beefac8aeaad8a397758c5e47 192.168.1.181:8003 192.168.1.180:8000
>>> Adding node 192.168.1.181:8003 to cluster 192.168.1.180:8000
>>> Performing Cluster Check (using node 192.168.1.180:8000)
M: 673e32925e0a6f9beefac8aeaad8a397758c5e47 192.168.1.180:8000
slots:0-5460 (5461 slots) master
0 additional replica(s)
M: 605950ea5c2214f50d5f3dddd87c80f4e7d1b631 192.168.1.180:8001
slots:5461-10922 (5462 slots) master
0 additional replica(s)
M: a923e10183ef356bcadc1566503be4ab1ea1adb6 192.168.1.180:8002
slots:10923-16383 (5461 slots) master
0 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 192.168.1.181:8003 to make it join the cluster.
Waiting for the cluster to join...
>>> Configure node as replica of 192.168.1.180:8000.
[OK] New node added correctly.1
2
3redis-trib.rb add-node --slave --master-id 605950ea5c2214f50d5f3dddd87c80f4e7d1b631 192.168.1.181:8004 192.168.1.180:8000
redis-trib.rb add-node --slave --master-id a923e10183ef356bcadc1566503be4ab1ea1adb6 192.168.1.181:8005 192.168.1.180:8000查看集群信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24redis-trib.rb check 192.168.1.180:8000
>>> Performing Cluster Check (using node 192.168.1.180:8000)M: 673e32925e0a6f9beefac8aeaad8a397758c5e47 192.168.1.180:8000
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: dea85bf87e560b6a5074f60965b8ad334bfeb5e8 192.168.1.181:8004
slots: (0 slots) slave
replicates 605950ea5c2214f50d5f3dddd87c80f4e7d1b631
M: a923e10183ef356bcadc1566503be4ab1ea1adb6 192.168.1.180:8002
slots:10923-16383 (5461 slots) master
1 additional replica(s)
M: 605950ea5c2214f50d5f3dddd87c80f4e7d1b631 192.168.1.180:8001
slots:5461-10922 (5462 slots) master
1 additional replica(s)
S: 6f12a4c4e1c60f435f68fbce1b72dc60ac73de83 192.168.1.181:8005
slots: (0 slots) slave
replicates a923e10183ef356bcadc1566503be4ab1ea1adb6
S: 74bfaa76306dd6bc59e559d012203ceed2a8ab24 192.168.1.181:8003
slots: (0 slots) slave
replicates 673e32925e0a6f9beefac8aeaad8a397758c5e47
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.1
2
3
4
5
6
7192.168.1.180:8000> CLUSTER NODES
dea85bf87e560b6a5074f60965b8ad334bfeb5e8 192.168.1.181:8004@18004 slave 605950ea5c2214f50d5f3dddd87c80f4e7d1b631 0 1541155161000 2 connected
673e32925e0a6f9beefac8aeaad8a397758c5e47 192.168.1.180:8000@18000 myself,master - 0 1541155160000 1 connected 0-5460
a923e10183ef356bcadc1566503be4ab1ea1adb6 192.168.1.180:8002@18002 master - 0 1541155162973 3 connected 10923-16383
605950ea5c2214f50d5f3dddd87c80f4e7d1b631 192.168.1.180:8001@18001 master - 0 1541155160000 2 connected 5461-10922
6f12a4c4e1c60f435f68fbce1b72dc60ac73de83 192.168.1.181:8005@18005 slave a923e10183ef356bcadc1566503be4ab1ea1adb6 0 1541155162000 3 connected
74bfaa76306dd6bc59e559d012203ceed2a8ab24 192.168.1.181:8003@18003 slave 673e32925e0a6f9beefac8aeaad8a397758c5e47 0 1541155162000 1 connected1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17192.168.1.180:8000> CLUSTER INFO
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:3
cluster_my_epoch:1
cluster_stats_messages_ping_sent:1756
cluster_stats_messages_pong_sent:1713
cluster_stats_messages_sent:3469
cluster_stats_messages_ping_received:1708
cluster_stats_messages_pong_received:1756
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:34691
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25192.168.1.180:8000> CLUSTER SLOTS
1) 1) (integer) 0
2) (integer) 5460
3) 1) "192.168.1.180"
2) (integer) 8000
3) "673e32925e0a6f9beefac8aeaad8a397758c5e47"
4) 1) "192.168.1.181"
2) (integer) 8003
3) "74bfaa76306dd6bc59e559d012203ceed2a8ab24"
2) 1) (integer) 10923
2) (integer) 16383
3) 1) "192.168.1.180"
2) (integer) 8002
3) "a923e10183ef356bcadc1566503be4ab1ea1adb6"
4) 1) "192.168.1.181"
2) (integer) 8005
3) "6f12a4c4e1c60f435f68fbce1b72dc60ac73de83"
3) 1) (integer) 5461
2) (integer) 10922
3) 1) "192.168.1.180"
2) (integer) 8001
3) "605950ea5c2214f50d5f3dddd87c80f4e7d1b631"
4) 1) "192.168.1.181"
2) (integer) 8004
3) "dea85bf87e560b6a5074f60965b8ad334bfeb5e8"1
2
3
4
5
6redis-trib.rb info 192.168.1.180:8000
192.168.1.180:8000 (673e3292...) -> 1 keys | 5461 slots | 1 slaves.192.168.1.180:8002 (a923e101...) -> 1 keys | 5461 slots | 1 slaves.
192.168.1.180:8001 (605950ea...) -> 1 keys | 5462 slots | 1 slaves.
[OK] 3 keys in 3 masters.
0.00 keys per slot on average.
常用管理命令
cluster cmd
原生redis cluster命令集
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22//cluster
CLUSTER INFO 打印集群的信息
CLUSTER NODES 列出集群当前已知的所有节点(node),以及这些节点的相关信息。
//node
CLUSTER MEET <ip> <port> 将 ip 和 port 所指定的节点添加到集群当中,让它成为集群的一份子。
CLUSTER FORGET <node_id> 从集群中移除 node_id 指定的节点。
CLUSTER REPLICATE <node_id> 将当前节点设置为 node_id 主节点的从节点。
CLUSTER SAVECONFIG 将节点的配置文件保存到硬盘里面。
//slot
CLUSTER ADDSLOTS <slot> [slot ...] 将一个或多个槽(slot)指派(assign)给当前节点。
CLUSTER DELSLOTS <slot> [slot ...] 移除一个或多个槽对当前节点的指派。
CLUSTER FLUSHSLOTS 移除指派给当前节点的所有槽,让当前节点变成一个没有指派任何槽的节点。
CLUSTER SETSLOT <slot> NODE <node_id> 将槽 slot 指派给 node_id 指定的节点。
CLUSTER SETSLOT <slot> MIGRATING <node_id> 将本节点的槽 slot 迁移到 node_id 指定的节点中。
CLUSTER SETSLOT <slot> IMPORTING <node_id> 从 node_id 指定的节点中导入槽 slot 到本节点。
CLUSTER SETSLOT <slot> STABLE 取消对槽 slot 的导入(import)或者迁移(migrate)。
//key
CLUSTER KEYSLOT <key> 计算键 key 应该被放置在哪个槽上。
CLUSTER COUNTKEYSINSLOT <slot> 返回槽 slot 目前包含的键值对数量。
CLUSTER GETKEYSINSLOT <slot> <count> 返回 count 个 slot 槽中的键。
//新增
CLUSTER SLAVES node-id 返回一个master节点的slaves 列表更改slave隶属master
1
2# 登录需要修改的slave
redis-cli -c -h 192.168.1.181 -p 80061
2# 调用cluster命令修改隶属master
192.168.1.181:8006> CLUSTER REPLICATE <new_master_id>手动缩容master(迁移slot)
1
2
3
4
5
6
7
8
9
10
11# 查看集群状态
> CLUSTER NODES
74bfaa76306dd6bc59e559d012203ceed2a8ab24 192.168.1.181:8003@18003 slave a923e10183ef356bcadc1566503be4ab1ea1adb6 0 1542794664779 3 connected
d976e1cc897744d5e5c2a0f754662c9d2a7cc077 192.168.1.181:8006@18006 myself,master - 0 1542794660000 12 connected 5461-5558 10923-10950
a923e10183ef356bcadc1566503be4ab1ea1adb6 192.168.1.180:8002@18002 master - 0 1542794663778 3 connected 10951-16383
dea85bf87e560b6a5074f60965b8ad334bfeb5e8 192.168.1.181:8004@18004 slave 6f12a4c4e1c60f435f68fbce1b72dc60ac73de83 0 1542794661000 11 connected
605950ea5c2214f50d5f3dddd87c80f4e7d1b631 192.168.1.180:8001@18001 master - 0 1542794664000 10 connected 5559-10922
673e32925e0a6f9beefac8aeaad8a397758c5e47 192.168.1.180:8000@18000 slave 605950ea5c2214f50d5f3dddd87c80f4e7d1b631 0 1542794664000 10 connected
6f12a4c4e1c60f435f68fbce1b72dc60ac73de83 192.168.1.181:8005@18005 master - 0 1542794665780 11 connected 0-5460
将8006迁移到80011
2
3
4
5
6
7# 迁移步骤说明
> 在迁移目的节点执行cluster setslot <slot> IMPORTING <source node ID>命令,指明需要迁移的slot和迁移源节点。
> 在迁移源节点执行cluster setslot <slot> MIGRATING <node ID>命令,指明需要迁移的slot和迁移目的节点。
> 在迁移源节点执行cluster getkeysinslot获取该slot的key列表。
> 在迁移源节点执行对每个key执行migrate命令,该命令会同步把该key迁移到目的节点。
> 在迁移源节点反复执行cluster getkeysinslot命令,直到该slot的列表为空。
> 在迁移源节点和目的节点执行cluster setslot <slot> NODE <target node ID>,完成迁移操作。1
2# 使用脚本进行迁移slot
migrate_slot.sh 192.168.1.181 8006 192.168.1.180 8001 5461 5558 "<redis_password>"
redis-trib.rb cmd
官方redis cluster集群管理工具命令集
创建集群
redis-trib.rb create
只创建master节点
1
2redis-trib.rb create <master1 ip:port> <master2 ip:port> <master3 ip:port>
e.g. redis-trib.rb create 192.168.1.180:8000 192.168.1.180:8001 192.168.1.180:8002自动分配master/slave节点
1
2redis-trib.rb create --replicas 1 <redis1 ip:port> <redis2 ip:port> <redis3 ip:port> <redis4 ip:port> <redis5 ip:port> <redis6 ip:port>
e.g. redis-trib.rb create --replicas 1 192.168.1.180:8000 192.168.1.180:8001 192.168.1.180:8002 192.168.1.181:8003 192.168.1.181:8004 192.168.1.181:8005
检查集群
redis-trib.rb check
1
2redis-trib.rb check <redis ip:port>
e.g. redis-trib.rb check 192.168.1.180:8000查看集群信息
redis-trib.rb info
1
2redis-trib.rb info <redis ip:port>
e.g. redis-trib.rb info 192.168.1.180:8000修复集群
redis-trib.rb fix
1
2redis-trib.rb fix <redis ip:port>
e.g. redis-trib.rb fix 192.168.1.180:8000新增节点
redis-trib.rb add-node
加入master节点
1
2redis-trib.rb add-node <new master ip:port> <one of cluster ip:port>
e.g. redis-trib.rb add-node 192.168.1.181:8006 192.168.1.180:8000加入slave节点
1
2
3# 将<new slave ip:port>节点加入集群,并作为<master_id>节点的slave
redis-trib.rb add-node --slave --master-id <master node id> <new slave ip:port> <one of cluster ip:port>
e.g. redis-trib.rb add-node --slave --master-id 673e32925e0a6f9beefac8aeaad8a397758c5e47 192.168.1.181:8007 192.168.1.180:8000
删除节点
redis-trib.rb del-node
redis-trib.rb del-node
只能删除没有分配slot的节点 (slave or 空master)1
2redis-trib.rb del-node <one of cluster ip:port> <delete node id>
e.g. redis-trib.rb del-node 192.168.1.180:8000 6f12a4c4e1c60f435f68fbce1b72dc60ac73de83在线迁移slot
redis-trib.rb reshard
redis-4.x.x.gem 有bug!不能进行扩容/缩容
reshard|rebalance
, 报错Syntax error ,try CLIENT (LIST|KILL|GETNAME|SETNAME|PAUSE|REPLY)
安装redis-3.x.x.gem版本可调用
reshard|rebalance
但不支持密码,不适用于有密码验证的集群1
2
3
4
5
6
7# 交互式
redis-trib.rb reshard <one of cluster ip:port>
# 非交互式
redis-trib.rb reshard --from <all|node1_id,node2_id,node3_id> --to <dest node_id> --slots <numbers of slots> <one of cluster ip:port>
e.g. redis-trib.rb reshard --from all --to 673e32925e0a6f9beefac8aeaad8a397758c5e47 --slots 5461 192.168.1.180:8000
e.g. redis-trib.rb reshard --from 673e32925e0a6f9beefac8aeaad8a397758c5e47,a923e10183ef356bcadc1566503be4ab1ea1adb6 --to 673e32925e0a6f9beefac8aeaad8a397758c5e47 --slots 5461 192.168.1.180:8000平衡slot
redis-trib.rb rebalance
redis-4.x.x.gem 有bug!不能进行扩容/缩容
reshard|rebalance
, 报错Syntax error ,try CLIENT (LIST|KILL|GETNAME|SETNAME|PAUSE|REPLY)
安装redis-3.x.x.gem版本可调用
reshard|rebalance
但不支持密码,不适用于有密码验证的集群1
2
3
4# 简单平衡所有节点
redis-trib.rb rebalance 192.168.1.180:8000
# 复杂平衡
redis-trib.rb rebalance --threshold 1 --weight b31e3a2e=5 --weight 60b8e3a1=5 --use-empty-masters --simulate 192.168.1.180:8000设置node间心跳超时时间
redis-trib.rb set-timeout
1
2redis-trib.rb set-timeout <one of cluster ip:port> <timeout>
e.g. redis-trib.rb set-timeout 192.168.1.180:8000 30000集群所有node执行命令
redis-trib.rb call
1
2
3
4
5
6
7redis-trib.rb call <one of cluster ip:port> <redis command>
e.g. redis-trib.rb call 192.168.1.180:8000 get key
e.g. redis-trib.rb call 192.168.1.180:8000 rconfig rewrite
redis-trib.rb call 10.10.10.171:7004 rconfig set masterauth "bincentTO*TOredis"
redis-trib.rb call 10.10.10.171:7004 rconfig set requirepass "bincentTO*TOredis"
redis-trib.rb call 10.10.10.171:7004 rconfig rewrite导入数据
redis-trib.rb import
1
2
3redis-trib.rb import --from <source ip:port> <one of cluster ip:port>
# --copy 选项可以保存旧redis上的key
# --replace 选项可以替换集群中相同名称的key,如果不使用,此类key不会被导入
Redis 工具
redis-faina: 分析hot key和top commands。需要注意的是,此脚本使用monitor
命令进行分析
redis-rdb-tools: 通过RDB文件全量分析bigkey。redis原本支持查询bigkeyredis-cli --bigkeys
RedisCluster 测试
模拟场景 | 是否达到预期 | 业务功能是否受损预期 | 结论 |
---|---|---|---|
宕1个slave | 是 (耗时15s自动故障迁移) | 业务不受损 | 当master无slave时,若有冗余slave,会自动进行切换,不影响集群功能。15s由cluster-node-timeout参数决定 |
宕2个slave | 是 | 业务不受损 | master无slave不影响集群功能,slave宕机不影响集群功能 |
宕3个slave | 是 | 业务不受损 | master无slave不影响集群功能,slave宕机不影响集群功能 |
宕1个master | 是 (耗时15s自动slave切换master) | 业务受影响15s | slave自动切换成master期间(15s),宕掉的master对应的slot不可用,所有对此slot的操作皆会受影响,由于设置cluster-require-full-coverage=no,没宕的2个master可正常提供服务。15s由cluster-node-timeout参数决定 |
宕2个master | FAILOVER FORCE / TAKEOVER恢复 | 业务受损 | 集群超过半数master宕,集群进入fail状态,slave不会自动切换成master,整个集群不可用。不要将多个master部署在一台机器上 |
宕3个master | 是 | 业务受损 | 集群所有master宕,集群进入fail状态,slave不会自动切换成master。整个集群不可用 |
宕1组master/slave | 是 | 业务受影响 | 一组主从同时宕机,会导致对应宕掉master的slot无法操作,由于设置cluster-require-full-coverage=no没宕的master可正常提供服务。不要将一组主从部署在一台机器上 |
扩容/缩容1个master | 是 | 业务不受损 | 迁移slot,不影响对集群读写操作。由于官方工具redis-trib.rb有bug,扩容/缩容需手动迁移slot,极易误操作,建议尽量不要纯手工迁移slot,非要迁移尽量使用脚本迁移,避免纯手动迁移 |
标准化规范
- 公共配置文件路径:
/usr/local/redis-cluster/redis.conf
- 实例配置文件路径:
/usr/local/redis-cluster/conf
- 配置文件命名:
{redis_port}.conf
,以redis端口命名配置文件名。(e.g., /usr/local/redis-cluster/conf/8000.conf) - 数据存放路径:
/data/redis/{redis_port}
,以redis端口命名文件夹目录。(e.g., dir “/data/redis/8000”) - 日志存放路径:
/data/logs/redis/
- 日志文件命名:
redis_{redis_port}.log
,以redis端口命名日志。(e.g., logfile “/data/logs/redis/redis_8000.log”) - 端口范围: redis_port > 20000
附录
1 | # 迁移slot脚本 |