10.Ceph 基础篇 - RGW 高可用

2022-10-31,,,,

文章转载自:https://mp.weixin.qq.com/s?__biz=MzI1MDgwNzQ1MQ==&mid=2247485316&idx=1&sn=d3a6be417aa39bb03d3fc8bf61527a2a&chksm=e9fdd270de8a5b66940ffe07e58535acfa88ebb8342d52510eefee8eba30dbecaa225122de25&cur_album_id=1600845417376776197&scene=189#wechat_redirect

RGW 扩展

1.集群状态

[root@ceph-node01 ~]# ceph -s
cluster:
id: cc10b0cb-476f-420c-b1d6-e48c1dc929af
health: HEALTH_OK services:
mon: 3 daemons, quorum ceph-node01,ceph-node02,ceph-node03 (age 7h)
mgr: ceph-node01(active, since 7d), standbys: ceph-node03, ceph-node02
osd: 7 osds: 7 up (since 7h), 7 in (since 44h)
rgw: 1 daemon active (ceph-node01) task status: data:
pools: 7 pools, 224 pgs
objects: 1.14k objects, 2.9 GiB
usage: 16 GiB used, 784 GiB / 800 GiB avail
pgs: 224 active+clean [root@ceph-node01 ~]#

通过ceph -s 可以看到,目前只有一个radosgw进程;

2.扩展rgw网关

[root@ceph-node01 ceph-deploy]# ceph-deploy rgw create ceph-node02
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy rgw create ceph-node02
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] rgw : [('ceph-node02', 'rgw.ceph-node02')]
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] quiet
。。。
[root@ceph-node01 ceph-deploy]#

这样部署完成后,默认监听在7480端口上面,下面我们修改配置;

3.添加配置文件

[root@ceph-node01 ceph-deploy]# cat ceph.conf
。。。 [client.rgw.ceph-node01]
rgw_frontends = "civetweb port=80"
[client.rgw.ceph-node02]
rgw_frontends = "civetweb port=80" 。。。
[root@ceph-node01 ceph-deploy]#

4.推送配置文件

[root@ceph-node01 ceph-deploy]# ceph-deploy --overwrite-conf config push ceph-node01 ceph-node02 ceph-node03^C

5.重启radosgw服务

[root@ceph-node02 ~]# systemctl restart ceph-radosgw.target

需要ssh到 ceph-node02上面进行服务器重启;

RGW 高可用

架构图

服务器规划

由于测试集群,服务器数量有限,我们只是使用keepalived与haproxy都共用ceph-node01与ceph-node02,如果有额外的机器,可以不共用。

radosgw 端口修改

# 1. 修改配置
[root@ceph-node01 ceph-deploy]# cat ceph.conf
。。。 [client.rgw.ceph-node01]
rgw_frontends = "civetweb port=81"
[client.rgw.ceph-node02]
rgw_frontends = "civetweb port=81" [osd]
osd crush update on start = false # 2. 推送配置
[root@ceph-node01 ceph-deploy]# ceph-deploy --overwrite-conf config push ceph-node01 ceph-node02 ceph-node03
。。。
[root@ceph-node01 ceph-deploy]# # 3. 重启ceph-node01 rgw 服务并查看监听
[root@ceph-node01 ceph-deploy]# systemctl restart ceph-radosgw.target
[root@ceph-node01 ceph-deploy]# netstat -antp |grep 81|grep radosgw
tcp 0 0 0.0.0.0:81 0.0.0.0:* LISTEN 85707/radosgw
[root@ceph-node01 ceph-deploy]# ssh ceph-node02
Last login: Sat Oct 17 06:28:17 2020 from 100.73.18.152 # 4. 重启ceph-node02 rgw 服务并查看监听
[root@ceph-node02 ~]# systemctl restart ceph-radosgw.target
[root@ceph-node02 ~]# netstat -antp |grep radosgw |grep 81
tcp 0 0 0.0.0.0:81 0.0.0.0:* LISTEN 11222/radosgw
[root@ceph-node02 ~]#
[root@ceph-node02 ~]# exit
登出
Connection to ceph-node02 closed.
[root@ceph-node01 ceph-deploy]#

部署keepalived

ceph-node01节点:

# 1. ceph-node01 安装
[root@ceph-node01 ~]# yum -y install keepalived # 2. 配置文件
[root@ceph-node01 keepalived]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
} vrrp_script haproxy_check {
script "killall -0 haproxy"
interval 2
weight -2
}
vrrp_instance RADOSGW {
state MASTER
interface eth0
virtual_router_id 54
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
100.73.18.253/24
} track_script {
haproxy_check
}
}
[root@ceph-node01 keepalived]# # 3. 启动
[root@ceph-node01 keepalived]# systemctl start keepalived && systemctl enable keepalived

ceph-node02节点:

# 1. ceph-node02 安装
[root@ceph-node02 ~]# yum -y install keepalived # 2. 配置文件
[root@ceph-node02 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
} vrrp_script haproxy_check {
script "killall -0 haproxy"
interval 2
weight -2
}
vrrp_instance RADOSGW {
state BACKUP
interface eth0
virtual_router_id 54
priority 99
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
100.73.18.253/24
} track_script {
haproxy_check
}
}
[root@ceph-node02 ~]# # 3. 启动
[root@ceph-node02 keepalived]# systemctl start keepalived && systemctl enable keepalived

keepalived 两个节点之间,只有state与priority 权重的区别,其它都一样;

遇到问题

Oct 19 09:34:30 ceph-node01 Keepalived_vrrp[87655]: VRRP_Instance(RADOSGW) ignoring received advertisment...
Oct 19 09:34:31 ceph-node01 Keepalived_vrrp[87655]: (RADOSGW): ip address associated with VRID 51 not present in MASTER advert : 100.73.18.253
Oct 19 09:34:31 ceph-node01 Keepalived_vrrp[87655]: bogus VRRP packet received on eth0 !!!
Oct 19 09:34:31 ceph-node01 Keepalived_vrrp[87655]: VRRP_Instance(RADOSGW) ignoring received advertisment...
Oct 19 09:34:32 ceph-node01 Keepalived_vrrp[87655]: (RADOSGW): ip address associated with VRID 51 not present in MASTER advert : 100.73.18.253
Oct 19 09:34:32 ceph-node01 Keepalived_vrrp[87655]: bogus VRRP packet received on eth0 !!!
Oct 19 09:34:32 ceph-node01 Keepalived_vrrp[87655]: VRRP_Instance(RADOSGW) ignoring received advertisment...
Oct 19 09:34:33 ceph-node01 Keepalived_vrrp[87655]: (RADOSGW): ip address associated with VRID 51 not present in MASTER advert : 100.73.18.253
Oct 19 09:34:33 ceph-node01 Keepalived_vrrp[87655]: bogus VRRP packet received on eth0 !!!

原因:当前 keepalived 集群的 backup 节点收到的包中的 ip 地址与当前节点的虚拟 ip 地址不匹配,造成这种情况通常是当前网络内还有另外一组keepalived集群,virtual_router_id与当前集群的重复,尝试修改到0-255的其他值,避免冲突,这里由51修改成54解决;

查看VIP

[root@ceph-node01 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 06:bb:14:00:0a:8c brd ff:ff:ff:ff:ff:ff
inet 100.73.18.152/24 brd 100.73.18.255 scope global eth0
valid_lft forever preferred_lft forever
inet 100.73.18.253/24 scope global secondary eth0
valid_lft forever preferred_lft forever
[root@ceph-node01 ~]# ping 100.73.18.253 -c 1
PING 100.73.18.253 (100.73.18.253) 56(84) bytes of data. ^C
--- 100.73.18.253 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms [root@ceph-node01 ~]#

关闭过滤规则

[root@ceph-node01 ~]# iptables -t filter -L -n
Chain INPUT (policy ACCEPT)
target prot opt source destination
DROP all -- 0.0.0.0/0 100.73.18.253 Chain FORWARD (policy ACCEPT)
target prot opt source destination Chain OUTPUT (policy ACCEPT)
target prot opt source destination
[root@ceph-node01 ~]#
[root@ceph-node01 ~]# iptables -t filter -F
[root@ceph-node01 ~]# iptables -t filter -L -n
Chain INPUT (policy ACCEPT)
target prot opt source destination Chain FORWARD (policy ACCEPT)
target prot opt source destination Chain OUTPUT (policy ACCEPT)
target prot opt source destination
[root@ceph-node01 ~]#
[root@ceph-node01 ~]# ping 100.73.18.253 -c 2
PING 100.73.18.253 (100.73.18.253) 56(84) bytes of data.
64 bytes from 100.73.18.253: icmp_seq=1 ttl=64 time=0.018 ms
64 bytes from 100.73.18.253: icmp_seq=2 ttl=64 time=0.022 ms --- 100.73.18.253 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.018/0.020/0.022/0.002 ms
[root@ceph-node01 ~]#

部署haproxy

# 1. ceph-node01 与 ceph-node02 均安装haproxy
[root@ceph-node01 haproxy]# yum -y install haproxy # 2. 配置文件
[root@ceph-node01 keepalived]# cat /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local2 chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon stats socket /var/lib/haproxy/stats defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000 frontend http_web *:80
mode http
default_backend radosgw backend radosgw
balance roundrobin
mode http
server ceph-node01 100.73.18.152:81
server ceph-node02 100.73.18.153:81 [root@ceph-node01 keepalived]# # 3. 启动
[root@ceph-node01 haproxy]# systemctl enable haproxy && systemctl start haproxy
Created symlink from /etc/systemd/system/multi-user.target.wants/haproxy.service to /usr/lib/systemd/system/haproxy.service. # 4. 启动 ceph-node02
[root@ceph-node01 haproxy]# scp /etc/haproxy/haproxy.cfg ceph-node02:/etc/haproxy/
haproxy.cfg 100% 992 3.3MB/s 00:00
[root@ceph-node01 haproxy]# ssh ceph-node02
Last login: Mon Oct 19 09:22:57 2020 from 100.73.18.152
[root@ceph-node02 ~]# systemctl enable haproxy && systemctl start haproxy
Created symlink from /etc/systemd/system/multi-user.target.wants/haproxy.service to /usr/lib/systemd/system/haproxy.service.
[root@ceph-node02 ~]#

验证

[root@ceph-node01 keepalived]# curl http://100.73.18.152/
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>[root@ceph-node01 keepalived]#
[root@ceph-node01 keepalived]#
[root@ceph-node01 keepalived]# curl http://100.73.18.153/
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>[root@ceph-node01 keepalived]#
[root@ceph-node01 keepalived]#
[root@ceph-node01 keepalived]#

s3客户端测试

[root@ceph-node01 ~]# cat /root/.s3cfg
。。。
host_base = 100.73.18.253:80
host_bucket = 100.73.18.253:80/%(bucket)s
。。。
[root@ceph-node01 ~]#

s3 客户端验证

[root@ceph-node01 ~]# s3cmd mb s3://test-1
Bucket 's3://test-1/' created
[root@ceph-node01 ~]# s3cmd ls
...
2020-10-19 13:54 s3://test-1
[root@ceph-node01 ~]#

swift客户端

# 1. 配置修改并应用
[root@ceph-node01 ~]# cat /etc/profile
...
export ST_AUTH=http://100.73.18.253/auth
export ST_USER=ceph-s3-user:swift
export ST_KEY=0M1GdRTvMSU3fToOxEVXrBjItKLBKtu8xhn3DcEE
[root@ceph-node01 ~]# source /etc/profile # 2. 客户端使用
[root@ceph-node01 ~]# swift post test-2
[root@ceph-node01 ~]# swift list
。。。
test-1
test-2
[root@ceph-node01 ~]#

高可用验证

# 1. 模拟ceph-node02 haproxy 故障
[root@ceph-node01 ~]# systemctl stop haproxy
[root@ceph-node01 ~]# tail -f /var/log/messages
Oct 19 10:23:58 ceph-node01 Keepalived_vrrp[89533]: /usr/bin/killall -0 haproxy exited with status 1
Oct 19 10:23:58 ceph-node01 Keepalived_vrrp[89533]: VRRP_Script(haproxy_check) failed
Oct 19 10:23:58 ceph-node01 Keepalived_vrrp[89533]: VRRP_Instance(RADOSGW) Changing effective priority from 100 to 98
Oct 19 10:23:59 ceph-node01 Keepalived_vrrp[89533]: VRRP_Instance(RADOSGW) Received advert with higher priority 99, ours 98
Oct 19 10:23:59 ceph-node01 Keepalived_vrrp[89533]: VRRP_Instance(RADOSGW) Entering BACKUP STATE
Oct 19 10:23:59 ceph-node01 Keepalived_vrrp[89533]: VRRP_Instance(RADOSGW) removing protocol VIPs.
Oct 19 10:23:59 ceph-node01 Keepalived_vrrp[89533]: VRRP_Instance(RADOSGW) removing protocol iptable drop rule
Oct 19 10:24:00 ceph-node01 ntpd[5618]: Deleting interface #9 eth0, 100.73.18.253#123, interface stats: received=0, sent=0, dropped=0, active_time=180 secs
Oct 19 10:24:00 ceph-node01 Keepalived_vrrp[89533]: /usr/bin/killall -0 haproxy exited with status 1
Oct 19 10:24:02 ceph-node01 Keepalived_vrrp[89533]: /usr/bin/killall -0 haproxy exited with status 1
Oct 19 10:24:04 ceph-node01 Keepalived_vrrp[89533]: /usr/bin/killall -0 haproxy exited with status 1
Oct 19 10:24:06 ceph-node01 Keepalived_vrrp[89533]: /usr/bin/killall -0 haproxy exited with status 1
Oct 19 10:24:08 ceph-node01 Keepalived_vrrp[89533]: /usr/bin/killall -0 haproxy exited with status 1
^C
[root@ceph-node01 ~]# # 2. 登录 ceph-node02 查看
[root@ceph-node02 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 06:f6:14:00:0a:8d brd ff:ff:ff:ff:ff:ff
inet 100.73.18.153/24 brd 100.73.18.255 scope global eth0
valid_lft forever preferred_lft forever
inet 100.73.18.253/24 scope global secondary eth0
valid_lft forever preferred_lft forever
[root@ceph-node02 ~]# iptables -t filter -L -n
Chain INPUT (policy ACCEPT)
target prot opt source destination
DROP all -- 0.0.0.0/0 100.73.18.253 Chain FORWARD (policy ACCEPT)
target prot opt source destination Chain OUTPUT (policy ACCEPT)
target prot opt source destination
[root@ceph-node02 ~]# iptables -t filter -F
[root@ceph-node02 ~]# ping 100.73.18.253
PING 100.73.18.253 (100.73.18.253) 56(84) bytes of data.
64 bytes from 100.73.18.253: icmp_seq=1 ttl=64 time=0.015 ms
^C
--- 100.73.18.253 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.015/0.015/0.015/0.000 ms
[root@ceph-node02 ~]#

禁用添加 Drop 规则

[root@ceph-node01 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_garp_interval 0
vrrp_gna_interval 0
}
。。。
[root@ceph-node01 ~]#

keepalived 启动后 iptables 会自动添加 Drop 规则,这是因为全局配置下使用了vrrp_strict 参数,把此参数从全局中删除,此参数的作用是严格控制 VRRP 协议,不支持单播模式,注释掉此选项,将不会默认添加DROP规则。

10.Ceph 基础篇 - RGW 高可用的相关教程结束。

《10.Ceph 基础篇 - RGW 高可用.doc》

下载本文的Word格式文档,以方便收藏与打印。