很多生产环境,ceph是不跑在容器内,而是单独找服务器做成ceph集群,只是跑ceph集群不运行其它应用。
本次使用ceph-deploy快速构建ceph集群并创建块存储、文件系统、以及对象存储。
centos7安装15.x.xx版本会有问题, 这里安装14.x.xx
环境准备
集群信息
Ceph-deploy:2.0.1
Ceph:Nautilus(14.2.22)
System: CentOS Linux release 7.9.2009
IP | 主机名 | 服务 | 附加硬盘(OSD) |
---|---|---|---|
192.168.77.41 | ceph01 | mon、mgr、osd | 2块16G数据盘 |
192.168.77.42 | ceph02 | mon、mgr、osd | 2块16G数据盘 |
192.168.77.43 | ceph03 | mon、mgr、osd | 2块16G数据盘 |
如果环境允许,可以用一个 ceph-admin 节点专门放置 mon,mgr,mds 等这些组件,osd 放置在其他节点,更便于管理
关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat
iptables -P FORWARD ACCEPT
setenforce 0
sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
配置host解析
cat >> /etc/hosts <<EOF
192.168.77.41 ceph01
192.168.77.42 ceph02
192.168.77.43 ceph03
EOF
设置主机名
hostnamectl set-hostname ceph01
bash
配置ssh免密钥登陆
yum install -y sshpass
ssh-keygen -f /root/.ssh/id_rsa -P ''
export HOSTS="ceph01 ceph02 ceph03"
export SSHPASS=123456 # 密码
for HOST in $HOSTS;do sshpass -e ssh-copy-id -o StrictHostKeyChecking=no $HOST;done
时间同步
yum install -y ntp
部署NTP 服务器 ceph01操作
# 替换centos ntp server
vim /etc/ntp.conf
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server ntp1.aliyun.com iburst
server ntp2.aliyun.com iburst
server ntp3.aliyun.com iburst
systemctl restart ntpd
systemctl enable ntpd
timedatectl set-timezone Asia/Shanghai
timedatectl set-local-rtc 0
ceph02、ceph03 配置ntp server指向 ceph01 IP
vim /etc/ntp.conf
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server 192.168.77.41 iburst
systemctl restart ntpd
systemctl enable ntpd
root@ceph01 ~]# ntpq -pn
remote refid st t when poll reach delay offset jitter
==============================================================================
120.25.115.20 .INIT. 16 u - 64 0 0.000 0.000 0.000
*203.107.6.88 100.107.25.114 2 u 38 64 1 44.618 -11.494 0.282
[root@ceph03 ~]# ntpq -pn
remote refid st t when poll reach delay offset jitter
==============================================================================
192.168.77.41 203.107.6.88 3 u 1 64 1 0.327 7.468 0.180
配置yum源加速
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
yum clean all
yum makecache
ceph部署
配置ceph源
cat > /etc/yum.repos.d/ceph.repo << EOF
[norch]
name=norch
baseurl=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch/
enabled=1
gpgcheck=0
[x86_64]
name=norch
baseurl=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/x86_64/
enabled=1
gpgcheck=0
EOF
指定安装版本源
export CEPH_DEPLOY_REPO_URL=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7
export CEPH_DEPLOY_GPG_URL=https://mirrors.aliyun.com/ceph/keys/release.asc
安装必要软件包
yum install -y wget curl net-tools bash-completion python-setuptools
安装ceph-deploy部署工具
ceph01部署节点安装依赖包以及ceph部署工具ceph-deploy
yum install -y ceph-deploy
初始化Mon配置
这里先运行单节点的monitor
mkdir /etc/ceph
cd /etc/ceph
ceph-deploy new --public-network 192.168.77.0/24 ceph01
#ceph-deploy new --public-network 192.168.77.0/24 ceph0{1,2,3}
# --cluster-network 集群内通信的网络
# --public-network 集群对外的网络
执行后 生成的文件
[root@ceph01 ceph]# ls -l
total 12
-rw-r--r-- 1 root root 231 Apr 9 13:29 ceph.conf # ceph配置文件
-rw-r--r-- 1 root root 2992 Apr 9 13:29 ceph-deploy-ceph.log # ceph日志
-rw------- 1 root root 73 Apr 9 13:29 ceph.mon.keyring # keyring身份验证
添加配置参数
允许ceph时间偏移
echo "mon clock drift allowed = 2" >> /etc/ceph/ceph.conf
echo "mon clock drift warn backoff = 30" >> /etc/ceph/ceph.conf
export CEPH_DEPLOY_REPO_URL=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7
export CEPH_DEPLOY_GPG_URL=https://mirrors.aliyun.com/ceph/keys/release.asc
ceph-deploy install --release nautilus ceph01 ceph02 ceph03
初始化Mon节点
ceph-deploy install --release nautilus ceph01 ceph02 ceph03
[root@ceph01 ceph]# ls -l
total 224
-rw------- 1 root root 113 Apr 12 13:43 ceph.bootstrap-mds.keyring
-rw------- 1 root root 113 Apr 12 13:43 ceph.bootstrap-mgr.keyring
-rw------- 1 root root 113 Apr 12 13:43 ceph.bootstrap-osd.keyring
-rw------- 1 root root 113 Apr 12 13:43 ceph.bootstrap-rgw.keyring
-rw------- 1 root root 151 Apr 12 13:43 ceph.client.admin.keyring
-rw-r--r-- 1 root root 292 Apr 12 13:43 ceph.conf
-rw-r--r-- 1 root root 157631 Apr 12 13:43 ceph-deploy-ceph.log
-rw------- 1 root root 73 Apr 9 13:53 ceph.mon.keyring
-rw-r--r-- 1 root root 92 Jun 30 2021 rbdmap
同步生成的文件
ceph-deploy admin ceph01 ceph02 ceph03
禁用不安全模式
ceph config set mon auth_allow_insecure_global_id_reclaim false
可以看到有一个mon节点在运行
[root@ceph01 ceph]# ceph -s
cluster:
id: 32029ead-afed-49e0-b0d8-a7f3eb56e7c5
health: HEALTH_OK
services:
mon: 1 daemons, quorum ceph01 (age 2m) # 包含一个mon
mgr: no daemons active # mgr还没创建
osd: 0 osds: 0 up, 0 in # osd还没创建
data: # 资源池还没创建
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
创建Mgr
这里只将ceph01 作为 manager daemon
ceph-deploy mgr create ceph01
可以看到有一个刚添加的mgr在运行
[root@ceph01 ceph]# ceph -s
cluster:
id: 32029ead-afed-49e0-b0d8-a7f3eb56e7c5
health: HEALTH_WARN
OSD count 0 < osd_pool_default_size 3
services:
mon: 1 daemons, quorum ceph01 (age 6m)
mgr: ceph01(active, since 35s) # 新添加的mgr 状态:active
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
部署OSD
这里使用sdb sdc作为osd盘
[root@ceph01 ~]# fdisk -l
Disk /dev/sda: 36.5 GB, 36507222016 bytes, 71303168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000eb3ad
Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1050623 524288 83 Linux
/dev/sda2 1050624 71303167 35126272 83 Linux
Disk /dev/sdb: 17.2 GB, 17179869184 bytes, 33554432 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/sdc: 17.2 GB, 17179869184 bytes, 33554432 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
格式化磁盘
ceph-deploy disk zap ceph01 /dev/sdb
ceph-deploy disk zap ceph02 /dev/sdb
ceph-deploy disk zap ceph03 /dev/sdb
[root@ceph01 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 34G 0 disk
├─sda1 8:1 0 512M 0 part /boot
└─sda2 8:2 0 33.5G 0 part /
sdb 8:16 0 16G 0 disk
sdc 8:32 0 16G 0 disk
sr0 11:0 1 1024M 0 rom
创建OSD
cd /etc/ceph
ceph-deploy osd create ceph01 --data /dev/sdb
ceph-deploy osd create ceph03 --data /dev/sdb
ceph-deploy osd create ceph02 --data /dev/sdb
ceph-deploy osd create ceph01 --data /dev/sdc
ceph-deploy osd create ceph02 --data /dev/sdc
ceph-deploy osd create ceph03 --data /dev/sdc
health: HEALTH_OK 同步正常
[root@ceph01 ceph]# ceph -s
cluster:
id: 32029ead-afed-49e0-b0d8-a7f3eb56e7c5
health: HEALTH_OK
services:
mon: 1 daemons, quorum ceph01 (age 26h)
mgr: ceph01(active, since 26h)
osd: 6 osds: 6 up (since 7s), 6 in (since 7s)
task status:
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 6.0 GiB used, 90 GiB / 96 GiB avail
pgs:
[root@ceph01 ceph]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.09357 root default
-3 0.03119 host ceph01
0 hdd 0.01559 osd.0 up 1.00000 1.00000
3 hdd 0.01559 osd.3 up 1.00000 1.00000
-5 0.03119 host ceph02
1 hdd 0.01559 osd.1 up 1.00000 1.00000
4 hdd 0.01559 osd.4 up 1.00000 1.00000
-7 0.03119 host ceph03
2 hdd 0.01559 osd.2 up 1.00000 1.00000
5 hdd 0.01559 osd.5 up 1.00000 1.00000
这里部署6个osd完毕
同步配置
如果期间我们有需要修改cpeh.conf的操作,只需要在ceph01上修改,使用下面的命令同步到其他节点上
ceph-deploy --overwrite-conf config push ceph01 ceph02 ceph03
集群配置
部署Mon高可用集群
monitor负责保存OSD的元数据,所以monitor当然也需要高可用。 这里的monitor推荐使用奇数节点进行部署,我这里以3台节点部署
扩展mon节点
当我们添加上3个monitor节点后,monitor会自动进行选举,自动进行高可用
cd /etc/ceph
ceph-deploy mon add ceph02 --address 192.168.77.42
ceph-deploy mon add ceph03 --address 192.168.77.43
[root@ceph01 ceph]# ceph -s
cluster:
id: 32029ead-afed-49e0-b0d8-a7f3eb56e7c5
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph01,ceph02,ceph03 (age 24s)
mgr: ceph01(active, since 26h)
osd: 6 osds: 6 up (since 11m), 6 in (since 11m)
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 6.0 GiB used, 90 GiB / 96 GiB avail
pgs:
查看monitor选举情况,以及集群的健康状态
[root@ceph01 ceph]# ceph quorum_status --format json-pretty
{
"election_epoch": 12,
"quorum": [
0,
1,
2
],
"quorum_names": [
"ceph01",
"ceph02",
"ceph03"
],
"quorum_leader_name": "ceph01", # 当前leader节点
"quorum_age": 79,
"monmap": {
"epoch": 3, # monitor节点数量
"fsid": "32029ead-afed-49e0-b0d8-a7f3eb56e7c5",
"modified": "2024-04-13 16:13:12.488751",
"created": "2024-04-12 13:43:12.316352",
"min_mon_release": 14,
"min_mon_release_name": "nautilus",
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "ceph01",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "192.168.77.41:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "192.168.77.41:6789",
"nonce": 0
}
]
},
"addr": "192.168.77.41:6789/0",
"public_addr": "192.168.77.41:6789/0"
},
{
"rank": 1,
"name": "ceph02",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "192.168.77.42:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "192.168.77.42:6789",
"nonce": 0
}
]
},
"addr": "192.168.77.42:6789/0",
"public_addr": "192.168.77.42:6789/0"
},
{
"rank": 2,
"name": "ceph03",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "192.168.77.43:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "192.168.77.43:6789",
"nonce": 0
}
]
},
"addr": "192.168.77.43:6789/0",
"public_addr": "192.168.77.43:6789/0"
}
]
}
}
使用dump参数查看关于monitor更细的信息
[root@ceph01 ceph]# ceph mon dump
epoch 3
fsid 32029ead-afed-49e0-b0d8-a7f3eb56e7c5
last_changed 2024-04-13 16:13:12.488751
created 2024-04-12 13:43:12.316352
min_mon_release 14 (nautilus)
0: [v2:192.168.77.41:3300/0,v1:192.168.77.41:6789/0] mon.ceph01
1: [v2:192.168.77.42:3300/0,v1:192.168.77.42:6789/0] mon.ceph02
2: [v2:192.168.77.43:3300/0,v1:192.168.77.43:6789/0] mon.ceph03
dumped monmap epoch 3
部署Mgr高可用
Ceph-MGR目前的主要功能是把集群的一些指标暴露给外界使用
mgr集群只有一个节点为active状态,其它的节点都为standby。只有当主节点出现故障后,standby节点才会去接管,并且状态变更为active
ceph-deploy mgr create ceph02 ceph03
[root@ceph01 ceph]# ceph -s
cluster:
id: 32029ead-afed-49e0-b0d8-a7f3eb56e7c5
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph01,ceph02,ceph03 (age 6m)
mgr: ceph01(active, since 26h), standbys: ceph02, ceph03
osd: 6 osds: 6 up (since 17m), 6 in (since 17m)
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 6.0 GiB used, 90 GiB / 96 GiB avail
pgs:
RBD块存储
块存储是存储区域网络中使用的一个数据存储类型。在这种类型中,数据以块的形式存储在卷里,卷会挂载到节点上。可以为应用程序提供更大的存储容量,并且可靠性和性能都更高。
RBD协议,也就是Ceph块设备 (Ceph Block Device)。RBD除了可靠性和性能之外,还支持完整和增量式快照,精简的配置,写时复制(copy-on-write)式克隆。并且支持全内存式缓存。
目前CEPH RBD支持的最大镜像为16EB,镜像可以直接作为磁盘映射到物理裸机,虚拟机或者其他主机使用,KVM和Xen完全支持RBD,VMware等云厂商也支持RBD模式
创建资源池Pool
ceph osd pool create sunday 64 64
# sunday 为pool名称
# pg为64个 (pg和pgp数量需要一致)
# pgp为64个
# 没指定副本数,默认为三个副本
查看pool
[root@ceph01 ceph]# ceph osd lspools
1 sunday
查看 sunday pool 副本数
默认情况下为3个副本数 保证高可用
[root@ceph01 ceph]# ceph osd pool get sunday size
size: 3
这里也可以按需 修改副本数
[root@ceph01 ceph]# ceph osd pool set sunday size 2
set pool 1 size to 2
[root@ceph01 ceph]# ceph osd pool get sunday size
size: 2
2.10、创建osd与journal日志分区
2.10.1、创建journal日志分区
pvcreate /dev/vdc1
vgcreate ceph-osd0-journal /dev/vdc1
lvcreate -l 100%FREE --name log ceph-osd0-journal
2.10.2、创建osd与关联journal日志分区
ceph-deploy --overwrite-conf osd create ceph01 --filestore --fs-type xfs --data /dev/vdd --journal ceph-osd0-journal/log
2.11、部署mds组件
ceph-deploy mds create ceph0{1,2,3}
2.12、部署mgr Dashboard组件
2.12.1、添加mgr功能
ceph-deploy mgr create ceph0{1,2,3}
2.12.2、开启dashboard 功能
ceph mgr module enable dashboard
2.12.3、创建证书
ceph dashboard create-self-signed-cert
2.12.4、创建 web 登录用户密码
ceph dashboard set-login-credentials user-name password
2.12.5、查看服务访问方式
ceph mgr services
2.12.6、查看mgr服务默认访问端口8443为30443
ceph config set mgr mgr/dashboard/server_port 30443
systemctl restart ceph-mgr.target