很多生产环境，ceph是不跑在容器内，而是单独找服务器做成ceph集群，只是跑ceph集群不运行其它应用。

本次使用ceph-deploy快速构建ceph集群并创建块存储、文件系统、以及对象存储。

centos7安装15.x.xx版本会有问题, 这里安装14.x.xx

环境准备

集群信息

Ceph-deploy：2.0.1
Ceph：Nautilus（14.2.22）
System: CentOS Linux release 7.9.2009

IP	主机名	服务	附加硬盘(OSD)
192.168.77.41	ceph01	mon、mgr、osd	2块16G数据盘
192.168.77.42	ceph02	mon、mgr、osd	2块16G数据盘
192.168.77.43	ceph03	mon、mgr、osd	2块16G数据盘

如果环境允许，可以用一个 ceph-admin 节点专门放置 mon,mgr,mds 等这些组件，osd 放置在其他节点，更便于管理

关闭防火墙

systemctl stop firewalld
systemctl disable firewalld
iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat
iptables -P FORWARD ACCEPT
setenforce 0
sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config

配置host解析

cat >> /etc/hosts <<EOF
192.168.77.41 ceph01
192.168.77.42 ceph02
192.168.77.43 ceph03
EOF

设置主机名

hostnamectl set-hostname ceph01
bash

配置ssh免密钥登陆

yum install -y sshpass
ssh-keygen -f /root/.ssh/id_rsa -P ''
export HOSTS="ceph01 ceph02 ceph03"
export SSHPASS=123456 # 密码 
for HOST in $HOSTS;do sshpass -e ssh-copy-id -o StrictHostKeyChecking=no $HOST;done

时间同步

yum install -y ntp

部署NTP 服务器 ceph01操作

# 替换centos ntp server
vim /etc/ntp.conf
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server ntp1.aliyun.com iburst
server ntp2.aliyun.com iburst
server ntp3.aliyun.com iburst

systemctl restart ntpd
systemctl enable ntpd
timedatectl set-timezone Asia/Shanghai
timedatectl set-local-rtc 0

ceph02、ceph03 配置ntp server指向 ceph01 IP

vim /etc/ntp.conf
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server 192.168.77.41 iburst

systemctl restart ntpd
systemctl enable ntpd

root@ceph01 ~]# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 120.25.115.20   .INIT.          16 u    -   64    0    0.000    0.000   0.000
*203.107.6.88    100.107.25.114   2 u   38   64    1   44.618  -11.494   0.282

[root@ceph03 ~]# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 192.168.77.41   203.107.6.88     3 u    1   64    1    0.327    7.468   0.180

配置yum源加速

curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
yum clean all
yum makecache

ceph部署

配置ceph源

cat > /etc/yum.repos.d/ceph.repo << EOF
[norch]
name=norch
baseurl=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch/
enabled=1
gpgcheck=0

[x86_64]
name=norch
baseurl=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/x86_64/
enabled=1
gpgcheck=0
EOF

指定安装版本源

export CEPH_DEPLOY_REPO_URL=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7
export CEPH_DEPLOY_GPG_URL=https://mirrors.aliyun.com/ceph/keys/release.asc

安装必要软件包

yum install -y wget curl net-tools bash-completion python-setuptools

安装ceph-deploy部署工具

ceph01部署节点安装依赖包以及ceph部署工具ceph-deploy

yum install -y ceph-deploy

初始化Mon配置

这里先运行单节点的monitor

mkdir /etc/ceph
cd /etc/ceph
ceph-deploy new --public-network 192.168.77.0/24 ceph01

#ceph-deploy new --public-network 192.168.77.0/24 ceph0{1,2,3}
# --cluster-network 集群内通信的网络
# --public-network 集群对外的网络

执行后生成的文件

[root@ceph01 ceph]# ls -l
total 12
-rw-r--r-- 1 root root  231 Apr  9 13:29 ceph.conf # ceph配置文件
-rw-r--r-- 1 root root 2992 Apr  9 13:29 ceph-deploy-ceph.log # ceph日志
-rw------- 1 root root   73 Apr  9 13:29 ceph.mon.keyring # keyring身份验证

添加配置参数

允许ceph时间偏移

echo "mon clock drift allowed = 2" >> /etc/ceph/ceph.conf
echo "mon clock drift warn backoff = 30" >> /etc/ceph/ceph.conf

export CEPH_DEPLOY_REPO_URL=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7
export CEPH_DEPLOY_GPG_URL=https://mirrors.aliyun.com/ceph/keys/release.asc
ceph-deploy install --release nautilus ceph01 ceph02 ceph03

初始化Mon节点

ceph-deploy install --release nautilus ceph01 ceph02 ceph03

[root@ceph01 ceph]# ls -l
total 224
-rw------- 1 root root    113 Apr 12 13:43 ceph.bootstrap-mds.keyring
-rw------- 1 root root    113 Apr 12 13:43 ceph.bootstrap-mgr.keyring
-rw------- 1 root root    113 Apr 12 13:43 ceph.bootstrap-osd.keyring
-rw------- 1 root root    113 Apr 12 13:43 ceph.bootstrap-rgw.keyring
-rw------- 1 root root    151 Apr 12 13:43 ceph.client.admin.keyring
-rw-r--r-- 1 root root    292 Apr 12 13:43 ceph.conf
-rw-r--r-- 1 root root 157631 Apr 12 13:43 ceph-deploy-ceph.log
-rw------- 1 root root     73 Apr  9 13:53 ceph.mon.keyring
-rw-r--r-- 1 root root     92 Jun 30  2021 rbdmap

同步生成的文件

ceph-deploy admin ceph01 ceph02 ceph03

禁用不安全模式

ceph config set mon auth_allow_insecure_global_id_reclaim false

可以看到有一个mon节点在运行

[root@ceph01 ceph]# ceph -s
  cluster:
    id:     32029ead-afed-49e0-b0d8-a7f3eb56e7c5
    health: HEALTH_OK

  services:
    mon: 1 daemons, quorum ceph01 (age 2m)  # 包含一个mon
    mgr: no daemons active # mgr还没创建
    osd: 0 osds: 0 up, 0 in # osd还没创建

  data: # 资源池还没创建
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

创建Mgr

这里只将ceph01 作为 manager daemon

ceph-deploy mgr create ceph01

可以看到有一个刚添加的mgr在运行

[root@ceph01 ceph]# ceph -s
  cluster:
    id:     32029ead-afed-49e0-b0d8-a7f3eb56e7c5
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3

  services:
    mon: 1 daemons, quorum ceph01 (age 6m)
    mgr: ceph01(active, since 35s) # 新添加的mgr 状态：active
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

部署OSD

这里使用sdb sdc作为osd盘

[root@ceph01 ~]# fdisk -l

Disk /dev/sda: 36.5 GB, 36507222016 bytes, 71303168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000eb3ad

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048     1050623      524288   83  Linux
/dev/sda2         1050624    71303167    35126272   83  Linux

Disk /dev/sdb: 17.2 GB, 17179869184 bytes, 33554432 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk /dev/sdc: 17.2 GB, 17179869184 bytes, 33554432 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

格式化磁盘

ceph-deploy disk zap ceph01 /dev/sdb
ceph-deploy disk zap ceph02 /dev/sdb
ceph-deploy disk zap ceph03 /dev/sdb

[root@ceph01 ~]# lsblk 
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda      8:0    0   34G  0 disk 
├─sda1   8:1    0  512M  0 part /boot
└─sda2   8:2    0 33.5G  0 part /
sdb      8:16   0   16G  0 disk 
sdc      8:32   0   16G  0 disk 
sr0     11:0    1 1024M  0 rom

创建OSD

cd /etc/ceph
ceph-deploy osd create ceph01 --data /dev/sdb
ceph-deploy osd create ceph03 --data /dev/sdb
ceph-deploy osd create ceph02 --data /dev/sdb

ceph-deploy osd create ceph01 --data /dev/sdc
ceph-deploy osd create ceph02 --data /dev/sdc
ceph-deploy osd create ceph03 --data /dev/sdc

health: HEALTH_OK 同步正常

[root@ceph01 ceph]# ceph -s
  cluster:
    id:     32029ead-afed-49e0-b0d8-a7f3eb56e7c5
    health: HEALTH_OK

  services:
    mon: 1 daemons, quorum ceph01 (age 26h)
    mgr: ceph01(active, since 26h)
    osd: 6 osds: 6 up (since 7s), 6 in (since 7s)

  task status:

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   6.0 GiB used, 90 GiB / 96 GiB avail
    pgs:  

[root@ceph01 ceph]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME       STATUS REWEIGHT PRI-AFF 
-1       0.09357 root default                            
-3       0.03119     host ceph01                         
 0   hdd 0.01559         osd.0       up  1.00000 1.00000 
 3   hdd 0.01559         osd.3       up  1.00000 1.00000 
-5       0.03119     host ceph02                         
 1   hdd 0.01559         osd.1       up  1.00000 1.00000 
 4   hdd 0.01559         osd.4       up  1.00000 1.00000 
-7       0.03119     host ceph03                         
 2   hdd 0.01559         osd.2       up  1.00000 1.00000 
 5   hdd 0.01559         osd.5       up  1.00000 1.00000

这里部署6个osd完毕

同步配置

如果期间我们有需要修改cpeh.conf的操作，只需要在ceph01上修改，使用下面的命令同步到其他节点上

ceph-deploy --overwrite-conf config push ceph01 ceph02 ceph03

集群配置

部署Mon高可用集群

monitor负责保存OSD的元数据，所以monitor当然也需要高可用。这里的monitor推荐使用奇数节点进行部署，我这里以3台节点部署

扩展mon节点
当我们添加上3个monitor节点后，monitor会自动进行选举，自动进行高可用

cd /etc/ceph
ceph-deploy mon add ceph02 --address 192.168.77.42
ceph-deploy mon add ceph03 --address 192.168.77.43

[root@ceph01 ceph]# ceph -s
  cluster:
    id:     32029ead-afed-49e0-b0d8-a7f3eb56e7c5
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum ceph01,ceph02,ceph03 (age 24s)
    mgr: ceph01(active, since 26h)
    osd: 6 osds: 6 up (since 11m), 6 in (since 11m)

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   6.0 GiB used, 90 GiB / 96 GiB avail
    pgs:

查看monitor选举情况，以及集群的健康状态

[root@ceph01 ceph]# ceph quorum_status --format json-pretty

{
    "election_epoch": 12,
    "quorum": [
        0,
        1,
        2
    ],
    "quorum_names": [
        "ceph01",
        "ceph02",
        "ceph03"
    ],
    "quorum_leader_name": "ceph01", # 当前leader节点
    "quorum_age": 79,
    "monmap": {
        "epoch": 3, # monitor节点数量
        "fsid": "32029ead-afed-49e0-b0d8-a7f3eb56e7c5",
        "modified": "2024-04-13 16:13:12.488751",
        "created": "2024-04-12 13:43:12.316352",
        "min_mon_release": 14,
        "min_mon_release_name": "nautilus",
        "features": {
            "persistent": [
                "kraken",
                "luminous",
                "mimic",
                "osdmap-prune",
                "nautilus"
            ],
            "optional": []
        },
        "mons": [
            {
                "rank": 0,
                "name": "ceph01",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "192.168.77.41:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "192.168.77.41:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "192.168.77.41:6789/0",
                "public_addr": "192.168.77.41:6789/0"
            },
            {
                "rank": 1,
                "name": "ceph02",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "192.168.77.42:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "192.168.77.42:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "192.168.77.42:6789/0",
                "public_addr": "192.168.77.42:6789/0"
            },
            {
                "rank": 2,
                "name": "ceph03",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "192.168.77.43:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "192.168.77.43:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "192.168.77.43:6789/0",
                "public_addr": "192.168.77.43:6789/0"
            }
        ]
    }
}

使用dump参数查看关于monitor更细的信息

[root@ceph01 ceph]# ceph mon dump
epoch 3
fsid 32029ead-afed-49e0-b0d8-a7f3eb56e7c5
last_changed 2024-04-13 16:13:12.488751
created 2024-04-12 13:43:12.316352
min_mon_release 14 (nautilus)
0: [v2:192.168.77.41:3300/0,v1:192.168.77.41:6789/0] mon.ceph01
1: [v2:192.168.77.42:3300/0,v1:192.168.77.42:6789/0] mon.ceph02
2: [v2:192.168.77.43:3300/0,v1:192.168.77.43:6789/0] mon.ceph03
dumped monmap epoch 3

部署Mgr高可用

Ceph-MGR目前的主要功能是把集群的一些指标暴露给外界使用

mgr集群只有一个节点为active状态，其它的节点都为standby。只有当主节点出现故障后，standby节点才会去接管，并且状态变更为active

ceph-deploy mgr create ceph02 ceph03

[root@ceph01 ceph]# ceph -s
  cluster:
    id:     32029ead-afed-49e0-b0d8-a7f3eb56e7c5
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum ceph01,ceph02,ceph03 (age 6m)
    mgr: ceph01(active, since 26h), standbys: ceph02, ceph03
    osd: 6 osds: 6 up (since 17m), 6 in (since 17m)

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   6.0 GiB used, 90 GiB / 96 GiB avail
    pgs:

RBD块存储

块存储是存储区域网络中使用的一个数据存储类型。在这种类型中，数据以块的形式存储在卷里，卷会挂载到节点上。可以为应用程序提供更大的存储容量，并且可靠性和性能都更高。

RBD协议，也就是Ceph块设备 (Ceph Block Device)。RBD除了可靠性和性能之外，还支持完整和增量式快照，精简的配置，写时复制(copy-on-write)式克隆。并且支持全内存式缓存。

目前CEPH RBD支持的最大镜像为16EB，镜像可以直接作为磁盘映射到物理裸机，虚拟机或者其他主机使用，KVM和Xen完全支持RBD，VMware等云厂商也支持RBD模式

创建资源池Pool

ceph osd pool create sunday 64 64

# sunday 为pool名称
# pg为64个 (pg和pgp数量需要一致)
# pgp为64个
# 没指定副本数，默认为三个副本

查看pool

[root@ceph01 ceph]# ceph osd lspools
1 sunday

查看 sunday pool 副本数
默认情况下为3个副本数保证高可用

[root@ceph01 ceph]# ceph osd pool get sunday size
size: 3

这里也可以按需修改副本数

[root@ceph01 ceph]# ceph osd pool set sunday size 2
set pool 1 size to 2
[root@ceph01 ceph]# ceph osd pool get sunday size
size: 2

2.10、创建osd与journal日志分区

2.10.1、创建journal日志分区

pvcreate /dev/vdc1
vgcreate ceph-osd0-journal /dev/vdc1
lvcreate -l 100%FREE --name log ceph-osd0-journal

2.10.2、创建osd与关联journal日志分区

ceph-deploy --overwrite-conf osd create ceph01 --filestore --fs-type xfs --data /dev/vdd --journal ceph-osd0-journal/log

2.11、部署mds组件

ceph-deploy mds create ceph0{1,2,3}

2.12、部署mgr Dashboard组件

2.12.1、添加mgr功能

ceph-deploy mgr create ceph0{1,2,3}

2.12.2、开启dashboard 功能

ceph mgr module enable dashboard

2.12.3、创建证书

ceph dashboard create-self-signed-cert

2.12.4、创建 web 登录用户密码

ceph dashboard set-login-credentials user-name password

2.12.5、查看服务访问方式

ceph mgr services

2.12.6、查看mgr服务默认访问端口8443为30443

ceph config set mgr mgr/dashboard/server_port 30443
systemctl restart ceph-mgr.target

Ceph-deploy 快速部署Ceph集群

Ceph-deploy 二进制单机部署及集群扩容