02月21, 2017

ceph常用操作和问题排查

以下介绍下ceph的常用操作和常见问题的处理。

常用操作

查看各个卷大小

ceph df
GLOBAL:
    SIZE      AVAIL     RAW USED     %RAW USED
    4189G     3870G         318G          7.61
POOLS:
    NAME        ID     USED       %USED     MAX AVAIL     OBJECTS
    rbd         0      61634M      1.44         1241G       12108
    volumes     1           0         0         1241G           2
    images      2      10782M      0.25         1241G        1358
    backups     3           0         0         1241G           0
    vms         4           0         0         1241G           2
    cache       5      10549M      0.25         1241G        1357

查看监控集群状态

ceph health
ceph status

其它

ceph osd stat
ceph osd dump
ceph osd tree
ceph mon dump
ceph quorum_status
ceph mds stat
ceph mds dum

列出卷

ceph osd lspools

列出卷中的image

rbd ls images -l
NAME                                       SIZE PARENT FMT PROT LOCK
01111dea-70fc-42fc-bd3c-143acfc98512      3072M          2
01111dea-70fc-42fc-bd3c-143acfc98512@snap 3072M          2 yes
1df83732-d521-440a-a7e7-d98067b7233b       589M          2
1df83732-d521-440a-a7e7-d98067b7233b@snap  589M          2 yes
b14bf911-c281-4e3d-bcc2-da58b02c3400       502M          2
b14bf911-c281-4e3d-bcc2-da58b02c3400@snap  502M          2 yes
ea8266ad-3eb4-4679-ab01-db61768dd753      6619M          2
ea8266ad-3eb4-4679-ab01-db61768dd753@snap 6619M          2 yes

查看看image的详细信息

rbd -p images info 01111dea-70fc-42fc-bd3c-143acfc98512
rbd image '01111dea-70fc-42fc-bd3c-143acfc98512':
    size 3072 MB in 384 objects
    order 23 (8192 kB objects)
    block_name_prefix: rbd_data.81062b813386
    format: 2
    features: layering

看image的object数量

rados -p vms ls |grep rbd_data.11e86f017fe7 | wc -l

查看快照

rbd -p images snap ls 01111dea-70fc-42fc-bd3c-143acfc98512
SNAPID NAME    SIZE
    39 snap 3072 MB

删除快照 两种办法

  1. 删除卷中所有快照
    rbd -p images snap purge 01111dea-70fc-42fc-bd3c-143acfc98512
    
  2. 删除某个快照
    rbd -p images snap rm 01111dea-70fc-42fc-bd3c-143acfc98512@snap
    
    认证测试
    ceph -n client.glance --keyring ceph.client.glance.keyring
    
    创建image测试
    rbd create --size 1024 xsltest -p images -n client.glance --keyring
    rbd create --size 1024 xsltest -p images -n client.glance --keyring -n client.glance --keyring ceph.client.glance.keyring
    
    看某个image在使用的client ``` rados -p rbd listwatchers myrbd.rbd watcher=10.2.0.131:0/1013964 client.34453 cookie=1 $ rbd info myrbd rbd image 'myrbd':
     size 1024 TB in 268435456 objects
     order 22 (4096 kB objects)
     block_name_prefix: rbd_data.82072ae8944a
     format: 2
     features: layering
    

$ rados -p rbd listwatchers rbd_header.82072ae8944a watcher=10.2.0.131:0/1013964 client.34453 cookie=1

调试模式
加上```--debug-rbd 20 --debug-ms 1

rbd -p images info --debug-rbd 20 --debug-ms 1 01111dea-70fc-42fc-bd3c-143acfc98512

常见问题

device busy

有时候删除会遇到device busy.

删除image之前先删除snap,如果删除snap也是device busy.

先试试 unprotect

rbd snap unprotect --debug-rbd 20 --debug-ms 1 $snapname

如果这样也不行,可能有children,先删除children 列出children,删除之

rbd children volumes/volume-9f441318-3169-40f2-9c99-bc4297834810@snapshot-bdbfa276-190c-4be4-98c5-29bb31ed8d4d

如果这样删除报错

2015-10-15 18:56:52.467396 7ff1eaaef880 -1 librbd: image has watchers - not removing
Removing image: 0% complete...failed.
rbd: error: image still has watchers
This means the image is still open or the client using it crashed. Try again after closing/unmapping it or waiting 30s for the crashed client to timeout.
Removing image: 100% complete...done.

通过rados看watcher

rbd -p vms info ce5616a8-9bde-43a9-a227-dfca3c0d52b7_disk
rbd image 'ce5616a8-9bde-43a9-a227-dfca3c0d52b7_disk':
    size 40960 MB in 5120 objects
    order 23 (8192 kB objects)
    block_name_prefix: rbd_data.32010637b4f12
    format: 2
    features: layering
    parent: images/38d83516-f6c3-44ff-b447-b874e00b052b@snap
    overlap: 2048 MB

rados -p vms listwatchers rbd_header.32010637b4f12
watcher=10.1.1.1:0/1050778 client.204885 cookie=1

通过以上看到,10.1.1.1对它有链接没有关闭。 找到这个进程,然后使其关闭连接。

ps aux | grep 'vms/dc139f2b-22b1-4c15-b236-ae38a7fad329_disk'

本文链接:https://www.opsdev.cn/post/ceph-common-issue.html

-- EOF --

Comments

评论加载中...

注:如果长时间无法加载,请针对 disq.us | disquscdn.com | disqus.com 启用代理。