In my last post I showed you how to monitor your Proxmox Ceph storage with CheckMK. Let’s have a look.
In my home lab I have a 256 GB Ceph pool CEPHPOOL1. As long as it’s empty it looks like this in CheckMK:

The total size displayed is only 243 GB, probably due to Ceph overhead, reserves or whatever. We will get a warning when used space is over 194 GB (80 %) and a critical when used space is over 218 GB (90 %). That’s ok for the moment, we will talk about it later.
Then I moved five virtual disks to CEPHPOOL1. If you do that: don’t forget to tick “Delete source”! This is not the default although one would think that “move” implies deletion of the source. If you forget it, you will end up with a dead copy1 of the disk in the source storage.
Now let’s take a look at how the graph for CEPHPOOL1 developed:

Overbooking and trim
The first interesting observation is that we see some kind of a saw-tooth pattern. That is because after migrating the disks of each VM to Ceph I ran fstrim -a in this VM. This discards blocks which are not in use from all mounted filesystems. “Discard” here means “give unused blocks back to the underlying storage”. I.e. the first VM I migrated seems to have a 192 GB drive allocated, but actually only uses 64 GB of that space. By running fstrim I “gave back” the difference of 128 GB to Ceph. After the second VM I automated this task (see below).
There are a few things to consider when doing that:
- This is “overbooking”. You can allocate more space than you actually have in your storage. There is a big “but”: if a VM decides to actually use its remaining storage than the Ceph pool might become 100% full and because of this switch to read-only, bringing all VMs that use this pool to a halt. Don’t forget that and closely monitor used and free space!
I advise you to lower the warning and critical thresholds in CheckMK to 60/70% for Ceph pools! - Make sure that all your virtual disks run with “
discard=on” and “ssd=on” ticked (the latter can only be configured when “Advanced” is ticked). “discard” makes it possible to give unused blocks back and “ssd” animates your guest OS to actually use trim (Windows won’t trim unless it “sees” an SSD). Unfortunately you need to shut down the VM to change these settings. - The fact that migrating virtual disks takes up all the unused space inside the VM is a limitation of QEMU live migration. To automate trimming after a storage migration: make sure that all your VMs under Options > QEMU guest agent have “Run guest-trim after a disk move or VM migration” ticked. You then should see “
qemu-ga[????]: info: guest-fstrim called” in/var/log/messagesafter each migration. - In Linux guests run “
systemctl enable fstrim.timer” once after installation (works for RHEL, Debian and their heirs). It will runfstrimonce a week. fstrimonly “sees” mounted filesystems. Don’t have unpartitioned space, unmounted partitions or LVM volume groups with free space laying around. Partition, format and mount them as “/mnt/dummy” or “/mnt/reserve“.
CheckMK bug: wrong “total size”
The second interesting observation is that the total size (the green line) moved upwards. I’m adamant that I did not actually add any new storage while moving virtual disks to Ceph. 🙂
This is actually a bug in CheckMK. It uses what it gets from ceph df detail --format json:
# ceph df detail --format json | jq '.pools[] | select(.name=="CEPHPOOL1")'
{
"name": "CEPHPOOL1",
"id": 3,
"stats": {
"stored": 70027208056,
...
"bytes_used": 210081734761,
...
"max_avail": 189614047232,
...
}
}
Now go to your CheckMK server, cd to /opt/omd/versions/default/lib/python3/cmk/base/plugins/agent_based and take a look at ceph_df_json_section. You will find:
mps.extend(
[
(
pool["name"],
str_to_mebibyte(pool["stats"]["max_avail"])
+ str_to_mebibyte(pool["stats"]["bytes_used"]),
str_to_mebibyte(pool["stats"]["max_avail"]),
0,
)
for pool in ceph_df["pools"]
]
)
return mps
mps gets extended by a construct normally containing the pool name, the size of the pool, the free space in the pool and the value 0. But the size of the pool is calculated by max_avail + bytes_used. This is 189614047232 + 210081734761 or 176.59 GB + 195.65 GB = 372.25 GB, almost exactly the 372.26 GB displayed in the graph. 176 GB (max_avail) is the free space in the pool. But 195 GB (bytes_used) is the space allocated (or “overbooked”) to virtual storage and not the actual space used.
You might say that this is cosmetic, but the problem behind this is that the warning and critical thresholds are calculated from the (wrong) total size and go through the roof: they are at 297 and 335 GB now, much more than the 241 GB we even have. That is asking for trouble!
When we instead calculate max_avail + stored we get 189614047232 + 70027208056 = 176.59 GB + 65.22 GB = 241.81 GB, the total space we started with. And the total space should stay the same until we add some storage to the pool.
So, make a copy of ceph_df_json_section.py, then edit it and change bytes_used to stored. Save it, wait a few minutes and look at the graph again.

See? Fixed! And the thresholds are back to “useful”.
What to do with the bug?
Actually, this is a known bug in CheckMK 2.3.0p18 (current as of October 2024) and a pull request exists to fix it. Alas, they have decided to not implement the trivial fix (“our resources are quite limited”) and instead point to a new Ceph statistics plugin by Robert Sander from Heinlein Support, who contributed a lot of CheckMK stuff, see this page.
I will test that alternate Ceph plugin and report back here a bit later.
Update: I did take a look at the alternate plugin. Read my review.
- To get rid of a dead disk copy:
qm rescan --vmid VMID, then you will find it as “Unused Disk” in the VM’s hardware tab and can delete it from there. ↩︎
