since I installed cernvm-fs at our institution to handle how we distribute software to our workstations and laptops, things have gone so smoothly that now I’m very rusty with its administration… But I think it is about time that I start to worry about the disk space taken by the repositories both in the Stratum-0 and Stratum-1s.
Frist, I would like to see how much space is used at present. Is there any cvmfs utility to report some stats on this, or is it down to using the system tools to check the space in the relevant directories?
Second, when I distribute a new version of the repository I always do it like: cvmfs_server publish -a “$tag” -m “$message” sie_u22.iac.es, but now we have 131 named snapshots and I would like to do some cleaning up. I know I can do cvmfs_server tag -r , but this would be a bit painful to do one by one. Is there a way to just say “keep the latest 50 tags”? And once I do that in the Stratum-0 machine, to reclaim space in Stratum-1s, do I have to do anything manually or deletion would happen automatically?
In case it is relevant, our configuration file reads:
First, I would like to see how much space is used at present. Is there any cvmfs utility to report some stats on this, or is it down to using the system tools to check the space in the relevant directories?
No, there’s no cvmfs utility to check the diskspace - all data that makes up the repository resides in a single location ( defined by CVMFS_UPSTREAM_STORAGE, usually /srv/cvmfs/ if it’s on a local disk, or an s3 bucket ), so it should be easy to check with system tools. Either du -h -d 1 /srv/cvmfs/sie_u22.iac.es/ or s3cmd du <bucket>/
Is there a way to just say “keep the latest 50 tags”?
Not really. That exists only for auto-generated tags - if you set CVMFS_AUTO_TAG=true it will generate tags like generic-2025-08-27T12:44:46Z for every publication. You can then set CVMFS_AUTO_TAG_TIMESPAN="1 month ago" to clean up autotags older than a month (when doing new publications). For named snapshots, it should be possible to parse the output of cvmfs_server list and do similar deletions in a script. The autotag cleanup essentiall does the same thing. Let me open an issue to add an option to the server tools eventually.
And once I do that in the Stratum-0 machine, to reclaim space in Stratum-1s, do I have to do anything manually or deletion would happen automatically?
Nothing is deleted from cvmfs repository backend stores by default, even when you delete files in a publication. You have to run the garbage collection (cvmfs_server gc), both on the stratum-0 and stratum-1 (it’s usually done in a cron job). I see you have gc disabled
CVMFS_GARBAGE_COLLECTION=false
so first you need to change that to true, and then do one (empty) publication for the setting to take effect.
I was actually doing this a few minutes ago, and my one liner to remove the oldest 50 named snapshots ended up being a bit ugly but it did the job
for tag in `sudo cvmfs_server tag -lx sie_u22.iac.es | tac | head -n 50 | awk '{print $1}' | tr '\n' ' '` ; do sudo cvmfs_server tag -f -r $tag sie_u22.iac.es ; done
Thanks a lot Valentin. That was really useful. I cleaned up the name snapshots list, and for the moment I realized that I probably can live without garbage collection as there are very few deletions in the repository, and it is mostly adding new packages, so I don’t think I would gain much from garbage collection, so the lazy in me tells me to postpone this for the time being…