Programatically figure out if a client can properly acesss a CernVM-FS repository?

Hello,
I’m trying to find a way to monitor the health of our CernVM-FS clients, and I attempted a couple of alternatives, but none seem perfect so far:

  1. I tried to look in the web server logs of the repository server for accesses to the .cvmfspublished file, but as far as I can see, the /cvfms mountpoint is unmounted in some of the clients (even if I put it in the /etc/fstab file of the client?) or at least some clients don’t try to access this file from time to time if there are no accesses to /cvmfs, so some clients with a valid configuration will not appear in the access logs. And in any case, this would only tell me clients that successfully connect to the server, and I’m more interested in clients that fail to connect to the server for some reason.

  2. So I tried using cvmfs_config probe, but to my surprise even when I hide the web servers behind the firewall, the probe is still OK, but obviously the client is not able to access the repository anymore. Is this expected behaviour and I’m wrongly understanding what the probe is meant for?

In any case, what would be the recommended way to find if a client is able to access properly the repository? (I want to put that in a script that I run regularly and collect the information somewhere, so that I can monitor the health of all our CernVM-FS clients, so that in case of issues I can find which clients have problems before their users realize that they are not getting the latest versions of the repository software).

Thanks,
Angel de Vicente

Hi Angel,

yes, there are various solutions to monitoring out there. If you are using autofs, it is expected that CVMFS will be unmounted after it has not been accessed for a certain time, and so it may not connect to any server until the next time it is mounted.

cvmfs_config probe actually only reads the configuration and does a filesystem access to the repository (ls and df), so if the repository is mounted already it also won’t talk to the server.

What you probably want to use is cvmfs_talk -i <repository name> internal affairs - here we expose a bunch of counters that can be monitored. In particular download.n_retries and the failovers may be interesting for you. There is builtin support for using these counters with influxdb, see
Client Telemetry Aggregators — CernVM-FS 2.11.3 documentation

There is also this collectd plugin (although I think Steve was planning to move to Prometheus): GitHub - cvmfs/collectd-cvmfs: Collectd Plugin to Monitor CvmFS Clients

To monitor the connectivity between client and server, it’s also possible to just curl .cvmfspublished from the server periodically on the client machine and monitor that - the client does the same thing basically.

Maybe it’d be interesting for you to join the next CVMFS coordination meeting, the site admins and operator that attend there, may be able to give you more tips! GitHub - cvmfs/collectd-cvmfs: Collectd Plugin to Monitor CvmFS Clients

Thanks for the tips. I will check those links.

In the meantime I had prepared something on my own by using the “CernVM-FS Nagios Check for Clients”. I’m not using Nagios, but I wrote a small script to run regularly so that if a client is online I run the Nagios script and that seems sufficient to find out whether I have connection to the server, which then I send to a central database in the server. That way I get a clear picture of which clients are connecting fine and which have issues (hopefully none).

When is the CVFMS coordination meeting happening? (I believe you send me the wrong link by mistake?)

Thanks

Right! Here’s the correct link: CernVM-FS Coordination Meeting (10 June 2024) · Indico

OK, yes, there are different solutions to monitoring, that seems perfectly fine.