CVMFS Use Cases

kevin · May 5, 2021, 5:46pm

I’m in the process of building a new ~500 node HPC cluster, and considering using CVMFS for software distribution. I’m trying to figure out what pieces I need to effectively support the cluster. Are there any documented use cases out there for people that have done this before?
I’m guessing I need a stratum 0 server, a couple of stratum 1 servers, and a couple of proxy servers, all local to the cluster. (For a cluster of this size, do I even need to bother with proxy servers? Seems like my clients could just connect directly to the stratum 1s)
Also, the docs recommend running a proxy server on the same machine as the stratum 1 server, but it’s not clear to me as to why this is necessary.

Any guidance would be greatly appreciated!

Thanks,
Kevin

–
Kevin Hildebrand
HPC Architect
University of Maryland, College Park
Division of IT

jakob · May 6, 2021, 8:41am

Hi Kevin,

Welcome to CVMFS!

For use cases and experience in HPC, perhaps @bedroge can provide some additional insights.

What I would suggest is starting with few only components and scaling up if it turns out to be necessary. For 500 nodes, I would suggest to start with a stratum 0 and a single stratum 1 and two proxy servers. Having both, stratum 0 and stratum 1 allows for separating reading (stratum 1) and publishing (stratum 0). The two proxy servers allow for taking one of them out at any time for maintenance. I think in terms of load that should work just fine. You might install a second stratum 1 server if HA read access to the repository is a concern. If you have powerful physical nodes as stratum 1 web servers, you can certainly try skipping the proxies altogether. In this case, I’d at least setup two stratum 1s. (The advantage of proxy servers is that they are easier to scale up than stratum 1s because they are just caches.)

In your case, you would not need an additional proxy servers on the stratum 1. This additional proxy server is the configuration recommended for large, regional stratum 1s serving the Worldwide LHC Computing Grid. For a single site, a standard Apache as a stratum 1 should work just fine.

Besides the number of client nodes, it would also make sense to consider the following:

The repository and domain name(s): once created, the names cannot be changed. It would make sense to pick a single domain name for your site and then different repository names if you have different “software librarians” (like we do with atlas.cern.ch, cms.cern.ch, etc.).
Related, it makes sense to have a single set of master keys for the domain, which are used to sign the whitelists of all the repositories. There is more information on the key management in the technical documentation
The key and client configuration management is especially important if you plan at some point to make the repository available to the outside world or if you have a larger number of users who would need to publish software in cvmfs. If you know it’s only going to be a single repository, maintained by few administrators for a single site, you can just use the keys automatically created during repository creation and use your standard configuration management system to add the repository configuration on the clients to /etc/cvmfs/config.d/

Cheers,
Jakob