I’ve been working on setting up a stratum0 with an s3 backend. I’ve gotten through that and started work on getting a client mounting but the client is unable to find the .cvmfspublished immediately. I can confirm the httpd from the stratum0 is not serving the file but I do see it when I’m browsing the repo from my s3 bucket. I’m curious - is there some extra client hints that are needed to tell the client about the s3 bucket?
(catalog) Initialize catalog [05-29-2025 13:55:18 PDT]
(cache) unable to read local checksum [05-29-2025 13:55:18 PDT]
(download) escaped http://cvmfs.[redacted]/cvmfs/cvmfs.[redacted]/.cvmfspublished to http://cvmfs.[redacted]/cvmfs/cvmfs.[redacted]/.cvmfspublished [05-29-2025 13:55:18 PDT]
(curl) {header/out} GET http://cvmfs.[redacted]/cvmfs/cvmfs.[redacted]/.cvmfspublished HTTP/1.1
Host: cvmfs.[redacted]
Accept: /
Proxy-Connection: Keep-Alive
Connection: Keep-Alive
User-Agent: cvmfs Fuse 2.11.5 [05-29-2025 13:55:18 PDT]
(curl) {header/in} HTTP/1.1 404 Not Found [05-29-2025 13:55:18 PDT]
(download) http status error code: HTTP/1.1 404 Not Found
[404] [05-29-2025 13:55:18 PDT]
(curl) {info} Failed writing header [05-29-2025 13:55:18 PDT]
No, for the client the storage backend does not matter, because it is always served over http. That does not seem to work for your bucket though - you can easily check this on the commandline with curl:
curl -o - http://cvmfs.[redacted]/cvmfs/cvmfs.[redacted]/.cvmfspublished
The .cvmfspublished file needs to be accessible like this - if this gives a 404 you need to change the s3 configuration.
It does indeed 404. We have an s3.conf and a repositories.d/[server]/server.conf pointing to the s3 w/ CVMFS_STRATUM0 and CVMFS_UPSTREAM_STORAGE.
The repo sucessfully works with a transaction/publish update (that we’ve confirmed populates the changes to the upstream s3 bucket. If the transaction/publish step works, are there further bits of plumbing that need to still be done server-side?
The transaction publish will indeed use the s3 credentials, so that can work without the bucket being accessible over http. My guess would be that the problem is with the acl, which needs to be set to allow public read access, see: Creating a Repository (Stratum 0) — CernVM-FS 2.12.6 documentation
I had considered the ACL but a curl to the S3 url for that .cvmfspublished in the bucket is perfectly fine w/o credentials.
Is there any logging on the server side the might reveal where the disconnect is? The normal httpd access and error logs just show what’s already clear (that the GET is 404’ing)
You need public read access over http to the bucket, the server won’t help, as this seems an issue in the s3 configuration.
a curl to the S3 url for that .cvmfspublished in the bucket is perfectly fine w/o credentials.
Couldn’t you use that url then?
In any case I think it’d help if you could share your server.conf.
# Created by cvmfs_server.
CVMFS_CREATOR_VERSION=143
CVMFS_REPOSITORY_NAME=cvmfs.[redacted]
CVMFS_REPOSITORY_TYPE=stratum0
CVMFS_USER=root
CVMFS_UNION_DIR=/cvmfs/cvmfs.[redacted]
CVMFS_SPOOL_DIR=/var/spool/cvmfs/cvmfs.[redacted]
CVMFS_STRATUM0=https://vasts3.[redacted]/cvmfs.[redacted]/cvmfs.[redacted]
CVMFS_UPSTREAM_STORAGE=S3,/var/spool/cvmfs/cvmfs.[redacted]/tmp,cvmfs.[redacted]@/etc/cvmfs/s3.conf
CVMFS_USE_FILE_CHUNKING=true
CVMFS_MIN_CHUNK_SIZE=4194304
CVMFS_AVG_CHUNK_SIZE=8388608
CVMFS_MAX_CHUNK_SIZE=16777216
CVMFS_UNION_FS_TYPE=overlayfs
CVMFS_HASH_ALGORITHM=shake128
CVMFS_COMPRESSION_ALGORITHM=none
CVMFS_FILE_MBYTE_LIMIT=10240
CVMFS_EXTERNAL_DATA=false
CVMFS_AUTO_TAG=true
CVMFS_AUTO_TAG_TIMESPAN="8 weeks ago"
CVMFS_GARBAGE_COLLECTION=false
CVMFS_AUTO_REPAIR_MOUNTPOINT=true
CVMFS_AUTOCATALOGS=true
CVMFS_ASYNC_SCRATCH_CLEANUP=true
CVMFS_PRINT_STATISTICS=false
CVMFS_UPLOAD_STATS_DB=false
CVMFS_UPLOAD_STATS_PLOTS=false
CVMFS_IGNORE_XDIR_HARDLINKS=true
That should be http and not https in cvmfs_stratum0. It’s possible to use https, but requires extra configuration
Is there any documentation you know of for that extra configuration? security folks would come at me with pitchforks if I used straight http (even if it’s completely mundane data).
There’s unfortunately not much documentation for that, and I also don’t have much experience with it personally, but I think you just need to add the following to your /etc/cvmfs/s3.conf :
CVMFS_S3_USE_HTTPS=true
CVMFS_USE_SSL_SYSTEM_CA=true
We do have a test for it here: cvmfs/test/src/684-https_s3/main at devel · cvmfs/cvmfs · GitHub
but it is slightly complicated as it uses self-signed certificats.
We already had the USE_HTTPS and the SYSTEM_CA seems to have no impact on the client comms. Is there a way to turn on any sort of logging on the stratum0 here?
It feels like an httpd vhost isn’t configured or something…