Unable to create a new repo with data stored in S3-compatible storage

Hi,

I am trying to create a testing repo with data stored in a S3-compatible storage, following this documentation:

https://cvmfs.readthedocs.io/en/latest/cpt-repo.html?#s3-compatible-storage-systems

I have created the bucket, and I can manually copy a file into it and list it.

[root@stratum0 ~]# cat creds
[default]
access_key = 1KWS********************
secret_key = 6Pq*********************
host_base = s3.echo.stfc.ac.uk
host_bucket = s3.echo.stfc.ac.uk/%(bucket)

[root@stratum0 ~]# s3cmd --config s3config mb s3://cvmfs_test_s3_egi_eu
Bucket 's3://cvmfs_test_s3_egi_eu/' created

[root@stratum0 ~]# s3cmd --config creds put /tmp/test s3://cvmfs_test_s3_egi_eu
upload: '/tmp/test' -> 's3://cvmfs_test_s3_egi_eu/test'  [1 of 1]
 23 of 23   100% in    0s   180.08 B/s  done

[root@stratum0 ~]# s3cmd --config creds ls s3://cvmfs_test_s3_egi_eu
2021-06-23 11:24           23  s3://cvmfs_test_s3_egi_eu/test

However, the command to create the repo is failing.

[root@stratum0 ~]# cat s3.conf
CVMFS_S3_ACCESS_KEY=1KWS***********
CVMFS_S3_SECRET_KEY=6Pq************
CVMFS_S3_HOST=s3.echo.stfc.ac.uk
CVMFS_S3_BUCKET=cvmfs_test_s3_egi_eu
CVMFS_S3_USE_HTTPS=true
CVMFS_S3_DNS_BUCKETS=false

[root@stratum0 ~]# cvmfs_server mkfs -s s3.conf -w https://cvmfs_test_s3_egi_eu.s3.echo.stfc.ac.uk -g -m -o cvmfs test_s3.egi.eu
Warning: CernVM-FS filesystems using overlayfs may not enforce hard link semantics during publishing.
Creating Configuration Files... done
Creating CernVM-FS Master Key and Self-Signed Certificate... done
Creating CernVM-FS Server Infrastructure... done
Signing 30 day whitelist with master key... Upload job for 'test_s3.egi.eu/.cvmfswhitelist' failed. (error code: 5 - S3: host connection problem)
failed to upload /var/spool/cvmfs/test_s3.egi.eu/tmp/whitelist.test_s3.egi.eu

I have tried both, http and https, with the same result.

Has anybody experience the same issue and is able to share the solution?

Thanks a lot in advance.
Cheers,
Jose

Hi Jose,

For the time being, you’d need to use HTTP without S. The CVMFS_S3_USE_HTTPS has actually no meaning and the URL should start with http://. We have a PR to add HTTPS support for maintaining an S3 repository but it is not yet merged or released.

The problem I think comes from the DNS style bucket naming. According to the configuration, your S3 setup does not use the bucket name as part of the host name (CVMFS_S3_DNS_BUCKETS=false). In this case, you can try with the option

-w http://s3.echo.stfc.ac.uk/cvmfs_test_s3_egi_eu

Cheers,
Jakob

Unfortunately, still failing:

[root@stratum0 ~]# cat s3.conf
CVMFS_S3_ACCESS_KEY=*****
CVMFS_S3_SECRET_KEY=*******
CVMFS_S3_HOST=s3.echo.stfc.ac.uk
CVMFS_S3_BUCKET=cvmfs_test_s3_egi_eu
CVMFS_S3_DNS_BUCKETS=true
#CVMFS_S3_USE_HTTPS=true

[root@stratum0 ~]# cvmfs_server mkfs -s s3.conf -w 
http://s3.echo.stfc.ac.uk/cvmfs_test_s3_egi_eu -g -m -o cvmfs test_s3.egi.eu
Warning: CernVM-FS filesystems using overlayfs may not enforce hard link semantics during 
publishing.
Creating Configuration Files... done
Creating CernVM-FS Master Key and Self-Signed Certificate... done
Creating CernVM-FS Server Infrastructure... done
Signing 30 day whitelist with master key... Upload job for 
'cvmfs_test_s3_egi_eu/test_s3.egi.eu/.cvmfswhitelist' failed. (error code: 5 - S3: host connection 
problem)
failed to upload /var/spool/cvmfs/test_s3.egi.eu/tmp/whitelist.test_s3.egi.eu

Hi Jakob,
some extra info.

First, I have added these 2 lines to /etc/cvmfs/default.conf

CVMFS_SERVER_DEBUG=3
CVMFS_DEBUGLOG=/tmp/cvmfs.log

But they didn’t make any difference. BTW, is CVMFS_DEBUGLOG only for the clients?

Second thing, following recommendation from our Storage sys admin, I am trying explicitly the hostname of one of the gateways.
We have S3-like storage area behind a bunch of gateways. So we are trying one, to see if that one has anything in the logs.
Here are the 2 attempts:

[root@stratum0 ~]# cat s3.conf
CVMFS_S3_ACCESS_KEY=***
CVMFS_S3_SECRET_KEY=***
CVMFS_S3_HOST=ceph-gw1.gridpp.rl.ac.uk
CVMFS_S3_BUCKET=cvmfs_test_s3_egi_eu
CVMFS_S3_DNS_BUCKETS=true

[root@stratum0 ~]# cvmfs_server mkfs -s s3.conf -w http://ceph-gw1.gridpp.ac.uk/cvmfs_test_s3_egi_eu -g -m -o cvmfs test_s3.egi.eu
Warning: CernVM-FS filesystems using overlayfs may not enforce hard link semantics during publishing.
Creating Configuration Files... done
Creating CernVM-FS Master Key and Self-Signed Certificate... done
Creating CernVM-FS Server Infrastructure... done
Signing 30 day whitelist with master key... Error: DNS resolve failed for address 'cvmfs_test_s3_egi_eu.ceph-gw1.gridpp.rl.ac.uk'.
cvmfs_swissknife: /home/sftnight/jenkins/workspace/CvmfsFullBuildDocker/CVMFS_BUILD_ARCH/docker-x86_64/CVMFS_BUILD_PLATFORM/cc7/build/BUILD/cvmfs-2.7.4/cvmfs/s3fanout.cc:708: int s3fanout::S3FanoutManager::InitializeDnsSettings(CURL*, std::string) const: Assertion `dnse != __null' failed.
/usr/bin/cvmfs_server: line 197: 31992 Aborted                 (core dumped) $(__swissknife_cmd) $@


[root@stratum0 ~]# cat s3.conf
CVMFS_S3_ACCESS_KEY=***
CVMFS_S3_SECRET_KEY=***
CVMFS_S3_HOST=ceph-gw1.gridpp.rl.ac.uk
CVMFS_S3_BUCKET=cvmfs_test_s3_egi_eu
CVMFS_S3_DNS_BUCKETS=false

[root@stratum0 ~]# cvmfs_server mkfs -s s3.conf -w http://cvmfs_test_s3_egi_eu.ceph-gw1.gridpp.ac.uk -g -m -o cvmfs test_s3.egi.eu
Warning: CernVM-FS filesystems using overlayfs may not enforce hard link semantics during publishing.
Creating Configuration Files... done
Creating CernVM-FS Master Key and Self-Signed Certificate... done
Creating CernVM-FS Server Infrastructure... done
Signing 30 day whitelist with master key... Upload job for 'test_s3.egi.eu/.cvmfswhitelist' failed. (error code: 5 - S3: host connection problem)
failed to upload /var/spool/cvmfs/test_s3.egi.eu/tmp/whitelist.test_s3.egi.eu

The host ceph-gw1.gridpp.ac.uk is reachable. I can get an IP with nslookup, and ping works.
cvmfs_test_s3_egi_eu.ceph-gw1.gridpp.rl.ac.uk is not found by nslookup, as expected, but responds to ping.
Nothing in the logs on the gateway.

Same behavior with both http and https.

Anything else you would suggest me to try?

Cheers,
Jose

Hi Jose,

The /etc/cvmfs/default.conf is only picked up by the regular client. The client that is used as a read-only layer for publishing has its own configuration in /etc/cvmfs/repositories.d/$reponame/client.conf. As for the CVMFS_SERVER_DEBUG, this is a configuration option for the command line interface. You use it directly on the command line like this

CVMFS_SERVER_DEBUG=3 cvmfs_server mkfs ...

I’m not quite sure I understand your last point: if cvmfs_test_s3_egi_eu.ceph-gw1.gridpp.rl.ac.uk is not found by nslookup, how can ping work? I would expect that the name cvmfs_test_s3_egi_eu.ceph-gw1.gridpp.rl.ac.uk resolves to ceph-gw1.gridpp.rl.ac.uk.

Cheers,
Jakob

We do have /etc/cvmfs/server.local now which is included by cvmfs_server. So if someone wanted to enable CVMFS_SERVER_DEBUG=3 by default, they could set that there.

Dave

OK. I have just tried that way, appending CVMFS_SERVER_DEBUG to the command line, but still not getting a more verbose output:

[root@stratum0 ~]# cat s3.conf 
CVMFS_S3_ACCESS_KEY=***
CVMFS_S3_SECRET_KEY=***
CVMFS_S3_HOST=s3.echo.stfc.ac.uk
CVMFS_S3_BUCKET=cvmfs_test_s3_egi_eu
CVMFS_S3_DNS_BUCKETS=true

[root@stratum0 ~]# CVMFS_SERVER_DEBUG=3 cvmfs_server mkfs -s s3.conf -w http://cvmfs_test_s3_egi_eu.s3.echo.stfc.ac.uk -g -m -o cvmfs test_s3.egi.eu
Warning: CernVM-FS filesystems using overlayfs may not enforce hard link semantics during publishing.
Creating Configuration Files... done
Creating CernVM-FS Master Key and Self-Signed Certificate... done
Creating CernVM-FS Server Infrastructure... done
Signing 30 day whitelist with master key... Upload job for 'test_s3.egi.eu/.cvmfswhitelist' failed. (error code: 5 - S3: host connection problem)
failed to upload /var/spool/cvmfs/test_s3.egi.eu/tmp/whitelist.test_s3.egi.eu

[root@stratum0 ~]# cat s3.conf 
CVMFS_S3_ACCESS_KEY=***
CVMFS_S3_SECRET_KEY=***
CVMFS_S3_HOST=s3.echo.stfc.ac.uk
CVMFS_S3_BUCKET=cvmfs_test_s3_egi_eu
CVMFS_S3_DNS_BUCKETS=false

[root@stratum0 ~]# CVMFS_SERVER_DEBUG=3 cvmfs_server mkfs -s s3.conf -w 
http://s3.echo.stfc.ac.uk/cvmfs_test_s3_egi_eu -g -m -o cvmfs test_s3.egi.eu
Warning: CernVM-FS filesystems using overlayfs may not enforce hard link semantics during publishing.
Creating Configuration Files... done
Creating CernVM-FS Master Key and Self-Signed Certificate... done
Creating CernVM-FS Server Infrastructure... done
Signing 30 day whitelist with master key... Upload job for 'test_s3.egi.eu/.cvmfswhitelist' failed. (error code: 5 - S3: host connection problem)
failed to upload /var/spool/cvmfs/test_s3.egi.eu/tmp/whitelist.test_s3.egi.eu

May it be a problem with the protocol? s3cmd can copy a file from the Stratum-0 host to the S3 one. But there is no ssh (and therefore scp) connectivity…

Could it be that your S3 endpoint only accepts HTTPS connections? Can you try if s3cmd works with HTTP only?

Hi Jakob,

indeed, that seems to be the case. With option --no-ssl, the program s3cmd fails.
And I get from your first post that https is not supported yet, right?

Thanks a lot,
Jose

Unfortunately that’s still true for the moment. But we might get to full S3/HTTPS support relatively soon, a PR is already open.

As a follow-up on this: the PR meanwhile got merged and we have nightly builds that should work with HTTPS S3 endpoints.

Thanks !!
I will give it a try as soon as I can.

Hmm

Am I missing something? I am trying on a testbed from scratch, so I wonder if I forgot some step building it:

[root@stratum0 ~]# rpm -qa | grep cvmfs
cvmfs-2.9.0-0.2942.8ef7f289787a4c00git.el7.x86_64
cvmfs-config-none-1.0-2.noarch
cvmfs-server-2.9.0-0.2942.8ef7f289787a4c00git.el7.x86_64

[root@stratum0 ~]# cat s3.conf
CVMFS_S3_ACCESS_KEY=1KW*****
CVMFS_S3_SECRET_KEY=6Pq*****
CVMFS_S3_HOST=s3.echo.stfc.ac.uk
CVMFS_S3_BUCKET=cvmfs_test_s3_egi_eu
CVMFS_S3_USE_HTTPS=true
CVMFS_S3_DNS_BUCKETS=false

[root@stratum0 ~]# CVMFS_SERVER_DEBUG=3 cvmfs_server mkfs -s s3.conf -w https://s3.echo.stfc.ac.uk/cvmfs_test_s3_egi_eu -g -m -o cvmfs test_s3.egi.eu
Creating Configuration Files... done
Creating CernVM-FS Master Key and Self-Signed Certificate... done
Creating CernVM-FS Server Infrastructure... done
Signing 30 day whitelist with master key... unexpected curl error (60) while trying to upload test_s3.egi.eu/.cvmfswhitelist: SSL certificate problem: self signed certificate in certificate chain
Upload job for 'test_s3.egi.eu/.cvmfswhitelist' failed. (error code: 9 - no text)
failed to upload /var/spool/cvmfs/test_s3.egi.eu/tmp/whitelist.test_s3.egi.eu

Is the problem on the Stratum-0 host itself, or with the communication with the remote S3 server?

Thanks,
Jose

That looks like a problem with the remote site. The S3 server needs to use a proper certificate that is ultimately signed by a CA recognized by the client. You can try just by using curl from the command line the file at hand (.cvmfswhitelist), that should fail as well.

Actually, the problem may be in the testbed. It is missing a host certificate. That’s probably it.

Well, the problem is not the lack of host certificate. I have installed it on the testbed, and still getting the same error.

Just in case, I decided to try on the production Stratum-0, as that host is supposed to be fully set.
I have unpacked the new RPM on a temporary directory, and set environment variables PATH and LD_LIBRARY_PATH to point to that temp dir. (And changed the hardcoded paths in cvmfs_server accordingly). I am getting the same error as well.

The Storage sys admin thinks everything is fine in S3, and that the line

unexpected curl error (60) while trying to upload test_s3.egi.eu/.cvmfswhitelist: SSL certificate problem: self signed certificate in certificate chain

suggests that CVMFS is using a self-signed certificate on the fly, and that is causing the failures. Is that possible?

Thanks,
Jose

Is there an example of a curl command I could try from command line to check if problems are on the host and not CVMFS code?

You can try directly with the S3 root folder (it doesn’t matter if you get an HTTP error, we just want to check if the SSL connection works), like this:

curl -I https://s3.echo.stfc.ac.uk/

This actually works from my machine. Hm, cvmfs is supposed to check the system standard paths for the root CAs. Perhaps I can reproduce it from here.