Stratum 0 - Setup

Hello,
I faced a filesystem corruption Job for var-spool-cvmfs-...-rdonly.mount failed
I’ve deleted the filesystem to restart from scratch.
I did the cvmfs version update, and tried multiple times to recreate it from scratch, and sync our data without success.
It fails now every time it reaches an amount of data/files to share IMAS data. (~2TB/17M small files)
If I do a full rsync at some point I have a cvmfs crash. I’m now doing some rsync by subfolders (transaction/rsync/publish) but at some point during a publish it also crash.

Here is the setup:

CentOS Linux release 8.4.2105
cvmfs-libs-2.11.2-1.el8.x86_64
cvmfs-config-default-2.1-1.noarch
cvmfs-2.11.2-1.el8.x86_64
cvmfs-server-2.11.2-1.el8.x86_64
NAME                        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                           8:0    0 93.2G  0 disk
├─sda1                        8:1    0    1G  0 part /boot
└─sda2                        8:2    0 92.2G  0 part
  ├─cl_io--ls--cvmfs01-root 253:0    0 55.7G  0 lvm  /
  └─cl_io--ls--cvmfs01-swap 253:1    0  9.3G  0 lvm  [SWAP]
sdb                           8:16   0  3.7T  0 disk
├─sdb1                        8:17   0  700G  0 part /var/spool/cvmfs
└─sdb2                        8:18   0    3T  0 part /cvmfs
sdc                           8:32   0  3.7T  0 disk
└─sdc1                        8:33   0  3.7T  0 part /srv/cvmfs

Default configuration coming by cvmfs_server mkfs imas.iter.org:

/etc/cvmfs/repositories.d/imas.iter.org/server.conf
CVMFS_CREATOR_VERSION=143
CVMFS_REPOSITORY_NAME=imas.iter.org
CVMFS_REPOSITORY_TYPE=stratum0
CVMFS_USER=root
CVMFS_UNION_DIR=/cvmfs/imas.iter.org
CVMFS_SPOOL_DIR=/var/spool/cvmfs/imas.iter.org
CVMFS_STRATUM0=http://localhost/cvmfs/imas.iter.org
CVMFS_UPSTREAM_STORAGE=local,/srv/cvmfs/imas.iter.org/data/txn,/srv/cvmfs/imas.iter.org
CVMFS_USE_FILE_CHUNKING=true
CVMFS_MIN_CHUNK_SIZE=4194304
CVMFS_AVG_CHUNK_SIZE=8388608
CVMFS_MAX_CHUNK_SIZE=16777216
CVMFS_UNION_FS_TYPE=overlayfs
CVMFS_HASH_ALGORITHM=sha1
CVMFS_COMPRESSION_ALGORITHM=default
CVMFS_EXTERNAL_DATA=false
CVMFS_AUTO_TAG=true
CVMFS_AUTO_TAG_TIMESPAN=""
CVMFS_GARBAGE_COLLECTION=false
CVMFS_AUTO_REPAIR_MOUNTPOINT=true
CVMFS_AUTOCATALOGS=true
CVMFS_AUTOCATALOGS_MAX_WEIGHT=500000
CVMFS_ASYNC_SCRATCH_CLEANUP=true
CVMFS_PRINT_STATISTICS=false
CVMFS_UPLOAD_STATS_DB=false
CVMFS_UPLOAD_STATS_PLOTS=false
CVMFS_IGNORE_XDIR_HARDLINKS=true

Before restarting again from scratch, is there any obvious problem? I’ve tried different partition configurations, and also tried with CVMFS_AUTOCATALOGS true & false…

I am contacting you because data synchronization takes hours before each crash.
Thank you in advance for your help.
Regards

Just fyi some crash outputs.

Hi Fred, before that it may be worthwhile to rerun the failing publication with the environment variable

CVMFS_SERVER_DEBUG=3

That may yield some more information.

/srv/cvmfs is on a local hard disk, right? We’ve recently seen issues when network file systems were used for that.
Best,
Valentin

Thank you for your quick reply!
Yes all are local disks. Data is copied from a NFS share.

But I just did a “cvmfs_server rmfs imas.iter.org”, and removed all files, +reboot, so now I should have a clean server:

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 93.2G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 92.2G 0 part
├─cl_io–ls–cvmfs01-root 253:0 0 55.7G 0 lvm /
└─cl_io–ls–cvmfs01-swap 253:1 0 9.3G 0 lvm [SWAP]
sdb 8:16 0 3.7T 0 disk
├─sdb1 8:17 0 700G 0 part /var/spool/cvmfs
└─sdb2 8:18 0 3T 0 part /cvmfs
sdc 8:32 0 3.7T 0 disk
└─sdc1 8:33 0 3.7T 0 part /srv/cvmfs

df -h
Filesystem Size Used Avail Use% Mounted on

/dev/sdb1 700G 5.0G 695G 1% /var/spool/cvmfs
/dev/sdb2 3.0T 22G 3.0T 1% /cvmfs
/dev/sdc1 3.7T 26G 3.7T 1% /srv/cvmfs

usually, to recreate the FS I’m just doing this before starting to add files:

cvmfs_server mkfs imas.iter.org
/bin/cp -rp keys/* /etc/cvmfs/keys/
/usr/bin/cvmfs_server resign -p imas.iter.org
/usr/bin/cvmfs_server resign imas.iter.org
cvmfs_server transaction imas.iter.org
rm -rf /cvmfs/imas.iter.org/new_repository
mkdir -p /cvmfs/imas.iter.org/shared/imasdb
touch /cvmfs/imas.iter.org/shared/imasdb/test.txt
cvmfs_server publish imas.iter.org

cvmfs_server transaction imas.iter.org
rsync /mnt/nfs/* /cvmfs/imas.iter.org/
cvmfs_server publish imas.iter.org

Do I continue like that, or I’m missing something?

Hi Fred

If I see correctly, you are trying to copy all the data from nfs (the 2 TB?) and publish it in a single transaction?
If that is the case, please try making multiple smaller transactions and see if that works.

Your /var/spool/cvmfs is only 700 GB large. During publication this is used as temporary staging area, where the data will be laying uncompressed. So a single publication of 2TB will not work.

Let me know if that works!

Cheers
Laura

Hello,
I tried to split the rsync by subfolders (level 5 below the root).
I can try to split more for sure.

But before restarting that, is there any tuning to do or do you have any advice?

Regards.

I recommend checking the size of each subfolder with du -hs to make sure it’s well below 700GB. You can also watch the space remaining in /var/spool/cvmfs as you do the rsyncs.

Also, I question the need for a /cvmfs partition of 3T. The only thing in /cvmfs should be mountpoints so that could stay in the root filesystem.

You could instead put more space in /var/spool/cvmfs but it’s probably not a good idea to do such large transactions anyway, so 700G should be fine.

Okay that will help me a lot !
I hadn’t understood it like that :sweat_smile:

I’ll do the modification and start again. Will come back to you, asap

(update done to v2.11.3-1)

Hello,

Started from scratch then:

[root@io-ls-cvmfs01:~]# cvmfs_server publish imas.iter.org
imas.iter.org is in a transaction but /cvmfs/imas.iter.org is not mounted read/write
Repository imas.iter.org is in a transaction and cannot be repaired.
--> Run `cvmfs_server abort imas.iter.org` to revert and repair.
[root@io-ls-cvmfs01:~]# cvmfs_server transaction imas.iter.org
imas.iter.org is in a transaction but /cvmfs/imas.iter.org is not mounted read/write
Repository imas.iter.org is in a transaction and cannot be repaired.
--> Run `cvmfs_server abort $name` to revert and repair.
(unexpected termination) cannot establish writable mountpoint

Stacktrace:
/lib64/libcvmfs_server.so.2.11.3(+0x79c96) [0x7f95ae48ac96]
/lib64/libcvmfs_server.so.2.11.3(+0x80b72) [0x7f95ae491b72]
/lib64/libcvmfs_server.so.2.11.3(_ZN7publish9Publisher16TransactionRetryEv+0x58) [0x7f95ae49a47c]
/lib64/libcvmfs_server.so.2.11.3(_ZN7publish9Publisher11TransactionEv+0xe) [0x7f95ae48b214]
/usr/bin/cvmfs_publish() [0x4396c9]
/usr/bin/cvmfs_publish() [0x40f785]
/lib64/libc.so.6(__libc_start_main+0xf3) [0x7f95abfeb493]
/usr/bin/cvmfs_publish() [0x40f98e]



/dev/sdb1                                                  3.7T  338G  3.4T  10% /var/spool/cvmfs
/dev/sdc1                                                  3.7T  273G  3.4T   8% /srv/cvmfs
imas.iter.org                                              4.0G  3.8G  130M  97% /var/spool/cvmfs/imas.iter.org/rdonly
overlay_imas.iter.org                                      3.7T  338G  3.4T  10% /cvmfs/imas.iter.org
NAME                        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                           8:0    0 93.2G  0 disk
├─sda1                        8:1    0    1G  0 part /boot
└─sda2                        8:2    0 92.2G  0 part
  ├─cl_io--ls--cvmfs01-root 253:0    0 55.7G  0 lvm  /
  └─cl_io--ls--cvmfs01-swap 253:1    0  9.3G  0 lvm  [SWAP]
sdb                           8:16   0  3.7T  0 disk
└─sdb1                        8:17   0  3.7T  0 part /var/spool/cvmfs
sdc                           8:32   0  3.7T  0 disk
└─sdc1                        8:33   0  3.7T  0 part /srv/cvmfs

:smiling_face_with_tear:

Hi Fred,

can you abort the transaction? It was not clear if you tried that. Please also retry setting the environment variable CVMFS_SERVER_DEBUG=3
Cheers,
Valentin

Hello Valentin,
No, I had left it as is in case of need for debugging.
I just tried and the abort transaction worked.

Is there anything to do before trying again another transaction/sync/publish?
(I’ve set CVMFS_SERVER_DEBUG=3)
Thanks a lot,
Fred

Ok! No, I think you’ll just want to break up the transactions a bit, and start with small ones.