Cvmfs_receiver crashes

Dear all

We have a CVMFS infrastructure using S3 as backend

We setup the gateway on the stratum0 node, and a publisher on another instance.
gw/stratum0 and publisher both run cvmfs 2.10-1 on ubuntu 20.04

It happens quite often that the receiver dies

This is an example of a failed transaction as seen from the publisher node [*]
On the stratum0/gw node I see:

Apr 04 22:12:25 cvmfs-s0-datacloud cvmfs_receiver[2604965]: failed to read-ahead /var/spool/cvmfs/dbgsgara.infn.it/tmp/receiver/commit_processor.RwXexo/catalog.s9MkNr (22)
Apr 04 22:12:25 cvmfs-s0-datacloud cvmfs_receiver[2604965]: failed to read-ahead /var/spool/cvmfs/dbgsgara.infn.it/tmp/receiver/commit_processor.Vu707p/catalog.CfzVzp (22)
Apr 04 22:12:25 cvmfs-s0-datacloud cvmfs_receiver[2604965]: failed to read-ahead /var/spool/cvmfs/dbgsgara.infn.it/tmp/receiver/commit_processor.ji04ip/catalog.eGzZzq (22)
Apr 04 22:12:27 cvmfs-s0-datacloud cvmfs_gateway[2604603]: {“level”:“info”,“component”:“actions”,“req_id”:“e3849102-d85a-477c-b093-a79b8feda2d1”,“req_dt”:2557.963993,“action”:“commit_lease”,"outcom>
Apr 04 22:12:27 cvmfs-s0-datacloud cvmfs_gateway[2604603]: {“level”:“info”,“component”:“http”,“req_id”:“e3849102-d85a-477c-b093-a79b8feda2d1”,“req_dt”:2559.490295,“message”:“request processed”}
Apr 04 22:12:27 cvmfs-s0-datacloud cvmfs_gateway[2604603]: {“level”:“error”,“component”:“worker_pool”,“req_id”:“e3849102-d85a-477c-b093-a79b8feda2d1”,“req_dt”:2557.968525,“worker_id”:1,“message”:"e>
Apr 04 22:12:27 cvmfs-s0-datacloud cvmfs_receiver[2604969]: –
Signal: 6, errno: 2, version: 2.10.1, PID: 2604965
Executable path: /usr/bin/cvmfs_receiver

                                                        Thread 22 (Thread 0x7fb5647f8700 (LWP 2604992)):
                                                        #0  futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x55e47ce74740) at ../sysdeps/nptl/futex-internal.h:183
                                                        #1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x55e47ce746f0, cond=0x55e47ce74718) at pthread_cond_wait.c:508
                                                        #2  __pthread_cond_wait (cond=cond@entry=0x55e47ce74718, mutex=mutex@entry=0x55e47ce746f0) at pthread_cond_wait.c:647


and a /var/log/cvmfs_receiver/stacktrace file is produced

Any hints how to debug this issue ?

Thanks. Massimo

[*]

root@vnode-0:~# cvmfs_server transaction dbgsgara.infn.it
Gateway reply: ok
root@vnode-0:~# rm /cvmfs/dbgsgara.infn.it/*
root@vnode-0:~# cp /etc/t* /cvmfs/dbgsgara.infn.it
cp: -r not specified; omitting directory ‘/etc/terminfo’
cp: -r not specified; omitting directory ‘/etc/tmpfiles.d’
root@vnode-0:~# cvmfs_server publish dbgsgara.infn.it
Using auto tag ‘generic-2023-04-04T20:12:24Z’
Processing changes…
Waiting for upload of files before committing…
Committing file catalogs…
Note: Catalog at / gets defragmented (42.86% free pages)… done
Wait for all uploads to finish
Exporting repository manifest
Lease end request - error reply: {“reason”:“worker ‘commit’ call failed: possible that the receiver crashed: EOF”,“status”:“error”}
SessionContext: could not commit session. Aborting.
[ERROR] Failed to commit transaction.
Statistics stored at: /var/spool/cvmfs/dbgsgara.infn.it/stats.db
Synchronization failed

Executed Command:
cvmfs_swissknife sync -u /cvmfs/dbgsgara.infn.it -s /var/spool/cvmfs/dbgsgara.infn.it/scratch/current -c /var/spool/cvmfs/dbgsgara.infn.it/rdonly -t /var/spool/cvmfs/dbgsgara.infn.it/tmp -b ae5183c40a53d06a6758b922f1cd94a36535d02f -r gw,/srv/cvmfs/dbgsgara.infn.it/data/txn,http://cvmfs.wp6.cloud.infn.it:4929/api/v1 -w https://stor.cloud.infn.it/v1/AUTH_79322ee743c74382ad9de8b1d895616a/cvmfs/dbgsgara.infn.it -o /var/spool/cvmfs/dbgsgara.infn.it/tmp/manifest -e sha1 -Z default -C /etc/cvmfs/repositories.d/dbgsgara.infn.it/trusted_certs -N dbgsgara.infn.it -K /etc/cvmfs/keys/dbgsgara.infn.it.pub -L -D generic-2023-04-04T20:12:24Z -H /etc/cvmfs/keys/dbgsgara.infn.it.gw -P /var/spool/cvmfs/dbgsgara.infn.it/session_token -f overlayfs -p -l 4194304 -a 8388608 -h 16777216 -i

The stracktrace file at CERNBox

Hi Massimo,

we’ll take a look here as well - thanks for the stacktrace! Was this the result of ingesting a tarball on the publisher?
Cheers,
Valentin

Hello Valentin
No: these were “normal” files
Thanks, Massimo