Problems updating client? Jobs failing after upgrade

Hello,

recently, we have upgraded the cvmfs client package on our local batch system, from version 2.7.2 to 2.9.2.

We run Scientific Linux 7.9: [1]
Before the upgrade: [2]
After the upgrade: [3]

It turns out we weren’t making use of RPM egi-cvmfs before. It has been used first time for this upgrade.

Right after the upgrade, many jobs have started to be fail. The time correlation is very suspicious.
Therefore, I have a couple of questions:

  1. After upgrading the cvmfs packages, was a host reboot needed?
  2. Or, at least, remounting the mount-points?
  3. The yum logs show a step where the initial process attempted to unmount the /cvmfs/ directory but failed. For example: [4]. Is that expected?
  4. Is it possible that somehow the local cache gets corrupted by installing the new package while it was still in use?

Any comment or suggestion is more than appreciated.
Thanks a lot in advance.
Cheers,
Jose

[1]

[root@lcg2400 ~]# uname -srvo
Linux 5.4.186-1.el7.elrepo.x86_64 #1 SMP Fri Mar 18 09:17:21 EDT 2022 GNU/Linux

[root@lcg2400 ~]# cat /etc/redhat-release
Scientific Linux release 7.9 (Nitrogen)

[2]

[root@lcg2400 ~]# rpm -qa | grep cvmfs
cvmfs-2.7.2-1.el7.x86_64
cvmfs-config-egi-2.0-1.el7.centos.noarch
cvmfs-x509-helper-2.2-2.29.obs.el7.x86_64

[3]

[root@lcg2400 ~]# rpm -qa | grep cvmfs
cvmfs-2.9.2-1.el7.x86_64
egi-cvmfs-4-2.18.obs.el7.noarch
cvmfs-config-egi-2.5-1.6.obs.el7.noarch
cvmfs-x509-helper-2.2-2.29.obs.el7.x86_64

[4]

yum history info 269
Loaded plugins: langpacks, pakiti2, post-transaction-actions, priorities, versionlock
Transaction ID : 269
Begin time     : Thu May 12 15:59:23 2022
Begin rpmdb    : 850:11a8390bee20df351ca8ebfc6b200dc3d9e40716
End time       :            16:01:17 2022 (114 seconds)
End rpmdb      : 850:7c4b0000352d46e3b41e19e554f14795eee3e1d9
User           : System <unset>
Return-Code    : Success
Command Line   : -c /etc/yum.conf -y distro-sync
Transaction performed with:
    Installed     rpm-4.11.3-48.el7_9.x86_64                    @sl-7x-x86_64-security
    Installed     yum-3.4.3-168.sl7.noarch                      @sl-7x-x86_64-os
    Installed     yum-metadata-parser-1.1.4-10.el7.x86_64       @anaconda/7.5
    Installed     yum-plugin-versionlock-1.1.31-54.el7_8.noarch @sl-7x-x86_64-os
Packages Altered:
    Updated cvmfs-2.7.2-1.el7.x86_64                 @cvmfs-el7
    Update        2.9.2-1.el7.x86_64                 @EMI-UMD4-EL7_updates_x86_64
    Updated cvmfs-config-egi-2.0-1.el7.centos.noarch @cvmfs
    Update                   2.5-1.6.obs.el7.noarch  @EMI-UMD4-EL7_updates_x86_64
Scriptlet output:
   1 Pausing grid.cern.ch on /cvmfs/grid.cern.ch
   2 Pausing singularity.opensciencegrid.org on /cvmfs/singularity.opensciencegrid.org
   3 Pausing oasis.opensciencegrid.org on /cvmfs/oasis.opensciencegrid.org
 ***ETC ETC ETC***
 357 atlas.cern.ch: Releasing saved inode generation info
 358 atlas.cern.ch: Releasing saved open files table
 359 atlas.cern.ch: Releasing open files counter
 360 atlas.cern.ch: Activating Fuse module
 361 warning: directory /cvmfs: remove failed: Device or resource busy
history info

At first glance I would guess that warning is nothing to worry about, because the /cvmfs directory would need to be immediately created again anyway if it did get removed. I recommend instead focusing on exactly why the jobs have failed.

I agree with Dave, I don’t think the warning is related. Your procedure is correct, no reboot or remount is required but the RPM upgrade uses cvmfs_config reload for a life patch of the mountpoints. I think only the job logs from the failed jobs may provide more clues.