Input/output error on single repo file

One of our published repos has an interesting problem in which one (or at least one) file throws client I/O errors while the rest of the repo loads and reads properly, and the file can be listed but not read:
Input/output error

I’ve tried many things on the client side, as well as deleting/recreating the same file on the server side. Even after wiping client cache and restarting autofs, the result does not change, although autofs status hints at an inode problem:
failed to fetch {file} (hash: 048b3874a5227e99a7f1fe05b7e98e8587738614, error 9 [host returned HTTP error])
failed to open inode: 142631, CAS key 048b3874a5227e99a7f1fe05b7e98e8587738614, error code 2

But inodes count at least is not a problem:
df -i|grep cvmfs
/dev/mapper/sysvg-cvmfs 8192000 291 8191709 1% /var/cvmfs

I have created a bug report, but I don’t see an attachments function here, and since this seems to be localized to one or more single files, I’m not sure it will be of much help.

Apparently the file cannot be loaded from the server. You can try to load the chunk with curl:

curl http://$cvmfs-stratum-host/cvmfs/$repositoryname/data/04/8b3874a5227e99a7f1fe05b7e98e8587738614

Perhaps it is not there, i.e. for some reason it didn’t get published. If it is something else, the curl error message should help figure out the cause of the problem.

There is no curl error: curl returns an empty result, and curl headers show HTTP/1.1 200 OK. The file shows up and stats in dir listings, but can’t be read.

If curl returns an empty result, then I think the problem is that this is a zero-byte (i.e.: corrupted) file on the server.

I assumed this as well, but I can cat the file both from the source, and from a client on the server, just not from remote clients

The only way I could think of to quickly mitigate this was to delete and recreate the problematic file between publications. I could not find any other files with the same I/O error. I would attribute this to a a possible bug in the older client version still running on our WNs (2.7 – service infrastructure is running 2.10), except for the valid but empty curl fetch results through the proxies and caches, so I am still unsure of the cause.

It may have been a corruption in an intermediate (possibly transparent) HTTP cache.