#762835 error exit on dafileserver (segfault)

Package:
openafs-fileserver
Source:
openafs
Description:
AFS distributed filesystem file server
Submitter:
Thomas Otto
Date:
2022-06-20 17:45:05 UTC
Severity:
serious
#762835#5
Date:
2014-09-25 14:25:21 UTC
From:
To:
Hello,

i used the actual (updated) debian stable.

since last week our dafileserver has problems and exited quite often.

This affects that some client hangs after that.
I hope the corefile helps.


$ bos status afs03 -long
Instance dafs, (type is dafs) has core file, currently running normally.
    Auxiliary status is: file server running.
    Process last started at Thu Sep 25 11:10:01 2014 (7 proc starts)
    Last exit at Thu Sep 25 11:10:01 2014
    Last error exit at Thu Sep 25 10:17:11 2014, by file, due to signal 11
    Command 1 is '/usr/lib/openafs/dafileserver -syslog'
    Command 2 is '/usr/lib/openafs/davolserver -syslog'
    Command 3 is '/usr/lib/openafs/salvageserver'
    Command 4 is '/usr/lib/openafs/dasalvager'


Sep 25 10:12:43 afs03 fileserver[2494]: nUsers == 0, but header not on LRU
Sep 25 10:14:24 afs03 fileserver[2494]: FindClient: stillborn client 0000000003F59320(5c5433d4); conn 00007FF63802D110 (host 141.35.29.92:7001) had client 0000000003F59250(5c5433d4)
Sep 25 10:16:41 afs03 fileserver[2494]: CheckHost_r: Probing all interfaces of host 93.128.220.70:7001 failed, code -1
Sep 25 10:17:10 afs03 fileserver[2494]: Scheduling salvage for volume 536890884 on part /vicepa over SALVSYNC
Sep 25 10:17:11 afs03 fileserver[2494]: FSYNC_com:  invalid protocol version (3523477760)
Sep 25 10:17:11 afs03 kernel: [ 2459.331071] dafileserver[2869]: segfault at 9 ip 0000000000000009 sp 00007ff646302eb8 error 14 in dafileserver[400000+ca000]
Sep 25 10:17:11 afs03 davolserver[2495]: SYNC_ask: No response on circuit 'FSSYNC'
Sep 25 10:17:11 afs03 davolserver[2495]: SYNC_ask: protocol communications failure on circuit 'FSSYNC'; attempting reconnect to server
Sep 25 10:17:11 afs03 bosserver[2472]: dafs:file exited on signal 11 (core dumped)


best regards

Thomas Otto

#762835#10
Date:
2014-09-25 20:25:10 UTC
From:
To:
Unfortunately, the core file is not particularly helpful, as the stack
trace for the faulting thread is garbage.

It looks like OPENAFS-SA-2014-002 is fixed in wheezy-backports but not in
wheezy itself.  I have no particular reason to think that that use of
uninitialized memory is responsible for your crash, of course, but can ask
if you are willing to run the newer package from -backports.

That the issues started just a week or two ago is rather odd, as I don't
see any changelog entries in any relevant-seeming packages on my VM from
around that time.  Do you have apt logs that might indicate whether a
particular package update was correlated with the onset of the crashes?

Are you in a position to say anything about the usage patterns of your
AFS clients, in case it becomes necessary to try to reproduce the crash
locally?

Thanks,

Ben

#762835#15
Date:
2014-09-26 11:50:59 UTC
From:
To:
Hello,

Am 25.09.2014 um 22:25 schrieb Benjamin Kaduk:

Now i updated the packages with wheezy-backports.

For the last 4 hours the daemon works fine :)

I didn't see anything relevant, but attached the log.

Unfortunately i don't had an idea, what occurs the problem.
So i can't reproduce this.

I installed this as a new openafs-fileserver to migrate all volumes from
our very old (hardware) other servers. So this server has SAN-Storage and
a high number of volumes (most of them are unused).

best regards

Thomas

#762835#24
Date:
2022-06-20 17:35:10 UTC
From:
To:
Doing some housekeeping; this is a very old bug report and is listed as
fixed in all versions of the package supported by Debian, but the actual
bug itself was not closed.

Closing the bug now to tidy up the database.