#597133 gvfsd-dav crashes, resulting in "Message did not receive a reply"

Package:
gvfs
Source:
gvfs
Description:
userspace virtual filesystem - GIO module
Submitter:
Date:
2015-05-19 07:54:08 UTC
Severity:
normal
#597133#5
Date:
2010-09-16 21:06:37 UTC
From:
To:
Hi,

I don't know if I analyzed this issue correctly, so please feel free to re-
assign if appropriate.

I have just enabled sharing the ~/Public folder via gnome-user-share on one of
my computers and wanted to acces the folder via nautilus from another computer.
The Folder showed up as expected in the network view in nautilus, when when I
attempted to open it, an error message appeared within a split second and told
me that dbus has received a timeout (or similar, I failed to copy it and have
just switched off the other computer; I'll try to reproduce if necessary).
C'mon, this folder is on another computer which happens to be connected to the
same network and I think dbus (or gvfs) should be a little bit more patient
when it comes to acces this ressource. I don't know which of both is
responsible for setting timeout thresholds, but I suspect gvfs and thus file
this bug report against it. I beliebe at least half a second should be waited
until the operation is canceled via a cryptic error message.

 - Fabian

#597133#10
Date:
2010-09-17 07:29:42 UTC
From:
To:
I forgot to mention that I eventually succeeded to open the shared
folder on the remote computer by rapidly hitting Enter on its icon
(each first key press will attempt to open the remote folder and each
second key press will close the dbus error message). During one of
these attempts to open the folder the response from the server was
apperently quick enough to not trigger the timeout error. Just to
prove it's not a configuration error on my side.

  - Fabian

#597133#15
Date:
2010-09-17 08:35:35 UTC
From:
To:
The error message window in question reads as follows:

  DBus error org.freedesktop.DBus.Error.NoReply: Message did not
  receive a reply (timeout by message bus)

and an error message like the following appears in dmesg output:

  [137392.806931] gvfsd-dav[27593]: segfault at 0 ip b7286d70 sp
  bfe13e80 error 4 in libc-2.11.2.so[b7214000+13e000]

Please note that access to the webdav directory works fine using
nautilus-connect-server, but not via nautilus network view.

  - Fabian

#597133#20
Date:
2010-09-17 09:41:05 UTC
From:
To:
Le vendredi 17 septembre 2010 à 10:35 +0200, Fabian Greffrath a écrit :

Well, this is clearly where the bug lies. Could you install gvfs-dbg and
try to obtain a backtrace?

Thanks,

#597133#25
Date:
2010-09-17 09:58:42 UTC
From:
To:
Am 17.09.2010 11:41, schrieb Josselin Mouette:

I'd love to, but I have no idea how, since the gvfsd-dav process is
just started from inside nautilus when I try to open the share.

  - Fabian

#597133#30
Date:
2010-09-17 10:31:37 UTC
From:
To:
Am 17.09.2010 11:41, schrieb Josselin Mouette:

I was able to reproduce this behaviour on the command line:

  $ gvfs-mount dav+sd://10.0.2.15:59965/
  Error mounting location: DBus error
  org.freedesktop.DBus.Error.NoReply: Message did not receive a reply
  (timeout by message bus)

but then the segfault in dmesg output does not appear. Running
gvfs-mount through gdb results in

  (gdb) run dav+sd://10.0.2.15:59965/
  Starting program: /usr/bin/gvfs-mount dav+sd://10.0.2.15:59965/
  [Thread debugging using libthread_db enabled]
  Error mounting location: DBus error
  org.freedesktop.DBus.Error.NoReply: Message did not receive a reply
  (timeout by message bus)

  Program exited normally.
  (gdb) bt
  No stack.

What now?

#597133#35
Date:
2010-09-17 11:40:08 UTC
From:
To:
Le vendredi 17 septembre 2010 à 12:31 +0200, Fabian Greffrath a écrit :

ISTR that the gvfsd-* are spawned by the gvfs daemon. In this case you
should be able to see what’s happening by using gdb on the gvfsd
process.

Cheers,

#597133#40
Date:
2010-09-17 11:45:27 UTC
From:
To:
Am 17.09.2010 13:40, schrieb Josselin Mouette:

No, sorry. If I attach gdb to `pidof gvfsd` and provoke the error in
nautilus, nothing happens in gdb. :(

#597133#45
Date:
2010-09-21 11:26:47 UTC
From:
To:
Am 17.09.2010 13:40, schrieb Josselin Mouette:

Is there anything else I can try to do? This issue really bites me, as
the same error message occationally occurs when I want to export my
calendar to my WebDAV space via evolution.

  - Fabian

#597133#50
Date:
2010-10-14 15:11:33 UTC
From:
To:
Hi.

I have encountered this issue. After several tries I managed to mount Public
folder through network.

However, everything works as intended on the other computer with fresh Debian
testing. Even after some time a dialog pops up and one can wait longer or
abort.

All of the related packages are up do date on both computers.

Damian

#597133#55
Date:
2012-10-23 10:32:48 UTC
From:
To:

#597133#60
Date:
2012-10-23 12:37:17 UTC
From:
To:
reopen 597133
found 597133 1.6.3-1

Ugh, sorry, wrong bug. Reopening.

    S

#597133#67
Date:
2012-10-23 12:37:17 UTC
From:
To:
reopen 597133
found 597133 1.6.3-1

Ugh, sorry, wrong bug. Reopening.

    S

#597133#72
Date:
2013-05-26 20:51:05 UTC
From:
To:
Hello!
This is still an issue in stable Debian 7.0.
I have three computers at home with Debian 7.0 and want to share "Public"
folders.

Nautilus shows an error "DBus error org.freedesktop.DBus.Error.NoReply: Message
did not receive a reply (timeout by message bus)".

dmesg shows lines like:
[ 6968.600842] gvfsd-dav[6554]: segfault at 0 ip 00007fdfbbdd73e6 sp
00007fff4227d588 error 4 in libc-2.13.so[7fdfbbd58000+180000]

It happens not always. Approximately each 10th time is successful and folder is
mounted.
Strange, but when using valgrind approximately every 2nd time is successful.

Using valgrind with gvfsd-dav executable like this:

user@debian-1:~$ cat /usr/lib/gvfs/gvfsd-dav
#! /bin/bash
LANG=C valgrind /usr/lib/gvfs/gvfsd-dav.original $* > /home/user/gvfsd-dav.log
2>&1

shows this:

==7080== Memcheck, a memory error detector
==7080== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==7080== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==7080== Command: /usr/lib/gvfs/gvfsd-dav.original --spawner :1.9
/org/gtk/gvfs/exec_spaw/35
==7080==
==7080== Invalid read of size 1
==7080==    at 0x4C2A001: strcmp (mc_replace_strmem.c:711)
==7080==    by 0x8DDA003: avahi_service_resolver_event (in /usr/lib/x86_64
-linux-gnu/libavahi-client.so.3.2.9)
==7080==    by 0x8DD5D33: ??? (in /usr/lib/x86_64-linux-gnu/libavahi-
client.so.3.2.9)
==7080==    by 0x527D53D: dbus_connection_dispatch (in /lib/x86_64-linux-
gnu/libdbus-1.so.3.7.2)
==7080==    by 0x8DDC535: ??? (in /usr/lib/x86_64-linux-gnu/libavahi-
client.so.3.2.9)
==7080==    by 0x89C065F: ??? (in /usr/lib/x86_64-linux-gnu/libavahi-
glib.so.1.0.2)
==7080==    by 0x5F12354: g_main_context_dispatch (in /lib/x86_64-linux-
gnu/libglib-2.0.so.0.3200.4)
==7080==    by 0x5F12687: ??? (in /lib/x86_64-linux-
gnu/libglib-2.0.so.0.3200.4)
==7080==    by 0x5F12A81: g_main_loop_run (in /lib/x86_64-linux-
gnu/libglib-2.0.so.0.3200.4)
==7080==    by 0x40EA2A: daemon_main (daemon-main.c:300)
==7080==    by 0x40718F: main (daemon-main-generic.c:39)
==7080==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==7080==
==7080==
==7080== Process terminating with default action of signal 11 (SIGSEGV)
==7080==  Access not within mapped region at address 0x0
==7080==    at 0x4C2A001: strcmp (mc_replace_strmem.c:711)
==7080==    by 0x8DDA003: avahi_service_resolver_event (in /usr/lib/x86_64
-linux-gnu/libavahi-client.so.3.2.9)
==7080==    by 0x8DD5D33: ??? (in /usr/lib/x86_64-linux-gnu/libavahi-
client.so.3.2.9)
==7080==    by 0x527D53D: dbus_connection_dispatch (in /lib/x86_64-linux-
gnu/libdbus-1.so.3.7.2)
==7080==    by 0x8DDC535: ??? (in /usr/lib/x86_64-linux-gnu/libavahi-
client.so.3.2.9)
==7080==    by 0x89C065F: ??? (in /usr/lib/x86_64-linux-gnu/libavahi-
glib.so.1.0.2)
==7080==    by 0x5F12354: g_main_context_dispatch (in /lib/x86_64-linux-
gnu/libglib-2.0.so.0.3200.4)
==7080==    by 0x5F12687: ??? (in /lib/x86_64-linux-
gnu/libglib-2.0.so.0.3200.4)
==7080==    by 0x5F12A81: g_main_loop_run (in /lib/x86_64-linux-
gnu/libglib-2.0.so.0.3200.4)
==7080==    by 0x40EA2A: daemon_main (daemon-main.c:300)
==7080==    by 0x40718F: main (daemon-main-generic.c:39)
==7080==  If you believe this happened as a result of a stack
==7080==  overflow in your program's main thread (unlikely but
==7080==  possible), you can try to increase the size of the
==7080==  main thread stack using the --main-stacksize= flag.
==7080==  The main thread stack size used in this run was 8388608.
==7080==
==7080== HEAP SUMMARY:
==7080==     in use at exit: 259,651 bytes in 2,095 blocks
==7080==   total heap usage: 3,516 allocs, 1,421 frees, 479,718 bytes allocated
==7080==
==7080== LEAK SUMMARY:
==7080==    definitely lost: 0 bytes in 0 blocks
==7080==    indirectly lost: 0 bytes in 0 blocks
==7080==      possibly lost: 56,133 bytes in 359 blocks
==7080==    still reachable: 203,518 bytes in 1,736 blocks
==7080==         suppressed: 0 bytes in 0 blocks
==7080== Rerun with --leak-check=full to see details of leaked memory
==7080==
==7080== For counts of detected and suppressed errors, rerun with: -v
==7080== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 8 from 6)

and sometimes this:

==6958== Memcheck, a memory error detector
==6958== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==6958== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==6958== Command: /usr/lib/gvfs/gvfsd-dav.original --spawner :1.9
/org/gtk/gvfs/exec_spaw/30
==6958==
gvfsd-dav.original: ../avahi-common/dbus-watch-glue.c:91: request_dispatch:
Assertion `dbus_connection_get_dispatch_status(d->connection) ==
DBUS_DISPATCH_DATA_REMAINS' failed.
==6958==
==6958== HEAP SUMMARY:
==6958==     in use at exit: 259,382 bytes in 2,092 blocks
==6958==   total heap usage: 3,449 allocs, 1,357 frees, 477,948 bytes allocated
==6958==
==6958== LEAK SUMMARY:
==6958==    definitely lost: 0 bytes in 0 blocks
==6958==    indirectly lost: 0 bytes in 0 blocks
==6958==      possibly lost: 56,784 bytes in 361 blocks
==6958==    still reachable: 202,598 bytes in 1,731 blocks
==6958==         suppressed: 0 bytes in 0 blocks
==6958== Rerun with --leak-check=full to see details of leaked memory
==6958==
==6958== For counts of detected and suppressed errors, rerun with: -v
==6958== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 6)

So the behaviour is each time different.

#597133#77
Date:
2014-03-12 00:00:20 UTC
From:
To:
Hey,

Could you please still reproduce this issue with newer version
like 1.16.3-2 ?

thanks
regards
althaser

#597133#82
Date:
2014-03-13 08:55:24 UTC
From:
To:
Am Mittwoch, den 12.03.2014, 00:00 +0000 schrieb althaser:
computers and a different network when I initially reported this bug.

However, I still don't seem to be lucky now. User file sharing is
enabled in gnome-control-center:

$ ps aux | grep apache
greffra+ 20529  0.0  0.0  56372  1816 ?        Ss   09:49
0:00 /usr/sbin/apache2 -f /usr/share/gnome-user-share/dav_user_2.4.conf
-C Listen 60325
greffra+ 20530  0.0  0.0  56332  1352 ?        S    09:49
0:00 /usr/sbin/apache2 -f /usr/share/gnome-user-share/dav_user_2.4.conf
-C Listen 60325
greffra+ 20531  0.0  0.0  56396   864 ?        S    09:49
0:00 /usr/sbin/apache2 -f /usr/share/gnome-user-share/dav_user_2.4.conf
-C Listen 60325

Trying to connect:

$ gvfs-mount dav+sd://localhost:60325
Error mounting location: Message did not receive a reply (timeout by
message bus)

The error message appears instantly:

$ time gvfs-mount dav+sd://localhost:60325
Error mounting location: Message did not receive a reply (timeout by
message bus)

real    0m0.028s
user    0m0.008s
sys     0m0.000s

Packages used:

$ dpkg -l gvfs dbus | awk '/^ii/ {print $2 " " $3}'
dbus 1.8.0-2
gvfs:amd64 1.16.3-2

Hope that helps!

- Fabian

#597133#87
Date:
2014-03-13 10:19:04 UTC
From:
To:
retitle 597133 gvfsd-dav crashes, resulting in "Message did not receive a reply"
thanks

This means that whatever service the reply was expected from (probably
gvfsd-dav?) disconnected from D-Bus without replying. The dbus-daemon's
reasoning is that it can't possibly reply now, so it behaves as though
it had timed out immediately, rather than waiting for the timeout
period to elapse. In practice, that probably means the service crashed.

The error message is misleading; it's a relic of earlier D-Bus versions
where the dbus-daemon would enforce a maximum timeout on method calls.
It can still be configured to enforce a maximum timeout, but the
default is now "wait forever" - resource usage is limited by other
mechanisms, so the arbitrary timeout had no benefit, only costs. I've
opened <https://bugs.freedesktop.org/show_bug.cgi?id=76112> upstream.

It might be helpful to rename gvfsd-dav to gvfsd-dav.real and replace
it with a script something like this:

#!/bin/sh
exec > $HOME/gvfsd-dav-$$.log
exec 2>&1
exec gdb -return-child-result -batch -ex run \
  -ex 'thread apply all bt full' -ex kill -ex quit \
  --args /usr/lib/gvfs/gvfsd-dav.real "$@"

Alternatively, capturing core dumps via corekeeper or equivalent might
be helpful.

(People who know more about gvfs, please feel free to suggest how to
start gvfsd-dav under gdb manually - I was hoping it was just D-Bus
service activation, in which case running it from a terminal would be
enough, but that doesn't seem to be the case.)

    S

#597133#94
Date:
2014-03-13 10:46:25 UTC
From:
To:
Am Donnerstag, den 13.03.2014, 10:19 +0000 schrieb Simon McVittie:

Well, I have to tell you now that it works if I omit the '+sd' part from
the URI, i.e. "gvfs-mount dav://localhost:58892" works as expected.

Also, during my previous tests, no segfaults were ever reported in
'dmesg' output.

- Fabian

#597133#99
Date:
2014-03-13 11:12:34 UTC
From:
To:
Thanks, that's useful information.

Looking back through the bug, the valgrind log points to errors inside
Avahi libraries, which is related ("+sd" means "with DNS-SD",
implemented using Avahi). I wonder whether something is being done in a
non-thread-safe way? (Avahi uses libdbus, whose behaviour in multiple
threads is harder to reason about than GDBus.)

The script wrapping gdb that I suggested would tell you whether there
are multiple threads, and what they are doing at the time of the crash.

    S

#597133#104
Date:
2014-03-13 11:20:54 UTC
From:
To:
Am Donnerstag, den 13.03.2014, 11:12 +0000 schrieb Simon McVittie:

I already tried that wrapper and in the course of doing so, I found a
string like "localhost is not a valid dns-sd triple" or similar in the
log file created in $HOME. Because of that, I omited this part of the
URI and now it works.

I think there were multiple threads traced in the log file, but the
application did not crash.

Would you like me to apply that gdb wrapper script again to get more
information?

- Fabian

#597133#109
Date:
2014-03-13 11:30:24 UTC
From:
To:
the non-standard dav+sd:// URL (I wonder whether it's documented
anywhere?), the expected result is graceful failure with an error
message, and the actual result is a crash.

Can you still reproduce this failure mode through a GUI like Nautilus?
(If you can, then there might be a second bug: "the GUI uses the wrong
form of dav+sd:// too".)

That would probably be useful - having a backtrace is always better than
having no backtrace.

    S

#597133#114
Date:
2014-03-13 12:05:17 UTC
From:
To:
Am Donnerstag, den 13.03.2014, 11:30 +0000 schrieb Simon McVittie:

Erm, no, it doesn't crash anymore.

If I enable user file sharing in g-c-c, the GUI tells me that my shared
folder is now accessible to other computers under the "dav://kff50"
address (kff50 is my hostname). Note that it does neither add the '+sd'
part nor tell the port that it has opened.

Now I open Nautilus and choose "Connect to Server". If I paste the
address that g-c-c just suggested, Nautilus presents an error message:
"Oops! Something went wrong.
Unhandled error message: HTTP-Fehler: Cannot connect to destination
(kff50)"

The same does gvfs-mount in a terminal:
$ LANG=C gvfs-mount dav://kff50
Error mounting location: HTTP-Fehler: Cannot connect to destination
(kff50)

However, if I add the port opened by apache to the command line, it
works as expected:
$ LANG=C gvfs-mount dav://kff50:50211
$ echo $?
0

Now Nautilus does indeed show "kff50:50211" as a shortcut in the Network
section of its sidebar.

I can provide one later.

- Fabian