#954197 GFileMonitor can deadlock if not cancelled before last unref

Package:
libglib2.0-0
Source:
glib2.0
Description:
GLib library of C routines
Submitter:
Simon McVittie
Date:
2022-07-13 12:33:04 UTC
Severity:
normal
Tags:
#954197#5
Date:
2020-03-18 08:29:06 UTC
From:
To:
https://buildd.debian.org/status/fetch.php?pkg=gegl&arch=arm64&ver=0.4.22-2&stamp=1584481529&raw=0

It isn't trivial to correlate the test's output with which one it is
(this is a legacy test using unstructured output and no particular testing
framework, and I should have made d/rules cat meson-logs/testlog.txt but
didn't), but I don't think it produced any, which means a fairly trivial
GEGL test didn't make any useful progress on these buildds.

https://gitlab.gnome.org/GNOME/gegl/issues/226 appears to be the same thing
on x86_64. I suspect it might be a race condition or otherwise
intermittent: in the past this test also timed out on armel, but my recent
upload worked fine on armel. Perhaps there is something about the ARM
family of architectures that makes this more likely to happen there.

    smcv

#954197#12
Date:
2020-03-18 09:31:46 UTC
From:
To:
1000 times in a loop. After a while, output stopped.

From the backtrace below it appears that there is a deadlock between
threads 1 and 6, which are both waiting for the same mutex. This might
be a GLib bug.

    smcv

Thread 6 (Thread 0xffff95366ce0 (LWP 1856)):
#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
#1  0x0000ffff97f6f588 in g_mutex_lock_slowpath (mutex=0xffff97cee280 <g.inotify_lock_lock>) at ../../../glib/gthread-posix.c:1457
#2  0x0000ffff97f700b8 in g_mutex_lock (mutex=<optimized out>) at ../../../glib/gthread-posix.c:1482
#3  0x0000ffff97c4c00c in _ih_sub_cancel (sub=0xaaaac48fad60) at ../../../gio/inotify/inotify-helper.c:121
#4  0x0000ffff97c4c2a8 in g_inotify_file_monitor_cancel (monitor=<optimized out>) at ../../../gio/inotify/ginotifyfilemonitor.c:75
#5  0x0000ffff97b6b474 in g_file_monitor_cancel (monitor=0xaaaac48fd410 [GInotifyFileMonitor]) at ../../../gio/gfilemonitor.c:241
#6  g_file_monitor_cancel (monitor=0xaaaac48fd410 [GInotifyFileMonitor]) at ../../../gio/gfilemonitor.c:235
#7  0x0000ffff97b6b51c in g_file_monitor_dispose (object=0xaaaac48fd410 [GInotifyFileMonitor]) at ../../../gio/gfilemonitor.c:123
#8  0x0000ffff97c409d4 in g_local_file_monitor_dispose (object=0xaaaac48fd410 [GInotifyFileMonitor]) at ../../../gio/glocalfilemonitor.c:814
#9  0x0000ffff97e784f8 in g_object_unref (_object=<optimized out>) at ../../../gobject/gobject.c:3461
#10 g_object_unref (_object=0xaaaac48fd410) at ../../../gobject/gobject.c:3391
#11 0x0000ffff97c40358 in g_file_monitor_source_handle_event (fms=0xaaaac48faa50, event_type=event_type@entry=G_FILE_MONITOR_EVENT_CHANGES_DONE_HINT, child=0xffff84001470 "buf_a.gegl", rename_to=rename_to@entry=0x0, other=other@entry=0x0, event_time=<optimized out>) at ../../../gio/glocalfilemonitor.c:455
#12 0x0000ffff97c4bbe4 in ih_event_callback (event=0xffff84001430, sub=0xaaaac48fad60, file_event=<optimized out>) at ../../../gio/inotify/inotify-helper.c:206
#13 0x0000ffff97c4b13c in ip_event_dispatch (dir_list=dir_list@entry=0xaaaac48f7d80 = {...}, file_list=0x0, event=event@entry=0xffff84001430) at ../../../gio/inotify/inotify-path.c:492
#14 0x0000ffff97c4b2f4 in ip_event_dispatch (event=0xffff84001430, file_list=<optimized out>, dir_list=0xaaaac48f7d80 = {...}) at ../../../gio/inotify/inotify-path.c:547
#15 ip_event_callback (event=0xffff84001430) at ../../../gio/inotify/inotify-path.c:547
#16 ip_event_callback (event=0xffff84001430) at ../../../gio/inotify/inotify-path.c:529
#17 0x0000ffff97c4a4f8 in ik_source_dispatch (source=0xaaaac48fa5f0, func=0xffff97c4b210 <ip_event_callback>, user_data=<optimized out>) at ../../../gio/inotify/inotify-kernel.c:327
#18 0x0000ffff97f2235c in g_main_dispatch (context=0xaaaac48fa6b0) at ../../../glib/gmain.c:3309
#19 g_main_context_dispatch (context=context@entry=0xaaaac48fa6b0) at ../../../glib/gmain.c:3974
#20 0x0000ffff97f22708 in g_main_context_iterate (context=context@entry=0xaaaac48fa6b0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../../../glib/gmain.c:4047
#21 0x0000ffff97f227ac in g_main_context_iteration (context=0xaaaac48fa6b0, may_block=may_block@entry=1) at ../../../glib/gmain.c:4108
#22 0x0000ffff97f227fc in glib_worker_main (data=<optimized out>) at ../../../glib/gmain.c:5996
#23 0x0000ffff97f4bcec in g_thread_proxy (data=0xaaaac48fbd20) at ../../../glib/gthread.c:807
#24 0x0000ffff97a0b998 in start_thread (arg=0xfffffc63b24f) at pthread_create.c:479
#25 0x0000ffff97dc16ac in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Thread 5 (Thread 0xffff95b67ce0 (LWP 1855)):
#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
#1  0x0000ffff97f701dc in g_cond_wait (cond=cond@entry=0xffff981bf9d0 <queue_cond>, mutex=mutex@entry=0xffff981bf998 <mutex>) at ../../../glib/gthread-posix.c:1540
#2  0x0000ffff98149d70 in gegl_tile_backend_file_writer_thread (ignored=<optimized out>) at ../gegl/buffer/gegl-tile-backend-file-async.c:257
#3  0x0000ffff97f4bcec in g_thread_proxy (data=0xaaaac4865120) at ../../../glib/gthread.c:807
#4  0x0000ffff97a0b998 in start_thread (arg=0xfffffc63b5ff) at pthread_create.c:479
#5  0x0000ffff97dc16ac in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Thread 4 (Thread 0xffff96368ce0 (LWP 1854)):
#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
#1  0x0000ffff97f701dc in g_cond_wait (cond=cond@entry=0xffff981bd228 <gegl_parallel_distribute_threads+128>, mutex=mutex@entry=0xffff981bd220 <gegl_parallel_distribute_threads+120>) at ../../../glib/gthread-posix.c:1540
#2  0x0000ffff9811d4e0 in gegl_parallel_distribute_thread_func (thread=0xffff981bd218 <gegl_parallel_distribute_threads+112>) at ../gegl/gegl-parallel.c:508
#3  0x0000ffff97f4bcec in g_thread_proxy (data=0xaaaac4864b60) at ../../../glib/gthread.c:807
#4  0x0000ffff97a0b998 in start_thread (arg=0xfffffc63bccf) at pthread_create.c:479
#5  0x0000ffff97dc16ac in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Thread 3 (Thread 0xffff96b69ce0 (LWP 1853)):
#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
#1  0x0000ffff97f701dc in g_cond_wait (cond=cond@entry=0xffff981bd1f0 <gegl_parallel_distribute_threads+72>, mutex=mutex@entry=0xffff981bd1e8 <gegl_parallel_distribute_threads+64>) at ../../../glib/gthread-posix.c:1540
#2  0x0000ffff9811d4e0 in gegl_parallel_distribute_thread_func (thread=0xffff981bd1e0 <gegl_parallel_distribute_threads+56>) at ../gegl/gegl-parallel.c:508
#3  0x0000ffff97f4bcec in g_thread_proxy (data=0xaaaac4864b00) at ../../../glib/gthread.c:807
#4  0x0000ffff97a0b998 in start_thread (arg=0xfffffc63bccf) at pthread_create.c:479
#5  0x0000ffff97dc16ac in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Thread 2 (Thread 0xffff9736ace0 (LWP 1852)):
#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
#1  0x0000ffff97f701dc in g_cond_wait (cond=cond@entry=0xffff981bd1b8 <gegl_parallel_distribute_threads+16>, mutex=mutex@entry=0xffff981bd1b0 <gegl_parallel_distribute_threads+8>) at ../../../glib/gthread-posix.c:1540
#2  0x0000ffff9811d4e0 in gegl_parallel_distribute_thread_func (thread=0xffff981bd1a8 <gegl_parallel_distribute_threads>) at ../gegl/gegl-parallel.c:508
#3  0x0000ffff97f4bcec in g_thread_proxy (data=0xaaaac4864aa0) at ../../../glib/gthread.c:807
#4  0x0000ffff97a0b998 in start_thread (arg=0xfffffc63bccf) at pthread_create.c:479
#5  0x0000ffff97dc16ac in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Thread 1 (Thread 0xffff976efbe0 (LWP 1851)):
#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
#1  0x0000ffff97f6f588 in g_mutex_lock_slowpath (mutex=mutex@entry=0xffff97cee280 <g.inotify_lock_lock>) at ../../../glib/gthread-posix.c:1457
#2  0x0000ffff97f700b8 in g_mutex_lock (mutex=mutex@entry=0xffff97cee280 <g.inotify_lock_lock>) at ../../../glib/gthread-posix.c:1482
#3  0x0000ffff97c4bf14 in _ih_startup () at ../../../gio/inotify/inotify-helper.c:74
#4  0x0000ffff97c4c308 in g_inotify_file_monitor_start (local_monitor=<optimized out>, dirname=0xaaaac48f62c0 "/tmp/test-backend-file-PI3NH0", basename=0xaaaac48fe550 "buf_a.gegl", filename=0x0, source=0xaaaac48fe700) at ../../../gio/inotify/ginotifyfilemonitor.c:61
#5  0x0000ffff97c40e5c in g_local_file_monitor_new_for_path (pathname=0xaaaac48b3bb0 "/tmp/test-backend-file-PI3NH0/buf_a.gegl", is_directory=0, flags=G_FILE_MONITOR_NONE, error=<optimized out>) at ../../../gio/glocalfilemonitor.c:886
#6  0x0000ffff97b5e6dc in g_file_monitor_file (file=0xaaaac48f80c0, flags=flags@entry=G_FILE_MONITOR_NONE, cancellable=cancellable@entry=0x0, error=error@entry=0x0) at ../../../gio/gfile.c:5447
#7  0x0000ffff9814abec in gegl_tile_backend_file_constructed (object=0xaaaac48f00a0 [GeglTileBackendFile]) at ../gegl/buffer/gegl-tile-backend-file-async.c:1170
#8  0x0000ffff97e78d88 in g_object_new_internal (class=0xaaaac48ee750, class@entry=0xaaaac48b5140, params=params@entry=0xfffffc63b928, n_params=n_params@entry=4) at ../../../gobject/gobject.c:1977
#9  0x0000ffff97e7ab8c in g_object_new_valist (object_type=0xaaaac48ee5a0 [GeglTileBackendFile/GeglTileBackend/GeglTileSource], first_property_name=first_property_name@entry=0xffff98177638 "tile-width", var_args=...) at ../../../gobject/gobject.c:2262
#10 0x0000ffff97e7ae14 in g_object_new (object_type=<optimized out>, first_property_name=first_property_name@entry=0xffff98177638 "tile-width") at ../../../gobject/gobject.c:1780
#11 0x0000ffff9813c9e0 in gegl_buffer_constructor (type=<optimized out>, n_params=<optimized out>, params=<optimized out>) at ../gegl/buffer/gegl-buffer.c:549
#12 0x0000ffff97e78b40 in g_object_new_with_custom_constructor (n_params=2, params=0xfffffc63be18, class=0xaaaac48b4400) at ../../../gobject/gobject.c:1855
#13 g_object_new_internal (class=0xaaaac48b4400, class@entry=0xaaaac48724d0, params=params@entry=0xfffffc63be18, n_params=n_params@entry=2) at ../../../gobject/gobject.c:1935
#14 0x0000ffff97e7ab8c in g_object_new_valist (object_type=0xaaaac48b4160 [GeglBuffer/GeglTileHandler/GeglTileSource], first_property_name=<optimized out>, var_args=...) at ../../../gobject/gobject.c:2262
#15 0x0000ffff97e7ae14 in g_object_new (object_type=<optimized out>, first_property_name=<optimized out>) at ../../../gobject/gobject.c:1780
#16 0x0000aaaac3179808 in test_buffer_open () at ../tests/simple/test-backend-file.c:306
#17 main (argc=<optimized out>, argv=<optimized out>) at ../tests/simple/test-backend-file.c:443

#954197#17
Date:
2020-03-18 09:37:15 UTC
From:
To:
Control: reassign -1 libglib2.0-0 2.62.2-3
Control: affects -1 + src:gegl
Control: forwarded -1 https://gitlab.gnome.org/GNOME/glib/issues/1941
Control: tags -1 + upstream

Looks like <https://gitlab.gnome.org/GNOME/glib/issues/1941>. The
workaround, which gegl might be able to use, is to always cancel
file monitors before their last-unref is reached.

    smcv

#954197#34
Date:
2020-03-18 17:56:30 UTC
From:
To:
Control: retitle -1 GFileMonitor can deadlock if not cancelled before last unref
Control: severity -1 normal

I have done that in gegl and it seems to have resolved the FTBFS.

According to GLib developers, best-practice is to cancel file monitors,
close files and sockets, etc. before their last-unref *anyway*, similar
to the advice given for file-like objects in modern Python: this avoids
doing complicated things in the dispose() and finalize() callbacks,
and avoids leaking resources if a reference cycle, a temporary ref
taken while delivering events or a ref taken by a worker thread keeps
the object alive. So this workaround is not just a workaround, it is
also an improvement to gegl.

    smcv

#954197#45
Date:
2022-07-13 12:28:40 UTC
From:
To:
Version: 2.72.3-1

Fixed upstream in 2.72.3 and 2.73.1.

    smcv