#1140806 python3-pycurl: WebSocket test deadlock under libcurl 8.21.0 (lazy PONG + read-only select)

Package:
python3-pycurl
Source:
python3-pycurl
Description:
Python bindings to libcurl (Python 3)
Submitter:
Keng-Yu Lin
Date:
2026-06-26 18:51:02 UTC
Severity:
normal
Tags:
#1140806#5
Date:
2026-06-26 18:49:07 UTC
From:
To:
Dear Maintainer,

While investigating the autopkgtest regression blocking the migration
of vsftpd (3.0.5-1) to testing, I discovered that the newly introduced
WebSocket test case in pycurl consistently deadlocks and fails under
libcurl >= 8.21.0 inside autopkgtest (LXC) environments:

  tests/test_websocket.py::test_default_mode_autopongs_server_ping

Correlation with libcurl 8.21.0:
- PASS: Runs against libcurl 8.20.0 (from testing) pass 100% of the time.
- FAIL: Runs against libcurl 8.21.0 (from unstable, including rc3) fail 100% of the time.

This regression was triggered by the upstream security fix for
CVE-2026-11586 in curl 8.21.0.

Root Cause & Deadlock Mechanism:

1. The PycURL Test Loop
   In tests/test_websocket.py, the helper _recv catches
   BlockingIOError and calls _wait_readable, which polls the
   socket only for readability:

       r, _, _ = select.select([fd], [], [], timeout) # Empty write list

2. curl 8.21.0 "Lazy PONG" (CVE-2026-11586)
   To prevent memory exhaustion, upstream commit 849317ff5c5a5e13f50ec3d0
   removed the immediate flush from ws_enc_add_cntrl() and made
   PONG sending lazy. The PONG frame is now merely buffered in
   ws->pending to be flushed during subsequent I/O.

3. The Deadlock
   In non-blocking mode, when curl_ws_recv() consumes the incoming
   PING frame from the socket receive buffer, no application-layer data
   (TEXT or BINARY) is yet available to return. Consequently,
   curl_ws_recv() naturally returns CURLE_AGAIN (raising
   BlockingIOError in Python) to yield control.

   Crucially, under curl 8.21.0's new "lazy PONG" design, the generated
   PONG frame is only buffered in ws->pending and has not yet been
   flushed to the socket.

   Because the PycURL test loop immediately suspends in a read-only
   select() upon receiving BlockingIOError, the client blocks
   forever waiting for readability. The server, waiting for the PONG,
   sends no further data. Since the client is blocked in select(),
   it never invokes any subsequent libcurl API to drive the write
   queue, leaving the PONG permanently trapped in ws->pending.

4. The Latent Bug in PycURL's I/O Loop
   In non-blocking mode with CONNECT_ONLY, the application has the
   obligation to drive the outbound write queue (either by polling
   for write-readiness or invoking subsequent libcurl APIs to flush
   pending output). The pycurl test helper _wait_readable violates
   this by only polling for readability. This latent bug was exposed
   by libcurl's new lazy PONG design. While alternative options like
   CURLWS_NOAUTOPONG (CURLOPT_WS_OPTIONS) exist, the default mode
   requires a robust non-blocking drive.

[Why Increasing Timeout is Ineffective]

Increasing the client-side timeout (even to 60.0s) does not resolve
the deadlock. The mock server's underlying websockets library has
a hardcoded 10.0-second close_timeout:

    # websockets/asyncio/connection.py
    class Connection:
        def __init__(..., close_timeout: float | None = 10, ...):

After exactly 10.0 seconds of waiting for the PONG, the server
forcefully closes the TCP connection (sending a FIN packet). This
wakes the client's select(), which then calls ws_recv() and
immediately fails with CURLE_GOT_NOTHING (52, "Server returned
nothing").

Execution trace obtained by adding print debug messages to the
client-side execution path (60s timeout):

    === PHASE 3: The Deadlock (PING received, PONG trapped) ===
    [DEBUG CLIENT] ws_recv raised BlockingIOError at 0.000798s
    [DEBUG CLIENT] Entering _wait_readable (remaining=59.493937s) at 0.001020s
    [DEBUG CLIENT] Exited _wait_readable at 10.008510s              # Trapped for exactly 10.0s until server FIN
    [DEBUG CLIENT] Calling ws_recv at 10.008932s
    FAILED (pycurl.error: 52, 'Server returned nothing')

strace of pytest running the websocket test case:

    18:12:26.595986 recvfrom(11, ..., 65535, 0, ...) = -1 EAGAIN  # Read-side yield, socket is empty
    18:12:26.596821 pselect6(12, [11], NULL, NULL, ...) <unfinished ...>  # Read-only select (writefds is NULL)
    ...
    [10.0-second silence: PONG remains trapped in ws->pending, zero outbound writes on fd 11]
    ...
    18:12:36.603624 close(12) <unfinished ...>  # Server close_timeout (10s) expires, sending FIN
    18:12:36.603798 <... pselect6 resumed> = 1 (in [11])  # Woken by TCP FIN
    18:12:36.604703 recvfrom(11, "", 65535, ...) = 0  # Connection closed (EOF), raising CURLE_GOT_NOTHING

[Proposed Temporary Workaround]

Since a complete fix requires upstream architectural refactoring of
the non-blocking I/O loop, marking this test as xfail is proposed
as a temporary workaround to unblock package migrations in Debian.

    diff --git a/tests/test_websocket.py b/tests/test_websocket.py
    index d9268f7..0b764db 100644
    --- a/tests/test_websocket.py
    +++ b/tests/test_websocket.py
    @@ -381,6 +381,7 @@ def test_ws_recv_would_block(wscurl, ws_app):
         assert exc_info.value.errno == errno.EAGAIN


    +@pytest.mark.xfail(run=False, reason="flaky deadlock on non-blocking auto-PONG without write-ready drive")
     def test_default_mode_autopongs_server_ping(wscurl, ws_app):
         wscurl.setopt(pycurl.URL, ws_app + "/ping-and-report-pong")
         wscurl.setopt(pycurl.CONNECT_ONLY, 2)

Thanks,
Keng-Yu Lin