#1031059 golang-github-pion-sctp: FTBFS randomly (failing tests)

#1031059#5
Date:
2023-02-10 23:43:59 UTC
From:
To:
Dear maintainer:

During a rebuild of all packages in bookworm, your package failed to build:
--------------------------------------------------------------------------------
[...]
  debian/rules binary-indep
dh binary-indep --builddirectory=_build --buildsystem=golang --with=golang
    dh_update_autotools_config -i -O--builddirectory=_build -O--buildsystem=golang
    dh_autoreconf -i -O--builddirectory=_build -O--buildsystem=golang
    dh_auto_configure -i -O--builddirectory=_build -O--buildsystem=golang
    dh_auto_build -i -O--builddirectory=_build -O--buildsystem=golang
	cd _build && go install -trimpath -v -p 1 github.com/pion/sctp github.com/pion/sctp/examples/ping-pong
internal/goarch
internal/unsafeheader
internal/abi
internal/cpu
internal/bytealg
internal/goexperiment
internal/goos
runtime/internal/atomic
runtime/internal/math
runtime/internal/sys
runtime/internal/syscall
runtime
internal/reflectlite
errors
internal/race
sync/atomic
sync
io
unicode
unicode/utf8
bytes
internal/itoa
internal/oserror
syscall
time
context
hash
math/bits
math
strconv
crypto
crypto/internal/subtle
crypto/subtle
reflect
encoding/binary
crypto/cipher
crypto/internal/boring/sig
crypto/internal/boring
crypto/internal/randutil
internal/syscall/unix
sort
internal/fmtsort
internal/poll
internal/safefilepath
internal/syscall/execenv
internal/testlog
path
io/fs
os
fmt
math/rand
strings
math/big
crypto/rand
encoding/hex
log
github.com/pion/logging
github.com/pion/randutil
hash/crc32
vendor/golang.org/x/net/dns/dnsmessage
internal/godebug
internal/nettrace
internal/singleflight
internal/intern
net/netip
runtime/cgo
net
github.com/pion/sctp
github.com/pion/sctp/examples/ping-pong
    dh_auto_test -i -O--builddirectory=_build -O--buildsystem=golang
	cd _build && go test -vet=off -v -p 1 github.com/pion/sctp github.com/pion/sctp/examples/ping-pong
=== RUN   TestAckTimer
=== RUN   TestAckTimer/start_and_close
     ack_timer_test.go:26: ack timed out
     ack_timer_test.go:26: ack timed out
=== RUN   TestAckTimer/start_and_stop
--- PASS: TestAckTimer (1.15s)
     --- PASS: TestAckTimer/start_and_close (0.90s)
     --- PASS: TestAckTimer/start_and_stop (0.25s)
=== RUN   TestAssocStressDuplex
--- PASS: TestAssocStressDuplex (0.62s)
=== RUN   TestAssocReliable
=== RUN   TestAssocReliable/Simple
=== RUN   TestAssocReliable/ordered_reordered
=== RUN   TestAssocReliable/ordered_fragmented_then_defragmented
=== RUN   TestAssocReliable/unordered_fragmented_then_defragmented
=== RUN   TestAssocReliable/unordered_reordered
=== RUN   TestAssocReliable/retransmission
=== RUN   TestAssocReliable/short_buffer
--- PASS: TestAssocReliable (1.22s)
     --- PASS: TestAssocReliable/Simple (0.15s)
     --- PASS: TestAssocReliable/ordered_reordered (0.13s)
     --- PASS: TestAssocReliable/ordered_fragmented_then_defragmented (0.16s)
     --- PASS: TestAssocReliable/unordered_fragmented_then_defragmented (0.16s)
     --- PASS: TestAssocReliable/unordered_reordered (0.16s)
     --- PASS: TestAssocReliable/retransmission (0.32s)
     --- PASS: TestAssocReliable/short_buffer (0.15s)
=== RUN   TestAssocUnreliable
=== RUN   TestAssocUnreliable/Rexmit_ordered_no_fragment
=== RUN   TestAssocUnreliable/Rexmit_ordered_fragments
=== RUN   TestAssocUnreliable/Rexmit_unordered_no_fragment
=== RUN   TestAssocUnreliable/Rexmit_unordered_fragments
=== RUN   TestAssocUnreliable/Timed_ordered
=== RUN   TestAssocUnreliable/Timed_unordered
--- PASS: TestAssocUnreliable (1.02s)
     --- PASS: TestAssocUnreliable/Rexmit_ordered_no_fragment (0.17s)
     --- PASS: TestAssocUnreliable/Rexmit_ordered_fragments (0.18s)
     --- PASS: TestAssocUnreliable/Rexmit_unordered_no_fragment (0.17s)
     --- PASS: TestAssocUnreliable/Rexmit_unordered_fragments (0.17s)
     --- PASS: TestAssocUnreliable/Timed_ordered (0.17s)
     --- PASS: TestAssocUnreliable/Timed_unordered (0.17s)
=== RUN   TestCreateForwardTSN
=== RUN   TestCreateForwardTSN/forward_one_abandoned
=== RUN   TestCreateForwardTSN/forward_two_abandoned_with_the_same_SI
--- PASS: TestCreateForwardTSN (0.00s)
     --- PASS: TestCreateForwardTSN/forward_one_abandoned (0.00s)
     --- PASS: TestCreateForwardTSN/forward_two_abandoned_with_the_same_SI (0.00s)
=== RUN   TestHandleForwardTSN
=== RUN   TestHandleForwardTSN/forward_3_unreceived_chunks
=== RUN   TestHandleForwardTSN/forward_1_for_1_missing
=== RUN   TestHandleForwardTSN/forward_1_for_2_missing
=== RUN   TestHandleForwardTSN/dup_forward_TSN_chunk_should_generate_sack
--- PASS: TestHandleForwardTSN (0.00s)
     --- PASS: TestHandleForwardTSN/forward_3_unreceived_chunks (0.00s)
     --- PASS: TestHandleForwardTSN/forward_1_for_1_missing (0.00s)
     --- PASS: TestHandleForwardTSN/forward_1_for_2_missing (0.00s)
     --- PASS: TestHandleForwardTSN/dup_forward_TSN_chunk_should_generate_sack (0.00s)
=== RUN   TestAssocT1InitTimer
=== RUN   TestAssocT1InitTimer/Retransmission_success
=== RUN   TestAssocT1InitTimer/Retransmission_failure
sctp ERROR: 2023/02/09 23:43:05 [0xc00011ddc0] retransmission failure: T1-init
sctp ERROR: 2023/02/09 23:43:05 [0xc00011dc00] retransmission failure: T1-init
--- PASS: TestAssocT1InitTimer (0.69s)
     --- PASS: TestAssocT1InitTimer/Retransmission_success (0.04s)
     --- PASS: TestAssocT1InitTimer/Retransmission_failure (0.66s)
=== RUN   TestAssocT1CookieTimer
=== RUN   TestAssocT1CookieTimer/Retransmission_success
=== RUN   TestAssocT1CookieTimer/Retransmission_failure
sctp ERROR: 2023/02/09 23:43:05 [0xc0003c6380] retransmission failure: T1-cookie
--- PASS: TestAssocT1CookieTimer (1.69s)
     --- PASS: TestAssocT1CookieTimer/Retransmission_success (0.03s)
     --- PASS: TestAssocT1CookieTimer/Retransmission_failure (1.66s)
=== RUN   TestAssocCreateNewStream
=== RUN   TestAssocCreateNewStream/acceptChSize
--- PASS: TestAssocCreateNewStream (0.00s)
     --- PASS: TestAssocCreateNewStream/acceptChSize (0.00s)
=== RUN   TestAssocT3RtxTimer
=== RUN   TestAssocT3RtxTimer/Retransmission_success
--- PASS: TestAssocT3RtxTimer (0.22s)
     --- PASS: TestAssocT3RtxTimer/Retransmission_success (0.22s)
=== RUN   TestAssocCongestionControl
=== RUN   TestAssocCongestionControl/Fast_retransmission
     association_test.go:1769: nDATAs      : 5
     association_test.go:1770: nSACKs      : 5
     association_test.go:1771: nAckTimeouts: 2
     association_test.go:1772: nFastRetrans: 1
=== RUN   TestAssocCongestionControl/Congestion_Avoidance
goroutine profile: total 10
2 @ 0x43ba36 0x44b7fc 0x5abddc 0x5bfe7e 0x46b8a1
#	0x5abddb	github.com/pion/transport/test.(*bridgeConn).Read+0x11b	/<<PKGBUILDDIR>>/_build/src/github.com/pion/transport/test/bridge.go:73
#	0x5bfe7d	github.com/pion/sctp.(*Association).readLoop+0x15d	/<<PKGBUILDDIR>>/_build/src/github.com/pion/sctp/association.go:509

1 @ 0x430db6 0x465ce5 0x50c535 0x50c34d 0x5092ab 0x5afba5 0x46b8a1
#	0x465ce4	runtime/pprof.runtime_goroutineProfileWithLabels+0x24	/usr/lib/go-1.19/src/runtime/mprof.go:846
#	0x50c534	runtime/pprof.writeRuntimeProfile+0xb4			/usr/lib/go-1.19/src/runtime/pprof/pprof.go:723
#	0x50c34c	runtime/pprof.writeGoroutine+0x4c			/usr/lib/go-1.19/src/runtime/pprof/pprof.go:683
#	0x5092aa	runtime/pprof.(*Profile).WriteTo+0x14a			/usr/lib/go-1.19/src/runtime/pprof/pprof.go:330
#	0x5afba4	github.com/pion/transport/test.TimeOut.func1+0x44	/<<PKGBUILDDIR>>/_build/src/github.com/pion/transport/test/util.go:21

1 @ 0x43a15f 0x46cedb 0x47376e 0x473734 0x5ae328 0x5edbeb 0x4d5b4b 0x46b8a1
#	0x47376d	sync.(*Mutex).Unlock+0x8d					/usr/lib/go-1.19/src/sync/mutex.go:219
#	0x473733	sync.(*RWMutex).Unlock+0x53					/usr/lib/go-1.19/src/sync/rwmutex.go:216
#	0x5ae327	github.com/pion/transport/test.(*Bridge).Tick+0x2c7		/<<PKGBUILDDIR>>/_build/src/github.com/pion/transport/test/bridge.go:414
#	0x5edbea	github.com/pion/sctp.TestAssocCongestionControl.func2+0x96a	/<<PKGBUILDDIR>>/_build/src/github.com/pion/sctp/association_test.go:1814
#	0x4d5b4a	testing.tRunner+0x10a						/usr/lib/go-1.19/src/testing/testing.go:1446

1 @ 0x43ba36 0x40815b 0x407c58 0x4d693a 0x4d888e 0x4d5b4b 0x4d8736 0x4d7219 0x617aea 0x43b672 0x46b8a1
#	0x4d6939	testing.(*T).Run+0x379		/usr/lib/go-1.19/src/testing/testing.go:1494
#	0x4d888d	testing.runTests.func1+0x6d	/usr/lib/go-1.19/src/testing/testing.go:1846
#	0x4d5b4a	testing.tRunner+0x10a		/usr/lib/go-1.19/src/testing/testing.go:1446
#	0x4d8735	testing.runTests+0x455		/usr/lib/go-1.19/src/testing/testing.go:1844
#	0x4d7218	testing.(*M).Run+0x5d8		/usr/lib/go-1.19/src/testing/testing.go:1726
#	0x617ae9	main.main+0x1a9			_testmain.go:193
#	0x43b671	runtime.main+0x211		/usr/lib/go-1.19/src/runtime/proc.go:250

1 @ 0x43ba36 0x40815b 0x407c58 0x4d693a 0x5ec5fe 0x4d5b4b 0x46b8a1
#	0x4d6939	testing.(*T).Run+0x379					/usr/lib/go-1.19/src/testing/testing.go:1494
#	0x5ec5fd	github.com/pion/sctp.TestAssocCongestionControl+0x11d	/<<PKGBUILDDIR>>/_build/src/github.com/pion/sctp/association_test.go:1779
#	0x4d5b4a	testing.tRunner+0x10a					/usr/lib/go-1.19/src/testing/testing.go:1446

1 @ 0x43ba36 0x44b7fc 0x5bd3b4 0x46b8a1
#	0x5bd3b3	github.com/pion/sctp.(*ackTimer).start.func1+0x73	/<<PKGBUILDDIR>>/_build/src/github.com/pion/sctp/ack_timer.go:56

1 @ 0x43ba36 0x44b7fc 0x5c08c6 0x46b8a1
#	0x5c08c5	github.com/pion/sctp.(*Association).writeLoop+0x205	/<<PKGBUILDDIR>>/_build/src/github.com/pion/sctp/association.go:558

1 @ 0x43ba36 0x44b7fc 0x5db84b 0x46b8a1
#	0x5db84a	github.com/pion/sctp.(*rtxTimer).start.func1+0xca	/<<PKGBUILDDIR>>/_build/src/github.com/pion/sctp/rtx_timer.go:151

1 @ 0x43ba36 0x468715 0x5acbcf 0x5abfd5 0x5c0a85 0x46b8a1
#	0x468714	time.Sleep+0x134					/usr/lib/go-1.19/src/runtime/time.go:195
#	0x5acbce	github.com/pion/transport/test.(*Bridge).Push+0x8e	/<<PKGBUILDDIR>>/_build/src/github.com/pion/transport/test/bridge.go:246
#	0x5abfd4	github.com/pion/transport/test.(*bridgeConn).Write+0xd4	/<<PKGBUILDDIR>>/_build/src/github.com/pion/transport/test/bridge.go:97
#	0x5c0a84	github.com/pion/sctp.(*Association).writeLoop+0x3c4	/<<PKGBUILDDIR>>/_build/src/github.com/pion/sctp/association.go:539

panic: timeout

goroutine 789 [running]:
github.com/pion/transport/test.TimeOut.func1()
	/<<PKGBUILDDIR>>/_build/src/github.com/pion/transport/test/util.go:24 +0xa5
created by time.goFunc
	/usr/lib/go-1.19/src/time/sleep.go:176 +0x32
FAIL	github.com/pion/sctp	17.491s
?   	github.com/pion/sctp/examples/ping-pong	[no test files]
FAIL
dh_auto_test: error: cd _build && go test -vet=off -v -p 1 github.com/pion/sctp github.com/pion/sctp/examples/ping-pong returned exit code 1
make: *** [debian/rules:4: binary-indep] Error 25
dpkg-buildpackage: error: debian/rules binary-indep subprocess returned exit status 2
--------------------------------------------------------------------------------

(The above is just how the build ends and not necessarily the most relevant part)

Note: I've used the word "randomly" in the subject, but it's not purely random.
After building several times on systems with 1 and 2 CPUs, I've noticed that
the failure rate on systems with one CPU is a lot higher.

Therefore, to reproduce, please try first setting GRUB_CMDLINE_LINUX="nr_cpus=1"
in /etc/default/grub. If that does not work, then the general offer below
still applies.
--------------------------------------------------------------------- About the archive rebuild: The build was made using virtual machines from Hetzner, with enough memory, enough disk, and either one or two CPUs, using a reduced chroot with only build-essential packages (plus debhelper). If you could not reproduce the bug please contact me privately, as I am willing to provide ssh access to a virtual machine where the bug is fully reproducible. If this is really a bug in one of the build-depends, please use reassign and affects, so that this is still visible in the BTS web page for this package. Thanks.