Dear maintainer(s), I looked at the results of the autopkgtest of your package. I noticed that it regularly fails. The failures seem related on the host that runs the test. ci-worker13 is a beefy machine [1] and test seem to fail consistently there, while the other amd64 workers are much more moderate [2] and tests pass there. Because the unstable-to-testing migration software now blocks on regressions in testing, flaky tests, i.e. tests that flip between passing and failing without changes to the list of installed packages, are causing people unrelated to your package to spend time on these tests. Don't hesitate to reach out if you need help and some more information from our infrastructure. Paul [1] https://metal.equinix.com/product/servers/m3-large/ [2] https://aws.amazon.com/ec2/instance-types/m5/ https://ci.debian.net/packages/p/pdns/testing/amd64/ https://ci.debian.net/data/autopkgtest/testing/amd64/p/pdns/41325109/log.gz 268s + service pdns restart 269s Job for pdns.service failed because the control process exited with error code. 269s See "systemctl status pdns.service" and "journalctl -xeu pdns.service" for details. 269s + journalctl _SYSTEMD_UNIT=pdns.service -n 10 --no-pager 269s Dec 25 16:13:20 ci-359-77591125 (s_server)[3766]: pdns.service: Failed to set up IPC namespacing: Resource temporarily unavailable 269s Dec 25 16:13:20 ci-359-77591125 (s_server)[3766]: pdns.service: Failed at step NAMESPACE spawning /usr/sbin/pdns_server: Resource temporarily unavailable 269s Dec 25 16:13:21 ci-359-77591125 (s_server)[3852]: pdns.service: Failed to set up IPC namespacing: Resource temporarily unavailable 269s Dec 25 16:13:21 ci-359-77591125 (s_server)[3852]: pdns.service: Failed at step NAMESPACE spawning /usr/sbin/pdns_server: Resource temporarily unavailable 269s Dec 25 16:13:23 ci-359-77591125 (s_server)[3876]: pdns.service: Failed to set up IPC namespacing: Resource temporarily unavailable 269s Dec 25 16:13:23 ci-359-77591125 (s_server)[3876]: pdns.service: Failed at step NAMESPACE spawning /usr/sbin/pdns_server: Resource temporarily unavailable 269s Dec 25 16:13:24 ci-359-77591125 (s_server)[3886]: pdns.service: Failed to set up IPC namespacing: Resource temporarily unavailable 269s Dec 25 16:13:24 ci-359-77591125 (s_server)[3886]: pdns.service: Failed at step NAMESPACE spawning /usr/sbin/pdns_server: Resource temporarily unavailable 269s Dec 25 16:13:25 ci-359-77591125 (s_server)[3915]: pdns.service: Failed to set up IPC namespacing: Resource temporarily unavailable 269s Dec 25 16:13:25 ci-359-77591125 (s_server)[3915]: pdns.service: Failed at step NAMESPACE spawning /usr/sbin/pdns_server: Resource temporarily unavailable 269s ++ mktemp 269s + TMPFILE=/tmp/tmp.jah1Y5TJIa 269s + trap cleanup EXIT 269s + tee /tmp/tmp.jah1Y5TJIa 269s + sdig 127.0.0.1 53 smoke.pgsql.example.org A 279s Fatal: Timeout waiting for data 279s + grep -c '127\.0\.0\.222' /tmp/tmp.jah1Y5TJIa 279s 0 279s + echo smoke.pgsql.example.org could not be resolved 279s smoke.pgsql.example.org could not be resolved 279s + exit 1 279s + cleanup
It would seem that the host runs out of IPC space? Does it run more tests in parallel than other workers, or so? I wouldn't know what to do about this, its not really under the control of src:pdns. Chris
Hi, What is IPC space? And when does a host run out of it? As I said, this is one of our most powerful hosts, so I would expect it to run out of things last. Yes, this host (like most of our host, but a bit more) runs multiple lxc based debci workers. Well, maybe check for it and fail gracefully? Or, since a couple of days, if qemu VM don't run out of IPC space, we could run them in qemu always. Paul
https://manpages.debian.org/bookworm/manpages/sysvipc.7.en.html https://manpages.debian.org/bookworm/manpages/ipc_namespaces.7.en.html anything special, the limits are probably shared with the whole host. kernel.shmmax, kernel.msgmax are I think the limits (but I'm not entirely sure). But how? systemd sets up the IPC namespace. I imagine a fully separated VM would not run out of IPC space, indeed. Chris
Hi, Can you figure out decent numbers for these? Below I printed the output of lsipc and AFAICT SHMMAX is already pretty big ;) (and the same on all our hosts, which is also true for MSGMAX). On the other hand, $(ipcs -a) doesn't show anything on the host, not even if I let it run in a while-loop (1 second interval) while I schedule the test of pdns. So, could this be a bug in systemd (which you claim below should be handeling this) or is this just not really supported in lxc and do you need a full VM. Because it works elsewhere, I feel more like a bug, and it would not be the first instance where code fails to properly handle 64 cores or 256GB or RAM. exit with 77 when you detect problems and add the skippable restriction. I just ran the test in qemu on ci-worker13 and it PASSed. Paul root@ci-worker13:~# lsipc RESOURCE DESCRIPTION LIMIT USED USE% MSGMNI Number of message queues 32000 0 0.00% MSGMAX Max size of message (bytes) 8K - - MSGMNB Default max size of queue (bytes) 16K - - SHMMNI Shared memory segments 4096 0 0.00% SHMALL Shared memory pages 18446744073692774399 0 0.00% SHMMAX Max size of shared memory segment (bytes) 16E - - SHMMIN Min size of shared memory segment (bytes) 1B - - SEMMNI Number of semaphore identifiers 32000 0 0.00% SEMMNS Total number of semaphores 1024000000 0 0.00% SEMMSL Max semaphores per semaphore set. 32000 - - SEMOPM Max number of operations per semop(2) 500 - - SEMVMX Semaphore max value 32767 - -
Hi, can you confirm two additional things please: 1) this happens only on the large host? 2) this does not or does happen with other packages also requesting the same settings from systemd, e.g. dnsdist or pdns-recursor? Chris
Hi, https://ci.debian.net/packages/p/pdns/testing/s390x/41650331/ Seems it happens on our s390x host too (which has 10 debci workers running in parallel). https://ci.debian.net/packages/d/dnsdist/ -> Page not found. pdns-recursor seems to be flaky as well on amd64 and all passing tests were on one of the smaller hosts. pdns-recursor passes on s390x though. Paul
For now I've added the exit 77 hack in the pdns tests, but this is quite unsatisfying. I've opened an issue with systemd upstream, maybe someone there has any insight: https://github.com/systemd/systemd/issues/31037 Chris
We believe that the bug you reported is fixed in the latest version of pdns, which is due to be installed in the Debian FTP archive. A summary of the changes between this version and the previous one is attached. Thank you for reporting the bug, which will now be closed. If you have further comments please address them to 1059995@bugs.debian.org, and the maintainer will reopen the bug report if appropriate. Debian distribution maintenance software pp. Chris Hofstaedtler <zeha@debian.org> (supplier of updated pdns package) (This message was generated automatically at their request; if you believe that there is a problem with it please contact the archive administrators by mailing ftpmaster@ftp-master.debian.org) Format: 1.8 Date: Sun, 21 Jan 2024 12:11:54 +0100 Source: pdns Architecture: source Version: 4.8.3-3 Distribution: unstable Urgency: medium Maintainer: pdns packagers <pdns@packages.debian.org> Changed-By: Chris Hofstaedtler <zeha@debian.org> Closes: 1059995 Changes: pdns (4.8.3-3) unstable; urgency=medium . * tests: Abort if IPC namespaces do not work (Closes: #1059995) Checksums-Sha1: 8c2bccfdaa7d5cd7df2f56f350d4f227bba3a10a 3628 pdns_4.8.3-3.dsc d147b0d0266ef6d023bf57bf76b078a0244ebfba 46680 pdns_4.8.3-3.debian.tar.xz 981ede89dd308b2365578212e54a66e8e869b6a0 23977 pdns_4.8.3-3_arm64.buildinfo Checksums-Sha256: cef2b8f66c6e1d11c8c37c71b37fdd771289f163da38aefc9fa40a452b83054b 3628 pdns_4.8.3-3.dsc d8c886849592c63333edea3862a2ae1822b1d48d3c16df95be6882494b1b3ee9 46680 pdns_4.8.3-3.debian.tar.xz fb8465f38df8c52a8296e3df14f8d3baa01224cb275b1a555c211650bae06bad 23977 pdns_4.8.3-3_arm64.buildinfo Files: e2d3325bc3c02f459a4c4a34c5c3eace 3628 net optional pdns_4.8.3-3.dsc 236d73448237fe0b97442acbd08a9212 46680 net optional pdns_4.8.3-3.debian.tar.xz 09d4ef90c99aa450830547b3c99b0c89 23977 net optional pdns_4.8.3-3_arm64.buildinfo -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEfRrP+tnggGycTNOSXBPW25MFLgMFAmWtO8QACgkQXBPW25MF LgPSShAAhyMRtpttafOU9s0UUd5hLrPA3+OiUAGOlW3Moagcp2C8xJ66d6AYZm+0 hUjI7AKnbSxz/2fKiWCJ+rYKdrYUOJJYgbR5UsC0pL+cl3g3vfrQYlzxzDyALVYb N5taaUIExakNWkDjIEpGe5xQo3jma1sojOODPDRutGIQm9A7GMiSNwddkJhdvI9+ DlNiuzLWJMrUamv7Kw2396vYw+omcyk1TufpOsGHb6ki0K8ZCgjmysayArnLywzX JCsqqFjUBystVQP8+MhcPDBVMJtz+I0aUnH0Jtg3/Sdq4+w21A+iwoOahtWOE4Iv Bq6QPIi3xHwWojYQXBdDWDe0UlwTljwQhTNPnEAsym2YYck1296YAX8vrGS6wJuW sTvhSAO1UgJPC5fL6X+DWDoWZ1R9heVnP1k9y3f2eTF/+BljWWMKtAy0ESRgsFfd 3yhYPC9GF3HlVcDXleZ/2ZeOX1YQW8u/hTqCEjTdyrakAXROH+KEzY6A9FAnBDuu 5VWJ/YDSunQbsk9Pfgn+APJjwqfmOtZUuWEZFTILlFfxRqmRZTdAzsxaI/g8P0p8 yu/lZxRPLBFCY6wprWdmgbs7TKMU3hO/Lt8rUD7FMRvj2dJF/2epHLc6oO1/5ujd LbR7IVsBSONg2UXdItBc45Iz2WCdROqB81KiQgErUauyPb4RO1Q= =LRsk -----END PGP SIGNATURE-----
Hi Paul, * Paul Gevers <elbrus@debian.org> [240104 18:14]: Likely, but it is probably in systemd or in lxc or in apparmor or elsewhere. I see this "works", but now the tests fail after one try on the problematic worker and then are never retried. Can this please be fixed? Thanks, Chris
clone 1059995 -1 reopen -1 reassign -1 systemd found -1 systemd/254.3-1 forwarded -1 https://github.com/systemd/systemd/issues/31037 thanks Dear systemd Packagers, Paul Gevers noted that src:pdns's autopkgtests fail every so often on a large amd64 debci worker and on s390x workers. Apparently a similar problem can be seen in src:pdns-recursor's debci runs. As there is no pdns(-recursor) code running at this point, this seems to be a problem somewhere in the space of systemd <> lxc <> apparmor <> kernel. I've opened a bug with systemd upstream, unfortunately with very little info as I don't know how to provide additional info from within a debci run. Help with providing additional info would be very welcome. Thanks, Chris
Hi zeha, What do you have in mind? I think you need to wait until issue 166 [1] is fixed, which I guess isn't going to happen soon. Paul [1] https://salsa.debian.org/ci-team/debci/-/issues/166
* Paul Gevers <elbrus@debian.org> [240126 22:25]: 166 seems like an option, or auto-retry on a different worker, if thats possible? Chris
Hi, The issue (or at least some issue) seems to be kernel related. Due to issues with the backports kernel on arm64, we had to revert to the bookworm kernel and now pdns fails on arm64 too. On ppc64el and riscv64 the test passes for the last two months, both run a newer kernel (backports or even sid). However, s390x also runs a backports kernel and the issue still exists there. Paul By the way, if you want to use "exit 77" when conditions are not met, you also need to set the skippable restriction on those tests, otherwise the exit code is used like any other.
We believe that the bug you reported is fixed in the latest version of pdns, which is due to be installed in the Debian FTP archive. A summary of the changes between this version and the previous one is attached. Thank you for reporting the bug, which will now be closed. If you have further comments please address them to 1059995@bugs.debian.org, and the maintainer will reopen the bug report if appropriate. Debian distribution maintenance software pp. Chris Hofstaedtler <zeha@debian.org> (supplier of updated pdns package) (This message was generated automatically at their request; if you believe that there is a problem with it please contact the archive administrators by mailing ftpmaster@ftp-master.debian.org) Format: 1.8 Date: Mon, 26 Feb 2024 02:41:58 +0100 Source: pdns Architecture: source Version: 4.8.3-4 Distribution: unstable Urgency: medium Maintainer: pdns packagers <pdns@packages.debian.org> Changed-By: Chris Hofstaedtler <zeha@debian.org> Closes: 1059995 Changes: pdns (4.8.3-4) unstable; urgency=medium . * tests: mark tests as skippable for exit code 77 (Closes: #1059995) Checksums-Sha1: e49ae0e2bd3ca08ce5ee23bc6a49237d56da608d 3628 pdns_4.8.3-4.dsc 2ff861b5ff0f7739dbde4034599d6ea86a7d0a83 46692 pdns_4.8.3-4.debian.tar.xz 6319296b7bcf4d9b4ee2e91057e305d6bebc3111 24501 pdns_4.8.3-4_arm64.buildinfo Checksums-Sha256: fa328a3df9e85c2069b4d3b1ec39ac5cce1369ed284aa9067b73efdb108d8ed8 3628 pdns_4.8.3-4.dsc e7da8f9266178d78ffcf73c88f996ff5d2afcccefbb13bf7ad27356ffe3b9b31 46692 pdns_4.8.3-4.debian.tar.xz 4ece8c33f3a8f86d8f87143bf21ab70c7a28e7d3680d702e838fb391d87b4c76 24501 pdns_4.8.3-4_arm64.buildinfo Files: 6fb6d41166e6081b634b8a127649de22 3628 net optional pdns_4.8.3-4.dsc 1dea1ba3ff90f0d0f70e8c6ad77e6300 46692 net optional pdns_4.8.3-4.debian.tar.xz 090c7f71e74e251f4ee06072169acf46 24501 net optional pdns_4.8.3-4_arm64.buildinfo -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEfRrP+tnggGycTNOSXBPW25MFLgMFAmXb7qkACgkQXBPW25MF LgParBAAi6V3izCpDEOcse3KwI2mAya6ZSLOK16+5vSuffk8KYXKhOnNi9osc5eM QAUw8F/CvVzLG+SUeUaLUgZTb/mHo1Is/1WoEKMYKVNKUvn//Ze5W32rcUjMrC35 EYAUNWfdEVyrBnP8b08KzEYCagA+Mo816iW02fymEpaGXyYact3YywuEYBWvtOnP BZ7YZ9WZqACEm+C3Tn4LT4EvdVjCG72iw7h8RXWRwiPY+x7+5qbUnYJ/1lA7cao1 qMtYdJre2miYfSyAkSkuL16dVsGXcyOErYDFbmJvoOppEoUHgM7iZA2Vob2jLA29 hNU4MO9+M9dwZ6TT8FYkEJjf/CsgjxZF9ePMczL9dD6undDLT6idKY+7yaQ9eQiV cQCWgHGOA4TC6qikqKp138o5+hX4tCR/2IRNPkPCPkl79DIHTiikBLyyLiWy4oQj gvXzoFArhSf/NTWEK9zdy59Q9T+fRZavvrR+hmMvkUfIoadnUfEP86zPEiLmltcL NCt8hVBs0D518xdSk/M0WF86SOrdndowQZFpHWmVd4OEMZnoYqFcpPk/uvLU8Sdl ii/tI5ywtqY5k3MjTyLd9OSBVvzWp1S+fDj++zxt4TBSgtM4tQbAzk/ozWSyGSw8 F5rE+a90NnMqxT7/yluBu2z0cJRHMU2i+y03XGrNr7hnufct0XE= =/c4f -----END PGP SIGNATURE-----