#1010608 openldap: Flaky test test063-delta-multiprovider

#1010608#5
Date:
2022-05-05 11:54:14 UTC
From:
To:
https://buildd.debian.org/status/fetch.php?pkg=openldap&arch=amd64&ver=2.5.12%2Bdfsg-1&stamp=1651720566&raw=0
https://tests.reproducible-builds.org/debian/rbuild/unstable/i386/openldap_2.5.11+dfsg-1.rbuild.log.gz

...
running defines.sh
Initializing server configurations...
Starting server 1 on TCP/IP port 9011...
Using ldapsearch to check that server 1 is running...
Using ldapadd for context on server 1...
Starting server 2 on TCP/IP port 9012...
Using ldapsearch to check that server 2 is running...
Starting server 3 on TCP/IP port 9013...
Using ldapsearch to check that server 3 is running...
Starting server 4 on TCP/IP port 9014...
Using ldapsearch to check that server 4 is running...
Using ldapadd to populate server 1...
Waiting 7 seconds for syncrepl to receive changes...
Using ldapsearch to read all the entries from server 1...
Using ldapsearch to read all the entries from server 2...
Using ldapsearch to read all the entries from server 3...
Using ldapsearch to read all the entries from server 4...
Comparing retrieved entries from server 1 and server 2...
Comparing retrieved entries from server 1 and server 3...
Comparing retrieved entries from server 1 and server 4...
Using ldapadd to populate server 2...
Using ldapsearch to read all the entries from server 1...
Using ldapsearch to read all the entries from server 2...
Using ldapsearch to read all the entries from server 3...
Using ldapsearch to read all the entries from server 4...
Comparing retrieved entries from server 1 and server 2...
Comparing retrieved entries from server 1 and server 3...
test failed - server 1 and server 3 databases differ
(exit 1)
make[4]: *** [Makefile:303: mdb-mod] Error 1

#1010608#12
Date:
2022-05-05 14:06:24 UTC
From:
To:
--On Thursday, May 5, 2022 3:54 PM +0300 Adrian Bunk <bunk@debian.org> 
wrote:


The test suite is heavily timing dependent.  If you're building in a
resource constrainted environment, you'll need to adjust the timers
accordingly.

#1010608#17
Date:
2022-05-06 20:04:54 UTC
From:
To:
Hi Adrian,

I'm afraid this link has been superseded by the new upload (which built
successfully & reproducibly). Just to confirm, you're saying that it
failed for the same reason as the amd64 build?

I looked at this script, and I think I see how this part might be
fragile: *if* I'm reading correctly, it waits for server 1 to receive
the changes, but then I think it proceeds with the comparison
immediately, and could fail if server 3 or 4 was slower.

https://git.openldap.org/openldap/openldap/-/blob/master/tests/scripts/test063-delta-multiprovider#L309-359

This is also different from the previous section (lines 264-294) which
waits a flat $SLEEP1 seconds (default: 7) for changes to be synced.

However I'm not comfortable proposing changes to the script if I can't
validate them. I could really use some help figuring out how to
reproduce this failure. I would need to have just server 3 or 4 affected
by some slowdown - and not sure what kind, whether CPU or network or
disk. I guess I'll start by seeing if I can use tc to add latency to
just the specific port...

thanks,
Ryan

#1010608#24
Date:
2022-05-06 20:13:02 UTC
From:
To:
Hi Ryan,

this was from the reproducible build log.

It is the same reason, except that it was "server 3" instead of "server 4"
in the "test failed" line.

cu
Adrian

#1010608#29
Date:
2022-12-28 21:32:30 UTC
From:
To:
Hi Ryan,

Then not running the script at all is an improvement over the current
situation. Flaky tests are bad. Until a better solution is found, how
about skipping the test?

Paul

#1010608#34
Date:
2022-12-29 23:02:22 UTC
From:
To:
Not ideal, but yeah, probably an improvement over shipping a flaky test
in stable. Thanks for the reminder, I'll try to upload it soon.

#1010608#39
Date:
2023-01-14 19:01:59 UTC
From:
To:
Control: severity -1 important

I have uploaded -3 with the flaky test disabled. I'm downgrading the
bug, but not closing it right now - I'm still optimistic about finding a
proper solution (likely not for bookworm, though).