#1069048 live-boot fails to DHCP on all NICs with link up

Package:
live-boot
Source:
live-boot
Submitter:
Thomas Goirand
Date:
2025-05-14 14:54:01 UTC
Severity:
normal
Tags:
#1069048#5
Date:
2024-04-15 13:58:31 UTC
From:
To:
Hi,

The current behavior of live-boot is to search 5 times for network
interfaces with the carrier link up. On each run, as soon as there
is one interface with link up, the script will exit, leaving no time
for other NICs to be up in any eventual subsequent run.

This only works if:
- one is lucky
- if only the interfaces with DHCP have an actual ethernet link.

For cases where there is more than one interface with the link up,
but only one is connected to a DHCPd server, it is possible that it
will fail (depending which card will have the link first).

The attached patch changes the behavior: it makes sure that all cards
with a link that is up are reported in /conf/param.conf before
exiting, so that live-boot will try to get an IP address from
all cards with link up. Each card continues to have a 15 seconds
timeout (by default) to get the IP address from DHCP.

We've tested this patch in production, with such a case where it
was failing (ie: our 25Gbits/s cards were detected first, but were not
connected to a DHCP server, while the 1Gbits/s cards that were supposed
to be holding the network boot were never tried by live-boot). And
this patch fixed things for us.

Please merge this patch if you feel like it's correct. I also would
like to have it fixed in Stable if possible (once I have the approval
from the team).

Cheers,

Thomas Goirand (zigo)

P.S: If one would like to test it, the easiest way is to build a
Debian live the normal way, then unpack the ramdisk with cpio with
something like this:
zstdcat <path-to-initrd> | | cpio -idmv

Then recompress like this:
find . | cpio --create --format='newc' | zstd > <path-to-initrd>

If running an older version of Debian, replacing zstdcat by zcat and
zstd by "gzip -9" also works.

#1069048#10
Date:
2024-04-15 18:33:41 UTC
From:
To:
What is the behaviour if computer has 5 NICs and all of them are linked
to a DHCP-served network?
Will all of them get network configuration or only first succeeded one?

#1069048#15
Date:
2024-04-16 09:43:12 UTC
From:
To:
Hi Narcis,

The current behavior is that live-boot will attempt DHCP from the first
NIC that is detected with link up.

Cheers,

Thomas Goirand (zigo)

#1069048#20
Date:
2024-04-16 10:12:33 UTC
From:
To:
Hi Thomas,

And I'm asking about your patch contribution, when no NIC fails:
Will 5 linked NICs get network configuration from all respective
DHCP-served networks, or only first succeeded one?

#1069048#25
Date:
2024-04-16 15:51:05 UTC
From:
To:
Hi,

With my patch, the first NIC that gets an IP address from DHCP will be
used. All NICs will be tried one by one, with the default 15 seconds
timeout. The order of NICs stays the same as before, as in: the first
NIC that gets a link up will be tried first. So there's no regression
possible.

Cheers,

Thomas Goirand (zigo)

#1069048#30
Date:
2024-04-16 18:34:29 UTC
From:
To:
"the first NIC that gets an IP address from DHCP will be used"

You mean second NIC that gets IP too, right?
And also third one, yes?
And if ten NICs get IP configuration from DHCP, they all will be
configured; yes?

"All NICs will be tried one by one, with the default 15 seconds timeout"

You mean all NICs will be tried although first one succeeded already, am
I right?

#1069048#35
Date:
2024-04-17 10:35:42 UTC
From:
To:
Hi,

If there's 10 NICs with a working dhcpd, only one will be configured
(the first one), so that the live OS can fetch the squashfs. The fact
that all 10 NICs will be configured with an IP address depends on what
you put in the live image. By default, I believe all will be configured
when the system is up, but *not* at the squashfs wget phase.

So, what I'm fixing with this patch, is just the pre-wget phase, so that
it tries all NICs. When the first one succeeds, the scripts don't
attempt to get DHCP from another NIC at this stage. That's not different
from the past behavior though.

I hope you understood and I explained well enough this time! :)

Cheers,

Thomas Goirand (zigo)

#1069048#40
Date:
2024-04-17 11:32:05 UTC
From:
To:
Yes, you explained and I understood this time. Thank you for patience.

Could you review this patch for pre-wget phase, so it considers that a
NIC succeeds whet it acquires default gateway address?

This way, 10 active NICs with only one of them with assignation of
default gateway, will do the job.

Thank you.

#1069048#45
Date:
2024-04-29 10:12:43 UTC
From:
To:
Hi,

Narcis Garcia <debianbugs@actiu.net> wrote:
 > Could you review this patch for pre-wget phase, so it considers that a
 > NIC succeeds whet it acquires default gateway address?

Checking if a NIC has a default gateway interface is not the right way
to check if that nick should be in use. There are some configurations
where it's ok that there would be *NO* default gateway. This is a
perfectly valid DHCP setup.

The only way to check if it worked, is simply what's done right now:
check if dhclient gets an IP address. This part isn't even in the patch
itself, the only thing that this patch does, is listing the cards with
the link up, to pass it to the next step (ie: dhcp), which this patch
doesn't touch (it's written properly already, and works with multiple
network interface in the DEVICES= variable in /conf/param.conf).

So there's IMO nothing more to do in this patch.

Cheers,

Thomas Goirand (zigo)

#1069048#50
Date:
2024-05-21 06:00:39 UTC
From:
To:
ping?

If nobody really cares about this bug, would it be ok to NMU the fix to
Unstable, so that I can later backport it to Bookworm?

Cheers,

Thomas Goirand (zigo)

#1069048#53
Date:
2024-11-12 11:44:44 UTC
From:
To:
Hello,

Bug #1069048 in live-boot reported by you has been fixed in the
Git repository and is awaiting an upload. You can see the commit
message below and you can check the diff of the fix at:

https://salsa.debian.org/live-team/live-boot/-/commit/68c46378782d037d426dd67d2ca0ad336c49deea
------------------------------------------------------------------------
d/changelog:
  * Non-maintainer upload.
  * Add fix to get DHCP from all nics, not only the first one seen with link
    up (Closes: #1069048).
------------------------------------------------------------------------

(this message was generated automatically)
-- 
Greetings

https://bugs.debian.org/1069048

#1069048#60
Date:
2024-11-12 12:06:24 UTC
From:
To:
We believe that the bug you reported is fixed in the latest version of
live-boot, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 1069048@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Thomas Goirand <zigo@debian.org> (supplier of updated live-boot package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@ftp-master.debian.org)
Format: 1.8
Date: Tue, 12 Nov 2024 12:42:59 +0100
Source: live-boot
Architecture: source
Version: 1:20240525.1
Distribution: unstable
Urgency: medium
Maintainer: Debian Live Maintainers <debian-live@lists.debian.org>
Changed-By: Thomas Goirand <zigo@debian.org>
Closes: 1069048
Changes:
 live-boot (1:20240525.1) unstable; urgency=medium
 .
   * Non-maintainer upload.
   * Add fix to get DHCP from all nics, not only the first one seen with link
     up (Closes: #1069048).
Checksums-Sha1:
 6a6ef947a36dfb3191999b51cd7db07e86b4633a 1847 live-boot_20240525.1.dsc
 65da6232f0197721d6c1019a12df66c08cf6536b 99468 live-boot_20240525.1.tar.xz
 b50a5bd0e962852078bac8828b85614393cbaf95 6950 live-boot_20240525.1_amd64.buildinfo
Checksums-Sha256:
 37bc9f3c4eacacc849fa6eaf532f40947077a0b1cb08253645e511c99ef014b5 1847 live-boot_20240525.1.dsc
 7df508aac65cf26d2191e28f876592ffb3aa23bd2d6015151812a45ae555f42e 99468 live-boot_20240525.1.tar.xz
 8ea67da99942aaeae5fb93299358d48db94a01584b282683239554929a614821 6950 live-boot_20240525.1_amd64.buildinfo
Files:
 82ecab980e73b2cc0fe156b0d06b4109 1847 misc optional live-boot_20240525.1.dsc
 19b85bb82c085267441b1a30192ebb9e 99468 misc optional live-boot_20240525.1.tar.xz
 265c98a0957af9c68e45df5e8c8daf29 6950 misc optional live-boot_20240525.1_amd64.buildinfo
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEoLGp81CJVhMOekJc1BatFaxrQ/4FAmczQAMACgkQ1BatFaxr
Q/4JXA//cd3pcK40wrpMfiBPz66r413dVwmFMcuqgogpy/ElkQrHvFKw435jQYsh
4ZQwlI/74tVDYzk1JDZSs6M1y5vPt61mRlvY6Iup1g3CFcNuLGD6M0Udk/iWBaKK
xjvk28EhG1MpT4AJEBCkyEaSI4ijiDx+a2lrJSfbxMbLc1Z4keXPXZADxMCWV0z+
jGWTC+5Hhj79CyUEKqPWNnbCu9HTUS86fqlERSfny9+s/IEq2MaD2+ICokRgCvXh
bGA+USFcIrH5mZVuKy5g0hl3RnCHRh1XObIYGm8RVIM2BXTzbXCYdO/MVIISSb/N
ezdDrFMtjVkHzB3eoGqXutYIItMUxP5KLt8GnMenRjvChroSP1yKvRdstglawBi9
/PAizreIw9iSgDPFrh2KW7Sv4kbE20fA6aR6XAS1UOoMnIvuM/0zlGWgo6WbtBnh
GoSnxdncfNl1wLtc0IRpUZ70O9gfa3V3xtFvbf1GnLMMIUFO0eW9KYcKl/IKtDj7
VQKctCu9IhYnW4UbbtsrX57BIay4xrmTyE1PvSYJSEysTakd1+GI5vAy/SsxisKc
7eSEJxxEUuLJwLkBkRPr35pStondwGdPyk0R5p44m9htPa3PWZUn1LSNs2c1gBsQ
MEPxlHlwrvSpPl4xBHcWHHSYqy4yJ3ULe/grRnIqygtC3zLaJ08=
=1pbM
-----END PGP SIGNATURE-----

#1069048#73
Date:
2025-05-14 14:41:29 UTC
From:
To:
Hi,

in our tests, it looks like this merge request:

https://salsa.debian.org/live-team/live-boot/-/merge_requests/50/

fixes the problem with the slow interfaces which the patch to the bug
introduced.

Regards
Christoph

#1069048#78
Date:
2025-05-14 14:50:00 UTC
From:
To:
This concerns the bookworm version.

Am 14.05.25 um 16:41 schrieb Christoph Martin: