#989714 kdump-tools is broken out of the box

Package:
kdump-tools
Source:
kdump-tools
Description:
scripts and tools for automating kdump (Linux crash dumps)
Submitter:
Rich Ercolani
Date:
2024-02-02 07:06:03 UTC
Severity:
important
#989714#5
Date:
2021-06-11 07:43:17 UTC
From:
To:
Dear Maintainer,

(This part also applies to jessie/x86_64 and bullseye/x86_64, in addition to buster/x86_64.)

I installed kdump-tools to take a crash dump, rebooted, verified the crashkernel was configured,
triggered the problem I wanted to examine and a dump...the machine became entirely unresponsive
over ssh or local console (kind of expected) but didn't print any sign it was doing anything
like booting the crashkernel (bad).

I left it for 15 minutes, and nothing changed, so I hard rebooted it, and tried again, same result.

So I tried installing kdump-tools and then using echo 'c' | sudo tee /proc/syrq-trigger  on bullseye,
same outcome. Same on jessie/x86_64 (with manual configuration of crashkernel= in the grub config).

So I booted Ubuntu 20.04/x86_64 and tried this experiment, to make sure my expectations weren't
off-base - nope, works as expected.

So I looted part of the crashkernel= setting from the Ubuntu system (crashkernel=512M-:192M was theirs,
I used 384M-:192M) - no change. Tried 384M-:256M, and it worked. So I tried theirs verbatim, and it
also worked every time.

So maybe we need different defaults on at least x86_64 systems?

(I specify x86_64 because using 512M-:192M breaks crashkernel more on my i386 testbeds.)

- Rich

#989714#10
Date:
2024-02-02 06:53:53 UTC
From:
To:
Hi,

reviving this...

(Rich, sorry for double mail - my initial reply incorrectly replied to
your mail which didn't have 989714@bugs.debian.org anywhere to properly
tag... Hopefully this will get through better)

Rich Ercolani wrote on Fri, Jun 11, 2021 at 03:43:17AM -0400:

Also got bitten by this.
What's quite horrible is that when it happened on the real machine I
wanted to debug there was no sign it was doing anything -- the HDMI
screen setup probably didn't have time to happen on crash kernel to be
able to print anything, so even connecting a screen wouldn't help.

I also misread the 384M:-128M syntax to 384M@128M (second digit being
location in the later case) so tried to increase the first value which
obviously had no impact... and then tried in a VM at which point serial
works and it was clear enough, but the default experience was just
horrible, especially since the system never came back.

We probably ought to add 'panic=30' (or some arbitrary time) to
KDUMP_CMDLINE_APPEND's defaut value.

I haven't tried with less memory, but I'd say we can make use of the
range syntax to provide bigger values when the system has more than a
few GB of ram at least.
I can spend a bit of time to try in a VM with various values, but
something like crashkernel=512M-4G:192M,4G-64G:256M,64G-:384M is
probably sensible?
(lowest value coming from ubuntu's settings, would need to test how much
is needed for a system with 384MB but I'd be reluctant to take half of
its ram for crashkernel)

(Can't help about i386 though)

Thanks,
--
Dominique Martinet