(git)/home/ben/src/linux-3.2[linux-3.2.y]$ quilt push --fuzz=2
Applying patch x86-add-check-for-number-of-available-vectors-before-cpu-down.patch
patching file arch/x86/include/asm/irq.h
patching file arch/x86/kernel/irq.c
Hunk #1 succeeded at 222 (offset -40 lines).
patching file arch/x86/kernel/smpboot.c
patch: **** malformed patch at line 220: cpu_disable_common();
Patch x86-add-check-for-number-of-available-vectors-before-cpu-down.patch does not apply (enforce with -f)
# Oops, I need to fix diff addresses
(git)/home/ben/src/linux-3.2[linux-3.2.y]$ quilt push --fuzz=2
Applying patch x86-add-check-for-number-of-available-vectors-before-cpu-down.patch
can't find file to patch at input line 104
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|From: Prarit Bhargava <prarit@redhat.com>
|Date: Mon, 13 Jan 2014 06:51:01 -0500
|Subject: x86: Add check for number of available vectors before CPU down
|
|commit da6139e49c7cb0f4251265cb5243b8d220adb48d upstream.
|
|Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791
|
|When a cpu is downed on a system, the irqs on the cpu are assigned to
|other cpus. It is possible, however, that when a cpu is downed there
|aren't enough free vectors on the remaining cpus to account for the
|vectors from the cpu that is being downed.
|
|This results in an interesting "overflow" condition where irqs are
|"assigned" to a CPU but are not handled.
|
|For example, when downing cpus on a 1-64 logical processor system:
|
|<snip>
|[ 232.021745] smpboot: CPU 61 is now offline
|[ 238.480275] smpboot: CPU 62 is now offline
|[ 245.991080] ------------[ cut here ]------------
|[ 245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250()
|[ 246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out
|[ 246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas
|[ 246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14
|[ 246.044451] Hardware name: Intel Corporation S4600LH ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
|[ 246.057371] 0000000000000009 ffff88081fa03d40 ffffffff8164fbf6 ffff88081fa0ee48
|[ 246.065728] ffff88081fa03d90 ffff88081fa03d80 ffffffff81054ecc ffff88081fa13040
|[ 246.074073] 0000000000000000 ffff88200cce0000 0000000000000040 0000000000000000
|[ 246.082430] Call Trace:
|[ 246.085174] <IRQ> [<ffffffff8164fbf6>] dump_stack+0x46/0x58
|[ 246.091633] [<ffffffff81054ecc>] warn_slowpath_common+0x8c/0xc0
|[ 246.098352] [<ffffffff81054fb6>] warn_slowpath_fmt+0x46/0x50
|[ 246.104786] [<ffffffff815710d6>] dev_watchdog+0x246/0x250
|[ 246.110923] [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
|[ 246.119097] [<ffffffff8106092a>] call_timer_fn+0x3a/0x110
|[ 246.125224] [<ffffffff8106280f>] ? update_process_times+0x6f/0x80
|[ 246.132137] [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
|[ 246.140308] [<ffffffff81061db0>] run_timer_softirq+0x1f0/0x2a0
|[ 246.146933] [<ffffffff81059a80>] __do_softirq+0xe0/0x220
|[ 246.152976] [<ffffffff8165fedc>] call_softirq+0x1c/0x30
|[ 246.158920] [<ffffffff810045f5>] do_softirq+0x55/0x90
|[ 246.164670] [<ffffffff81059d35>] irq_exit+0xa5/0xb0
|[ 246.170227] [<ffffffff8166062a>] smp_apic_timer_interrupt+0x4a/0x60
|[ 246.177324] [<ffffffff8165f40a>] apic_timer_interrupt+0x6a/0x70
|[ 246.184041] <EOI> [<ffffffff81505a1b>] ? cpuidle_enter_state+0x5b/0xe0
|[ 246.191559] [<ffffffff81505a17>] ? cpuidle_enter_state+0x57/0xe0
|[ 246.198374] [<ffffffff81505b5d>] cpuidle_idle_call+0xbd/0x200
|[ 246.204900] [<ffffffff8100b7ae>] arch_cpu_idle+0xe/0x30
|[ 246.210846] [<ffffffff810a47b0>] cpu_startup_entry+0xd0/0x250
|[ 246.217371] [<ffffffff81646b47>] rest_init+0x77/0x80
|[ 246.223028] [<ffffffff81d09e8e>] start_kernel+0x3ee/0x3fb
|[ 246.229165] [<ffffffff81d0989f>] ? repair_env_string+0x5e/0x5e
|[ 246.235787] [<ffffffff81d095a5>] x86_64_start_reservations+0x2a/0x2c
|[ 246.242990] [<ffffffff81d0969f>] x86_64_start_kernel+0xf8/0xfc
|[ 246.249610] ---[ end trace fb74fdef54d79039 ]---
|[ 246.254807] ixgbe 0000:c2:00.0 p786p1: initiating reset due to tx timeout
|[ 246.262489] ixgbe 0000:c2:00.0 p786p1: Reset adapter
|Last login: Mon Nov 11 08:35:14 from 10.18.17.119
|[root@(none) ~]# [ 246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
|[ 249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
|[ 246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
|[ 249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
|
|(last lines keep repeating. ixgbe driver is dead until module reload.)
|
|If the downed cpu has more vectors than are free on the remaining cpus on the
|system, it is possible that some vectors are "orphaned" even though they are
|assigned to a cpu. In this case, since the ixgbe driver had a watchdog, the
|watchdog fired and notified that something was wrong.
|
|This patch adds a function, check_vectors(), to compare the number of vectors
|on the CPU going down and compares it to the number of vectors available on
|the system. If there aren't enough vectors for the CPU to go down, an
|error is returned and propogated back to userspace.
|
|v2: Do not need to look at percpu irqs
|v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest
| Priority Mode
|v4: Additional changes suggested by Gong Chen.
|v5/v6/v7/v8: Updated comment text
|
|Signed-off-by: Prarit Bhargava <prarit@redhat.com>
|Link: http://lkml.kernel.org/r/1389613861-3853-1-git-send-email-prarit@redhat.com
|Reviewed-by: Gong Chen <gong.chen@linux.intel.com>
|Cc: Andi Kleen <ak@linux.intel.com>
|Cc: Michel Lespinasse <walken@google.com>
|Cc: Seiji Aguchi <seiji.aguchi@hds.com>
|Cc: Yang Zhang <yang.z.zhang@Intel.com>
|Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
|Cc: Janet Morgan <janet.morgan@intel.com>
|Cc: Tony Luck <tony.luck@intel.com>
|Cc: Ruiv Wang <ruiv.wang@gmail.com>
|Cc: Gong Chen <gong.chen@linux.intel.com>
|Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|[bwh: Backported to 3.2: adjust context]
|Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|---
|diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
|index 0ea10f27..cb6cfcd 100644
|--- a/arch/x86/include/asm/irq.h
|+++ b/arch/x86/include/asm/irq.h
--------------------------
No file to patch. Skipping patch.
1 out of 1 hunk ignored
can't find file to patch at input line 116
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
|index 22d0687..4207e8d 100644
|--- a/arch/x86/kernel/irq.c
|+++ b/arch/x86/kernel/irq.c
--------------------------
No file to patch. Skipping patch.
1 out of 1 hunk ignored
patching file arch/x86/kernel/smpboot.c
Hunk #1 succeeded at 1278 (offset -34 lines).
Patch x86-add-check-for-number-of-available-vectors-before-cpu-down.patch does not apply (enforce with -f)
# So the first failure resulted in deleting the successfully
# patched files rather than rolling back the patch completely