#742432 Malformed patch results in deletion of already-patched files

#742432#5
Date:
2014-03-23 17:35:53 UTC
From:
To:
(git)/home/ben/src/linux-3.2[linux-3.2.y]$ quilt push --fuzz=2
Applying patch x86-add-check-for-number-of-available-vectors-before-cpu-down.patch
patching file arch/x86/include/asm/irq.h
patching file arch/x86/kernel/irq.c
Hunk #1 succeeded at 222 (offset -40 lines).
patching file arch/x86/kernel/smpboot.c
patch: **** malformed patch at line 220:  	cpu_disable_common();

Patch x86-add-check-for-number-of-available-vectors-before-cpu-down.patch does not apply (enforce with -f)

# Oops, I need to fix diff addresses

(git)/home/ben/src/linux-3.2[linux-3.2.y]$ quilt push --fuzz=2
Applying patch x86-add-check-for-number-of-available-vectors-before-cpu-down.patch
can't find file to patch at input line 104
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|From: Prarit Bhargava <prarit@redhat.com>
|Date: Mon, 13 Jan 2014 06:51:01 -0500
|Subject: x86: Add check for number of available vectors before CPU down
|
|commit da6139e49c7cb0f4251265cb5243b8d220adb48d upstream.
|
|Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791
|
|When a cpu is downed on a system, the irqs on the cpu are assigned to
|other cpus.  It is possible, however, that when a cpu is downed there
|aren't enough free vectors on the remaining cpus to account for the
|vectors from the cpu that is being downed.
|
|This results in an interesting "overflow" condition where irqs are
|"assigned" to a CPU but are not handled.
|
|For example, when downing cpus on a 1-64 logical processor system:
|
|<snip>
|[  232.021745] smpboot: CPU 61 is now offline
|[  238.480275] smpboot: CPU 62 is now offline
|[  245.991080] ------------[ cut here ]------------
|[  245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250()
|[  246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out
|[  246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas
|[  246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14
|[  246.044451] Hardware name: Intel Corporation S4600LH ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
|[  246.057371]  0000000000000009 ffff88081fa03d40 ffffffff8164fbf6 ffff88081fa0ee48
|[  246.065728]  ffff88081fa03d90 ffff88081fa03d80 ffffffff81054ecc ffff88081fa13040
|[  246.074073]  0000000000000000 ffff88200cce0000 0000000000000040 0000000000000000
|[  246.082430] Call Trace:
|[  246.085174]  <IRQ>  [<ffffffff8164fbf6>] dump_stack+0x46/0x58
|[  246.091633]  [<ffffffff81054ecc>] warn_slowpath_common+0x8c/0xc0
|[  246.098352]  [<ffffffff81054fb6>] warn_slowpath_fmt+0x46/0x50
|[  246.104786]  [<ffffffff815710d6>] dev_watchdog+0x246/0x250
|[  246.110923]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
|[  246.119097]  [<ffffffff8106092a>] call_timer_fn+0x3a/0x110
|[  246.125224]  [<ffffffff8106280f>] ? update_process_times+0x6f/0x80
|[  246.132137]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
|[  246.140308]  [<ffffffff81061db0>] run_timer_softirq+0x1f0/0x2a0
|[  246.146933]  [<ffffffff81059a80>] __do_softirq+0xe0/0x220
|[  246.152976]  [<ffffffff8165fedc>] call_softirq+0x1c/0x30
|[  246.158920]  [<ffffffff810045f5>] do_softirq+0x55/0x90
|[  246.164670]  [<ffffffff81059d35>] irq_exit+0xa5/0xb0
|[  246.170227]  [<ffffffff8166062a>] smp_apic_timer_interrupt+0x4a/0x60
|[  246.177324]  [<ffffffff8165f40a>] apic_timer_interrupt+0x6a/0x70
|[  246.184041]  <EOI>  [<ffffffff81505a1b>] ? cpuidle_enter_state+0x5b/0xe0
|[  246.191559]  [<ffffffff81505a17>] ? cpuidle_enter_state+0x57/0xe0
|[  246.198374]  [<ffffffff81505b5d>] cpuidle_idle_call+0xbd/0x200
|[  246.204900]  [<ffffffff8100b7ae>] arch_cpu_idle+0xe/0x30
|[  246.210846]  [<ffffffff810a47b0>] cpu_startup_entry+0xd0/0x250
|[  246.217371]  [<ffffffff81646b47>] rest_init+0x77/0x80
|[  246.223028]  [<ffffffff81d09e8e>] start_kernel+0x3ee/0x3fb
|[  246.229165]  [<ffffffff81d0989f>] ? repair_env_string+0x5e/0x5e
|[  246.235787]  [<ffffffff81d095a5>] x86_64_start_reservations+0x2a/0x2c
|[  246.242990]  [<ffffffff81d0969f>] x86_64_start_kernel+0xf8/0xfc
|[  246.249610] ---[ end trace fb74fdef54d79039 ]---
|[  246.254807] ixgbe 0000:c2:00.0 p786p1: initiating reset due to tx timeout
|[  246.262489] ixgbe 0000:c2:00.0 p786p1: Reset adapter
|Last login: Mon Nov 11 08:35:14 from 10.18.17.119
|[root@(none) ~]# [  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
|[  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
|[  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
|[  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
|
|(last lines keep repeating.  ixgbe driver is dead until module reload.)
|
|If the downed cpu has more vectors than are free on the remaining cpus on the
|system, it is possible that some vectors are "orphaned" even though they are
|assigned to a cpu.  In this case, since the ixgbe driver had a watchdog, the
|watchdog fired and notified that something was wrong.
|
|This patch adds a function, check_vectors(), to compare the number of vectors
|on the CPU going down and compares it to the number of vectors available on
|the system.  If there aren't enough vectors for the CPU to go down, an
|error is returned and propogated back to userspace.
|
|v2: Do not need to look at percpu irqs
|v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest
|    Priority Mode
|v4: Additional changes suggested by Gong Chen.
|v5/v6/v7/v8: Updated comment text
|
|Signed-off-by: Prarit Bhargava <prarit@redhat.com>
|Link: http://lkml.kernel.org/r/1389613861-3853-1-git-send-email-prarit@redhat.com
|Reviewed-by: Gong Chen <gong.chen@linux.intel.com>
|Cc: Andi Kleen <ak@linux.intel.com>
|Cc: Michel Lespinasse <walken@google.com>
|Cc: Seiji Aguchi <seiji.aguchi@hds.com>
|Cc: Yang Zhang <yang.z.zhang@Intel.com>
|Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
|Cc: Janet Morgan <janet.morgan@intel.com>
|Cc: Tony Luck <tony.luck@intel.com>
|Cc: Ruiv Wang <ruiv.wang@gmail.com>
|Cc: Gong Chen <gong.chen@linux.intel.com>
|Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|[bwh: Backported to 3.2: adjust context]
|Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|---
|diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
|index 0ea10f27..cb6cfcd 100644
|--- a/arch/x86/include/asm/irq.h
|+++ b/arch/x86/include/asm/irq.h
--------------------------
No file to patch.  Skipping patch.
1 out of 1 hunk ignored
can't find file to patch at input line 116
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
|index 22d0687..4207e8d 100644
|--- a/arch/x86/kernel/irq.c
|+++ b/arch/x86/kernel/irq.c
--------------------------
No file to patch.  Skipping patch.
1 out of 1 hunk ignored
patching file arch/x86/kernel/smpboot.c
Hunk #1 succeeded at 1278 (offset -34 lines).
Patch x86-add-check-for-number-of-available-vectors-before-cpu-down.patch does not apply (enforce with -f)

# So the first failure resulted in deleting the successfully
# patched files rather than rolling back the patch completely

#742432#10
Date:
2014-03-24 09:05:07 UTC
From:
To:
Control: severity -1 important

It seems to me that important is better suited. There might be data loss
but it's only when the patch is broken and it's usually for files that do
exist in multiple places...

(I'm doing this because I expect some delay until someone looks into this
and I don't think it warrants any release critical status.)

Cheers,

#742432#17
Date:
2014-05-08 15:28:17 UTC
From:
To:
Hello,

I tried to reproduce that bug with the attached test file, but failed.
I guess that it's because my way of breaking the patchfile is not
exactly the one you had. Could you provide more information about it,
please?

If you want to play a bit with the attached file, simply copy it in
the test directory of a quilt source tree and run "make check". I need
a way to reproduce the bug before I can work on it...

Thinking a bit more about this issue and reading the source code, I
think that it could be a bug in patch itself, actually. Restoring the
files when the patch does not apply is not something we do in quilt,
actually. It would be interesting to try applying the patch manually
and see what happens.

Thanks for reporting the issue and sorry for the delay.
Mt.

#742432#22
Date:
2014-05-09 00:53:03 UTC
From:
To:
I'm also failing to construct a patch that demonstrates this bug, but as
I hit it several times before filing the bug, I assume I will hit it
again and I will attach the example files then.

Ben.

#742432#27
Date:
2014-05-11 07:01:04 UTC
From:
To:
Thanks for your help. I'm waiting for your feedback, then.
Mt

#742432#32
Date:
2014-07-30 20:13:27 UTC
From:
To:
I'm attaching a patch and the files to be patched to reproduce this bug.

The patch touches 5 files and the hunk for the 5th is malformed.  The
other 4 files are deleted when you try to apply it with quilt.

I tried simplifying the patch to touch fewer files, leaving a malformed
hunk for the last file, but that made the bug go away.

Ben.