#1137004 trixie-pu: package zfs-linux (pre-approval)

#1137004#5
Date:
2026-05-18 11:48:00 UTC
From:
To:
I'd like to cherry-pick a number of upstream stable bug fixes to
trixie, targeting data consistency, crash, panic and deadlocks. Since
this is the first proposed update of the trixie version, the debdiff
looks quite huge (~250KB). I'll go through them so that it might be
easier to review. Also I'd like to thank for the patience.

This update adds 35 cherry-picks from upstream zfs-2.3-release branch,
on top of the 6 cherry-picks already shipped in 2.3.2-2. Roughly half
are under 20 lines each. The largest individual patches are
0030/0031/0034 (each 100-235 lines, where most of the bulk are tests).

The full zfs-2.3.2..zfs-2.3.7 range contains ~350 commits. Only small
ones that fix real issues are considered, some test hunks are dropped
to avoid dependencies to not absolutely required commits.


Best regards,
Aron
-----------------------------
It might be easier to review through salsa's web interface:
https://salsa.debian.org/zfsonlinux-team/zfs/-/commit/059671d2629716520c47c7c7df2f349945e42713

Notation: each patch line shows the diffstat as [Nf +X/-Y] meaning
N files changed, X insertions, Y deletions.

Patches by category:

Data corruption / on-disk consistency:
  0013  Fix off-by-one bug in range tree code           [1f +1/-1]
        Range tree could report false overlaps. One-line fix.  (b9324a1e7)
  0023  BRT: Fix ranges to blocks conversion math       [1f +1/-1]
        Missing parentheses caused memory corruption on vdevs >64TB
        when block cloning was used. One-line fix. (19b9d9397)
  0028  draid: fix data corruption after disk clear     [8f +74/-16]
        Cleared faulted disk + detached spare path could corrupt
        other still-attached dRAID spares; observable via scrub
        cksum errors / data loss in multi-spare scenarios.
        (b8cc4c504; 4 of the 8 files are tests/)
  0029  draid: fix import failure after disks repl.     [1f +4/-2]
        ASIZE-rounding issue meant replacing dRAID disks with a
        slightly smaller disk could prevent pool import. (b5d344cf5)
  0030  draid: allow seq resilver reads from degraded   [6f +162/-35]
        Previous check was too strict and could skip valid replicas
        during sequential resilver, leading to reconstruction
        failures. (ed932ff54; 4 of the 6 files are tests/, including
        a new redundancy_draid_degraded1.ksh)
  0031  draid: fix cksum errors w/ degraded disks       [8f +235/-19]
        With more than nparity disks faulted, only the first nparity
        were marked faulted; spare rebuilds for the others did not
        track properly and later scrubs saw cksum errors.
        (6741f501e; 3 of the 8 files are tests/, including a new
        redundancy_draid_degraded2.ksh)
  0034  Fix read corruption after block clone+truncate  [7f +160/-1]
        copy_file_range over a recent truncate could cause subsequent
        reads to return holes instead of the cloned data. Triggers
        under high I/O (compilation workloads).
        (dceca0d4a; the dbuf.c fix itself is 6/+2/-1; the bulk is a
        new clone_after_trunc.c test binary + ksh test.)
  0035  Prevent range tree corruption race (dnode_sync) [4f +87/-45]
        Race in zfs_range_tree_walk caused stale reads / range-tree
        inconsistency in sync context. (84fbeba11)
  0038  Fix redundant declaration of dsl_pool_t         [1f +5/-6]
        Small cleanup. Pulled in solely to make 0039 build: 0039
        uses 'dp' which upstream had moved to the top of
        vdev_rebuild_thread in this commit. (dbf4e74e5)

  0039  Fix rare cksum errors after rebuild             [2f +10/-1]
        Race in vdev_rebuild_thread re-enables metaslab before the
        txg with rebuilt ranges is sync-ed, allowing new allocations
        to interfere. Adds a txg sync wait. (ffdedd441)
  0040  Initialize vr_last_txg for rebuild              [1f +4/-1]
        Companion to 0039: avoid spurious txg_wait_synced on empty
        first metaslab. (c2673ffb7)
  0041  Fix vdev_rebuild_range() tx commit              [1f +3/-1]
        Ordering bug: child zio could be added after txg_sync had
        waited. (b652eb69e)

Crash / UAF / panic:
  0012  Fix null deref in spa_vdev_remove_cancel_sync   [1f +3/-4]
        ms_sm may be NULL; don't dereference it. (64e77fdf3)
  0022  Synchronize the update of feature refcount      [2f +9]
        Concurrent feature_sync() could panic from an unprotected
        refcount update. (8e7a31086)
  0024-0026  HIGHMEM kmap API violation trio
        Three related fixes: ZFS assumed multiple pages can be
        kmap'd at once and ignored required LIFO ordering. Crashes
        and possible memory corruption on 32-bit HIGHMEM systems
        and on x86_64 under PaX KERNSEAL.
        (0dcb88203 [1f +15/-2], 445879656 [1f +8/-8],
         4f77b3013 [1f +6/-4])
  0036  dmu_direct: avoid UAF in dmu_write_direct_done  [1f +7/-1]
        Direct I/O error path dereferenced freed dsa->dsa_tx.
        Save in local var before freeing dsa. (a188a58d5)
  0037  Fix 'kernel BUG at mm/usercopy.c'               [1f +10/-3]
        zfs_uiomove() returned wrong errno on short copy, causing
        panic when a cgroup-OOM-killed process was doing ZFS I/O.
        (748d0525e)

Deadlocks / leaks / NULL handling:
  0011  dmu_objset_hold_flags rele on error             [1f +1/-1]
        Reference leak on error path. (25ad9ce69)
  0014  linux/zvol_os: don't try disk ops on alloc fail [1f +4/-2]
        NULL deref of zvo_disk on gendisk alloc failure. (04493ca81)
  0019  Skip dbuf_evict_one for reclaim thread          [6f +46/-1]
        Deadlock when kswapd entered dbuf eviction and tried to take
        a dbuf hash lock already held. (c405a7a35)
  0027  Fix deadlock on dmu_tx_assign from vdev_rebuild [3f +6/-7]
        vdev_rebuild held spa_config_lock as writer while waiting
        for txg, but txg_sync also wanted spa_config_lock; rebuild
        could hang indefinitely. (a97fba427)
  0032  fix memleak in spa_errlog.c                     [1f +1/-1]
        (8e21c8856)
  0033  Fix s_active leak in zfsvfs_hold                [1f +1]
        Permanently leaks the VFS superblock s_active ref, leaving
        the pool unexportable (EBUSY) until reboot. (a9358748c)

Other correctness:
  0007  Fix double spares for failed vdev               [4f +209/-4]
        ZED could attach two spares to one failed vdev when the
        replacement disk also failed during resilver. The 209-line
        figure is mostly a new auto_spare_double.ksh test (~160
        lines); the spa.c fix itself is ~45 lines. (4b014840e)
  0008  Fix race resilver wait vs offline/detach        [1f +8/-5]
        scn_state was cleared before vdev_dtl_reassess, so a
        follow-up offline/detach could fail with "no valid
        replicas". (101edf7ed)
  0009  spa: clear checkpoint info during retry         [1f +1]
        Cherry-pick from main. (1c11d3a54)
  0010  icp: explicit_memset() in gcm_clear_ctx         [1f +2/-2]
        Compiler may elide a plain memset of sensitive crypto
        state before free; harden by always using explicit_memset.
        (a4de1d38d)
  0015  vdev: skip faulting disks pending removal       [1f +4/-1]
        Race where vdev_remove_wanted set after probe init caused
        redundant fault+removal. (f292b0f14)
  0016  Set spa_final_txg in spa_unload                 [1f +6]
        Triggered an assertion about ms_defer tree on reboot/
        shutdown after dedup workloads. (bf4baee81)
  0017  zfs_log_write: callback on last itx only        [1f +5/-2]
        Write callback fired once per itx for split writes, making
        cleanup hard. (9c0f5bc18)
  0018  ZED: Fix device type detection and pool iter    [1f +36/-31]
        Hotplug events on partitioned spare devices were
        misidentified as l2arc. (0c928f7a3)
  0020  zvol: Fix blk-mq sync                           [3f +61/-20]
        zvol blk-mq path sent FLUSH and TRIM down the read code
        path instead of write, so sync writes were not actually
        sync. (0bb5950e7; 2 of the 3 files are test updates)
  0021  Fix two infinite loops if dmu_prefetch_max=0    [1f +4/-2]
        User-tunable foot-gun. (81ceee0cf)