Dear Maintainer,
When a daily expiration run (cron.daily/apt-cacher-ng -> acngtool maint)
aborts -- e.g. on a by-hash validation error -- apt-cacher-ng leaves behind
the per-run volatile-index snapshots it created under
<CacheDir>/_xstore/rsnap/. These snapshots are never cleaned up on the abort
path, so they accumulate one-per-run indefinitely. Once one of the
accumulated snapshots is itself inconsistent, it becomes a new permanent
abort trigger: every subsequent expiration trips over it during "Validating
cache contents" and aborts again. A single transient by-hash hiccup thus
wedges expiration permanently; cache pruning stops cache-wide and the only
escape is a manual "rm -rf _xstore/rsnap".
Environment
-----------
apt-cacher-ng 3.7.4-1+b2 (daemon reports version 3.7.4)
Debian GNU/Linux 12 (bookworm), amd64
ExAbortOnProblems = 1 (compiled default, unset in config)
Cache proxies amd64 repos (security.ubuntu.com, Ubuntu archive) plus arm64
(ports.ubuntu.com), so by-hash index sets rotate frequently and unevenly.
Steps to reproduce
------------------
1. Run apt-cacher-ng caching repos that use Acquire-By-Hash (any current
Debian/Ubuntu mirror), ideally serving more than one client architecture.
2. Let the daily expiration run over many days. Each run writes a fresh
snapshot per volatile index under
<CacheDir>/_xstore/rsnap/.../<dist>/<numeric-fid>.
3. Induce or wait for one by-hash inconsistency (an InRelease referencing a
by-hash entry that was only partially fetched). Expiration aborts:
There were error(s) processing
ports.ubuntu.com/ubuntu-ports/dists/noble-security/<fid>, ignoring...
ByHash error at
ports.ubuntu.com/ubuntu-ports/dists/noble-updates/InRelease
Validating cache contents...
Found errors during processing, aborting as requested.
Observed behaviour
------------------
- After the abort, _xstore/rsnap/ retains every run's snapshot. Here a
single dist (ports.ubuntu.com/.../noble-security) had 22 accumulated
snapshot files (one per run over ~2 weeks), all 126 KiB copies of the
same InRelease.
- The damaged snapshot was reported by its logical path (without the
_xstore/rsnap/ prefix), making it look like a live-cache fault when the
file only exists in the snapshot store.
- _exfail_cnt grew an entry per failed run; nothing was pruned cache-wide
for ~2 weeks while the proxy kept serving HTTP 200 (an otherwise silent
failure).
- Setting ExAbortOnProblems = 0 does NOT help: the by-hash validation step
still ends the run with "aborting as requested". Per the documentation
that option governs the index-update (preparation) step, so operators
cannot opt out of the abort for this path.
Expected behaviour
------------------
1. Snapshot hygiene on abort: snapshots created for a run that aborts
should be cleaned up / not retained, so a failure cannot accumulate
state that guarantees future failures (a bounded or GC'd snapshot store).
2. Self-recovery: a damaged snapshot in _xstore/rsnap should be discarded
and regenerated, not treated as a hard abort clearable only by hand.
Optionally, the by-hash validation error could skip and re-fetch the single
inconsistent index rather than aborting the entire cache expiration.
Workaround
----------
Clearing the working state and stale volatile index metadata, then
re-running, turns the aborting run into a clean "Done.":
cd /var/cache/apt-cacher-ng
rm -rf _xstore/rsnap; rm -f _expending_damaged _exfail_cnt
find . -type d -path '*/dists/*' -name by-hash -prune -exec rm -rf {} +
find . -type f -path '*/dists/*' \( -name 'InRelease*' -o -name 'Release' -o -name 'Release.gpg' -o -name 'Release.head' -o -name 'Release.[0-9]*' \) -delete
/usr/lib/apt-cacher-ng/acngtool maint -c /etc/apt-cacher-ng SocketPath=/run/apt-cacher-ng/socket
Impact
------
Cache expiration silently stops cache-wide after one transient by-hash
error; disk usage grows unbounded until noticed (our cache held ~1.3 GB of
stale metadata that pruned on the first clean run, 2.8 GB -> 1.5 GB);
recovery requires manual filesystem surgery.
Thanks for maintaining apt-cacher-ng.
Regards,
Matt Shirel
To unsubscribe click: https://link.shirel.com/us/?e=bzLQKVFQzYtI.SWVY8WtCz94u.BTKxgwPkr7p