Dear Maintainer,
I recently discovered that logwatch did not report a systemd service (snapper-timeline.service) that failed repeatedly on my system. Looking through the script
/usr/share/logwatch/scripts/services/systemd
I determined that the reason for this is that the script makes wrong assumptions about what the log entries should look like - or assumptions that don't apply to a service of the type 'simple'.
The comments within the script explain the logic:
and a few lines further down:
The problem here is that type simple services will not trigger this log message when they fail, see this example:
$ journalctl -u snapper-timeline.service -S 2023-02-16 -U 2023-02-17
-- Journal begins at Fri 2022-12-02 07:02:42 CET, ends at Tue 2023-02-28 01:24:53 CET. --
Feb 16 03:05:29 HomeSrv systemd[1]: Started Timeline of Snapper Snapshots.
Feb 16 03:05:43 HomeSrv systemd-helper[2690]: running timeline for 'archive'.
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: running timeline for 'documents'.
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: running timeline for 'home'.
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: running timeline for 'photos'.
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: IO Error (.snapshots is not a btrfs subvolume).
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: timeline for 'photos' failed.
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: running timeline for 'root'.
Feb 16 03:05:45 HomeSrv systemd[1]: snapper-timeline.service: Main process exited, code=exited, status=1/FAILURE
Feb 16 03:05:45 HomeSrv systemd[1]: snapper-timeline.service: Failed with result 'exit-code'.
As the log message "Unit {} entered failed state." doesn't appear here, the failure of the unit is never reported by logwatch (snapper-timeline.service, in this example, is a simple service). I would have expected logwatch would catch such failures and report them and I think the script should be changed to look for messages that actually appear for all unit types in case of failure.
Thanks and regards,
Timo