- Package:
- puppet-agent
- Source:
- puppet-agent
- Submitter:
- Hendrik Jaeger
- Date:
- 2026-03-05 22:49:01 UTC
- Severity:
- normal
Dear Maintainer, * What led up to the situation? I was trying to build an exclude list for my backups and went through the content of my filesystems. * What was the outcome of this action? I noticed that there are reports of puppet runs in /var/cache/puppet/reports. * What outcome did you expect instead? I did expect all data in /var/cache and its subdirectories to be regeneratable and not contain any information one might want to backup. According to the FHS in https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s05. Puppet can not regenerate the report for a specific run. Also "cache" usually refers to data that will be reused which is not the case for these reports. /var/log seems a better fit for those. In my concrete case, it seems suboptimal that these reports are in a directory that I would like to exclude from backups because it should not contain anything worth backing up anyway as all data in there is supposed to be regeneratable and these reports clearly are not. Under the "Rationale" this use case is even mentioned explicitly: The argument has been made on IRC that usually reports are not stored locally anyway, but it seemed implied that the server would also store the reports in a directory named "cache", but outside the FHS in /opt/puppetlabs/puppet/cache/reports in the case of a non-debian installation. I have no puppetserver installation with debian on hand, so I don’t know how the debian package would behave. Another argument has been made that the reports are stored in puppetdb and the reports are thus only stored temporarily as files on a disk. IMHO that still wouldn’t make them "cache" data. "temporary" data maybe, so in that case they should probably go to /var/tmp or /tmp. Or, as https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s14.html mentions: Both of these arguments are kind of OK for a certain set of circumstances but not everybody is running a puppetdb or even a puppetserver. I am running puppet standalone, i.e. with `puppet apply`, so the reports will not be transferred to the server and will not be consumed into/by puppetdb. In any case, treating reports as "cached" data seems quite clearly wrong. In the case of standalone puppet (i.e. `puppet apply`) IMHO they are "logs" and should go to /var/log. In the case of a puppet-agent (i.e. a puppet client/agent connecting to a puppet server _without_ a puppetdb), they should probably not be saved on the client at all but if so, they are also "logs" IMHO and should be treated like mentioned above. On the server, they should also be treated like "logs" but not necessarily go to /var/log like machine-local log data. I don’t think I have a concrete sensible suggestion for this case. Maybe /var/lib. In the case of a puppetserver with a puppetdb, they should probably not be saved as files at all on the server. Unless they are sent directly to the puppetdb from the puppedserver, but consumed later, they are probably "spool" data.
Hello, I agree perhaps the default of "/var/cache/puppet/reports" isn't ideal. But instead of changing only "reportdir", we might want to instead change "vardir" from "/var/cache/puppet" to something like "/var/puppet". I'm not sure that anything puppet puts inside "vardir" can really be qualified as "cache"? I think perhaps the only reason it's that way is because of the naming choices made by upstream a long time ago.
That would break the FHS too, no? perhaps /var/spool/puppet or /var/lib/puppet instead?
Le 2024-09-04 à 10 h 45, Antoine Beaupré a écrit : I'm not sure if it would be wise to use the same path for both settings... The upstream docs describe both parameters as such: vardir: Where Puppet stores dynamic and growing data. libdir: An extra search path for Puppet. This is only useful for those files that Puppet will load on demand, and is only guaranteed to work for those cases. In fact, the autoload mechanism is responsible for making sure this directory is in Ruby's search path So maybe vardir could be /var/lib/puppet and libdir: $vardir/ruby ?
Hi On my systems: ``` # puppet config print libdir /var/cache/puppet/lib ``` But also, given what https://www.puppet.com/docs/puppet/8/dirs_vardir says about it, I think it being under /var/cache is correct: because its content is really just cached content and will be recreated from the server and it is supposed to be managed by the application, not the human. (But also: this is besides the point of this bug report.) Which page are you looking at? https://www.puppet.com/docs/puppet/8/dirs_vardir says: What I consider wrong, though, is putting reports there, as they cannot be regenerated which is what the FHS demands on data put there. And this should probably be forwarded upstream. (The same may be said about other things, like `bucketdir` and `clientbucketdir`, neither of which can be regenerated. But that should be a separate bug to not inflate this one!) To make a concrete suggestion: set reportdir = $logdir/reports What do you think?
Hello, Le 2024-10-24 à 05 h 18, Hendrik Jaeger a écrit : directories are also relevant. ... but I'm still hesitant to carry this as an additional patch in Debian: it really should be implemented upstream, since the FHS is not Debian-specific at all. Plus, if implemented, what do we do with the data in the previous path? Forcibly move it? Ignore it? Prompt the user? Worth noting is that since the last puppet-agent upload, the reports feature now defaults to disabled (no reports generated).
Hoi I agree that it should be, and understand the hesitation but as long as it isn’t, Debian should make it behave sanely. If the package has not done anything with it, it shouldn’t start now. Especially because none of the options seem particularly good: * move it: no, that would be a mess and for what gain? * ignore it: if that’s what has been done until now, continue doing that. * prompt: no, this is IMHO not a good enough reason to force interaction. IMHO it’s also not good enough reason for notification, as in NEWS.Debian entry. People who care about the reports will (or at least should) already be taking care of them and notice when they are missing or in a new place. Or read it in the changelog. People who don’t care: well, they are already taken care off. Also: the reports have been in a directory that always has had the potential of being nuked for whatever reason. Other than mentioning it in the changelog, it should IMHO just be left with that. Noted. Not sure how I feel about it, though … It’s probably better since there doesn’t seem to be any handling of old reports anyway, so this would (slowly) fill up diskspace if not handled specifically by the admin. Cheers henk