Currently this package installs a cron job that runs every ten minutes. This is a VERY bad idea: - if logrotate(8) runs during those 10 minutes, some log entries will fail to be accounted for by awstats - it wastes resources parsing the same log files every 10 minutes, especially if they get big - it makes logcheck(8) spam my inbox every hour due to the cron job failing every 10 minutes A better solution is to hook the update script onto the logrotate(8) entries for any installed webservers (eg. /etc/logrotate.d/lighttpd,apache2). This solves all of the 3 problems I just mentioned.
Hi Ximin, Good points! Frequent updates of logfiles have its use too, however. But not always - and the backsides you raise here are valid. I suggest to a) split the current cron job into infrequent and frequent jobs, b) make the frequent one optional (ideally through debconf), and c) invoke the infrequent job also (or instead?) as a logrotate hook. How does that sound? - Jonas
Yeah, that works. Though looking at the current script, there's not really much need to split it up. The main problem is to do the logrotate hook itself - ideally you'd add it directly to the webserver entry rather than a new awstats entry. Is that going to be a pain - editing another package's configuration files? I dunno what infrastructure / policy Debian has for this sort of thing.
Or more accurately: It needs implemented in webserver packages that this awstats package can hook into - not doable in awstats alone. - Jonas
logrotate every 10 minutes - could be the source of trouble. Not awstats. Do you mean, it parses _same_ log entires? No, awstats doesn't do such a stupid things. Actually, it does lseek on file to the last known entry and then begin parsing. Why exactly it fails? Do you try first to comment out crontab entry and fix the source of failure? I'm disagree with severity. Looks like a very site-specific/workload-specific issue. Your logrotate-based solution could be suggested as an option in README.Debian for specific setups. True. How to split a) or c)? It's easy only from the local admin side. We can make cron job frequency to be debconfigured. Is it an option?
clone 590074 -1 -2 -3 reassign -1 apache2 reassign -2 lighttpd reassign -3 nginx retitle -1 add pre-rotate hook to logrotate script retitle -2 add pre-rotate hook to logrotate script retitle -3 add pre-rotate hook to logrotate script severity -1 wishlist severity -2 wishlist severity -3 wishlist thanks Please add something like the following snippet to the logrotate script for your package: prerotate if [ -d /etc/logrotate.d/httpd-prerotate ]; then \ run-parts /etc/logrotate.d/httpd-prerotate; \ fi; \ endscript (or some suitable directory other than the one suggested; I'm not sure what Debian naming conventions are.) This would be greatly helpful to log-parsing packages such as awstats, which can then set up hooks to processes these logs before they get rotated (see #590074 for an example). X
No, logrotate isn't running every 10 minutes. I think you misunderstood my point. If logrotate runs between the 10-minute cron runs of awstats, it will rotate the log entries since the last 10-minute run, so the next 10-minute run won't be able to see it any more. What if the file has been truncated or removed by logrotate? I guess because I haven't written a proper config file yet? Anyway, it's still spamming my syslog *every 10 minutes*. This should at least be an option that's off by default. logrotate is part of the standard install for Debian webservers (at least apache2 and lighttpd). this is not "site specific".
Looks like possible language problem: during != every during ~= in between If this didn't help, please follow-up on Ximin's response instead of mine :-) I experienced cron spam too when trying to install awstats recently (and too busy at the time to investigate further - just cursed a bit and uninstalled awstats again). Possibly not a helpful comment - just want to hint that there might actually be an issue of cron spam in virgin installs of awstats currently. I must admit that I have lost track of most recent improvements by you, but seem to recall in the past that it made sense for my local scripts to distinguish between hevier monthly/weekly log analysis routines and smaller hourly ones. But perhaps that was because I (for other reasons) analyzed the files from scratch again each month... Let's first figure out if current frequent cron job really is heavy on system resources, and only if it is I can try elaborate more on my ideas here. I would prefer to keep the cron file as a conffile and instead have the invoked script check a flag in /etc/default/awstats if it should really run or just quit immediately. But again, let's first resolve if it really is necessary. - Jonas
That's true. But it's a known issue and your logrotate hint is already documented in README.Debian for this purpose. Probably ;) Please consider to enable EnableLockForUpdate feature. From the README.Debian: ---->8-------- Also consider enabling lock files in /etc/awstats/awstats.conf with EnableLockForUpdate=1 so that only one AWStats update process is running at a time. This will reduce system resources especially if the AWStats update process takes longer than 10 minutes to complete. This solution has some security drawbacks: lockfile with well-known name and writable by www-data user. ----------------->8-------- It's a fresh install, right? I guess, we can disable cron jobs by default on a fresh install. As /etc/awstats/awstats.conf is not configured by default, Yes, but your logrotate settings is very "site specific" and far, far away from defaults...
yeah. Where did I suggest that I edited my logrotate scripts? They are unchanged since being installed... Or do you mean the solution I proposed? AFAICS Debian utility packages normally assume they're going to be used on/by other Debian packages, so it's fine to assume that awstats is being installed for the logs on some local Debian webserver package. In fact, the solution I described in the cloned bug reports above, won't put any extra effort on awstats maintenance: - awstats adds some update scripts into <DIR>. job done on the awstats side. - default logrotate scripts of various webservers call <DIR> when rotating logs. (what I made those cloned reports for) - if a site admin wants to use non-default log settings, then they'll need to edit their logrotate scripts anyhow. X
A virgin install must not cause cron spam. If you implicitly acknowledge above that awstats currently does, then yes, we should disable it by default (or figure out something more clever). provided the specific logrotate config of that host? If you are simply guessing, I suggest you state that more clearly, and be kinder about alternative viewpoints here. :-) - Jonas
I'm also experiencing such spamming, and it gets even worse as it runs as www-data, and there's no /etc/aliases redirecting it to a real user by default in exim, it seems. So there's an ever growing mailbox in /var/spool/mail/www-data :-( See #496029 that seems to relate to the aliases problem. Still this needs to be addressed on awstats side too, I guess. Hope this helps. Best regards,
Comming back on the fact that your mail is spammed by the cron job failing. Even when awstats is set up correctly I get the same troubles. I did a fresh install of awstats on a box running Debian Lenny, set up awstats.conf files for all my vhosts, fix rights on apache log files, but I get `CRON...permission denied` messages in syslog every 10 minutes. It seems that using a file in `/etc/cron.d` with user `www-data` is not allowed. I do not understand why, and where the problem comes from.
severity 590074 +wishlist thanks Hello, I'll lower the severity of this bugreport. It's actualy a wishlist bug. The points of the original bugreport was addressed: be accounted for by awstats True (it was noted in README.Debian), but a longer period can introduce a bigger holes in statistics, due to log rotation. Probably, it's a good idea to introduce the right infrastructure in the logrotate package first. Not all maintainers are agreed with your suggestion, see http://bugs.debian.org/590097 Remember, it still does not solve the "lost entires" problem completely or it can introduce downside effects if you will start update.sh in background like su -l -c /usr/share/awstats/tools/update.sh www-data & in the prerotate hook. if they get big Just wrong. every 10 minutes Why not fix the causes of this failing instead? Olivier Berger wrote: See #652665. I'm still not sure if we must abuse root account by MAILTO=root. AWStats mails does not go to the black hole per default. You can read www-data's mailbox or just add the alias for www-data pointing to root. Bruno BEAUFILS <bruno@boulgour.com> wrote: No, it's allowed. Then you should provide more information (awstats configuration, detailed error messages, etc) and fill another bugreport.
I notice some of the issues in this bug relate to the way awstats is run
at log rotation time
The README.Debian recommends:
"Make sure to run AWStats right _before_ web logs are rotated. For
example, insert the following lines in /etc/logrotate.d/apache2:
prerotate
if [ -x /usr/share/awstats/tools/update.sh ]; then
su - -c /usr/share/awstats/tools/update.sh www-data
fi
endscript"
-----------------------------
This means that
a) sharedscripts must be set in logrotate
b) data is likely to be missed if it is logged between the time
update.sh finishes and the rotation of a file for any particular vhost
Why does the README insist on prerotate and not use postrotate?
I've discovered that after rotation, logrotate can give the rotated
filename to the postrotate script, and using nosharedscripts, logrotate
can call awstats multiple times, once for each vhost, just as it
finishes the rotation of that host:
nosharedscripts
postrotate
/path-to-cgi/awstats.pl -LogFile=$1
endscript
Where it says `$1' in the postrotate script, logrotate actually puts the
rotated filename, e.g.
/var/log/apache2/vhost1/access.log.1
so it will override the normal filename defined in
/etc/awstats/awstats.vhost1.com.conf
The normal conf file will still work normally from cron.
All that is needed is some wrapper script around awstats to select the
correct domain based on the path in $1 and pass the -config option to
awstats too
Does this address all the issues raised by contributors to this bug report?
I've contributed a script for fixing this, it is commit c0482b4109176e05 on master It is based on Sergey's update.sh, but it does each log file separately just after rotation
Probably, because it's easy. I'm not sure if reloading Apache per every vhost is a good idea. Why this can't be done once? http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=590074#70 For next release I've reverted the above commits. I don't like idea of code duplication. Can you consider to add needed code to update.sh?
Dear , Please can I have your attention and possibly help me for humanity's sake please. I am writing this message with a heavy heart filled with sorrows and sadness. Please if you can respond, i have an issue that i will be most grateful if you could help me deal with it please. Julian