On a fresh potato: apt-get install emacs20 apt-get install apache apt-get install htdig The install asks if I wanted to calculate the endings database; I said yes. No additional configuration was done. Now cron mails me the following every day: /etc/cron.daily/htdig: Can't determine type of file /var/spool/htdig/htdext.7383; content-type: application/msword; URL: http://localhost/doc/emacs20/etc/edt-user.doc Can't determine type of file /var/spool/htdig/htdext.7383; content-type: application/msword; URL: http://localhost/doc/emacs20/etc/enriched.doc A default install should not produce daily error messages emailed to root. Jeff PS. I think I see what is happening. Htdig is set to index localhost by default. The default web page has a link to the doc directory, which has some files which are confusing htdig. I don't know if the answer is suppressing errors, tweaking htdig configuration, or tweaking apache configuration.
A possible solution, which has the side benefit of being very simple, is to have htdig ignore anything with the .doc extension. This is consistant with Debian's apache, which is apparantly configured to serve that extension as content-type: application/msword. Htdig can't handle application/msword (I think). Sigh... Jeff +++ htdig.conf.orig Sat Jan 22 19:14:50 2000 @@ -58,7 +58,7 @@ # exclude_url patterns are matched anywhere. # bad_extensions: .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif \ - .doc .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi + .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi # # The string htdig will send in every request to identify the robot. Change