#57282 /usr/share/htdig/parse_doc.pl doesn't work with certain converters

Package:
htdig
Source:
htdig
Description:
web search and indexing system - binaries
Submitter:
Roderich Schupp
Date:
2005-07-18 03:57:56 UTC
Severity:
normal
#57282#5
Date:
2000-02-07 18:15:30 UTC
From:
To:
Other relevant packages:
xpdf-i		0.90-4		for pdftotext
gs-aladdin	5.50-7		for ps2ascii
pstotext	1.8-4		for pstotext

/usr/share/htdig/parse_doc.pl detects at runtime, which of
the above PDF/PS-to-text converters is installed and stores
its name in $parser. It then proceeds to open a pipe from
the command "$parser $ARGV[0] - |".
However, this will only work as expected if $parser is pdftotext.
ps2ascii will read from the file named $ARGV[0] and will write its
output into a file called "-". Hence the document will not be indexed.
pstotext first reads from $ARGV[0] and then tries to read from standard
input. As a cronjob you might be lucky as stdin is opened to /dev/null,
but otherwise pstotext will wait for input and the indexing will stall.

Cheers, Roderich