In Swedish, you can "continue" on the previous word stem by starting with a
hyphen, for example:
bilförare, -ägare och -passagerare
("car drivers, owners and passengers"), where "-" indicates that the words
continue "bil" ("car"). "Ägare" and "passagerare" are valid words in
Swedish, which makes ispell suggest that I should replace "-ägare" with
"ägare". This is a bug in ispell, since removing the initial hyphen changes
the meaning of the word.
IIRC, previous versions of ispell did not have this problem (at least I have
not noticed it until more recent versions).
Hi, I think this problem is because the wordlist changed, e.g. wht word costituent chars changed. You could check by removing the hyphen from the .aff file and rebuilding the dictionary. I'll try this later myself though. Reassign it if this is the case. /Micce (maintainer of iswedish) PS How do I add myself to this bug? I remember it is or will be possible...
Try http://people.debian.org/~micce/iswedish_1.4.3_i386.deb, with this we cannot spell ABC-vapen correctly, but maybe we can live with that. The compound word thing also needs work.
With this wordlist I get $ ispell test.txt Word 'användar-id' contains illegal characters Word 'ASCII-text' contains illegal characters Word 'cd-avbildning' contains illegal characters Word 'cd-avbildningar' contains illegal characters Word 'cd-avbildningarna' contains illegal characters Word 'cd-avbildningsfiler' contains illegal characters Word 'cd-rom' contains illegal characters Word 'Debian-cd' contains illegal characters Word 'Debian-produkt' contains illegal characters Word 'dpkg-sviten' contains illegal characters Word 'dpkg-verktyg' contains illegal characters Word 'e-post' contains illegal characters Word 'e-postbrev' contains illegal characters Word 'e-postprogramvara' contains illegal characters [...] when I start. I.e, error messages for all the compound words in my personal wordlist.
The hyphen should only be considered part of the word if it is not beginning or ending the word, I think. This is not a perfect soultion either, since some partial constructs will not work anyway, but some of them I think are beyond "easy" algorithmics that can be employed by a spelling checker, they would need some much more advanced linguistic analysis.
peter karlsson writes: > > Yepp. So how do we fix this? '-' should be a word constituent only > > sometimes... > > The hyphen should only be considered part of the word if it is not beginning > or ending the word, I think. This is not a perfect soultion either, since > some partial constructs will not work anyway, but some of them I think are > beyond "easy" algorithmics that can be employed by a spelling checker, they > would need some much more advanced linguistic analysis. But AFAIK, there is no way to do this in an ispell dictionary, unfortunately. If there is, pleas enlight me, and I fix it. Maybe it could be done by saying the word '-' could be combined with any word, both pre- and post-fix? (And making it a wc again, also). /Micce
peter karlsson writes: > > Try http://people.debian.org/~micce/iswedish_1.4.3_i386.deb, with this > > we cannot spell ABC-vapen correctly, but maybe we can live with that. > > The compound word thing also needs work. > > With this wordlist I get > > $ ispell test.txt > Word 'användar-id' contains illegal characters > Word 'ASCII-text' contains illegal characters > Word 'cd-avbildning' contains illegal characters > when I start. I.e, error messages for all the compound words in my > personal wordlist. Yepp. So how do we fix this? '-' should be a word constituent only sometimes... Please have a look at the source if you like, it's called swedish. I have no suggestion for the moment. Maybe there has to be some change to ispell after all? /Micce
Mikael Hedin <mikael.hedin@irf.se> writes: For what little it's worth, I've noticed this happens with english wordlists as well -- all hyphenated words get rejected with "contains illegal characters." I've always ignored it, since it's a feature of upstream ispell.
Did you and/or Mikael come up with a successful workaround? I have still not looked into this in ispell -- should I close the bug report, or keep it on my list of things to do some day? Thanks.
No, I don't think we ever did. Please.