On a test file: $ id3v2 -l Jüñķ\ -\ #nipponsei\ @\ irc.rizon.net\,\ dißilßìóñ.mp3 | grep TIT2 | xxd 0000000: 5449 5432 2028 5469 746c 652f 736f 6e67 TIT2 (Title/song 0000010: 6e61 6d65 2f63 6f6e 7465 6e74 2064 6573 name/content des 0000020: 6372 6970 7469 6f6e 293a 2064 69df 696c cription): di.il 0000030: dfec f3f1 0a ..... The tag in the MP3 file is UTF-8: 00000b0: 697a 6f6e 2e6e 6574 5449 5432 0000 000f izon.netTIT2.... 00000c0: 0000 0364 69c3 9f69 6cc3 9fc3 acc3 b3c3 ...di..il....... 00000d0: b154 4954 3300 0000 0100 0003 544f 5045 .TIT3.......TOPE And my system uses UTF-8: $ locale | grep LC_CTYPE LC_CTYPE="en_US.UTF-8"
I am a character encoding newbie - you likely know more than me about the subject. With that said: * The ID3v2 2.3.0 standard is not the latest but it happens to be what id3lib (which id3v2 uses) implements. (The latest is 2.4.0). * The ID3v2 2.3.0 standard only supports ISO-8859-1 and UCS-2. 2.4.0 did add support for UTF-16BE and UTF-8. * I'll have to check whether it's id3v2 or id3lib that lacks support for UCS-2, or whether it does support it but chooses the wrong encoding to convert to.
Already fixed. The listing path now reads text fields through id3lib's charset-aware accessor and converts them via iconv to the current locale's charset (nl_langinfo(CODESET)), so -l on a UTF-8 locale emits UTF-8 instead of forcing Latin-1. Fixed by the charset-conversion patch NMU'd in 0.1.12-2.1. I'm closing this bug report. Martin