When I want to set a tag, the text provided in the commandline is written
unmodified. This leads to data corruption if the system locale is not
latin1 (or ISO 8859-1). My system is set to UTF8 locale, so that the
following happens:
$ id3v2 --TIT2 "test" 07\ -\ Parties\ in\ München.mp3
Now let's see what a reliable tool (eyeD3) shows:
$ eyeD3 07\ -\ Parties\ in\ München.mp3
[...]
ID3 v2.3:
title: test artist: Superpunk
Now I set the real title of the track, which contains a german umlaut:
$ id3v2 --TIT2 "Parties in München" 07\ -\ Parties\ in\ München.mp3
And this is the eyeD3 output:
$ eyeD3 --debug 07\ -\ Parties\ in\ München.mp3
[snip]
eyeD3 trace> FrameSet: Reading Frame #6
eyeD3 trace> FrameHeader [start byte]: 91 (0x5B)
eyeD3 trace> FrameHeader [id]: TIT2 (0x54495432)
eyeD3 trace> FrameHeader [data size]: 20 (0x14)
eyeD3 trace> FrameHeader [flags]: ta(0) fa(0) ro(0) co(0) en(0) gr(0) un(0)
dl(0)
eyeD3 trace> FrameSet: Reading 20 (0x14) bytes of data from byte pos 101
(0x65)
eyeD3 trace> FrameSet: 20 bytes of data read
eyeD3 trace> TextFrame encoding: latin_1
eyeD3 trace> TextFrame text: Parties in München
[snip]
ID3 v2.3:
title: Parties in München artist: Superpunk
The UTF8 data from the terminal was put into the ID3 tag, which is wrong in
2 points:
1. the charset is set to latin_1
2. ID3 v2.3 doesn't support UTF8, only UTF16
If the user wants to put correctly encoded data into the tags, he/she has to
make sure to convert it to the correct encoding (latin1). He/She seems to be
completely lost if the input data contains characters which are not present
in the latin_1 charset.
Regards,
Tino