Lately I’ve been working on a Cocoa MP3 tagger/renamer app: it gathers features from various useful programs that didn’t make the cut on their own for not having them all. It was all fun and games until I met with Unicode weirdings in tag saving [via TagLib].
The scenario is:
- fetch data from Discogs API into
NSStringfields of a model;
- convert such strings into C-strings [
const char*] via
- set them into
TagLib::Tagproperty of each file;
Everything was fine for plain english releases [being english language almost free from diacritical marks], then I stumbled upon an italian record: data fetching went flawless, but when I persisted it to files and I checked Xcode console I met Mr. √® [the MacRoman representation of è, italian for third-person singular of to be]. Shouldn’t
UTF8String take care of encoding non-ASCII characters?
At a first glance I thought about a library issue, but even the minimal
NSLog(@"%s", [@"è" UTF8String]) example was broken! I tried then to mess with Taglib parameters, but I was having no clue at all; after a googling session, I understood I needed wide characters, which are compatible with this
How to perform conversion from
NSData* asData = [string dataUsingEncoding:kEncoding_wchar_t]; TagLib::wstring ws = TagLib::wstring( (wchar_t*)[asData bytes], [asData length] / sizeof(wchar_t) );
wstring is a provided implementation of
std::wstring [not defined in all systems as stated here], and
kEncoding_wchar_t is defined as following:
#if TARGET_RT_BIG_ENDIAN const NSStringEncoding kEncoding_wchar_t = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingUTF32BE); #else const NSStringEncoding kEncoding_wchar_t = CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingUTF32LE); #endif
Finally I could build desired string:
and set into files’ tag, which are correctly saved and rendered by the program itself, by QuickLook and by external media players.
One mystery lasts: I had to use
TagLib::String::Type::Latin1 encoding flag, and not expected
TagLib::String::Type::UTF8, which threw a “Unicode conversion error” exception: I will ask Stackoverflow later maybe.