[mb-users] Invalid UTF-8 title returned for release

Lukáš Lalinský lalinsky at gmail.com
Thu Oct 11 12:39:12 UTC 2007


On 10/11/07, Andy Hawkins <andy at gently.org.uk> wrote:
> Hi all,
>
> When I retrieve the data for the following release using the client library:
>
> http://musicbrainz.org/show/release/?releaseid=468712
>
> and then try to convert the title from UTF-8 to an encoding I can display on
> the screen, I get the error:
>
> iconv: Invalid or incomplete multibyte or wide character
>
> The raw title is:
>
> 4c (L)
> 69 (i)
> 76 (v)
> 65 (e)
> 20 ( )
> 46 (F)
> 72 (r)
> 6f (o)
> 6d (m)
> 20 ( )
> 47 (G)
> 64 (d)
> 61 (a)
> c5 (Å)
> 84 ( )
> 73 (s)
> 6b (k)

This is perfectly valid UTF-8.

>>> print "\x47\x64\x61\xc5\x84\x73\x6b".decode("utf-8")
Gdańsk

> I am attempting the conversion using the following code:
>
> setlocale(LC_ALL,  "" );
> char *Codeset=nl_langinfo(CODESET);

My guess would be that this encoding doesn't support the ń character.

> char *In=new char[m_UTF8Value.length()+1];
> strcpy(In,m_UTF8Value.c_str());
> size_t InLeft=m_UTF8Value.length();
>
> char *Out=new char[m_UTF8Value.length()*4];
> memset(Out,0,m_UTF8Value.length()*4);
> size_t OutLeft=m_UTF8Value.length()*4;
>
> char *InBuff=In;
> char *OutBuff=Out;
>
> iconv_t Convert=iconv_open(Codeset,"UTF-8");
> if ((iconv_t)-1!=Convert)
> {
>         if ((size_t)-1!=iconv(Convert,&InBuff,&InLeft,&OutBuff,&OutLeft))
>         {
>                 if (OutLeft>=sizeof(char))
>                     *OutBuff='\0';
>                         m_DisplayValue=Out;
>         }
>         else
>                 perror("iconv");
>
>         iconv_close(Convert);
> }
> else
>         perror("iconv_open\n");
>
> if (In)
>         free(In);
>
> if (Out)
>         free(Out);
>
> This code has worked on all the other albums I have retrieved that include
> non ASCII characters.
>
> Can anyone offer any assistance?
>
> Thanks
>
> Andy

Lukas


More information about the MusicBrainz-users mailing list