UTF7 - UTF8 folder encoding errors

Thomas Bruederli roundcube at gmail.com
Thu Aug 10 20:26:25 CEST 2006

Eric Liang wrote:
> First of all, thank you for this. It made things better. New comments
> are inline.
> On 8/10/06, *Thomas Bruederli* <roundcube at gmail.com
> <mailto:roundcube at gmail.com>> wrote:
>     IMP uses the PHP integrated IMAP functions and as far as I could see
>     do thex handle the charset conversion internally. 
> I don't know - I have just tested IMP and Group-Office lately and they
> both seem to treat folders nicely from a user's point of view. I am not
> that familiar with how they do it.

It uses a PHP extension that RoundCube does not require, means that we
cannot do the same here.
>     I just committed some changes that should solve these problems. It
>     works well with my mailbox but I only use ISO characters. Please
>     checkout the latest revision and test it with your environment.
> These are the results:
> 1) In folders list the multibyte characters are cut (... is put in the
> middle of the string) incorrectly. This happened before, I just didn't
> mention it to the dev list until now. This means that the PHP function
> should check the length of string as multibyte and cut it as such.
> Currently I have folders that have AAA?...AAAAA (where AAA=multibyte
> chars). The question mark (?) is shown because the second byte of the
> character is cut so it's substituted by ? (therefore I assume that the
> PHP fuctions do not treat this string as multibyte before
> checking/converting/minimizing length). There are multibyte folder
> name with real length of 8 chars that are cut and English folders that
> are 10 or more and are not cut. So perhaps this behaviour should be
> examined.

OK, this is correct, some PHP string functions are not multibyte save.
I'll try to sort that out.
> 2) When I use English GB language the folders work nicely. They used to
> work nicely on the folder list (ie show), now the create/rename folder
> works flawlessly too (big thanks).
> But when I use another language, for example Spanish, things get messed
> up, just like in the past. Let me know if I am allowed to send you
> screenshots (via personal email) to show what happens when charset is
> not ISO-8859-1, or perhaps a login account on such a mailbox.

Yes, please do. I cannot reproduce any different behavior when changing
the language.
> Results (for non English charset):
> Create folder -> "error occured while creating folder" or similar error
> in the translated language
> View folder -> strange charset conversion (or no conversion at all?)
> is shown instead of the normal folder
> Rename folder -> "error occured while creating folder" or similar error
> in the translated language

Do you have iconv or mbstring modules installed with your PHP? There's a
ticket that complains about buggy mbstring implementation.
> I have noticed that the erratic folder behavior happens when I use
> specific languages like
> Slovak, Polski, Greek, Espanol, Arabic etc but not on Japanese, Russian,
> English (it works fine on those showing always the correct folder names
> as intended)
> I can only assume it has to do with internal PHP charset conversions
> that only support some charsets and not others? Like what happens with
> html_entity_decode that supports only _some_ charsets:
> http://nl2.php.net/manual/en/function.html-entity-decode.php

Well, it has nothing to do with html entities. The problem occurs when
talking to the IMAP server.
> Finally, I am using PHP 4.4.x branch to test.

As far as I could see in the Squirrelmail code, they use mbstring and if
that is not available, UTF-7 conversion only works for ISO-885-1. Same
in RoundCube. I have found a C-File that converts UTF-8 to UTF-7 and
vice versa but I only have basic knowledge of C and it will take me many
hours to rewrite it in PHP. Anybody out there with good C skills?
> Let me know how I can help further to solve this.

Check your PHP if mbstring is installed.
> Your support is very much appreciated,
> Eric


More information about the Dev mailing list