When ever I go back to the INBOX screen, say from seeing a message, or from a composing message that just got sent, the default encoding changes from UTF to ISO-8859-1.
Also, even when the UTF encoding is correctly set, if a message comes in with ISO-8859-1 encoding, it's not converted, and so, it's not viewed correctly.
On Tue, 14 Mar 2006 7:57:59 -0300, Martin Marques martin@bugs.unl.edu.ar wrote:
When ever I go back to the INBOX screen, say from seeing a message, or from a composing message that just got sent, the default encoding changes from UTF to ISO-8859-1.
Sorry, this wasn't totaly accurate. This behaviour only happens when sending a composed message. When it gets back to the INBOX it switches encoding.
Also, even when the UTF encoding is correctly set, if a message comes in with ISO-8859-1 encoding, it's not converted, and so, it's not viewed correctly.
This is also true.
Martin Marques wrote:
On Tue, 14 Mar 2006 7:57:59 -0300, Martin Marques martin@bugs.unl.edu.ar wrote:
When ever I go back to the INBOX screen, say from seeing a message, or from a composing message that just got sent, the default encoding changes from UTF to ISO-8859-1.
Sorry, this wasn't totaly accurate. This behaviour only happens when sending a composed message. When it gets back to the INBOX it switches encoding.
Hmmh, seems that the encoding behavior is different from GET to POST method. The 0.1beta has the following line in the .htaccess file: AddDefaultCharset UTF-8 I thought that this would solve the encoding problem in general (at least on Apache servers)
Also, even when the UTF encoding is correctly set, if a message comes in with ISO-8859-1 encoding, it's not converted, and so, it's not viewed correctly.
This is also true.
If a message specifies it's charset in the Content-Type header, RC will attempt to convert it to UTF-8. This does not work for HTML messages that have chars encoded with html entities. A decoding function handling html entities has to be written for that. Anyone?
Regards, Thomas
On tis, 2006-03-14 at 16:05 +0100, Thomas Bruederli wrote:
If a message specifies it's charset in the Content-Type header, RC will attempt to convert it to UTF-8. This does not work for HTML messages that have chars encoded with html entities. A decoding function handling html entities has to be written for that. Anyone?
But aren't HTML entities already charset agnostic?!
Do really HTML entities have to be transformed in any way?
(Sorry that I jump into this discussion without fully reading up on what has been said before, but it just sounds so weird to me.)
/Håkan
Håkan Lindqvist wrote:
On tis, 2006-03-14 at 16:05 +0100, Thomas Bruederli wrote:
If a message specifies it's charset in the Content-Type header, RC will attempt to convert it to UTF-8. This does not work for HTML messages that have chars encoded with html entities. A decoding function handling html entities has to be written for that. Anyone?
But aren't HTML entities already charset agnostic?!
I guess they aren't. An entity like ü represents a single byte char (ASCII 252; "ü" in ISO-8859-1). As far as I know the browser will not display this entity correctly because it expects double-byte characters.
Please correct me if I'm wrong...
Do really HTML entities have to be transformed in any way?
(Sorry that I jump into this discussion without fully reading up on what has been said before, but it just sounds so weird to me.)
/Håkan
~Thomas
On tis, 2006-03-14 at 16:35 +0100, Thomas Bruederli wrote:
I guess they aren't. An entity like ü represents a single byte char (ASCII 252; "ü" in ISO-8859-1). As far as I know the browser will not display this entity correctly because it expects double-byte characters.
Oh! You're probably right about that, I didn't even think about that nasty variant of html entities.
The "correct" way of encoding ü would obviously be ü, which is what I was thinking of.
/Håkan
Håkan Lindqvist wrote:
On tis, 2006-03-14 at 16:35 +0100, Thomas Bruederli wrote:
I guess they aren't. An entity like ü represents a single byte char (ASCII 252; "ü" in ISO-8859-1). As far as I know the browser will not display this entity correctly because it expects double-byte characters.
Oh! You're probably right about that, I didn't even think about that nasty variant of html entities.
The "correct" way of encoding ü would obviously be ü, which is what I was thinking of.
/Håkan
After a short test, I have to correct myself: FF and IE display all html entities correctly (even the single-byte chars). In that case it's not necessary to write a function that will convert those.
But I have no idea why the characters in Martin's mails are not displayed correctly. The only reason I can imagine is that there's no charset specified in the mail/part headers. Could you check that Martin?
Regards, Thomas
On Tue, 14 Mar 2006 16:05:40 +0100, Thomas Bruederli roundcube@gmail.com wrote:
Hmmh, seems that the encoding behavior is different from GET to POST method. The 0.1beta has the following line in the .htaccess file: AddDefaultCharset UTF-8 I thought that this would solve the encoding problem in general (at least on Apache servers)
It's been comented out in CVS. I just un commented it in my .htaccess
martin@bugs:/var/www/html/roundcubemail$ cat .htaccess # AddDefaultCharset UTF-8 php_flag display_errors Off php_value upload_max_filesize 2m
<FilesMatch "(.inc|~)$|^_"> Order allow,deny Deny from all
</FilesMatch>
Order deny,allow Allow from all martin@bugs:/var/www/html/roundcubemail$ cvs diff .htaccess Index: .htaccess =================================================================== RCS file: /cvsroot/roundcubemail/roundcubemail/.htaccess,v retrieving revision 1.4 diff -u -r1.4 .htaccess --- .htaccess 3 Mar 2006 16:34:30 -0000 1.4 +++ .htaccess 14 Mar 2006 18:31:32 -0000 @@ -1,5 +1,5 @@ # AddDefaultCharset UTF-8 -php_flag display_errors On +php_flag display_errors Off php_value upload_max_filesize 2m
<FilesMatch "(.inc|~)$|^_">
If a message specifies it's charset in the Content-Type header, RC will attempt to convert it to UT
By the way, when trying to reply to this message (like right now) with RC, the message gets cut right here. Very odd.
On Tue, 14 Mar 2006 17:01:51 +0100, Thomas Bruederli roundcube@gmail.com wrote:
After a short test, I have to correct myself: FF and IE display all html entities correctly (even the single-byte chars). In that case it's not necessary to write a function that will convert those.
But I have no idea why the characters in Martin's mails are not displayed correctly. The only reason I can imagine is that there's no charset specified in the mail/part headers. Could you check that Martin?
The mail is from a very important news paper, which sends me a daily digital version. I don't have a mail from them right now (I should be recieving one at any moment), but here is another mail with a similar problem. Here's the relevant headers:
Martin Marques wrote:
On Tue, 14 Mar 2006 17:01:51 +0100, Thomas Bruederli roundcube@gmail.com wrote:
After a short test, I have to correct myself: FF and IE display all html entities correctly (even the single-byte chars). In that case it's not necessary to write a function that will convert those.
But I have no idea why the characters in Martin's mails are not displayed correctly. The only reason I can imagine is that there's no charset specified in the mail/part headers. Could you check that Martin?
<snip>
Rara vez =96y no es asunto de =E9poca o idiosincracias= nacionales=96 los asuntos
I don't know if this is the same problem or even close..
I have noticed the same on one test server I'm using a bit. I had not looked into it too much, and kinda assumed it was a bug or something - something that would be worked out.
When investigating I found that I had this in the httpd.conf file for apache: AddDefaultCharset on
Changing this to: AddDefaultCharset off
I had left a comment in the conf file about something like cross site scripting problem, and that was the reason I added that entry. As I said, this is a testing box, so it has some rather messy config files that are not properly documented...
Perhaps you should check that you haven't done something similar.
On Wed, 15 Mar 2006 22:32:28 +0100, Tor Bendiksen tor@tblab.net wrote:
Martin Marques wrote:
When investigating I found that I had this in the httpd.conf file for apache: AddDefaultCharset on
Changing this to: AddDefaultCharset off
- fixed it for me.
My .htaccess has that line commented out, which falls to the default that is off, just like yours.
Any ideas?
On Wed, 15 Mar 2006 19:01:28 -0300, Martin Marques martin@bugs.unl.edu.ar wrote:
On Wed, 15 Mar 2006 22:32:28 +0100, Tor Bendiksen tor@tblab.net wrote:
Martin Marques wrote:
When investigating I found that I had this in the httpd.conf file for apache: AddDefaultCharset on
Changing this to: AddDefaultCharset off
- fixed it for me.
My .htaccess has that line commented out, which falls to the default that is off, just like yours.
Any ideas?
I just reverted back to AddDefaultCharset on
in httpd.conf
And: AddDefaultCharset off
in .htaccess
Still same problem with encoding. You can probably see your sig below is messed up for me.
--
Lic. Martín Marqués | SELECT 'mmarques' || Centro de Telemática | '@' || 'unl.edu.ar'; Universidad Nacional | DBA, Programador, del Litoral | Administrador