Format of GitHub messages in SVN list

List overview All Threads
Download

newer

older

Re: [RCD] [Svn]...

Re: [RCD] git-pull am 11.11. nicht...

Michael Heydekamp

11 Dec 2012 11 Dec '12

2:25 p.m.

I sent this message to the SVN list already, but it appears that it doesn't arrive there (but I did also not receive a rejection through a moderator message).

Since the move from SVN to GitHub, the format of the messages in the SVN list looks a bit strange to me:

Roundcube is displaying the attachment icon for each message, although

there is no attachment at all in any of the messages.

All messages are declared as "multitype/mixed" for no real reason (the

second part just containing the signature, this could perfectly be appended to the first part without creating a second part).

The Content-Type of the first part (i.e. the real content of the

message) is "text/plain; charset=UTF-8", the Content-Transfer-Encoding "7bit". If CTE is 7bit, shouldn't the charset be "US-ASCII" then? The UTF-8 declaration will probably (and unnecessarily) invoke a decoding routine in most MUAs, although there is nothing to decode.

and 3) are minor issues, but 1) (the attachment icon) appears to be a

wrong indication.

Not sure if the declaration of the messages is misleading in some way, or if Roundcube's interpretation is wrong? If the latter, this could also affect any other messages.

Cheers,

Michael Heydekamp Co-Admin freexp.de Düsseldorf/Germany

Show replies by date

Jeroen van Meeuwen (Kolab Systems)

12 Dec 12 Dec

4:19 a.m.

On 2012-12-11 19:25, Michael Heydekamp wrote:

...

I sent this message to the SVN list already, but it appears that it doesn't arrive there (but I did also not receive a rejection through a moderator message).

Since the move from SVN to GitHub, the format of the messages in the SVN list looks a bit strange to me:

Roundcube is displaying the attachment icon for each message,

although there is no attachment at all in any of the messages.

Actually GitHub sends the original message as a mime part, and mailman therefore "attaches" the mailing list footer (Content-Disposition: inline, though).

...

All messages are declared as "multitype/mixed" for no real reason

(the second part just containing the signature, this could perfectly be appended to the first part without creating a second part).

Here too, the Content-Type is slightly different - the original GitHub message and the mailman mailing list footer claim to use a different charset.

...

The Content-Type of the first part (i.e. the real content of the

message) is "text/plain; charset=UTF-8", the Content-Transfer-Encoding "7bit". If CTE is 7bit, shouldn't the charset be "US-ASCII" then? The UTF-8 declaration will probably (and unnecessarily) invoke a decoding routine in most MUAs, although there is nothing to decode.

I suppose there's only nothing to decode (most of the time) because there's no actual utf-8 characters included in the message - but a commit message may contain utf-8 characters, of course.

Kind regards,

Jeroen van Meeuwen

-- Systems Architect, Kolab Systems AG e: vanmeeuwen at kolabsys.com m: +44 74 2516 3817 w: http://www.kolabsys.com pgp: 9342 BF08

Michael Heydekamp

12:14 p.m.

Am 12.12.2012 10:19, schrieb Jeroen van Meeuwen (Kolab Systems):

...

On 2012-12-11 19:25, Michael Heydekamp wrote:

...

...
Since the move from SVN to GitHub, the format of the messages in the SVN list looks a bit strange to me:

Roundcube is displaying the attachment icon for each message, although

there is no attachment at all in any of the messages.

Actually GitHub sends the original message as a mime part, and mailman therefore "attaches" the mailing list footer (Content-Disposition: inline, though).

Sure, but wasn't that the same procedure before the move to GitHub, when those messages were still coming from trac@roundcube.net?

But anyway: Roundcube apparently and already detects that there is no attachment, as it doesn't offer to download any attachments (below the header section). So why does it display the attachment (paper clip) icon in the attachment column then...?

...

...

All messages are declared as "multitype/mixed" for no real reason (the

second part just containing the signature, this could perfectly be appended to the first part without creating a second part).

Here too, the Content-Type is slightly different - the original GitHub message and the mailman mailing list footer claim to use a different charset.

Yes, I can see that, but I just don't see the reason why. The list footer is declared as US-ASCII, the first part of the message (apparently blindly) as UTF-8. As US-ASCII is a subset of UTF-8, there is no reason to create a second part - even if the entire message (including the footer) would be declared as UTF-8, it would be perfectly correct.

...

...

The Content-Type of the first part (i.e. the real content of the

message) is "text/plain; charset=UTF-8", the Content-Transfer-Encoding "7bit". If CTE is 7bit, shouldn't the charset be "US-ASCII" then? The UTF-8 declaration will probably (and unnecessarily) invoke a decoding routine in most MUAs, although there is nothing to decode.

I suppose there's only nothing to decode (most of the time) because there's no actual utf-8 characters included in the message - but a commit message may contain utf-8 characters, of course.

Right, it MAY. But if it doesn't, then there is no need to declare UTF-8, IMO.

Anyway, that leads me to a different question in terms of composing messages with Roundcube in general: Is there any chance to check the content of a message upon sending and to declare the "least invasive" charset, rather than blindly declaring UTF-8 no matter what?

I'm using the plugin "sendcharset" to avoid blindly declaring UTF-8 always, but this approach is as blind as well (because if I should use characters outside ISO-8859-1, they will not been transfered and displayed correctly - but just as "?").

I can see that you are declaring "us-ascii" with your message (how did you do that?) and are using "Roundcube Webmail/0.9-0.15.git0fa54df6.el6.kolab_3.0". Did you already implement a routine that is checking the body of a message and then declaring the least invasive charset?

If so, I'd like to have it. ;) If not, what would happen if you would use just one 8bit character in your message - still declaring "us-ascii"...?

Regards,

Michael Heydekamp Co-Admin freexp.de Düsseldorf/Germany

A.L.E.C

12:34 p.m.

On 12/12/2012 06:14 PM, Michael Heydekamp wrote:

...

Sure, but wasn't that the same procedure before the move to GitHub, when those messages were still coming from trac@roundcube.net?

But anyway: Roundcube apparently and already detects that there is no attachment, as it doesn't offer to download any attachments (below the

it's (most likely) because attachment is marked inline and it's a text.

...

header section). So why does it display the attachment (paper clip) icon in the attachment column then...?

It's just because of multipart/mixed content-type. When the list is displayed we don't know the content of the message.

-- Aleksander 'A.L.E.C' Machniak LAN Management System Developer [http://lms.org.pl] Roundcube Webmail Developer [http://roundcube.net] --------------------------------------------------- PGP: 19359DC1 @@ GG: 2275252 @@ WWW: http://alec.pl

Michael Heydekamp

12:54 p.m.

Am 12.12.2012 18:34, schrieb A.L.E.C:

...

On 12/12/2012 06:14 PM, Michael Heydekamp wrote:

...

...
[...] So why does it display the attachment (paper clip) icon in the attachment column then...?

It's just because of multipart/mixed content-type. When the list is displayed we don't know the content of the message.

Right, that does indeed explain it.

So the format of those message should be optimized...?

Regards,

Michael Heydekamp Co-Admin freexp.de Duesseldorf/Germany

Michael Heydekamp

1:18 p.m.

Am 12.12.2012 18:14, schrieb Michael Heydekamp:

...

Anyway, that leads me to a different question in terms of composing messages with Roundcube in general: Is there any chance to check the content of a message upon sending and to declare the "least invasive" charset, rather than blindly declaring UTF-8 no matter what?

I'm using the plugin "sendcharset" to avoid blindly declaring UTF-8 always, but this approach is as blind as well (because if I should use characters outside ISO-8859-1, they will not been transfered and displayed correctly - but just as "?").

I can see that you are declaring "us-ascii" with your message (how did you do that?) and are using "Roundcube Webmail/0.9-0.15.git0fa54df6.el6.kolab_3.0". Did you already implement a routine that is checking the body of a message and then declaring the least invasive charset?

Ha, got it!

I just sent a message to this list, deliberately (and unnecessarily) declared as UTF-8, although it did contain 7bit chars only (I replaced the "ü" in my default signature for this list with "ue" to achieve that).

So it has been sent to the list with those headers:

...

Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit

Uh, Roundcube apparently detects that no 8bit chars are contained in the message. I didn't know that yet.

Looking in the mailing list itself, the message came back with these headers:

...

Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit

Well, THAT'S amazing! Because it does mean that there must be a routine in Mailman which is able to determine a sort of "least invasive" charset.

If Mailman has such a routine, can't we use it in Roundcube as well?

This message is another test in so far as it will again be declared as UTF-8, but it will still contain an "ü" (i.e. 8bit characters) in the body as well as in the signature.

If Mailman is really clever, it should now change "UTF-8" into "ISO-8859-1". We'll see... (I can only tell after I sent this message.)

Cheers,

Michael Heydekamp Co-Admin freexp.de Düsseldorf/Germany

Benny Pedersen

1:28 p.m.

Michael Heydekamp skrev den 12-12-2012 19:18:

[snip]

...

If Mailman is really clever, it should now change "UTF-8" into "ISO-8859-1". We'll see... (I can only tell after I sent this message.)

if mailman changes content or any headers then it breaks dkim signed msgs

who will report mailman now ?

but yes, roundcube side of the problem i agre its better to use minimal encodings, it could aswell be utf-7 imho

Michael Heydekamp

2:27 p.m.

Am 12.12.2012 19:28, schrieb Benny Pedersen:

...

Michael Heydekamp skrev den 12-12-2012 19:18:

[snip]

...
If Mailman is really clever, it should now change "UTF-8" into "ISO-8859-1". We'll see... (I can only tell after I sent this message.)

if mailman changes content or any headers then it breaks dkim signed msgs

who will report mailman now ?

Well, it's not only Mailman. It's nothing new and unusual that especially encodings of mails are changed by hopping from server to server, charset strings are converted from upper to lower case etc. etc.

I'm not familiar with DKIM signatures, so I can't tell if this is really a problem.

Regards,

Michael Heydekamp Co-Admin freexp.de Düsseldorf/Germany

Benny Pedersen

2:59 p.m.

Michael Heydekamp skrev den 12-12-2012 20:27:

...

Well, it's not only Mailman. It's nothing new and unusual that especially encodings of mails are changed by hopping from server to server, charset strings are converted from upper to lower case etc. etc.

such servers is not possible to use with dkim-signed msgs

...

I'm not familiar with DKIM signatures, so I can't tell if this is really a problem.

thats another problem, or as we say in danmark, any solution changes the problem

Michael Heydekamp

1:46 p.m.

New subject: Encodings and charsets (was: Format of GitHub messages in SVN list)

Am 12.12.2012 19:18, schrieb Michael Heydekamp:

...

I just sent a message to this list, deliberately (and unnecessarily) declared as UTF-8, although it did contain 7bit chars only (I replaced the "ü" in my default signature for this list with "ue" to achieve that).

So it has been sent to the list with those headers:

...
Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit

Uh, Roundcube apparently detects that no 8bit chars are contained in the message. I didn't know that yet.

Looking in the mailing list itself, the message came back with these headers:

...
Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit

Well, THAT'S amazing! Because it does mean that there must be a routine in Mailman which is able to determine a sort of "least invasive" charset.

If Mailman has such a routine, can't we use it in Roundcube as well?

This message is another test in so far as it will again be declared as UTF-8, but it will still contain an "ü" (i.e. 8bit characters) in the body as well as in the signature.

If Mailman is really clever, it should now change "UTF-8" into "ISO-8859-1". We'll see... (I can only tell after I sent this message.)

Unfortunately Mailman is not that clever as I was hoping it to be.

The message was sent with these headers:

...

Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

And it came back with these headers:

...

Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64

Charset unchanged, but encoding changed from qp to b64. Hmm, that's not what I hoped to happen.

But anyway, Mailman does apparently have some logic which Roundcube could take as a start to declare a least invasive encoding and charset.

For us Western Europeans, a fallforward routine such as US-ASCII -> ISO-8859-1 -> ISO-8859-15 -> Windows-1252 -> UTF-8 (by checking the content of the message, of course) would be just nice.

In Eastern (but still latin) Europe, different charsets do of course apply.

This would IMO cover more than 90% of the Roundcube users. To make things not more complicated as they need to be, all others can/should still declare UTF-8.

These fallforward routines (Western/Eastern Europe) could be selected in the user settigs.

Ok, just dreamin'...

Michael Heydekamp Co-Admin freexp.de Düsseldorf/Germany

4631

Age (days ago)

4632

Last active (days ago)

dev@lists.roundcube.net

9 comments

4 participants

tags (0)

participants (4)

A.L.E.C
Benny Pedersen
Jeroen van Meeuwen (Kolab Systems)
Michael Heydekamp