Hi Michael,
On Thu 03/Jan/2013 21:01:41 +0100 Michael Heydekamp wrote:
first of all thanks a lot for your comprehensive response.
You're welcome.
I have to admit that I'm not such an expert in this area as you may believe.
Neither am I :-)
Am 03.01.2013 17:41, schrieb Alessandro Vesely:
On Mon 31/Dec/2012 17:38:06 +0100 Michael Heydekamp wrote:
From a practical side of view, the still open question for us is: How to (re)build a fixed magic.mgc which does at least support "Return-Path:", "Return-path:" and "return-path:"?
libmagic provides for case insensitive comparisons. For file-5.12, the latest version, there are "c" and "C" switches that make the comparison fully case insensitive. For file-5.04, the version on debian squeeze, only "c" seems to be provided, hence it would be necessary to rewrite the magic test string in all-lowercase.
Here's my first dumb question: What exactly is the connection between the PHP function 'finfo' and the Linux command 'file'? Does 'finfo' internally use the output of the 'file' command?
Since file-4.00 the package features libmagic, and the file command is implemented as a client of that library. PHP uses libmagic as well. See http://en.wikipedia.org/wiki/File_%28command%29
BTW, that spurious text/x-mail mime type has been removed from the source in file-5.10.
I know, I have the sources of file-5.11 on our system as well (see my post to this list of 27.12.2012, 23:53).
But when I run a compiled file-5.11 against an .eml with "Return-path:" in the first line, it produces the same result as file-5.04: "text/x-mail" Stunning...
I submitted a patch to fix that right now: http://bugs.gw.com/view.php?id=226 (I had tried before, vie email, but their mailing list seems to dislike attachments.) Maybe it will arrive with file-5.13?
The resulting behavior is equivalent to using MAGIC_NO_CHECK_TOKENS. PHP doesn't define such constant, but it is defined as 0x100000 in src/magic.h. What happens if you run, say,
php -r '$f=new finfo(FILEINFO_MIME|0x100000); print $f->file("YOUR.eml")."\n";'
Although I don't understand what I'm (and this command is) exactly doing, the result is:
"text/plain"; charset=us-ascii"
That's equivalent to issuing "file --exclude tokens". That way you rely on magic files only, smoothing the transition through file-5.10.
So not "text/x-mail" anymore, but also still not "message/rfc822".
If I replace file-5.04 with file-5.11 and run the command above, I get the same "text/plain" result.
Yes, investigating a text/plain file further, in order to determine if it is of type message/rfc822, text/rfc822-headers, or text/news would seem safer than dealing with text/x-mail.
(Although I have file-5.04 too, on debian squeeze, I'm unable to get text/x-mail: I get text/plain for "Return-path:".)
Strange. Probably because of a different magic.mgc file...?
Or just a different .eml.
Plus that finfo is apparently also able to load a plain text file instead of a compiled magic.mgc, but it doesn't load every magic text file that I can find on our server. I will have to dig a little deeper into that issue, unless somebody else can help with more specific information.
What PHP version does roundcube require?
According to http://www.roundcube.net/about: PHP Version 5.2.1 or greater
Hm... that leaves a too wide choice of magic.mgc possible formats.
According to the Notes at http://php.net/manual/en/function.finfo-open.php, PHP >= 5.3.11 and >= 5.4.1 upgraded to libmagic 5.
Well, we're running 5.3.3-7+squeeze14 with Suhosin-Patch (cli), AFAICS.
Same here. Obviously, it's easier to target a specific version, especially if recent.
However, slight changes in the format of the magic file are possible even between minor versions.
Yeah, but I have no idea how to compile a magic.mgc at all (which text file I have to use as a basis, where to get the latest one, what exactly to do with it, etc. pp.).
That should be "file -C -m magic". If you build from sources, you can just issue "make" after altering any magic/Magdir/*.
Ciao Ale