[RCD] junk processing flow needs two Junk folders, not just one
rq at akl.lt
Sun Feb 26 14:32:24 CET 2012
2012.02.26 00:26, Brian J. Murrell wrote:
> But that does not recognize that there are two types of junk: the mail
> that the mail system determined is spam (let's call this tagged spam)
> and wants to quarantine for the user to sift through for false
> positives. The other type of "Junk" is the spam that the mail system
> did not determine for the user (let's call this untagged spam) and that
> the user wants to tell the mail system is spam so that it can learn.
> So the user needs two folders for these different types of messages, for
> a couple of reasons. First reason is that it's a waste of the users
> time to put the untagged spam into the same folder that is meant to be
> the folder that the user to sifts through to find falsely tagged spam.
> Secondly, the user does need a folder to put untagged spam so that the
> mail system has somewhere it can go get messages that the user wants it
> to use to learn about what spam is. And this folder shouldn't be same
> folder that the tagged spam has gotten put into since we don't want/need
> the mail system to learn from messages it's already tagged as spam.
It's probably a matter of personal taste, but I feel OK with the mail
system simply not accepting messages that are clearly spam and
prepending "***SPAM***" to the subjects of messages that look very much
like spam, but could possibly be legitimate. None of that requires a
separate folder. On the other hand, if you don't delete positives, there
are two options:
1) your filter is suspicious and marks most spam as spam. In this case,
there probably aren't that many messages for user to mark as spam
manually, so even if you put them all in one place, the increase of work
to find that important false positive won't be noticeable.
2) the filter is relaxed and the user marks more spam manually than the
system does automatically. In this case, false positives are soooo
unlikely that it's not even worth considering
And even if you don't fall into one of those categories, there's still
another argument not to bother about two folders: the search field. The
user can always use it to look for that particular ham message in the
spam folder. All she has to know is (a part of) the sender address or
As for the second reason (learning), I think you could easily teach the
system not to learn from messages that already have "X-Spam-Flag: Yes"
or a similar header that it has set itself. In any case, with your
setup, additional measures of preventing the system from learning from
the same message again are needed: it has to either delete the messages
it has learned from, or move them to another (third?) folder, or keep a
track of them, or add a header to them and look for it next time. Either
way, it's troublesome.
Add to it the fact that what one user considers spam, might be ham for
the other (and vice versa) and such learning becomes even more complicated.
There is an alternative to having untagged spam learned from. Mark as
Junk plugin allows you to specify commands to run with a message to
teach spamassassin about (non-)spam. Why not use these?
> On my system here those two folders are "Junk" and "spam" (respectively).
No offence, but to me, this distinction in names looks rather lame. If I
saw those two folders next to each other, my first idea would be that
there is/was a glitch/misconfiguration somewhere which resulted in this
situation. On the other hand, "Junk (auto)" and "Junk (manual)" would be
> Mail that has the X-Spam-Flag header set to "YES" is
> put into "Junk" (and does not need to be used to learn about spam from)
> and messages that are in the user's INBOX that are actually spam should
> be moved to "spam". A process on the mail system goes through the
> "spam" folders of all of the users and pushes those messages through the
> spam-learning process.
> Am I going about this all wrong? Does anyone else see the need for two
> different folders (three if you bring the "ham" into the discussion) for
> spam processing?
Again, I think it's a matter of personal taste and perspective, but to
me, your proposed set up looks more confusing than useful.
List info: http://lists.roundcube.net/dev/
More information about the Dev