From: Subject: Filtering Unsolicited E-mail Date: Wed, 14 Nov 2001 08:36:09 +0100 MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_NextPart_000_0015_01C16CE7.690AB880"; type="text/html" X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 This is a multi-part message in MIME format. ------=_NextPart_000_0015_01C16CE7.690AB880 Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Content-Location: http://ist.uwaterloo.ca/security/howto/2000-09-27.html Filtering Unsolicited E-mail

Filtering Unsolicited E-mail
Information Systems and = Technology=20
University of Waterloo

29-September-2000
3D[IST]=20

Synopsis

Many users are burdened with unsolicited E-mail -- messages that = they did=20 not ask for, do not want and do not appreciate. Some of the = unsolicited mail=20 is actually quite offensive. Those users may want to consider=20 filtering their mail. Good tools exist on Unix to = sort, file=20 and even toss mail as it arrives and that can help to deal with = unsolicited=20 messages. The mail filtering tools discussed here allow the user to=20 selectively file or discard messages. The rules for doing so are = entirely at=20 the control of the user. With a little effort you can easily deal with = the=20 burden of unsolicited mail. You can also intelligently folder your = messages=20 which can help to organize your reading.=20

Some mail packages (notably Outlook Express and Netscape = Communicator) have=20 tools for filtering messages as they are picked up from the = mail=20 server. I'm not going to discuss those tools. The filtering discussed = here is=20 the automatic filtering of mail messages at the server as they=20 arrive. The mail server will sort and file things for you even = when=20 you're off on holidays or busy elsewhere.

Example:

One of the most popular Unix mailer filters is procmail(1). Another not = so popular=20 but very effective filter is slocal(1). Both are typically not = part of=20 the vendor distribution -- they're public domain additions to the = system.=20 Since procmail is the most popular I'll only discuss that = filter=20 (you'll find a r= eference=20 below to an earlier paper on using slocal). Configuring = your Unix=20 account for procmail filtering is a two stage process:=20

  1. The user needs to forward their mail through a filter. Usually = one=20 forwards mail to another person, but there's nothing wrong with = forwarding=20 to a filter that intelligently manages the mail which arrives. I'm=20 forwarding my mail to procmail:=20
    [8:44am ist] which procmail
    /.software/local/.admin/bins/bin/procmail
    [8:44am ist] more .forward
    |/.software/local/.admin/bins/bin/procmail
    
    The "which" command tells me where procmail is = found on my Unix system. At Waterloo procmail is typically = found as=20 above. At other sites it may be found elsewhere. The ".forward" file = instructs the mail delivery system to pipe my mail to the = procmail=20 program.=20

  2. And the user needs to specify the mail filtering rules -- ie. to = specify=20 how the filter should act. For the procmail mail = filter you=20 need to specify the rules in a .procmailrc file. The = procmail=20 manual pages (which you should read if you need more information) = calls=20 these recipes.=20
    [8:48am ist] ls -l .procmailrc
       8 -r--r-----   1 reggers  none        1037 Sep 19 08:59 .procmailrc
    [8:48am ist] more .procmailrc
    # From http://www.uwasa.fi/~ts/info/proctips.html
    #      Timo's procmail tips and recipes=20
    
    SHELL=3D/usr/bin/sh               # Check this, is it really?
    MAILDIR=3D${HOME}/mail            # Check this, is it really?
    LOGFILE=3D${HOME}/.procmail.log   # Watch out, this grows!!
    
    # Anything to/from a uwaterloo.ca address is probably OK
    
    :0:
    * ^TO_.*uwaterloo\.ca
    ${DEFAULT}
    
    :0:
    * ^From:.*uwaterloo\.ca
    ${DEFAULT}
    
    # Otherwise, it's unsolicited and I don't want it
    
    :0:
    Unsolicited
    

The filtering rules shown (which you might choose to = implement)=20 work this way -- if the message is From:, To:, or Cc: a=20 uwaterloo.ca address then file it the "DEFAULT" = folder=20 (ie. your "INBOX" where your mail has always arrived).=20 Otherwise, file it in an "Unsolicited" folder. That's an adequate spam = filter=20 for most users. But see the BEWARE=20 section below for some cautionary advice -- it may not be adequate for = all=20 users.

My Mail Filter:

I use a procmail filter similar to the above to deal with my = mail=20 and to file the unsolicited spam. I'm a member of several mailing = lists (here=20 and elsewhere) so mail often arrives with something other than=20 uwaterloo.ca addresses in the headers. I have mail accounts at = other=20 sites and mail for those mailboxes is forwarded here. Again, that mail = won't=20 have uwaterloo.ca addresses in the headers. The example shown = above=20 isn't adequate for me -- much of my important mail would be filed into = the=20 Unsolicited folder.=20

My mail tool at work is Netscape (on a Unix platform) while at home = I use=20 Outlook Express (on a Microsoft station). Both are configured to = access the=20 IMAP mail service on the ist.uwaterloo.ca server = where I=20 have lots of mail folders. Filing unsolicited mail in a folder means I = can=20 deal with them at my leisure -- they don't clutter my=20 INBOX folder.=20

First I'll show you what I have, then I'll explain what it means = and how it=20 works. This won't be a comprehensive tutorial but will be enough to = get you=20 going. This is a procmail filter I have used:=20

[8:54am ist] more .procmailrc
# From http://www.uwasa.fi/~ts/info/proctips.html
#      Timo's procmail tips and recipes=20
# $Id: 2000-09-27.html,v 1.2 2000/09/27 13:09:28 reggers Exp $

SHELL=3D/usr/bin/sh               # Check this, is it really?
MAILDIR=3D${HOME}/mail            # Check this, is it really?
LOGFILE=3D${HOME}/.procmail.log   # Watch out, this grows!!

# Some of the mailing lists I belong to. If any of the TO headers (To:,
# Cc:, etc.) match one of them then file in my INBOX.

:0:
* ^TO_.*cert
${DEFAULT}

:0:
* ^TO_.*secure-sol
${DEFAULT}

:0:
* ^TO_.*MICROSOFT_SECURITY
${DEFAULT}

:0:
* ^TO_.*efc-talk
${DEFAULT}

:0:
* ^TO_.*unisog
${DEFAULT}

# Anything to me is probably OK

:0:
* ^TO_.*reggers
${DEFAULT}

# Anything to/from a uwaterloo.ca address is probably OK

:0:
* ^TO_.*uwaterloo\.ca
${DEFAULT}

:0:
* ^From:.*uwaterloo\.ca
${DEFAULT}

# Otherwise, it's unsolicited and I don't want it

:0:
Unsolicited

It should be clear that lines which begin with a `#' are commentary = which=20 procmail ignores. At the very top I've defined some = configuration=20 values for the SHELL (verify where the Bourne = shell is=20 found on your system), the MAILDIR (that's where = mail=20 folders are kept) and the LOGFILE (that's a = transaction=20 log of procmail actions). There's lots of configuration = variables that=20 you can set -- that even includes the DEFAULT = folder.=20 I've not found a need to set any more than I've shown.=20

The remainder of the file is a sequence of rules to define how mail = ought=20 to be processed. Rules begin with the `:0:' line and are applied = sequentially=20 one after the other until a condition is found that applies. The last = rule is=20 the simplest -- if all else fails, put it in the Unsolicited = folder.=20 The other rules are simple conditions -- if some pattern matches then = file the=20 message in some folder:=20

The rules I've shown so far are pretty much all you need to know. = If you=20 want to get fancy procmail will let you be very fancy -- but I'd = encourage you=20 to stick with simple rules as above.

Other Filtering Tricks:

The r= eferences=20 below will lead you to sites with comprehensive information well = beyond what=20 I've shown you so far (eg. see Timo Salmi).=20 Nevertheless I'll give you some teasers to hint, if only a little, at = what you=20 can do with procmail.=20

  • Suppose that I've been filtering as above and have satisfied = myself that=20 everything that I've been filing into my Unsolicited folder = is really=20 junk and that I can therefore safely toss it instead of filing it. = If I=20 wanted to do that I'd change my last rule to:=20
    # Otherwise, it's unsolicited and I don't want it
    
    :0:
    /dev/null
    
    It's that simple to avoid unsolicited E-mail entirely = --=20 file it in /dev/null (that's Unix sink hole you can toss = garbage at).=20

  • The examples so far involve a decision about where to file the = message=20 based on an address field (TO_ and From:) in the message header. = You're not=20 limited to just those headers. A common trick is to file "Urgent" = messages=20 based on the Subject: header:=20
    :0:
    * ^Subject:.*urgent
    Urgent
    
    It should be clear that you can filter your messages = on=20 any header and file the message into whatever folders you like. I = might=20 decide to file all messages from the various security mailing lists = into a=20 "Security" folder that I can browse at my leisure and keep my = default folder=20 (my INBOX) for mail that I ought to respond to = quickly.=20

  • The pattern matching within procmail has all the power of Unix = regular=20 expressions. The mailing list rules I gave above can also be = expressed in a=20 single rule:=20
    # Some of the mailing lists I belong to. If any of =
    the TO_ headers (To:,
    # Cc:, etc.) match one of them then file in my INBOX.
    
    :0:
    * ^TO_.*(cert|secure-sol|MICROSOFT_SECURITY|efc-talk|unisog)
    ${DEFAULT}
    
    That rule says -- if any recipient header has some = string=20 (that's what `.*' means) followed by the word `cert' = or the=20 word `secure-sol' or etc. then file it in the = default=20 folder. The vertical bar separates possibilities, the round bracket = groups=20 them. The TO_ expression can be understood as a short form for=20 `(To:|Cc:|Resent-to:|Resent-Cc:)'. If you know about Unix regular=20 expressions you may have recognized the hat (`^') as a notation for = "the=20 beginning of line".=20

  • Procmail rules can be stacked as in:=20
    :0:
    * ^From:.*rwwatt@
    * ^Subject:.*Job Interviews
    Sensitive
    
    That rule says -- if the sender is Roger Watt = (`rwwatt'=20 is his userid) and the message is about "Job = Interviews"=20 then file it in my "Sensitive" folder. Stacking conditions means = each=20 condition must apply (ie. a logical and operation).=20

  • Procmail rules can be negated as in:=20
    :0:
    * ! ^From:.*rwwatt@
    * ^Subject:.*Job Interviews
    ${DEFAULT}
    
    That rule says -- if the sender is = not=20 Roger Watt and the message is about "Job Interviews" = then=20 the message should be filed in my default folder (ie. my=20 INBOX). The exclamation mark (`!') is a meta notation = for the=20 logical not operation.

I've only given the reader a very brief glimpse at procmail = filters.=20 Hopefully enough so you can deal with the unsolicited E-mail that = arrives your=20 way. And perhaps too some ideas about how you might manage your mail.=20

Beware -- Cautionary Tales:

Mail filtering can be a good idea, it can also get things=20 really mucked up if you're not careful. The following = observations=20 are to caution those who are overly cavalier:=20
  1. If you filter your mail into folders then you need tools that = understand=20 folders -- Outlook Express and Netscape Communicator are = IMAP=20 clients that understand folders on the IMAP mail = server. Make=20 sure you configure them as IMAP clients not as=20 POP clients (the POP mail protocol = only=20 understands your INBOX). You need to be comfortable = with=20 manipulating folders before trying to automatically file things in = folders.=20

  2. Any mail filtering you do is your responsibility -- use = at your=20 own risk. If you do any filtering test your = filter!=20 Don't assume it works -- it's easy to get things wrong. The=20 LOGFILE is a good place to look when things = don't make=20 sense.=20

  3. Neither the procmail mail filter = (discussed here)=20 or the slocal(1) filter (which I've mentioned in passing) are = vendor=20 supported products. They're both tools in the public = domain=20 which we happen to have installed. We have no extra documentation = beyond the=20 manual pages and very few are using either tool. The web is a = valuable=20 resource though and I've given some links below to sites which = should help.=20

  4. If you're trying to use these filters on Unix systems at other = sites=20 (especially those outside Waterloo) make sure your system has the = filters=20 before trying to use them. Further, note well the = location=20 of the filters at your site. The filing system=20 /.software/local/.admin/bins/bin is very much a=20 localism at Waterloo and the .forward file = shown will=20 not apply at other sites.=20

  5. If you chose to destroy mail which you've filtered as = spam (eg.=20 by filing it in /dev/null) it will be gone and cannot = be=20 recovered! That can have very nasty consequences if your = filter=20 isn't capturing the mail you want. For that reason I = recommend=20 that, at least for a while, that you file unsolicited mail = in a=20 folder and monitor it closely. If it turns out that all you've got = there is=20 spam, then change your filtering rules to toss the spam.=20

  6. If you're a member of any mailing lists -- = especially=20 mailing lists at other sites -- you need to be very careful=20 about your filtering rules. You need to put in rules that = will=20 capture that mail. Use my procmail rules as a starting point. =

  7. If you have a mail account on some other system (eg. say you're = on=20 sabbatical and only visiting) and forward your mail here then make = sure you=20 capture all your addresses!=20

  8. It's better to be liberal in your filtering -- = at least=20 to start. Eg. you'll note that I've filtered on uwaterloo.ca=20 addresses in the header rather than looking for=20 reggers@ist.uwaterloo.ca. Accepting all uwaterloo.ca = addresses=20 necessarily captures my address, it also captures a lot of local = mailing=20 lists I'm on. Likewise, filtering on addresses which contain the=20 unisog fragment captures mail for unisog@sans.org and = similar=20 addresses should that mailing list move.=20

  9. Finally, while it might sound appealing to automatically = send=20 back nasty notes to the spammers, and this can be easily = done with=20 a filter, it is not a good idea!. Most of the spam = you=20 receive will have a forged return address so replying to that will = just=20 result in bounced mail that lands back in your mailbox -- there is = no such=20 person as 45934@aol.com even though you just got some spam = from him.=20 Worse yet, some of the spam you receive might have a mailing list as = the=20 return address. Replying to that makes an existing problem even = worse --=20 you'll be a spammer!

Having said all that, mail filtering can be very helpful. I've = shown you=20 what I've done for my mail. It works well for me and will work well = for you if=20 you are cautious in developing your personal mail filter. =

See Also:

There's lots more good information on the web:=20 Any questions or concerns about = this=20 documentation should be addressed to the author.

(by) Reg=20 Quinton, Information Systems = and=20 Technology
29-September-2000
------=_NextPart_000_0015_01C16CE7.690AB880 Content-Type: image/gif Content-Transfer-Encoding: base64 Content-Location: http://ist.uwaterloo.ca/ISTlogo.gif R0lGODlhYABfALMAAP///wAAAJ2dnYAGHPDm6F1dXSwsLNK3vKJJWkZGRv7+/saQmrNreIKCgpIq PeHM0CH5BAEAAAAALAAAAABgAF8AAAT+EMhJq7046827/2AojmRpnmiqrmzrvuZxbAoB3yAzT8ww MJPHAOHY4Y4SI2AxUCQHB6ZzCR0ikUKGDXCASoSL3rQHcAAzhMP2SmIOtorfhFnkaQcLjcMhZ48I CE0TCGcbBGsXe10Pfh0EOhMEPkonj10OD4iNFwd7A4wSCwdTKZKieJsbCAt7DqQuXT6gCpSpAAoO C4SaLWmgAEK/jWu4XpsMDo+8OIF5Eq9sTg5EdVcKrM22EmY+DNDMA8nI32wPZsJHgEXhBOTat3vL KQqBMgju2g9drDfF6O8UOmHDh4LAAoLvEARyoaAQQA7XaqFghfDhhUcM/oWQhMrih2v+PpyZIMDH oUcNgfZU9HBA40kLZm693KQPxQMG3jIcaMCTp4cHDwQIHSpgCwGiSJMOlXgCFx8EGQQEmBrAgKEC BqhqnbpD6tavXwtguKlFhBBaDqJuzaCgAditOw68nRtArIU4Q6BuDBcTg9epVi8oSECXatfCYO1W MCVFxIFV+P5WxVAAMdcklrcqrpAr3EoRAgyINpDgguTCcUUnUJ1V62rSohtgeLTA5QVR8kQQaE21 AFMNbqkGhkHkUwq5WmWPCA4YRycfuU4wn/rZwvThH/SJ3BDl8W8JAlav3iyhcu8S10HgDQdLK/by WsmDSA9ClLncJCS/B2B+qvwP9H3+cAAyxqygn3Xu4QeceyHQYdwGQoTEwYEVnNZAdRQECMIDCnHQ BU7fgcfgYl8ZUFQIGjriQx8aCKTGhCNW0N9WBhRwIgcpdoBbBz1MAqNwF/E2l4345OgCUAoCQKEF u2XWwDJGarMkk9PRZQAvUWpwyAtTXhQeYldWkGUG65g0G5IexmjITqu9pdwEXXJAEgNEbMDHitv5 peYHRwlJXYZ7yllSWijhtEBtPzZHwmBbCQAokCHgVBMLcYJAgGaPKupRpSDMWNoEY27CKYBqhtrI qB5MaaofqHag4ao2VdQqB55mOlkID2D4WJ6cbJUkW1u9CQCsc5hR3T4IKRBsCTP+BqDErIv15cEh tlFAWHzV3nWAnwG8QuwzW76A3FfKKcAabOi6KWagDzXr3zOZJbgupC8xiikAl8Z72byaspGkAs3a pWy8Bmi06iMotERIBwdcW5cE+SKWgADurLqPCXixku0EtAiwgwJKIZUrhEjJqU+BIxSxh5kzwbTi gyMUg0nLNGgn7QgMHEQzWxTgEmJ29+xMwS5BqBCLzjs35AMCvwoYThZCd/EIOytcQ09HNEuyBxMb R4r1zo/lnEwvOxiURyYPDbgGMk2TEMgPN6eCDB9tV4JHLHWrAAcUwfg7Dd1yuyJBOGz7MXUt7Rxx Mb5EN0LLBcgskLfXk18x9Q94lWNACyhaC91D4aV44sxjmePQOQtMAFCn0KEUiI1tq0MchRVUsMLy TBR98UNjSzROhC42aN2DDdeYUbotHwZiw4c4SfB5R9OYwnoH9CjifBNrDwBAH2KsMv20I+M7RB9S 9PUA0t//IUMkB6Gd/vvwxy///PTXz0EEADs= ------=_NextPart_000_0015_01C16CE7.690AB880 Content-Type: image/jpeg Content-Transfer-Encoding: base64 Content-Location: http://ist.uwaterloo.ca/security/howto/images/back.jpg /9j/4AAQSkZJRgABAgEASABIAAD/7QG4UGhvdG9zaG9wIDMuMAA4QklNA+kAAAAAAHgAAwAAAEgA SAAAAAAC2gIo/+H/4gL5AkYDRwUoA/wAAgAAAEgASAAAAAAC2gIoAAEAAABkAAAAAQABAQEAAAAB Jw8AAQABAAAAAAAAAAAAAAAAAAIAGQGQAAAAAABAAAAAAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAA4 QklNA+0AAAAAABAASAAAAAEAAQBIAAAAAQABOEJJTQPzAAAAAAAIAAAAAAAAAAA4QklNJxAAAAAA AAoAAQAAAAAAAAACOEJJTQP1AAAAAABIAC9mZgABAGxmZgAGAAAAAAABAC9mZgABAKGZmgAGAAAA AAABADIAAAABAFoAAAAGAAAAAAABADUAAAABAC0AAAAGAAAAAAABOEJJTQP4AAAAAABwAAD///// ////////////////////////A+gAAAAA/////////////////////////////wPoAAAAAP////// //////////////////////8D6AAAAAD/////////////////////////////A+gAADhCSU0EBgAA AAAAAgAC/+4ADkFkb2JlAGSAAAAAAf/bAIQADAgICAkIDAkJDBELCgsRFQ8MDA8VGBMTFRMTGBEM DAwMDAwRDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAENCwsNDg0QDg4QFA4ODhQUDg4ODhQR DAwMDAwREQwMDAwMDBEMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwM/8AAEQgAYABgAwEiAAIR AQMRAf/EAT8AAAEFAQEBAQEBAAAAAAAAAAMAAQIEBQYHCAkKCwEAAQUBAQEBAQEAAAAAAAAAAQAC AwQFBgcICQoLEAABBAEDAgQCBQcGCAUDDDMBAAIRAwQhEjEFQVFhEyJxgTIGFJGhsUIjJBVSwWIz NHKC0UMHJZJT8OHxY3M1FqKygyZEk1RkRcKjdDYX0lXiZfKzhMPTdePzRieUpIW0lcTU5PSltcXV 5fVWZnaGlqa2xtbm9jdHV2d3h5ent8fX5/cRAAICAQIEBAMEBQYHBwYFNQEAAhEDITESBEFRYXEi EwUygZEUobFCI8FS0fAzJGLhcoKSQ1MVY3M08SUGFqKygwcmNcLSRJNUoxdkRVU2dGXi8rOEw9N1 4/NGlKSFtJXE1OT0pbXF1eX1VmZ2hpamtsbW5vYnN0dXZ3eHl6e3x//dAAQABv/aAAwDAQACEQMR AD8A9GHtHKbcToFKQmn5BFSwlS8p1TaKQLQElLFgTe3hJz5UNZSUkcdICZo8U0kJCSkpnP7o+aGT rqiF2kKOwclJSzXaqTgCowJTgT3SU//Q9FAcE+3u5Rc4ypAyNSipQMmAE5b4mEwIHCkAOTqUlMDH ASACRBJ0TQQUlMnN00TNHjwn3GIT699ElK3DwTF274JbR3PyShJSgB2GiRmNE4BPJ0UiQElP/9H0 QCU5GibcVMERqipgCZRI8SogDnukZnQJKZEho05UdeSmg904geZSUrhNukqYb4qLi1qSlvhqniOT qnDhGiiG6yUlMtY0US091MEnRo+ag+Qkp//S9GO3gJASkPIJ4J0CKlSBwmk/BLYZklOGg8lJSwPh qnBhMSQYCcDTXRJSxeeygZKmSANAo6JKUAQlOvKm0AhRLDKSmQPYapbZ1KW6BCYvJ0SU/wD/0/Rw AdJUpa1QGvCRb4mEVLPdPCZpSjXRSASUtu8FJuupUHAgqTYiSUlLu93CiWQFOWhR55OiSlhITgHu lpymJKSlyG90vhoEhPhqlEauPySU/wD/1PRwY0ak4QOZKZvkJKltJ5RUj1TsKk4gaBRhJS5IJ4SM p5ACaZSUrb4n5JAeSbvpqpapKXAJ1KTiAmc48BQM90lLh5lORPKiG90jokp//9k= ------=_NextPart_000_0015_01C16CE7.690AB880--