From: Subject: No title Date: Thu, 18 Apr 2002 17:28:32 +0200 MIME-Version: 1.0 Content-Type: text/html; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable Content-Location: http://pm-doc.sourceforge.net/pm-tips-body.html X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 No title


1.0 Document id

1.1 General

$Id: pm-tips.txt,v 2.5 2002/02/03 = 22:22:42=20 jaalto Exp $
$Keywords: procmail, sendmail, formail, = mail,=20 UBE, UCE, spam, filter $
$URL: http://pm-doc.sourceforge.net/ $
$Contactid:=20 jari.aalto@poboxes.com $
$UrlLinksLastChecked: 2002-02-02 $

This is a procmail tips page: a collection of = procmail=20 recipes, instructions, howtos. The document also contains URL pointers = to the=20 procmail mailing list and sites that fight against Internet UBE. = Procmail is=20 powerful mail handling tool and a lot of space here has been devoted = to=20 discuss about UBE (aka Spam) and its essence. You will also find many = other=20 interesting subjects that discuss about internet mail in general: mail = headers, MIME and RFCs. Another part of this document is dedicated to = Emacs=20 and Emacs plug-in package Gnus.el, simply because Emacs is the best = tool you=20 can use to deal with your mail and news reading. Nowadays Emacs is = also=20 available in Windows platform as well. This is not to say that = existing Unix=20 elm(1), mutt(1) or pine(1), slrn(1) mail/news programs are bad, they = are just=20 limited in power compared to Emacs and usually tied to Unix platform. = Finally,=20 to your blessing or curse (smile) the author = happens to=20 know Emacs quite well. The tips are compiled from the procmail = discussion=20 list, from comp.mail.misc and from the author's own experiences with = procmail.=20

This document does not intend to teach you the = basics of=20 procmail, instead you have to be familiar with the procmail man pages = already.=20 Procmail manual pages exists primarily on Unix/Linux platform, If = You're using=20 Windows operating system, see Cygwin at http://www.cygwin.com/=20

You may want to read Nancy's = and Era's procmail FAQ pages before this page. = Especially Era's=20 link page contains an excellent collection of useful procmail links = and=20 pointers to Unix programs that deal with mail If you find errors or = things to=20 improve in this document, please send mail to this document's = Maintainer.=20

If any mentioned URL is not alive, you may still be = able to=20 successfully find it using the WWW search such as http://www.google.com/

1.2 What is Procmail?

[FAQ] = Procmail is a=20 mail processing utility, which can help you filter your mail, sort = incoming=20 mail according to sender, Subject line, length of message, keywords in = the=20 message, etc, implement an ftp-by-mail server, and much more. Procmail = is also=20 a complete drop-in replacement for your MDA. (If this doesn't mean = anything to=20 you, you may not want to know.) Procmail runs under Unix. See Infinite = Ink's=20 Mail Filtering and Robots page for information about related utilities = for=20 various other platforms, and competing Unix programs, too (there = aren't that=20 many of either).=20

1.3 Abbreviations and thanks

People and documents, = abbreviations=20 referred to, tokens used, are in no particular order.=20

[stephen] Stephen R. = van den=20 Berg, Author of Procmail Last heard from stephen 1997-08 in procmail = mailing=20 list by using address srb@cuci.nl.=20 Later 1998 due to his regular work activities and lack of time he = nominated=20 Philip Guenther to the head of Procmail development.=20

[aaron] Aaron Schrab = aaron+procmail@schrab.com
[alan] Alan K. Stebbens alan.stebbens@openwave.com=
[dan] Daniel Smith J.Daniel.Smith@WriteMe.com=
[david] David W. Tamkin dattier@mcs.net
[ed] Edward J. Sabol sabol@alderaan.gsfc.nasa.gov=
[elijah] Eli the Bearded process@qz.little-neck.ny.us=
[hal] Hal Wine hal@dtor.com
[jari] Jari Aalto jari.aalto@poboxes.com[philip] Philip Guenther guenther@gac.edu
[richard] Richard Kabel rkabel@sequent.com
[sean] Sean B. Straw PSE-L@mail.professional.org
[timothy] Timothy J Luoma luomat+procmail@luomat.pe= ak.org
[walter] Walter Dnes waltdnes@interlog.com

[FAQ] Procmail FAQ = era@iki.fi
[manual] Quote from some procmail manual = page
[maintainer] As of 2000-09 the maintainer is = [jari]
#broken-link Link does not exist any = more. A=20 replacement is needed

  • PM-L, Procmail mailing list <http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail= >=20 and SmartList <http://MailMan.RWTH-Aachen.DE/mailman/listinfo/smartlist>=20 See also http://www.procmail.org/era/lists.html=20
  • FAQ-L, Faq Maintainers mailing list http://www.landfield.com/FAQ-maintainers/FAQ-server/ = http://lists.consensus.com/scripts/lyris.pl?visit=3DFAQ-mai= ntainers=20 http://www.qucis.queensu.ca/FAQs/FAQaid/=20
  • DING-L, Emacs Gnus mail/newsreader mailing list (ding). http://www.gnus.org/ =
  • <<text>> Text has been rephrased or modified that = does not=20 exist in the original source.

A big Thank you goes all these people:=20

  • 1999-06-16 Mark Seiden mis@seiden.com Did a = enermous work to=20 proofread the v1.74. He sent a massive 105k wpatch ith many = editorial=20 corrections. My wholeheart thank you to you, Mark.=20
  • 1999-01-08 Steven Alexander stevena@teleport.com = thought=20 that a small perl script would help me to fix spelling mistakes more = easily.=20 The script has been much better correction program than that I = myself. Thank=20 you. (Being a perl programmer myself, I should have thought thia = laready smile)=20
  • 1999 Guido.Van.Hoecke@se.be= l.alcatel.be=20 took 1.48 and sent a huge 55k patch to correct many English language = typos.=20 Thank you very much Guido.=20
  • 1998-10-28 Richard Kabel rkabel@sequent.com sent = massive=20 patch to correct language and provided excellent improvement = comments. Thank=20 you Guido for spending the time with it.=20
  • 1998 Era Eriksson proof read the v1.12 and sent numerous = corrections.=20
  • Karl E. Vogel vogelke@c17mis.region= 2.wpafb.af.mil=20 sent numerous new anti-spam links to be added to the document.=20
  • 1998 John Gianni jjg@cadence.com send some = nice=20 recipes: one is now in the procmail module list and the other ideas = I have=20 added to this tips file.=20
  • 1998 Tim Potter tpot@zip.com.au had a spare = moment=20 with v1.27 and sent lot of spelling corrections. Thank you. =

1.4 Version information

Here is version and file size log of = the text=20 file, which gives you some estimate how the document has evolved.
      v2.5    2002-02-01  608  Spelling checked with Emacs =
ispell
      v2.2    2002-01-28  608  URL links checked and updated
      v2.0    2001-08-09  608  http://pm-doc.sourceforge.net/ opened.
      v1.77   1999-12-27  603  Netscape spam filters added
      v1.76   1999-10-01  602  Mark Seiden's patch applied. Now under =
CVS.
      v1.74   1999-04-26  599  document moved to www.procmail.org
      v1.72   1999-04-21  597  Links corrected
      v1.71   1999-03-29  597  Ricochet -- Perl script to fight UBE
      v1.70   1999-02-26  592  procmail's Y2K compliance
      v1.69   1999-02-23  590  RFC and using MIME in Usenet postings
      v1.68   1998-01-29  587  Added "Lua" language pointer
      v1.67   1998-01-07  579  Eli's procmail recipes in module section
      v1.66   1998-12-14  578  Philip took care of bugs/patches listing
      v1.64   1998-11-26  602  More Richard's comments integrated
      v1.63   1998-10-30  595  Richard's english correction patch
      v1.60   1998-10-21  591  UMASK, .forward if procmail already is =
LDA
      v1.58   1998-10-12  583  SmartList and other MLM software =
discussed
      v1.57   1998-10-06  575  PLUS addr. Convert HTML body to text
      v1.55   1998-08-29  565  Fetching fields with formail -x
      v1.53   1998-08-24  554  Procmail doesn't pass 8bit characters
      v1.52   1998-08-24  553  Flag c forking study, procmail wish list
      v1.51   1998-08-18  541  Small changes. MIME notes
      v1.49   1998-08-10  529  Guido.Van.Hoeck's 55k patch applied
      v1.46   1998-06-24  526  Added live urls to procmail archive
      v1.45   1998-06-23  521  All recipes checked by eye. Many fixes.
      v1.44   1998-06-19  516  Detecting mailing lists with pm-jalist.rc
      v1.41   1998-06-17  510  How to disable recipe quickly with
      v1.36   1998-04-03  493  Includerc rewritten, plus addressing
      v1.34   1998-04-02  488  ORing and supreme scoring added
      v1.32   1998-03-23  471  All recipes checked (by eye)
      v1.31   1998-03-10  469  Better ordering: ORing rules discussed
      v1.29   1998-01-30  429  "regexp" section rewrite.
      v1.24   1997-12-30  415  up till 1996-12 is now included
      v1.17   1997-12-09  343  up till archive 1996-07 now included
      v1.14   1997-11-25  260
      v1.13   1997-11-08  218  Era's correction suggestions.
      v1.10   1997-10-13  181  archive file 1995-10's tips included
      v1.9    1997-10-11  142
      v1.8    1997-10-01  127
      v1.6    1997-09-18  94
      v1.5    1997-09-16  76
      v1.05   1997-09-14  53
      v1.01   1997-09-13  46 (k)

1.5 Document layout and maintenance

In order to be able to = maintain=20 this documentation in every possible platform, the base version of = this=20 document is kept in text format, which is easily accessible and = requires no=20 special editors or learning a markup language like LaTex, Texinfo, or = Linux=20 DocBook SGML. Granted, that some other base format may be more = suitable for=20 multiple presentation output formats (like postscript, Emacs info), = but in=20 today's world a simple TEXT and generated HTML hopefully suffices to = all=20 needs. Also Perl and Emacs are cross-platform tools, (Windows, Unix = ..) and=20 easily installed, so getting work is hopefully no obstacle. The tools = to help=20 maintaining this document include (not required!):=20

Text version of this file was converted into HTML = with=20 following command. You need Perl interpreter 5.4 or newer to call t2HTML.pl script. The --Out option=20 generates file pm-tips.html in current directory. Please also = familiarize=20 yourself with GNU RCS ident(1), if you have it available. It is = important that=20 you mark interesting text to these tools so that someone can get an = overview=20 of your supplied files

      % perl5.004 -S =
t2HTML.pl                                        \
        --HTML-frame                                                  \
        --title   "Procmail tips page"                                \
        --author  "Jari Aalto"                                        \
        --email   jari.aalto@poboxes.com                              \
        --meta-keywords "procmail, sendmail, mail, filter, FAQ, ube"  \
        --meta-description "Procmail tips page"                       \
        --base      http://pm-doc.sourceforge.net/                     \
        --document  http://pm-doc.sourceforge.net/                     \
        --url       http://pm-doc.sourceforge.net/                     \
        --HTML-body "LANG=3Den"                                         =
\
        --Out                                                         \
        pm-tips.txt

1.5.1 Sending = improvements=20

Because I'm not English speaking, I regret any = typos in the=20 document. If you have any time, 5-10 minutes to find some spelling = mistake or=20 misuse of the English verbs, please go ahead and send a patch to = maintainer of=20 this page. The preferred way to send corrections to this document is = as diff(1) output. Here's how to make corrections = send them=20 forward. The diff option -u is only = available in GNU=20 diff, please try to send the -u diff if possible. If you don't have -u = option,=20 use -c option:

      %   =
cp pm-tips.txt pm-tips.txt.orig

      ... load the pm-tips.txt to your text editor
      ... edit the file and save
      ... Generate the difference (a patch(1) compatible file)

      %   diff -bwu pm-tips.txt.orig pm-tips.txt > pm-tips.txt.patch

      ...Send content of pm-tips.txt.diff by mail to document =
maintainer.

1.6 About presented recipes

The recipes presented here are = collected=20 from the net and procmail archives. The recipes have been kept as = original as=20 possible, but a generalization of the ideas have been done when = necessary. If=20 some recipe doesn't work as announced, please a) send note to [maintainer] b) send mail to procmail mailing = list and=20 ask how to correct it. Sometimes a simple dot(.) has been used in = regular=20 expressions, where the right, pedantic way would have been to use an = escaped=20 dot. If you want to be very strict, you should use the escaped dot = where=20 applicable.
      # free hand version     # pedantic =
version
      :0                      :0
      * match.this.site       * match\.this\.site

Procmail also accepts assignments without quotes, = like this:

      var =3D value
      num =3D 1
      dir =3D /var/mail

But in this document a strict style has been = adopted, where=20 literal strings are assigned with double quotes:

    =
  var =3D "value"

That's because the procmail code checker (Emacs = package tinyprocmail.el) then won't warn about missing = dollar-sign,=20 which might have very well been forgotten. Emacs package font-lock.el, a syntax highlighting assistant, also = displays=20 double quoted string in color.

      #   If you do =
this...

      var =3D value

      #   then you might have made a typo. It is in fact not clear
      #   what was intended:

      var =3D "value"   # Did you mean:  literal assignment?
      var =3D $value    # Did you mean: variable assignment?

Recipe flags are also not stuck=20 together, because the visual distinction of :0 and=20 flags is a valuable one. Reasoning for which = flags are=20 kept together and in which order is explained later in details.

      # Erm, all stuck]      # This may be visually more =
clear
      :0ABDc:                :0 A BDc:

1.7 Variables used in recipes

These are part of the procmail = module=20 pm-javar.rc and are used in recipes.
      #   Pure newline; typical usage if you want to write
      #   Something directly to procmail's active logfile:
      #
      #       LOG =3D "$NL message $NL"

      NL =3D "
      "

Refer to "improving Space-Tab syndrome" section for = more=20 details

      WSPC    =3D "     "               # =
whitespace: space + tab

      SPC     =3D "[$WSPC]"             # Regexp: space + tab
      SPCL    =3D "($SPC|$)"            # whitespace + linefeed: =
spc/tab/nl
      NSPC    =3D "[^$WSPC]"            # negation

      s       =3D $SPC                  # shortname: like perl -- \s
      d       =3D "[0-9]"               # A digit -- Perl \d
      w       =3D "[0-9a-z_A-Z]"        # A word  -- Perl \w
      W       =3D "[^0-9a-z_A-Z]"       # A word  -- Perl \W
      a       =3D "[a-zA-Z]"            # A word, only alphabetic chars

Writing recipes is now a little easier and may look = more=20 clear at least to people that have accustomed reading Perl regular = expression=20 short names:

      :0
      *$ Header-Name:$s+$d+$s+$d      # Matches "Header: 11 12"
      {
          # Matched "whitespace" + "digit" + "whitespace" + "digit"
          # Do  something
      }

SUPREME =3D = 9876543210, is the=20 highest score value that causes procmail to bail out. [david] Actually the maximum is 2147483647, = but=20 9876543210 is easier to remember/type and will function just as well.=20

PMSRC =3D Procmail = module source=20 code directory. Location where *.rc files = reside.=20 Anywhere you want it to be. Usually $HOME/pm or $HOME/procmail/lib. = Here you=20 can keep the procmail files, log files and includerc scripts. Another = common=20 used synonym is PMDIR.=20

SPOOL =3D Directory = where your=20 procmail delivers the categorized messages. Like mailing lists:

      list.procmail, list.lynx-users, list.emacs, list.elm

and work mail:

      =
work.announcements, work.lab, work.doc, work.customer

and your private message:

      =
mail.Usenet, mail.private, mail.default, mail.perl

and unimportant messages

      =
junk.daemon, junk.cron, junk.ube

If you read the procmail-delivered files directly, = this=20 directory is usually $HOME/Mail or $HOME/mail. If you use some other = software=20 that reads these files as mail spool files (like Emacs Gnus), then = this=20 directory is typically ~/Mail/spool or similar.=20

MYXLOOP_ =3D Used to = prevent=20 re-sending messages that have already been handled. Typically $LOGNAME@$HOST, but this can be any user chosen = string. Make=20 it it unique to your address. In this document the definition is:

      MY_XLOOP =3D "X-Loop: $LOGNAME@$HOST"

SENDMAIL =3D Program = to deliver=20 composed mail. Usually standard Unix sendmail(1), but=20 it must have some switches with it. See man page for more. We use = following=20 definition in scripts:

      SENDMAIL =3D "sendmail =
-oi -t"

NICE =3D In a Unix = environment you=20 can lower the scheduling priority with nice(1). If you=20 are conscious of how many external processes you launch for each piece = of mail=20 it would be polite to lower the priority of such processes. You may = see in=20 this document that external processes are called with NICE enabled:

      :0 w         =
       # Same as "nice -10 script.pl"
      | $NICE script.pl

IS functions; = Functions to test=20 file or directory attributes. E.g. IS_EXIST is defined as "test -e" = and so on.=20 The definition of IS functions are=20 system-dependent. E.g. On Irix the "-e" option is not recognized and = the=20 nearest equivalent is "test -r". All IS = functions=20 are defined in the pm-javar.rc module. =

1.8 About "useless use of cat award"

Randal Schwartz, a = well-known=20 Perl programmer and Perl book writer, started giving rewards for the = "useless=20 use of cat command" whenever someone wrote examples without token = "<". Like=20 this:
      % cat file.name.this | wc -l

Instead he writes that the call should have been = written like=20 this, which saves the pipe (never mind that wc can=20 read the file directly; this is an example).

      % =
wc -l < file.name.this

[Paul David Fardy pdf@morgan.ucs.mun.ca] = There is=20 weight in the pipeline, but the true cost is in process startup. Try = running=20 wc 100 times on /etc/motd or on this message. My tests show the = useless use of=20 cat doubles the real and processing time (real, user, and system time = are each=20 roughly doubled):

      $ cat > /tmp/randall =
<<'EOF'
      [[ -n $COUNT ]] || COUNT=100
      typeset -i i=3D1
      while (( i < $COUNT )); do
              < /etc/motd wc;
              (( i =3D i + 1 ))
      done > /dev/null
      EOF

      $ cat > /tmp/useless <<'EOF'
      [[ -n $COUNT ]] || COUNT=3D100
      typeset -i i=3D1
      while (( i < $COUNT )); do
              cat /etc/motd | wc;
              (( i =3D i + 1 ))
      done > /dev/null
      EOF

      $ set -x
      $ export COUNT0
      $ time ksh /tmp/randall
      $ time ksh /tmp/useless

This becomes important, for example, when you = decide to=20 filter all your mail with procmail--looking for virus signatures for = example.=20 I might well decide to look only at the first 3 or 4 kilobytes. It's = not the=20 size of messages--most are small anyway--but the number of messages = that cause=20 a problem. Do you want to double the processing cost of all our mail? = I'm=20 looking at a system-wide filter for all my users' mail. I'm = considering=20 Sendmail's mail filter versus procmail filtering. I'll likely be using = a bit=20 of both. And given that all of the filtering really just getting in = the way of=20 legitimate traffic, it'd really piss me off if I naively doubled the = cost.=20


2.0 Procmail pointers

2.1 Where is procmail developed

Philip Guenther guenther@gac.edu is = currently taking=20 care of and coordinating procmail bug fixes. Please send any procmail = bugs to=20 the mailing list or to bug@procmail.org. The = development=20 mailing list is running SmarList at procmail-dev@procmail.org.=20 Newest Procmail code:
      http://www.procmail.org/
      ftp://ftp.procmail.org/

Manual pages

      http://www.voicenet.com/~dfma/intro.html

2.2 About procmail's Y2K compliance

Please consult Philip = Guenther=20 guenther@gac.edu for = more up to=20 date details. Philip is the Procmail maintainer currently.=20

[1998-09-23 Bennett Todd bet@mordor.net in Message-Id:=20 <19980923164230.C30594@fcmc.com>] Well, from a simple grep over = the=20 sources, it looks like there may be a Y2038 problem in the autoconf = test code:=20 unsigned otimet =3D time(). And another, possibly less likely to = express itself,=20 in formail.c: unsigned long h1 =3D time(). Those could express = themselves when=20 32-bit signed time_t wraps; long before then the time_t define should = have=20 been changed to something that is bigger, even if it's "long long". = The above=20 type-mixes may fail to profit from a suitably redefined time_t, and so = may=20 overflow on 2038.=20

I don't see any Y2K problems, though. And mail = headers use=20 four-digit years pretty consistently, so that should all be cool. This = estimation doesn't constitute an in-depth Y2k audit of procmail, but = the=20 source code to procmail is ... kinda dense for in-depth auditing.=20

[1998-09-25 Bennett Todd Message-Id:=20 <19980925093902.B12428@fcmc.com>] As I see it there are at least = three=20 measures that a whole mail system, taken in aggregate, could use for = Y2K=20 checking. First, capture a vast cross-section of traffic and make sure = no mail=20 software is using 2-digit years. I don't recall having seen any, but = it's=20 still worth checking. Second, generate a load of traffic with 2000 and = 2001=20 dates and shove it through all the channels. And third, run all the = systems=20 end-to-end with their system clocks rolling over the millennium. =

2.3 Procmail resources

Procmail is discussed in Usenet = newsgroup comp.mail.misc
.=20

Procmail archive
ftp://ftp.informatik.rwth-aachen.de:/pub/packages/procmail/= =20 Articles from procmail mailing list: covers from 1994-08 to 1995-05 (A = .gz=20 file: ~2Meg when uncompressed)=20

And latest articles can be found here, hosted by = Achim Bohnet=20 Covers from 1995-10 to the present day. ach@mpe.mpg.de. The www page = has nice=20 search capabilities. http://www.rosat.mpe-garching.mpg.de/mailing-lists/procmail= /=20 http://www.rosat.mpe-garching.mpg.de/~ach/exmh/archive/proc= mail/=20

Era's Procmail = FAQ
http://www.iki.fi/~era/procmail/mini-FAQ.html http://www.dcs.ed.ac.uk/~procmail/FAQ/ [mirror] Also available by mail, the ITEM can = be:=20 links.html, mini-FAQ.html, procmail-FAQ

      To: =
era+pr@iki.fi
      Subject: send ITEM

Era's Procmail Link=20 collections
http://www.iki.fi/~era/procmail/links.html ...A page = full of=20 good links to the world of procmail=20

Professor Timo Salmis's = Procmail page --=20 EXELLENT
http://www.uwasa.fi/~ts/info/proctips.html=20

Joe Gross's short Procmail=20 tutorial
http://www.procmail.net/ jgross@stimpy.net ...Using = procmail=20 and a feature of ph you can set up your own mailing list without = needing root=20 on your own machine.=20

Google's procmail = pointers
http://directory.google.com/Top/Computers/Software/Internet= /Clients/Mail/Unix/Procmail/=20

Eli on = Procmail
1998-12-08 Eli=20 the Bearded *@qz.to announced = in=20 comp.mail.misc that he had made his procmail modules available at http://www.FAQs.org/FAQs/mail/addressing/ You may = find=20 interesting procmail code there but the modules themselves are not = general=20 purpose plug-in modules that you could use right = away.=20 Some functionality included:

      Inline decoding =
of MIME text attachments        (rc.mime-decode)
      Cleansing of obscure "Re:" formats in subject   (rc.pre-list)
      Nifty autoresponder                             (rc.qz-2)
      Sophisticated duplicate mail catching           (rc.dupes)
      Example of using my mail bouncer                (rc.lists-out)
      Detection of some classes of autoreplies        (rc.daemon)
      Various junk mail filtering                     (rc.filter)
      Daily log files                                 (rc.vars)

2.4 Procmail mode for Emacs

If you use Emacs, See Procmail=20 programming mode tinypm.el at <http://tiny-tools.sourceforge.net/> and it can be = used to=20 syntax check procmail recipes. Here is an example of its output:
      *** 1997-11-24 22:13 (pm.lint) 3.11pre7 tinypm.el =
1.80
      cd /users/jaalto/junk/
      pm.lint:010: Warning, no right hand variable found. ([$`']
      pm.lint:055: Pedantic, flag orer style is not standard `hW:'
      pm.lint:060: Warning, message dropped to folder, you need lock.
      pm.lint:062: Warning, recipe with "|" may need `w' flag.
      pm.lint:073: Warning, Formail used but no `f' flag found.

2.5 Procmail module library project

2.5.1 Where to get the = modules=20

Procmail module = library
Hosted=20 at sourceforge CVS server and open for anyone to participate. Visit http://pm-lib.sourceforge.net/=20

Alan's procmail = modules
Send=20 subject "send procmail library" to Alan Stebbens alan.stebbens@software.com= =20 or alan.stebbens@openwave.com= =20

Concordia scripts
http://alcor.concordia.ca/topics/email/auto/procmail/ = ...We=20 provide sample sets of recipes to get you started. The great thing = about the=20 concordia scripts is the fact that they are designed to run from a = central=20 location and be called from a procmailrc installed in the user's = ~/home=20 directory.
webdoc@alcor.concordia.ca=20

"David's" David Hunt dh@west.net ...My .procmailrc and = .forward=20 files can be viewed at http://www.west.net/~dh/homedir/pmdir/=20

2.5.2 Terminology=20

subroutine =3D A piece of = code that gets=20 something in INPUT and responds with OUTPUT. Subroutine is not message specific.=20

recipe =3D A piece of code = that is somewhat=20 self contained: It reads something from the message or does something=20 according to matches in message. Recipe may be message-specific.=20

2.5.3 Foreword to using = modules=20

In the module listing, some of the modules are = recipes and=20 some can be considered subroutines. Let's take the address exploder = module=20 that was discussed a while ago. First, visualise following familiar=20 programming language pseudo code:=20

(ret-val1, ret-val2 ...) =3D Function( arg1, arg2, = arg3 ...)=20

Function may return multiple = arguments=20 and multiple arguments can be passed to it. Clear so far. Let's show = how this=20 applies to procmail modules:

      RC_FUNCTION  =3D =
$PMSRC/pm-xxx.rc # name the subroutine/module
      RC_FUNCTION2 =3D ...

      INPUT       =3D "value"           # Set the arg1 for module
      INCLUDERC   =3D $RC_FUNCTION      # Call Function( $arg1 )

      :0                              # Examine function ret val
      * ERROR ?? yes
      ...

This should be pretty clear too. You just have to = look into=20 the subroutine/module which you intend to use, to find out what = arguments it=20 wants which you need to set (INPUT) before calling it. The = documentation also=20 tells you what values are returned, e.g. one of them was ERROR.=20

If it were recipe/module, the call would be almost = the same,=20 but instead of returning values, the recipe/module most likely does = something=20 to your message or writes something to the data files etc. A Recipe/module is much higher level, because it may = call=20 multiple subroutine/modules. The distinction between subroutine and = recipe=20 module type is not crystal clear, but I hope the above will clarify a = bit the=20 Procmail module/subroutine/recipe concept.=20

2.5.4 Header file modules =

These are like #include .h files in C, they define = common=20 variables, but do not contain actual code.=20

  • pm-javar.rc -- Defines standard variables: SPC WSPC NSPC SPCL = and perl=20 styled \s \d \D \w \W and \a \A (alphabetic characters only)=20
  • headers.rc -- From Alan's procmail-lib. Define standard regexp = and=20 macros: address, from, to, cc, list_precedence

2.5.5 General modules=20

  • pm-jafrom.rc -- Derive FROM field without = calling=20 formail unnecessarily. If all else fails, = use=20 formail.=20
  • get-from.rc -- From Alan's procmail-lib. = get the=20 "best" From address. Sets FROM and FRIENDLY, the latter being the = "friendly"=20 user name sans address.=20
  • pm-jaaddr.rc -- Subroutine to extract = various mail=20 components from INPUT. Like address=3Dfoo@some.com, net=3Dcom, = account=3Dfoo...=20
  • pm-jastore.rc -- Subroutine for general = mailbox=20 delivery. Define MBOX as the folder where to drop message and this=20 subroutine will store it appropriately. Supports single mboxes, = ".gz" mbox=20 files, directory files and MH folders with rcvstore.

2.5.6 Low-level Date and time=20 handling=20

For these, you get the date string from somewhere, = then feed=20 it to some of these subroutines:=20

  • pm-jatime.rc -- a low-level subroutine. = Parse time=20 "hh:mm:ss" from variable INPUT=20
  • pm-jadate1.rc -- a low-level subroutine. = Parse date=20 "Tue, 31 Dec 1997 19:32:57" from variable INPUT=20
  • pm-jadate2.rc -- a low-level subroutine. = Parse ISO=20 standard date "1997-11-01 19:32:57" from variable INPUT=20
  • pm-jadate3.rc -- a low-level subroutine. = Parse date=20 Tue Nov 25 19:32:57 from variable INPUT=20
  • pm-jadate4.rc -- Call shell command "date" = once to=20 construct RFC "Tue, 31 Dec 1997 19:32:57" and parse the YY MM HH and = other=20 values. You usually use this subroutine if you can't get the date = anywhere=20 else.

2.5.7 Higher-level Date and time=20 handling=20

You use these recipes to get the date directly from = the=20 message:=20

  • pm-jadate.rc -- higher-level recipe. Read = date from=20 message's headers: From_ Received, or call shell date if none succeeds.=20
  • date.rc -- higher-level recipe. From = Alan's=20 procmail-lib: parse date or from headers Resent-Date:, Date, and = From=20

2.5.8 Forwarding and account=20 modules=20

  • pm-japop3.rc -- Pop3 movemail implemented = with=20 procmail. You can send a "pop3" request to move your messages from = account X=20 to account Y. Each message is send separately. This recipe listens = to "pop3"=20 requests.=20
  • pm-jafwd.rc -- control forwarding = remotely. You can=20 change the forward address with a "control message" or turn = forwarding=20 on/off with a "control message"=20
  • pm-japing.rc -- Send short reply when = subject=20 contains the word "ping" to show that the account is up and mail = address is=20 valid.=20
  • correct-addr.rc -- From alan's procmail = lib. To help=20 forward mail from an OLD address to a NEW address, and do some = mailing list=20 mail management. This recipe file is intended to make it easy for = users to=20 forward their mail from their old address to a new address, and, at = the same=20 time, educate their correspondents about it by CC'ing them with the = mail.=20

2.5.9 Vacation modules=20

  • pm-javac.rc -- A framework for your = vacation=20 replies. This recipe will handle the vacation cache and compose an = initial=20 reply; which you only need to fill in. (Like putting vacation = message to the=20 body)=20
  • ackmail.rc -- From Alan's procmail lib. = procmail rc=20 to acknowledge mail (with either a vacation message, or an = acknowledgment)=20

2.5.10 Message-id based = modules=20

  • pm-jadup.rc -- Handle duplicate messages = by=20 Message-Id. Store duplicate message in separate folder.=20
  • dupcheck.rc -- From Alan's procmail-lib. = If the=20 current mail has a "Message-Id:" header, run the mail through = "formail -D",=20 causing duplicate messages to be dropped. Can use MD5 hash in cache. =

2.5.11 Cron modules=20

  • pm-jacron.rc -- A framework for your daily = cron=20 tasks. This recipe contains all the needed checks to ensure that = your=20 includerc is called whenever a day changes. (Day change is subject = to=20 messages you receive). Your own cron includerc is run once a day. =

2.5.12 Backup modules=20

  • pm-jabup.rc -- Save messages to backup = directory and=20 keep only N messages per day. Idea by John Gianni, packaged by Jari. = Note:=20 The implementation will always call shell for each message you = receive; so=20 using this module is not recommended if you get many messages per = day.=20 Instead, use the cron module to clean the messages' backup directory = only=20 once a day, and not every time a message arrives.

2.5.13 Confirmation = modules=20

  • pm-jacookie.rc -- Handle cookie (unique = id)=20 confirmations. Also known as Procmail authentication service (PAS). = This=20 simple procmail module will accept messages only from users who have = returned a "cookie" key. You can use this to to protect your mailing = list=20 from false "subscribe" messages or from getting mail from unknown = people,=20 typically spammers who won't send the cookie back to you to = "validate"=20 themselves. Uses subroutine pm-jacookie1.rc, which generates the = unique=20 cookie; CRC 32 by default.=20
  • See also Michelle's confirmation module for SmartList

2.5.14 File Servers=20

  • pm-jasrv.rc -- A Mime Procmail file server = (MPFS) It=20 contains all the instructions and supports several MIME encoding = types:=20 text/plain and gzip. The keyword SEND is configurable. You can set = up as=20 many file servers as you need to different directories by changing = the SEND=20 keyword. MPFS supports password for file access.=20
  • commands.rc -- From Alan's procmail-lib, = check for=20 commands in the subject line. Handles commands (send|get)=20 [help|info|procmail info|procmail lib|procmailrc] and a few others. =

2.5.15 Mime modules=20

  • pm-jamime.rc -- Subroutine to read MIME = headers and=20 put the mime version, boundary string, content-type information to=20 variables.=20
  • pm-jamime-decode.rc -- recipe to decode=20 quoted-printable or base64 encoding in the body.=20
  • pm-jamime-kill.rc -- Recipe for attachment = killing:=20 wipes out the extra mime cruft leaving only the plain text. = Applications for=20 killing: ms-tnef attachment (MS Explorer 7k), HTML attachments = (Netscape, MS=20 Express) vcard (Netscape), PCX attachment (Lotus Notes).=20
  • pm-jamime-save.rc -- Recipe for saving = simple file=20 attachment. When you receive ONE file = attachment=20 in a message, this recipe can save it in a separate directory. The = content=20 is also decoded (base64,qp) while saving.

2.5.16 Filtering message body or=20 headers=20

  • pm-jadaemon.rc -- Handle DAEMON messages = by changing=20 subject to reflect a) the error reason b) to whom the message was = originally=20 sent c) original subject sent and what was the subject. Store the = DAEMON=20 messages to separate folder.=20
  • pm-jasubject.rc -- Standardize Subject = "Re[32]: FW:=20 Sv: message" or any other derivate to de facto "Re: message"=20
  • pm-janetmind.rc -- Reformat http://minder.netmind.com/=20 messages, The default 4k message is shortened to a few important = lines.=20

2.5.17 Miscellaneous = modules=20

  • pm-jaempty.rc -- check if message body is = empty=20 (nothing relevant). Define variable BODY_EMPTY to "yes" or "no" if = message=20 is empty.=20
  • pm-janslookup.rc -- Run nslookup on given = address.=20 If you compose return address with "formail -rt -x To:" you can = verify if=20 domain is registered before sending reply. Uses cache for already = looked up=20 domains.=20
  • guess-mua.rc -- Guess the Mail User Agent = and set=20 MUA: MH,PINE,MAIL

2.5.18 Mailing list = modules=20

  • Microlist a small mailing list module by = david hunt=20 dh@west.net ...This = version=20 contains vars set for my environment and needs, and requires = resetting of=20 those vars before use. Its exact function and use will remain a = mystery=20 until I get a readme file written for it. If anyone wants to use it, = I=20 suggest you write to me first. If anyone has any suggestions or = criticisms=20 (no matter how harsh) please write http://www.west.net/~dh/homedir/microlist/microlist4.3 =
  • pm-jalist.rc -- Subroutine to extract = mailing list=20 name from message. Do you need to add a new recipe to your = .procmailrc every=20 time you subscribe to new mailing list? If you do, take a look at = this=20 module, which examines the message and defines variable LIST to hold the mailing list name. You can use = it=20 directly to save the messages adaptively to correct folders. No more = hand=20 work and manual storing of mailing list messages.

2.6 Procmail code to filter UBE

Sysadms=20 remember : Spam filtering is much more = efficiently=20 done in the MTA, especially if you are just looking at From and To = lines. For=20 example, you can setup in Exim a rule that blocks \d.*@aol\.com (that = is any=20 aol.com local part that begins with a digit). AOL guarantees that = none of their addresses begin with a digit. Exim = rejects=20 such bogus addresses at the SMTP level before the message is received. =

Daniel's spam = filter
1997-09-13=20 Daniel Smith DanS@bristol.com=20 sent excellent spam filter called spamc.rc. = It used=20 some nice heuristics and filters from various people, including [david] and [philip]. Later=20 Dan made substantial changes to it and the new version is available = from ftp://ftp.bristol.nl/pub/users/DanS/spamcheck=20

pm-jaube.rc Jari's ube filter = (compiled=20 from others)
After Daniel Smith posted his spam recipes to = procmail=20 mailing list, Jari investigated them and compiled other recipes to a = general=20 purpose UBE module that needs no special setup and can be installed = via simple=20 INCLUDERC. No additional ube-list files are used, all UBE detection = happens=20 using procmail rules. The module is included in kit pm-code.zip.=20

Catherine A. Hampton's=20 Spambouncer
http://www.spambouncer.org/ ...The attached set of = procmail=20 recipes/filters, which I call The Spam Bouncer, are for users who are = sick of=20 spam (unsolicited junk mail) and want to filter it out of their mail = as easily=20 as possible. These recipes can be used as shared recipes for a whole = system,=20 or by an individual for their own mailbox only.=20

Protect yourself from spam: A = practical=20 guide to procmail
http://www.sun.com/sunworldonline/swol-12-1997/swol-12-spam= .html=20 ...take you, step by step, through everything you need to know in = order to=20 enlist the aid of a Unix host in filtering unwanted e-mail traffic. = This page=20 is excellent to get you started with procmail and filtering with = simple=20 recipes and how to store messages to folders. Recommended for = newcomers to=20 Procmail.=20

Junkfilter
http://www.pobox.com/~gsutter/junkfilter/ = ...Junkfilter is a=20 user-configurable procmail-based filter system for electronic mail. = Recipes=20 include checks for forged headers, key words, common spam domains, = relay=20 servers and many others.=20

SpamDunk
http://www3.sympatico.ca/walter.dnes/email/spamdunk/ = ...This=20 webpage shows a commented example of a working .procmailrc file that = works for=20 me. I have tried to make things as generic as possible, but there are = no=20 guarantees that it will work for anyone else.


3.0 Dry run testing

3.1 What is dry run testing

It means that you call your = procmail test=20 script directly with sample test mail
      % =
procmail $HOME/pm/pm-test.rc < $HOME/tmp/test-mail.txt

The script pm-test.rc has the procmail recipe = you're testing=20 or improving. The test-mail.txt is any valid mail message containing = the=20 headers and body. You can make one with any text editor, e.g. vi, pico or emacs in your Unix system. Here's a simple test = mail=20 skeleton:

      From: me@here.com
      To: me@here.com (self test)
      X-info: I'm just testing

      BODY OF MESSAGE SEPARATED BY EMPTY LINE
      txt txt txt txt txt txt txt txt txt txt

Remember that you can define environment variables = as well in=20 the dry run call. Here's an example where procmail just executes the = script=20 and does nothing fancy.

      % procmail =
VERBOSE=3Don DEFAULT=3D/dev/null \
          ~/pm/pm-test.rc < ~/txt/test-mail.txt

Suppose the script prints something to log files, = but you'd=20 instead like to get it all dumped to screen. No problem, first find = out your=20 tty value by calling tty at shell prompt and = pass that=20 on the command line. Here the default LOGFILE is directed to take care = of=20 redirecting "LOG=3D" commands and statement=20 "MYTEST_LOG=3D${MYTEST_LOG:-$HOME/pm/pm-test.log}"

  =
    #  `tty' tells what to fill in /dev/..

      % procmail VERBOSE=3Don DEFAULT=3D/dev/null                        =
 \
          LOGFILE=3D/dev/pts/0 MYTEST_LOG=3D/dev/pts/0                   =
 \
          ~/pm/pm-test.rc < ~/txt/test-mail.txt

3.2 Why the From field is not okay after dry run

It now says "From foo@bar Mon = Sep 8=20 14:38:06 1997"=20

[philip] Don't worry = about this.=20 It's a side-effect of running the message through formail after having = generated any auto-reply -- the auto-reply generated by "formail -rt" = doesn't=20 have a "From " header (it's pointless for outgoing messages), so the = second=20 formail adds one, not knowing that it'll just be ignored by sendmail = later=20 (well, sendmail will extract the date from it, but that's ignorable). = You only=20 see it because you're saving to a folder instead of the mailing it. =

3.3 Getting default value of a procmail variable

[david] There's always this way to learn a = variable's=20 initial value (note the strong quotes), which Stephen uses to get = procmail's=20 value for $SENDMAIL in the scripts that build SmartList:
      procmail LOG=3D'$PATH' DEFAULT=3D/dev/null /dev/null =
< /dev/null

Since LOGFILE hasn't been defined, $PATH will be = printed to=20 the screen. One caution: if there are any variables in the definition = of $PATH=20 (such as $HOME), they'll be expanded in the output.


4.0 Things to remember

4.1 Get the newest procmail

Lot of troubles surface only = because you=20 have an old procmail version. Be sure to have the latest. Knock your = sysadm or=20 ISP until he installs this version and don't give up, if you're = serious about=20 using procmail. Here is a command to check your procmail version = number:
      % procmail -v

4.2 Csh's tilde is not supported

Real csh or Emacs freaks = have grown=20 accustomed to using tilde (~) everywhere, but must drop that habit = now.=20 Procmail doesn't support it; just use $HOME. = When you=20 write procmail recipes, think sh not csh. This mind set will automatically get your brain = tuned to=20 the right programming habits.=20

4.3 Be sure to write the recipe starting right

The recipe = starts with=20 :0 or just with : = but the=20 latter one is somewhat dangerous and easy to miss. Beware writing it = 0: as it happens easily. The Procmail code = checker, Lint,=20 also requires that you use the :0 recipe = start=20 convention.=20

[philip] Always put a = zero after=20 the colon that begins the recipe. In the first versions of procmail, = you would=20 put the number of conditions, with a default of 1. That was annoying, = and the=20 computer can do the counting easier, so Stephen made it so that a = count of 0=20 indicates that the conditions are all the lines beginning with a *. The default is one, unless the a,=20 A , e, or E flags is given, in which case the default is = zero. ALWAYS START a RECIPE=20 WITH :0.

4.4 Always set SHELL

[FAQ] If = your login=20 shell is a C shell (csh or tcsh), avoid havoc: as a precaution, always = put=20 following at the top of your .procmailrc.
      =
SHELL =3D /bin/sh

4.4.1 If system has no /bin/sh and = you're forced=20 to use csh/tcsh=20

[kuhlmav@elec.canterbury.ac.= nz]=20 Csh and tcsh execute the .cshrc first, THEN if, and only if it is the = login=20 shell (not a sub shell) it executes the .login, which should contain = basic=20 important system setting like stty commands. = Likewise,=20 bash and ksh users are taught to define and export PATH in .profile, = so our=20 per-shell startup files would not have clobbered the PATH set in = .procmailrc=20 the way your .cshrc did.=20

[philip] ...I have = been told by=20 other sysadmins that there are systems on which csh was hacked to = source the=20 .login before the cshrc. For various reasons I suspect these to be = systems=20 based on
older versions of BSD (say, 2.3 BSD).=20

As for tcsh, the order in which the .login and = .cshrc is=20 sourced is a compile-time option which defaults to the .cshrc (or = .tcshrc)=20 before the .login. There may be some wackos out there who change the = default=20 in memory of the system(s) that they were raised on. I suggest = electroshock as=20 the proper treatment.=20

...done sys admin on Crays, = Convexes,=20 Suns, SGIs, Decs, PC running BSDI, Linux and Free BSD, and I have = never run=20 into a system where the .cshrc is sourced AFTER the .login. If someone = goes to=20 the trouble to change the order, I would love to know a valid reason = for=20 it.=20

4.4.2 Procmail won't work well with = SHELL set to=20 csh derivate=20

[1998-08-17 PM-L kuhlmav@elec.canterbury.ac.= nz=20 Volker Kuhlmann] ...The blame lies with procmail and its = documentation.=20 Obviously, procmail is programmed with the assumption that the login = shell is=20 a sh derivative. This assumption is a) not very nice, and b) not = stated in the=20 otherwise very good documentation. Of course a user can set SHELL to = tcsh. If=20 then procmail is too stupid to hack it, it ought to say so clearly, = and the=20 above-mentioned questions of people using tcsh will disappear from = this list.=20 One could also be nice and point out pitfall (3) mentioned above in = the=20 procmail docs. It is customary to have terminal configuration in = .login. If it=20 is shifted to .cshrc it should be properly surrounded by if .. endif. = Perhaps=20 it is not customary to configure the terminal in bashrc (where else = then? -=20 only a rhetorical question), but that
is no reason to blame it on = tcsh.=20

My .cshrc only setenvs the environment when it is a = login=20 shell (shell level 1). Obviously procmail runs a login shell. As I = said=20 earlier, there are good reasons for setting a full PATH independently = whether=20 the shell is interactive or not. So, when procmail executes programs = with=20 SHELL=3Dtcsh, PATH is set to the tcsh defaults. That may or may not be = desirable, depending on the individual case. No problem with that and=20 avoidable (run tcsh with -f). Nice if it was in the procmail docs.=20

But then, the PATH getting clobbered is not the = point here=20 (just a side-effect I didn't realize until 2 people pointed it out). =

4.5 Check and set PATH

[jari] = It is very=20 likely that the default PATH environment variable that your = .procmailrc sees=20 it not enough. To play safe, so that all the needed binaries can be = found when=20 escaping to shell in procmailrc, set the PATH variable=20 as a very first statement.
Here is one example that I use for HP-9 = HP-10=20 and in SUN-OS. You can add paths that don't exist, that way you can = use the=20 same .procmail on multiple servers (On HP and SUN as I do)
      PATH        =3D $HOME/bin:\
      /usr/contrib/bin:\
      /bin:/usr/bin:/usr/lib:/usr/ucb:/usr/sbin:\
      /usr/local/bin:/opt/local/bin:\
      /vol/bin:/vol/lib:/vol/local/bin:${PATH}

[Richard] It is = dangerous to have=20 many directories in the PATH, especially if you do not control the = content of=20 any of them. A sysadmin could put a newer, incompatible version of a = program=20 you rely upon in one of them and you cause difficult-to-diagnose = problems. It=20 may make more sense to link the binaries you need into your own ~/bin=20 directory and include just that in your PATH.=20

[jari] In principle I = agree with=20 Richard's advice, but in practice the newer version of the program = seldom=20 breaks the procmail code you have written. It depends on your "threat = level":=20 be more cautious and use Rik's advice; alternatively trust the system = and=20 adapt to (rare) changes. Your call.

4.6 Keep the log on all the time

It's best that you put these = variables at the very start of your .procmailrc. When you start using=20 procmail, you also want to know all the time what's happening there = and why=20 your recipes didn't work as expected. The answer to almost all your = questions=20 can be found in the log file. As the log file will grow to be quite = big,=20 remember to set up a cron job to keep it moderate size.
      LOGFILE     =3D $PMSRC/pm.log
      LOGABSTRACT =3D "all"
      VERBOSE     =3D "on"

4.7 Never add a trailing slash for directories

[philip] Drop the trailing slash: it'll choke = if you=20 ever end up on Apollo's DomainOS where double slashes are network = references.=20 If the directory has a trailing slash, it will choke on most OSes = (they treat=20 it like "/.").
      DIR         =3D =
/full/path/to/www/directory/    # Wait...
      FILE        =3D $ARCHIVEDIR/file                # Ouch !

4.8 Remember what term DELIVERED means

[alan] When procmail delivers a piece of mail, = whether=20 to a file or a pipe-command, if the write succeeds, then the mail is=20 considered to have been delivered, and processing stops with that = recipe file.=20 Here is the relevant text from man page:=20

...There are two kinds of = recipes:=20 delivering and non-delivering recipes. If a delivering recipe is found = to=20 match, procmail considers the mail (you guessed it) delivered and will = cease=20 processing the rcfile after having successfully executed the action = line of=20 the recipe. If a non-delivering recipe is found to match, processing = of the=20 rcfile will continue after the action line of this recipe has been=20 executed.

4.9 Beware putting comment in wrong place

You like commenting = a lot,=20 sticking them everywhere possible? Yes, I do that too, and got into = trouble=20 because one is not that free to comment code in procmail. Pay = attention to the=20 following example
      :0          # comment, nice =
tune...
      * condition # OUCH, Ouch, ouch. This comment must not be here!!
          #         Hm, Old procmail versions don't understand this
          #         Are you sure you want to put comments inside
          #         Condition line?
      * condition
      {               # comment ok
                      # comment ok
          :0          # comment ok
          /dev/null   # comment ok
      }               # comment ok

So, the place to watch is the condition=20 line. Some later procmail versions promised to correct this = misfeature, but it=20 never came true. No procmail exists yet that allows putting comments = on the=20 same line with a condition clause.

4.10 Brace placement

Be careful with your braces and remember = that=20 old procmail versions aren't as forgiving as newest version. Below you = see=20 classical "Test OK condition first, and if that fails then do = something else".=20 See the side comments.
      :0
      * condition
                          # No space allowed here!
      {}                  # Wrong, at least _one_ empty space
      :0 E
      {do_something }     # Again mistake, must have surrounding spaces

4.11 Local lockfile usage

Lock files are only needed when = procmail is=20 doing something that should be serialized, i.e., when only one process = at a=20 time should be doing it.=20

This generally means that any time you write to a = file, you=20 should have a local lock, preferably based on the name of the file = being=20 written to. Forwarding actions ('!'), and 99% of all filters don't = need lock=20 files. However, if a filter action writes to a file while filtering, = then you=20 may need a lock. Procmail always does kernel locking when it writes = mail to=20 files via simple file actions. So even if you forgot the lock colon, = procmail=20 tries to play safe if kernel locking has been compiled in.=20

Beware misplacing the lock colon(:)

       :0: a      # Ouch! Wrong unless you want a lock file =
named a
       :0 a:      # Okay.

Note that in delivering recipes where you manually = write the=20 content, you must use local lock file with > token,=20 because procmail can't determine lock by itself. It can only determine = the=20 lock file from the >> token. [stephen] However, putting a lock file on a = recipe like=20 this is, of course, utterly useless. So you might as well omit the = locking=20 entirely.

      #   Save last body of message to =
file mail.body

      :0 b:  mail.body$LOCKEXT
      | cat > mail.body

  • If the command line in the procmail rcfile contains ">>", = a name=20 for the local lock file will be implicit, and the second colon alone = is=20 enough.=20
  • If the command doesn't write to a file, or doesn't write to the = same=20 file as anything else (including a matching letter that makes = procmail run=20 the same command) that might run at the same time, the local lock = file is=20 unnecessary.

[philip] Watch this = too. A=20 nesting block that does not launch a clone cannot take a local lock = file on=20 the recipe that starts the braces. A nesting block that does launch a = clone=20 can. (see the error)

      :0: file$LOCKEXT
      {
          #  error: "procmail: Extraneous local lock file ignored"
          #  - This lock file will be ignored
          #  - If the recipes inside the braces try to use file.lck
          #    as  a lock file, then you'll have a deadlock situation.

          :0 :
          /tmp/tmp.mbx
      }

Let me also explain why the w is so=20 important. Notice, that the two here are equivalent. The W here is implicit. NOTE: this=20 is only true on the recipe that opens a nested block. On a recipe with = a=20 program, forward, or delivery action, W' is = different=20 from w is different from missing both.

      :0 c: file$LOCKEXT      :0 Wc: file$LOCKEXT
      { ... }                 { ... }

To quote the comment in source code, "try and = protect the=20 user from his blissful ignorance". The parent will always wait for the = cloned=20 child to exit when a lock file is involved. The only question is = whether or=20 not it should be logged. If you want failure of the cloned child to be = logged,=20 then you should use the w flag, ala:

      :0 wc: file$LOCKEXT
      { ... }

A local lockfile can be used to lock a clone; the = parent=20 procmail will remove it when the clone exits (thus it serves as a = global lock=20 file for the clone). If the braced block does not launch a clone, = asking for a=20 local lock file generates an error.

4.12 Global lockfile

[david] If = you want=20 to block everything while the recipe runs, even during the conditions, use global lock. For example in this = construct=20 the formail which updates the message-id = cache file=20 must be protected with a global lock file.
      =
MID_CACHE_LEN   =3D 8192
      MID_CACHE_FILE  =3D $PMSRC/msgid.cache
      MID_CACHE_LOCK  =3D $PMSRC/msgid.cache$LOCKEXT

      LOCKFILE        =3D $MID_CACHE_LOCK

      :0
      * ^Message-ID:
      * ? $FORMAIL -D $MID_CACHE_LEN $MID_CACHE_FILE
      {
              LOG =3D "dupecheck: discarded $MESSAGEID from $FROM $NL"

              :0                  # no lockfile !
              $DUPLICATE_MBOX
      }

      LOCKFILE                    # kill variable

You cannot use local lockfile as below:

      :0 : $MID_CACHE_FILE$LOCKEXT
      *   ^Message-ID:
      * ? $FORMAIL -D $MID_CACHE_LEN $MID_CACHE_FILE

because the local lock file named on the flag line = will be=20 created only if the conditions have matched and the action is = attempted.=20

One more note: watch carefully, that there is = no : lock when = delivering to DUPLICATE_MBOX because the outer global lock file = already=20 prevents all other procmail instances from executing this part of the = recipe.=20

4.13 Gee, where do I put all those ! * $ ??

Ahem. I can't = tell you=20 exactly what to do or how to write your own procmail recipes, but I = can tell=20 how I'm writing them. Here is my condition line token order:
      * $ ! ? BH VAR ?? test

That won't say much unless I give you something to = compare=20 with. Here is one perfectly valid rule, but not my style

      :0
      *$ ^Subject:.*$VAR
      *! ^From:.*some
      *B ! ?? match-the-string-in-body
      *$? $IS_EXIST $FILE
      *VARIABLE ?? set

I prefer lining up things in the condition lines. = The first=20 column is reserved for dollar sign, the second for not=20 operator and so on. The important thing is that I can see at a glance = if I=20 have set the variable expansion dollar in the line (leftmost).

      :0
      *$       ^Subject:.*$VAR
      *  !      ^From:.*some
      *  ! B ?? match-the-string-in-body
      *$ ?      $IS_EXIST $FILE
      *         VARIABLE ?? set

4.14 Sending automatic reply, use X-loop header

Do not send = automatic=20 reply without checking "! ^FROM_DAEMON" condition and always include = X-Loop header and check its existence to prevent = mail loops
      :0
      *    conditions-for-auto-reply
      *$ ! ^$MY_XLOOP
      *  ! ^FROM_DAEMON
      | $FORMAIL -A "$MY-XLOOP" ...other-headers...

4.15 Avoid extra shell layer (check command for SHELLMETAS) =

[dan] It is very important to study your shell = command=20 calls and try to save the overload of the extra layer of shell. It may = be=20 extra work once when you write your rcfile but it saves effort on each = piece=20 of arriving mail. When procmail sees a character from SHELLMETAS, it runs
      # =
Default SHELLMETAS: &|<>~;?*[
      # Default $SHELLFLAGS: -c

      % $SHELL $SHELLFLAGS "command -opts args"

instead of

      % command -opts =
args

That is because procmail's ability to invoke other = programs=20 does not include filename globbing ([, *, ?), backgrounding (&), = piping=20 (|), succession (;), nor conditional succession (&&, ||). If = it sees=20 any of those characters (before expanding variables), it hands the job = over to=20 a shell.=20

Sometimes those characters appear in arguments to a = command=20 without having their shell meta meaning and procmail really could = invoke the=20 command directly without the shell. You can see the distinction in a = verbose=20 log file: if procmail runs the command itself, it logs

      Executing "command,-opts,args"

with a comma between each positional parameter, but = if it=20 calls a shell, the original spacing from the rcfile appears unchanged = in the=20 logfile:

      Executing "command -opts args"

So, if you know you won't be needing shell = expansion, wrap=20 your shell calls with this:

      savedMetas  =3D =
$SHELLMETAS
      SHELLMETAS    # Kill variable

      ..command that does not need shell expansion features..

      SHELLMETAS  =3D $savedMetas

4.16 Think what shell commands you use

For every message, = procmail=20 launches the processes you have put into your .procmailrc. If you = haven't paid=20 attention to optimization before, now it's serious time to take a = magnifying=20 glass and check every recipe and the processes in them. When you write = you=20 private shell scripts, the performance hit is not so important, but = for mail=20 delivery, the matter is totally different. First, let's see some = programs and=20 sizes: The following is from HP-UX 10, where the binaries seem to = include=20 debug and symbol table code.
      131072 Aug 21  =
1996 /usr/bin/awk
      196608 Oct  1  1996 /usr/bin/sort
      245760 Jun 10  1996 /usr/bin/grep
      262144 Jun 10  1996 /usr/bin/sed
      303552 Dec  7  1995 /usr/local/bin/gawk
      544768 Jun 10  1996 /usr/contrib/bin/perl       [perl 4.36]
      822232 Aug 25 13:58 /opt/local/bin/perl5.00401

              text    data     bss
      awk:    72727 + 51316 +  15317   =3D 139360
      sort:  173225 + 18496 + 183076   =3D 374797
      sed:   237248 + 16992 +  56252   =3D 310492
      grep:  221591 + 16176 +  53816   =3D 291583
      perl4: 502220 + 36044 +  65632   =3D 603896
      perl5: 633812 + 69612 +   2385   =3D 705809
      gawk:  160018 +  5264 +   7168   =3D 172450

The binary siszes above are not the typical cases: = these are=20 from another system

           4 Sep 28 14:25 =
/usr/local/bin/awk -> gawk
       32768 Nov 16  1996 /usr/bin/grep
       49152 Nov 16  1996 /usr/bin/sed
      114688 Oct 20  1996 /usr/local/contrib/gnu/bin/grep
      155648 Nov 16  1996 /usr/bin/awk
      155648 Nov 16  1996 /usr/bin/nawk
      221184 Nov 16  1996 /usr/bin/gawk
      311296 Jan 27  1997 /usr/local/bin/gawk
      958464 Nov  2 16:34 /usr/local/contrib/bin/perl
      1196032 Sep 14  1996 /usr/local/bin/perl

Stan Ryckman stanr@sunspot.tiac.net = wants you=20 to know that:=20

Comparing byte sizes on disk = means nothing=20 here... these things may or may not have been stripped. Any symbol = tables=20 included in the byte counts you see above won't affect process = start-up time.=20 The size command will give a better handle = on what=20 will be needed in starting a process. The three segments may each have = their=20 own overhead, though, and the relative contributions of those segments = to=20 startup time may well be system-dependent.=20

Hm. Can we draw some conclusion? Not anything = definitive, but=20 at least something:=20

  • While sed(1) and grep(1)=20 may be bigger than awk(1) in some systems, = this is=20 an exception. They are usually much smaller and fast to use.=20
  • Complex commands that would require many processes to be chained = together, like `grep -v | grep | sed' could be usually accomplished = with one=20 awk(1) call. Ask somewhere how to do it = with awk(1) if you don't know the language, it's = quite alike=20 perl(1)=20
  • Try to use standard awk(1). gawk(1) and nawk(1) = are bigger and=20 may not be found on all systems.=20
  • Avoid perl(1) at all costs; it's many = times (6)=20 bigger than awk(1). Perl is slow-to start = up, due to=20 intermediate compilation process at startup and hogs oodles of = memory.=20
  • Remember that if procmail is running in a dedicated mail host, = it=20 probably doesn't even have any goodies installed, just the boring = standard=20 versions; which may not be even the same as what you see on current = host.=20

Here are some more programs. Don't even think of = extracting=20 fields with grep or awk, like=20 "grep Subject", because formail is much = smaller and=20 more optimized for tasks like that. Better yet, many times you can do = all with=20 procmail's regexp matches.

      37007 Sep  5 15:53 =
/usr/local/bin/formail   # 3.11pre7
      28672 Jun 10  1996 /usr/bin/tr
      20480 Jun 10  1996 /usr/bin/tail
      20480 Jun 10  1996 /usr/bin/cat
      20480 Sep 26  1996 /usr/bin/expr
      16384 Jun 10  1996 /usr/bin/head
      16384 Jun 10  1996 /usr/bin/cut
      16384 Jun 10  1996 /usr/bin/date
      16384 Jun 10  1996 /usr/bin/uniq
      16384 Jun 10  1996 /usr/bin/wc
      12288 Jun 10  1996 /usr/bin/echo

4.17 Using absolute paths when calling a shell program

Shell=20 programmers know that if absolute path is used for calling the = executable,=20 shell doesn't have to search through long list of directories in = $PATH. This=20 may speed up shell scripts remarkably. The best way to use such an=20 optimization is to define variables to those programs.=20

Should you use such optimization in your procmail = code? That=20 is a two folded question. Examine how many shell calls do you use? Do = you use=20 grep or formail a = lot? Then=20 you could optimize these calls. To be portable, define variables for=20 executables:

      #  perhaps defined in separate =
INCLUDERC
      #
      #   INCLUDERC =3D $PMSRC/pm-mydefaults.rc

      FORMAIL     =3D /usr/local/bin/formail
      GREP        =3D /bin/grep
      DATE        =3D /bin/date

      :0 fhw
      | $FORMAIL -rt

When you port your .procmailrc to=20 different environment which has different paths, you could use this = recipe in=20 addition to one just mentioned above:

      FORMAIL  =
   =3D ...as above

      :0
      * HOST ?? second-host
      {
          #   In this host the paths are different. Reset.

          $FORMAIL    =3D "formail"
          $GREP       =3D "grep"
          $DATE       =3D "date"
      }

4.18 Disabling a recipe temporarily

If you have a recipe that = you=20 would like to disable for a while, there is an easy way. Just add the = "false"=20 condition line before any other conditions. The "!" also nicely = visually flags=20 that "this recipe is NOT used".
      #  This recipe =
stops at "!" and doesn't get past it.

      :0
      * !
      * condition
      * condition
      {
          ...
      }

4.19 Keep message backup, no matter what

It's good to have a = safety=20 measure in your .procmailrc. Although you = are an=20 expert and have checked your recipes 10 times, there is still a chance = that=20 something breaks. One morning, when you browse your BIFF=20 reminder log; you notice "Hm, there is that interesting message but it = was not=20 filed, where is it?". And when you go to study the procmail logs (you = do keep=20 the log going all the time) and it hits you: "Gosh; a mistake in my = script!=20 Message was fed to malicious pipe and I had that i=20 flag there... sniff". And you greatly regret you = didn't=20 back up the message in the first place.=20

So, before your procmail does anything to your = message, put=20 the message into some folder which is regularly expired. Emacs Gnus = can do=20 mailbox's expiring, but one could also use a cron(1)=20 to do the cleaning. After that, you can relax knowing your mail is = safe.

      #   Your incoming messages are stored =
here, filtered by procmail

      SPOOL      =3D $HOME/Mail/spool

      #   Backup storage
      #
      #   - This could be directory too. In that case you could use
      #     cron job to expire old messages at regular intervals
      #   - For once a day expiration, see procmail module list
      #     and pm-jacron.rc

      BUP_SPOOL  =3D $SPOOL/junk.bup.spool

      :0 c:
      $BUP_SPOOL

Naturally you can filter out mailing list messages = from the=20 backup, because losing one or two (hundred) of them may not be that = serious.=20 Maybe you could use two backup spools, one for mailing lists and the = other for=20 your non-list messages.

      :0 c:
      * ! mailing-list1|mailing-list2
      $BUP_SPOOL

If you have the date variables set up as described = below, you=20 could also create a backup folder per day:

      =
$BUP_SPOOL    =3D $SPOOL/junk.bup.$YYYY-$MM-$DD.spool

This makes it very easy to delete backups that are = older than=20 a given number of days, either manually or through a cron job.

4.20 Order of the procmail recipes

When you start writing a = lot of=20 procmail recipes, you soon find out that it matters a great deal in = which=20 order your put your recipes. When each group of recipes starts growing = too=20 big, it's good practice to move each group to a separate includerc = file. Here=20 is one recommended order in which yur calls appear in the mail=20 $HOME/.procmailrc=20

  • backup important messages=20
  • cron-subroutine=20
  • handle duplicate messages=20
  • handle DAEMON MESSAGES=20
  • handle plus addressed message (RFC plus or sendmail plus = addresses)=20
  • handle server requests (file server, ping responder...) =

  • drop MAILING LIST messages

  • send possible vacation replies only after all above=20
  • apply kill file=20
  • detect mime and format or modify the message body=20
  • save private messages

  • and last: FILTER UBE.

The backup, cron and duplicate handling go = naturally to the=20 beginning of your .procmailrc. Next comes a = grey area=20 where Daemon, plus handling and server messages can be put.=20

Mailing lists should be handled as early as = possible, but=20 after the server messages, because you want your services handled = first.=20

Do not send vacation replies before you have = handled mailing=20 lists to prevent annoying vacation replies to mailing lists.=20

After that you are left with "known" private = messages and=20 those of unknown origin. A kill file (to block based on sender) for = rapid=20 spammers, who send you message or several per day may need to be = checked=20 before checking other messages.=20

Last but not least: Put your UBE checkers to the = end to=20 avoid mishits of valid mail. DO NOT SEND AUTOMATIC COMPLAINT BACK, or you'll get grey hairs when the = autoresponder send=20 its complaint to valid source. You don't want to answr back with "My=20 apologies, the script had an error, it won't happen aagin." to all the = valid=20 hate mail that is now addressed to you.=20

Drop the UBE to a folder, manually select the = messages that=20 need actions and send message to postmasters in the Received chain = explaining=20 that their mail relay has been hijacked.


5.0 Procmail flags

5.1 The order of the flags

The Order of the flags does not = matter in=20 practice, but here is one stylistic suggestion. The idea here is that = the most=20 important flags are put to the left, like giving priority 1 for aAeE, which affect the recipe immediately. = Priority 2 is=20 given to flag f, which tells if a recipe = filters=20 something. Also (h)eader and (b)ody should immediately follow f, this is considered priority 3. In the middle = there are=20 other flags, and last flag is c, which ends = the=20 recipe, or allows it to continue. In addition according to [david]: "...I'm quite sure that putting = anything other=20 than the opening colon and the number to the left of AaEe will cause an error."
      =
:0 aAeE HBD fhb wWir c: LOCKFILE
         |    |   |   |    |
         |    |   |   |    (c)ontinue or (c)lone flag last.
         |    |   |   (w)ait and other flags
         |    |   (f)ilter flag and to filter what: (h)ead or (b)ody
         |    (H)eader and (B)ody match, possibly case sensitive (D)
         The `process' flags first. (A)nd or (E)lse recipe

You can write the flags side by side

      :0Afhw:$MYLOCK$LOCKEXT

Or, as suggested, leave flags in their own slot for = more=20 distinctive separation. Note that $LOCKEXT must be next to $MYLOCK, = because it=20 contains string ".lock".

      :0 A fhw: =
$MYLOCK$LOCKEXT

5.2 Flag w and recipe with |

[alan] If=20 the filter program exits with a 0 status (0 =3D=3D okay), then = procmail will=20 replace the original input body with the output of the filter program. = If the=20 filter program exits with anything but zero, procmail will report an = "error"=20 to the log, and "recover" the input (not filter it)=20

[david] I am very = sure that=20 that's the case only if you have the = w or W flag on the = filtering recipe.=20 Without w or W, = procmail won't=20 care about a bad exit status from the filter and will replace the = filtered=20 portion with whatever standard output the filter produced. It may = still report=20 an error to the log but it won't recover the previous text. This, for = example,=20 will destroy the body of a message, even without i:

      :0 fb
      | false

With this, however, procmail will recover the = original body:

      :0 fbW      # same results even =
if we add `i'
      | false

[stephen] No, not on = all=20 occasions. Procmail will not care about the exit code here. However, = if=20 procmail detects a write error, it will recover = (because=20 of the missing i flag). Procmail will only = detect a=20 write error in such a case if the mail is long enough and does not fit = in the=20 pipe buffer that's in the kernel (typically 10KB).

5.3 Flag w, lock file and recipe with |

[manual] In order to make sure the lock file = is not=20 removed until the pipe has finished, you have to specify option w otherwise the lock file would be removed as soon = as the=20 pipe has accepted the mail. So if you see anything that looks like = ">" or=20 ">>" in your recipe, then that should immediately ring your = bells.=20 immediately check that you have included the w flag=20 and the lock file :.
      :0 hwc: headc$LOCKEXT
      * !^FROM_MAILER
      | uncompress headc.Z; cat >> headc; compress headc

5.4 Flag f and w together

The w tells=20 Procmail to hang around and wait for the script to finish. Hm, = Wouldn't you=20 think this ought to be implied by the f flag = already?=20 Not so.=20

[david] Of course the = f flag is enough to make procmail wait for the = filter to=20 finish, but the w means something more: to = wait to=20 learn the exit code of the filtering command. If sed fails with a = syntax error=20 and gives no output, without W or w procmail would happily accept the null output as = the=20 results of the filter and go on reading recipes for the now body-less = message.=20 On the other hand, with W or w=20 sed will respond to a non-zero exit code by recovering the unfiltered = text.=20

5.5 Flags h and b

[david] hb is the default; you need to use h=20 only when you don't want b or vice versa. = You can=20 think of it this way: h means "lose the = body" and=20 b means "lose the header," but the two = together cancel=20 each other out.=20

[philip] hb (feeding whole message) is the default for = actions. You=20 need to specify h without b if=20 you want the action applied only to the head. H is the=20 default for conditions. You need to specify HB or=20 BH if you want to test a condition against = the entire=20 message.

5.6 Flag h and sinking to /dev/null

When you drop something = to=20 /dev/null, use the h flag so that procmail does not unnecessarily try = to feed=20 whole message there.
      :0 h
      * condition
      /dev/null

[philip] Procmail = knows that it=20 shouldn't create a local lock on /dev/null and that it shouldn't = kernel lock=20 /dev/null, and it knows to write it "raw" (no "From " escaping or = appended=20 newline). This means that procmail simply opens /dev/null, does its = write with=20 one system call, and closes it.=20

I'm not sure if adding the h flag=20 makes a real difference on modern UNIX kernels. I suppose it depends = on how=20 optimized the write() data is and in particular, whether a user-space = to=20 kernel-space copy is required, or = whether it's=20 delayed. If it's delayed then the code for handling /dev/null would = presumably=20 not do it, and the size of the write wouldn't actually matter.

5.7 Flag i and pipe flag f

Flag i is useless=20 in mailbox deliveries.=20

[FAQ] The following = will work=20 some of the time, when the message is short enough, but that's a = coincidence.=20 With a longer message, though, Unix starts paying attention to what is = happening, because it will have to buffer some of the data, and then = when the=20 buffered data is never read, an error occurs. The error is passed back = to=20 Procmail, and Procmail tries to be nice and give you back your = original=20 message as it was before this malicious program truncated it. Never = mind that=20 in this case you wanted to truncate the data. Anyway, the fix is easy: = Just=20 add an :i flag to the recipe ( :0fbwi instead of :0fbw) = to make=20 Procmail ignore the error.

      :0 fbw
      * condition
      | malicious-pipe

[dan] here's why the = i flag is needed (courtesy of Stephan): You told = procmail to=20 filter the entire mail (header and body), so it does and it attempts = to write=20 out header and body to the filter. Then procmail notices that not the = entire=20 body is being consumed. Procmail, being rather paranoid when it comes = to=20 delivery of mail assumes something went wrong and considers this a = failure of=20 the filter.

      :0 fbwi
      | head -2

5.8 Flag r

[philip] Procmail=20 automatically turns on the r (raw mode) flag = for=20 deliveries to /dev/null, so there's no need to do it yourself.
      :0 r        # you can leave out the `r'
      * condition
      /dev/null

[david] You can use = the r flag (for raw mode) on every recipe where you do = not want=20 a From_ line added. I'm assuming that there isn't one already there; = the r flag keeps procmail from making sure that there = are a=20 From_ line at the top and a blank line at the bottom, but it will not = make=20 procmail remove them if they are already present. Also, be careful to = use the=20 -f option on all calls to formail so that = formail=20 won't add a From_ line.=20

Someone who didn't need From_ lines -- I forget who = -- found=20 it annoying to put r onto every recipe and = altered the=20 source to prevent procmail from adding From_ lines at all, ever. I = think a=20 better idea would be a procmailrc Boolean to enable or disable them = for all=20 recipes without affecting other users. (Then perhaps we'd need a = reverse r flag to undo raw mode for one recipe at a time?) =

5.9 Flag c's background

...Interesting. My vision of = c is to think of CONTINUE with message processing = afterwards=20 even if conditions matched.=20

[david] Precisely: = when you have=20 braces, thinking "continue" instead of "copy" or "clone" can get you = into=20 trouble.=20

Early versions of procmail, before braces and = before cloning,=20 called the c flag "continue" in their = documentation; I=20 think it is still called that in the source.=20

When Stephen introduced braces (but not cloning at = this=20 point), it was of course implicit that an action line of "{" was=20 non-delivering, and a c was extraneous. = People put c's=20 there because they wanted procmail to continue to the recipes inside = the=20 braces on a match, and procmail brushed it off with an "extraneous = c-flag"=20 warning. No harm done.=20

When Stephen introduced cloning, though, I was = rather upset=20 that he was giving double duty to c instead = of=20 introducing something new like C for it, = especially=20 because people who absolutely wanted no clone but intended the recipes = inside=20 the braces to run in the same invocation of procmail as everything = else were=20 mistakenly putting c's on their braces to make sure procmail would = "continue".=20 People would (and did) get double deliveries.=20

Roman Czyborra, though, said that if you consider = c to stand for "copy", that covers both uses of = c: provide a copy to a simple recipe or, if there = are=20 braces, to a clone procmail that will handle the recipes inside the = braces.=20 Stephen agreed and changed the documentation accordingly.=20

Longtime users of procmail and people who read old = docs may=20 still think of it as "continue", but since the introduction of clones, = that is=20 not a good way to look at it. "Copy" is much safer.

5.10 Flag c before nested block forks a child

[alan] The combination of a nested block and = the c flag causes procmail to fork a child process for = the=20 nested block, while the parent skips over it and continues on. The = child=20 process doesn't necessarily stop unless a delivering=20 recipe (without the c flag) action succeeds. =

5.11 Flag c and understanding possible forking penalty

... I run shell commands that = need not to=20 be serialized, so instead of doing the standard way:

      :0 hic                  # nbr.1 / standard way
      | command

I assume I can avoid the extra = fork caused=20 by (c)lone flag altogether by using these. Any difference between = these=20 two?

      :0                      # nbr.2 / =
alternative
      * ? command
      { }                     # ...No-op, Procmail syntax requires this

      dummy =3D `command`       # nbr.3 / alternative

[philip] There is a=20 misunderstanding here. Let me clarify:=20

Procmail only forks a full-blown clone on a recipe = with the=20 'c' flag whose action is a nested block.=20

If it's a simple mailbox deliver, pipe, or forward = action=20 then procmail does not fork a 'clone' (for pipe and forward actions = procmail=20 does have to fork, but only so it can execute the action). nbr.1 and nbr.2 take the = same number=20 of forks to execute. They also take the same effective number of = writes (in=20 case you're concerned about that). The latter also requires that = procmail wait=20 for the command to finish. nbr.3 is worse = than the=20 above two, as procmail has to not only wait for the command to = complete but=20 also save the output into the named variable.

5.12 Flags before nested block

Given the following recipe, = let's=20 examine the flag part
      :0 $FLAGS
      {
          do-something
      }

[david] HB AaEe and D affect the conditions and thus are meaningful = when the=20 action is to open a brace. HB and D would be meaningless, of course, on any = unconditional=20 recipe, but they should not cause error messages. Generally, flags = that affect=20 actions are invalid there, and bhfi and = r always are, but the others are partial = exceptions: if you=20 are using c to launch a clone, then w W and a local lock = file can be=20 meaningful. If there is no c, then w W and a local lock = file are=20 invalid at the opening of a braced block.

5.13 Flags aAeE tutorial

[david] AaEe are mutually exclusive and no more than one = should ever=20 appear on a single recipe. [philip] = Actually, this=20 is not true. e does not work with E or a (and procmail gives a warning if you try), and = A is redundant if a is given, but at least some of = the other=20 combination make sense and work.=20

  • A =3D try this recipe if the conditions = succeeded on=20 the most recent recipe at that nesting level that did not itself = have an A=20 nor an a=20
  • a =3D same as A, = but moreover=20 the action must have succeeded on the most recently tried recipe at = that=20 nesting level=20
  • e =3D Almost like A, try this=20 recipe if the conditions matched but the action failed on the most = recently=20 tried (not skipped) recipe at this nesting level. universe, e is the opposite of a. e only looks backwards past E=20 recipes that were skipped because of their E. It=20 doesn't care whether a previous recipe had an A or=20 a flag.=20
  • E =3D try this recipe if the conditions = have failed on=20 the most recent recipe at that nesting level that did not have an = E and on since then every recipe at that level = that did=20 have an E; essentially opposite of A

These mnemonics might help:=20

  • A: if you did the recipe at the start of = the chain,=20 try this one (A)lso=20
  • a: if the last action at that nesting = level was=20 (a)ccomplished)=20
  • e: if the last action at that nesting = level (e)rred=20
  • E: (E)lse because the conditions down the = chain so=20 far have not matched. Or "try this recipe unless the last tried = recipe=20 matched".
      #   [philip] =
demonstrates `e'

      :0 :            # match, but action fails
      /etc/hosts/foo


          :0 A        # no match
          * -1^0
          /dev/null

      :0 e # this is skipped because the last tried recipe didn't match
      {
          ...whatever
      }

How they interact with one another when used = consecutively=20 has not been fully tested to my knowledge. Consider this:

      :0
      * conditions
      non-delivering-action1

          :0 a
          action2

      :0 e
      action3

Is action3 done if action2 failed or if action1 = failed (or=20 perhaps in both situations)? [philip] = Action 3 is=20 only done if action2 failed.=20

If the answer is action2, does this work to get = action3 done=20 if action1 failed? I think it does, but does it also run action3 if = the=20 conditions didn't match on the first recipe? [philip] Yes, and yes.

      =
:0             #   [david]
      * conditions
      non-delivering action1

          :0a
          action2

      :0E
      action3

[philip] If that's = not what you=20 want, combine some flags:

      :0
      * conditions
      non-delivering action1

          :0 Ae
          action3

      :0 a
      action2

If the conditions match, action1 will be executed. = action3=20 will then execute if action1 failed, otherwise action2 will be = executed [if=20 action1 succeeded].=20

[david] I know what = this=20 structure does because I use it:

      :0
      * conditions
      non-delivering action1
          :0A
          action2

      :0E
      non-delivering action3
          :0A
          action 4

If the conditions match, action1 and action2 are = performed=20 and action4 is not (of course action3 is not either), even if action2 = is=20 non-delivering; if they fail, action3 and action4 are performed. The = A on the fourth recipe refers back to the third = and no=20 farther. But I don't know about this:

      :0
      * conditions
      non-delivering action1
          :0A
          * more conditions
          action2

      :0E
      non-delivering action3
          :0A
          action 4

Now, suppose the conditions on the first recipe = match but=20 those on the second recipe do not match. Would the third recipe (and = thus the=20 fourth one) be attempted? I would expect so. [philip] Yes. The last tried recipe didn't = match,=20 therefore the E flag will be triggered.=20

If that isn't what you want, you can prevent it = this way:

      :0
      * conditions
      {
          :0
          non-delivering-action1

          :0
          * more-conditions
          action2
      }

      :0 E # ignores mismatch inside braces, looks only at same level
      non-delivering action3

      :0 A
      action4

If that is what you want, you can be positive this = way:

      # if action2 is non-delivering or =
vulnerable to error that
      # would cause fall-through

      DID2         # Kill variable

      :0
      * conditions
      non-delivering-action1

          :0 A
          action3

      :0
      * ! DID2 ?? (.)
      non-delivering-action3

          :0 A
          action4

      # if action2 is delivering and sure to succeed
      :0
      * conditions
      non-delivering-action1

          :0 A
          * more-conditions
          action2

      :0
      non-delivering-action3

          :0 A
          action4

[philip] or those who = are=20 interested, I'll note that there are only 3 combinations of the a, A, e, and=20 E flags that aren't either illegal or = redundant. They=20 are Ae, aE, and = AE. I've shown a use for Ae up=20 above. Here's an example of AE:

      :0
      * condition1
      non-delivering action1

          :0 A
          * condition2
          non-delivering action2

      :0 AE
      action3

action3 will only be executed if condition1 matched = but=20 condition2 didn't match. Without the A flag, action3 would be executed = if=20 either of them failed. This can also be done with a instead of A with=20 analogous results.=20

Procmail's "flow-control" flags may not be = particularly easy=20 to describe in straight terms (and this can all be made more = complicated by=20 throwing in a more varied mix of delivering vs non-delivering = recipes), but=20 I've found that it usually does what I expect it to do, and when it = doesn't or=20 I'm in doubt or I want to be particularly clear, I can always = fall-back to=20 doing it explicitly via nesting blocks. Pick your poison... =


6.0 Matching and regexps (regular expressions)

6.1 Philosophy of abstraction in regexps

Here are two ways to view or = write=20 regexps. Make up your own mind.=20

People who are in favor of writing pure = native=20 regexps in the recipes:

      [    ]<[   =
 ]*("([^"\]|\\.)*"|[-!#-'*+/-9=3D?A-Z^-~]+)...  # "

Where someone that immediately wants to = abstract=20 things says (This is from philip's great Message-Id matching = recipe)

      dq =3D '"'                      =
          # (literal) double-quote
      bw =3D "\\"                               # (literal) backwhack
      atom       =3D "[-!#-'*+/-9=3D?A-Z^-~]+"
      word       =3D "($atom|$dq([^$dq\]|$bw.)*$dq)'
      local_part =3D "$word($s\.$s$word)*"

      $s<$s$local_part...                     # ignore comment here

...abstraction: It makes code clearer when you = break it to=20 manageable parts, which possibly surfaces reusable parts. It also = makes thing=20 look simpler, and enables even novices to understand what's going on = there.=20 After we're not connected to the net anymore, others could possibly = understand=20 it too. So, naturally we can't agree with any of the previously = mentioned=20 arguments presented for keeping regexp "in pure native format".=20

6.2 Matches are not case-sensitive

Okay, okay; if you read = the manual=20 you knew that already. But sometimes someone with years of experience = with=20 Unix may take it for granted that procmail would be case-sensitive as = the rest=20 of the Unix tools are. Use the D flag to = turn on=20 case-sensitivity.=20

6.3 Procmail uses multi line matches

Procmail uses multi line = matches=20 by default. This means that ^ and $ match a newline, even in the = middle of a=20 regexp. Now you know this, you can easily interpret e.g. $[^>] as: `a newline followed by a line not = starting with=20 a >.=20

If you put a '$' after the '\/' match token then = procmail=20 will include the matched newline if there's one there. Solution? Don't = put a=20 dollar sign there unless you really want a newline, use period that = matches=20 all but newline:

      :0 B
      * ^Search-string: \/.+

6.4 Headers are unfolded before matching

If you have a header = that=20 continues on separate lines, you don't have to worry about the line = feeds.=20 Procmail silently unfolds the header onto one line, before matching it =
      Received: from unknown (HELO Desktop01) =
(208.11.179.72) by
          palm.bythehand.net with SMTP; 4 Dec 1997 23:29:09 -0000

      :0                          # note, match on continuation line
      * ^Received:.*bythehand\.
      {
          # Do something
      }

6.5 Improving Space-Tab syndrome

Procmail doesn't know about = standard=20 escape codes like \t and \n or=20 [\0x00-\0x133]:
      #  Not what you think       # =
You have to write: space + tab
      [ \t]                       [   ]

But using the space+tab is not very readable and = it's a very=20 error prone construct. Here is a suggestion to use variables to = improve the=20 readability:

      WSPC   =3D "    "         # =
whitespace =3D space + tab
      SPC    =3D "[$WSPC]"      # regexp whitespace, the short name
                              # SPC was chosen because you use this
                              # a lot in condition lines.
      NSPC  =3D "[^$WSPC]"      # negation of whitespace

      :0
      *$ var ?? $NSPC
      {
          #   match anything except space and tab
      }

      :0
      *$ ! var ?? ($SPC|$)
      {
          #   match anything ecxept space and tab and newline
      }

But you cannot use newline inside brackets.

      WSPCL  =3D "   "'         # Whitespace with line feed
      '

      #   Won't work although WSPCL definition is correct.

      *$ var ?? [$WSPCL]

Instead use variable syntax:

     =
 SPCNL =3D "($SPC|$)"      # space + tab + newline

If you absolutely need a range of characters, see = if you have=20 echo command in your system to define = variables like=20 this:

      NUL_CHAR        =3D `echo \\00`
      DEL_CHAR        =3D `echo \\0177`
      REGEXP_NON_7BIT =3D "[^$NUL_CHAR-$DEL_CHAR]"

6.6 Handling exclamation character

[philip] you do need the first backslash, to = keep=20 procmail from considering the backslash as a request to invert the = sense of=20 the match. For example, these two conditions are equivalent:
      * ! 200^1 foo
      *   200^1 ! foo

Therefore, a leading '!' must either be = backslashed, enclosed=20 in either parens or brackets (I suspect that parens would be more = efficient),=20 or prefaced with an empty pair of parens. I would recommend writing = the=20 condition with one of these:

      * 200^1 \!!!!
      * 200^1 ()!!!!
      * 200^1 (!!!!)

6.7 Rules for generating a character class

In a "character = class"=20 (things between "[" and "]"), metacharacters don't need to be escaped. = Well, a=20 backslash is an exception. e.g. [$[^\\] would match any one of the = literal=20 characters dollar, opening bracket, caret, and backslash.=20

  • To match "])" use [])]=20
  • To match "[(" use [[)]=20
  • To include a literal ^ must not be first=20
  • To include a literal - must be first, last or \-=20
  • To include a literal \ you must use \\=20
  • To include a literal ] must be first=20
  • To include a literal [ ( ) or $ just use it anywhere

[elijah] If you are = inverting a=20 character class "first" means just after the(^). So the character = class that=20 contains everything but ] ^ and - must look like this:

      [^]^-]

[david] What if I = want literal $=20 inside bracket? A $ inside brackets, unless it begins a variable name = and the=20 "$" modifier is on, always means a literal dollar sign. It cannot mean = a=20 newline if it appears inside brackets. A good way to keep it exempt = from "$"=20 interpretation is to put it last inside the brackets (unless one also = need to=20 include a literal hyphen and one can't put the hyphen first; then = you'll need=20 to escape the dollar sign with a backslash and put the hyphen last -- = well,=20 you could alternatively escape the hyphen, I guess), because procmail = knows=20 that "$]" cannot possibly be a reference to a variable.=20

General guideline:=20

  • ($) always matches a newline, with or without "$" = interpretation;=20
  • [$] always matches a dollar sign, with or w/o "$" = interpretation;=20

6.8 Matching space at the end of condition

[david] If you need to have tab or space at = the end of=20 condition line you can use these:
      * rest of =
string .*
      * rest of string[ ]
      * (rest of string )
      * rest of string ()
      * rest of string( )         # This may be the best

[philip] From my = looking at the=20 source, the last two should be equal in efficiency, and except for a = trace=20 difference in regcomp time, should match at the same speed as a = solitary=20 trailing blank. The character class version [ ] will be slower. Of = course, I=20 suspect that neither you nor your sysadmin will ever notice the = difference in=20 speed, and given that 99% of all systems are I/O bound and not CPU = bound, the=20 system is incredibly unlikely to notice either. I can't complain = though, as I=20 also go to various extremes to seek out every last bit of possible=20 performance. Ah well. The first one would be slower yet, though = perhaps no=20 slower than the bracket form.

6.9 Beware leading backslash

I am trying to come up with a = procmail=20 recipe that among other things should have the condition 'body does = not=20 contain a particular word'. Here is what I tried:

      * ! B ?? \<word\>

[david] You have = fallen into the=20 leading backslash problem, If the first character of a regexp is a = backslash,=20 procmail takes it as "end of leading whitespace" and strips it. What = you coded=20 means "a less-than sign, then the word, then any non-word character." = (It also=20 prevents the less-than sign from being taken as a size operator.) = Unless the=20 non-word character immediately to the left of the word was a less-than = sign,=20 that regexp would fail (and thus the condition would pass). Try this: =

      * ! B ?? ()\<word\>

This would work too:

      * ! B =
?? \\<word\>

but in a casual reading it would look like "literal = backslash, less-than sign, the word, word boundary character," so we = on the=20 list generally recommend the empty parentheses.=20

Do note that the difference in meaning of \< and = \> in=20 procmail (where they must match a non-word character) from their = meaning in=20 perl and egrep (where they match the zero-width transition into and = out of a=20 word respectively) does not come into play here. Because procmail's = \< and=20 \> can match newlines (both real and putative), it rarely is a = factor. It's=20 a problem only when a single character has to serve both as the ending = boundary of one word an also the opening boundary of another. Well, = it's also=20 a problem when you have one as the last character to the right of \/, = but=20 that's easily solved.

6.10 Correct use of TO Macro

  • TO is not a normal regular expression; = it is a=20 special procmail expression that is designed to catch any = destination=20 specification. For details, see the miscellaneous section of the = procmailrc(5) man pages.=20
  • Prefer TO_ instead of TO=20 if you have new procmail. TO_ is better = because TO=20 used to be too loose=20
  • Please remember to write ^TO, with the = anchor in=20 it.=20
  • Do not put a space between the caret (^) and the word TO in ^TO.=20
  • Do not put a space between the ^TO and = the text=20 that you are matching on; it must be ^TOtext If this=20 bothers you, you can use TO()text instead = to get=20 better separation of text.=20
  • Both letters in TO must be = capitalized.

6.11 Procmail's regexp engine

[philip]=20 procmail's regexp engine has no special optimization for anchoring = against the=20 beginning of the line. Most program that have such an optimization = have it=20 because they need the line distinction for other reasons (for example, = grep by=20 default prints the entire line containing a match). Procmail has no = such other=20 reason, so it treats newline like any other plain character in the = regexp.=20 There should be no speed difference as long as procmail can say: "the = first=20 character I see must be a 'foo'". Note that case insensitivity is = handled by=20 making everything lowercase, so a letter being first doesn't bring in = the=20 spectre of character-classes or anything like that.=20

> recipe may have just = changed the size=20 of the head, procmail
> cannot keep a byte-count pointer nor a=20 line-count pointer to
> where the body begins but must scan = through the=20 head to find the
> blank line at the neck before it begins a = body=20 search.

Procmail does this when it reads in the head, not = when it=20 goes to search the body, so that cost can't be avoided. Let me repeat; = that=20 searching the body is no slower than searching the header, if we = forget the=20 minimum impact of the size of these two.

6.12 Procmail and egrep differences

[By david]=20

  • ^ and $ are non-zero-width and anchor to real or putative = newlines=20 (rather than to the zero-width start and end of a line);=20
  • An initial ^^ or a final ^^ anchors to the opening or closing = putative=20 newline respectively;=20
  • ^ and $ in the middle of a procmailrc regexp match to an = embedded=20 newline (and must be escaped to match to a caret or a dollar sign);=20
  • \< and \> are non-zero-width and match to a character that = wouldn't be in a word (or to a real or putative newline) [rather = than to the=20 zero-width transition into or out of a word]; it always matches one = non-word=20 character. It will fail when there is no whitespace after the colon. = This is=20 rather pathological but still perfectly compliant with RFC822. For = this=20 reason, you should use (.*\<)? instead of just .*\< after the = colon=20 that terminates a header field name:
    =
      ^Subject:.*\<humor\>        # Wrong
          ^Subject:(.*\<)?humor\>     # Right, notice ?

  • *, ?, and + in the absence of \/ are stingy rather than greedy, = and that=20 generally won't matter, but in the presence of \/ they are stingy to = the=20 left of \/ and greedy to the right of \/, while in most applications = the=20 leftmost wildcard on a line is the greediest and greed decreases = from left=20 to right.

6.13 Understanding procmail's minimal matching (stingy vs. greedy) =

...I want to have a procmail = recipe that=20 will save certain mail to folders where the folder name (always a = number) is=20 specified in the subject.

      :0 :
      * ^Subject: *\/[0-9]*
      $HOME/Mail/$MATCH

[philip]...and this = won't quite=20 work. For a subject with a space after the tab, the '*' on the left = hand side=20 will be matched minimally (zero times), and then the stuff on the = right hand=20 side will be matched maximally, but starting at the space still, which = will=20 match nothing. This is a case were procmail's minimal matching can = cause=20 massive confusion and frustration. The solution is usually the = following:

      FORCE THE RIGHT HAND SIDE TO MATCH AT =
LEAST ONE CHARACTER

By Changing the recipe to:

      =
:0 :
      * ^Subject: *\/[0-9]+
      $HOME/folders/$MATCH

it'll work, because then the left hand side will = have to=20 match all the way up to the first digit (but not the digit itself). If = you=20 follow the rule in caps then you'll almost always be able to ignore = procmail's=20 weirdness in this area.=20

[david] And examine = how procmail=20 matches "Subject: Keywords 9999"

      * =
^Subject:.*Keywords.*\/[0-9]*

      procmail: Match on "^Subject:.*Keywords.*\/[0-9]*"
      procmail: Matched ""

The right side was as greedy as it could be; the = problem is=20 that we seem to expect greed on the left as well. MATCH is set to = null, in=20 contrary to our expectation. It is not a bug but rather a frequently=20 misunderstood effect of the way extraction is advertised to operate.=20

Remember that only the right = side is=20 greedy; the left side is stingy, and left-side stinginess takes = precedence=20 over right-side greed.=20

Extraction is implemented this way: the entire = expression,=20 left and right, is pinned to the shortest possible match; then the = division=20 mark is placed and the right side is repinned to the longest possible = match=20 starting at the division. The tricky part is to remember that the = division is=20 marked during the stingy stage.=20

If the expression is

      =
^Subject:.*Keywords.*\/[0-9]*

and the text is

      =
<newline>Subject:<space>Keywords<space>9999<newline&=
gt;

then the shortest possible match to the entirety is =

      <newline>Subject:<space>Keywords

because ".*" and "[0-9]*" both match to null. Then = the=20 division mark is placed on the space after "Keywords" and procmail = looks for=20 the longest possible match to [0-9]* starting with that space. That, = again, is=20 null, so MATCH is set to null.=20

We see that it works as expected if regexp is = changed to=20 this:

      ^Subject:.*Keywords.*\/[0-9]+

That is a whole other ball of wax. Now the shortest = match to=20 the entirety is

      =
<newline>Subject:<space>Keywords<space>9

and the division mark is placed at the 9. Then = procmail=20 refigures the longest match to the right side starting at the division = mark=20 and sets MATCH=3D9999. However here

      =
^Subject:.*Keywords\/.*[0-9]*

the second ".*" would have reached not just up to = the digits=20 but through them to the end of the line. MATCH would contain the rest = of all=20 of it matched to ".*" plus null match "[0-9]*".=20

[for curious reader]=20

Given line

      Subject: =
Keywords 9999

the second, which differs only by inserting the = extraction=20 marker, would not match and would not set $MATCH:

   =
   ^Subject: Keywords *9999        # matches ok
      ^Subject: Keywords *\/9999      # won't !

because the left side would be matched to=20 "<newline>Subject: Keywords" and the immediately following text, = "=20 9999", did not match the right side. It would actually make the = condition fail=20 and keep the recipe from executing. It took a lot of circuitous coding = to=20 allow for not knowing in advance exactly how many spaces there would = be before=20 the digits.=20

Call it counterintuitive, but = it's not a=20 bug. General advice: always make sure that the right side cannot match = null or=20 that the last element of the left side cannot match null. Or in other = words:=20 force the right-hand side of the \/ to match at least one = character.=20

6.14 Explaining \/ and ()\/

MATCH strips all=20 leading blank lines in 3.11pre7=20

[david] \/ with = nothing to the=20 left of it means "one foreslash". To start a condition with the = extraction=20 operator, use ()\/ or \\/; the latter looks counter intuitively like = "literal=20 backslash and literal foreslash" (as it would mean if it appeared = farther=20 along in the regexp), so most of us prefer the former.

      *$ var ?? $s+\/$d+      # ok, \/ in the middle
      *$ var ?? \/$d+         # Wrong, when \/ is at the beginning
      *$ var ?? ()\/$d+       # No ok, () at the beginning

6.15 Explaining ^^ and ^

[philip]=20 Procmail doesn't think lines when it matches; = but it=20 concatenates all lines together and then runs the regexp engine. This = may be a=20 bit surprising, but consider the following where we want to discard = any=20 message that is likely a HTML advertisement
      #  =
 Body consists entirely of HTML code
      #   something which'll match any message which has "<HTML>"
      #   in the body

      :0 B:
      *$ $s*<HTML>
      HTML.mbox

The condition test is applied to the entire body. = If you want=20 to limit it to match only against the beginning of the body, you have = to say=20 so using the ^^ token, as you discovered. A simple line anchor (^ or = $) just=20 says that there must be a newline (or the beginning or end of the area = being=20 searched) at that particular point in the text being matched. notice = the=20 leading anchors below.

      #   trap spam where the =
*very* first line of the body started with
      #   <HTML>

      :0 B:
      *$ ^^$s*<HTML>
      HTML.mbox

What, exactly, does "Anchor = the expression=20 at the very start of the search area..." i.e. the ^^ ?=20

[dan] Technically, an = opening ^^=20 anchors to the putative newline that procmail sees before the first = character=20 of the search area (and a closing ^^ anchors to the putative newline = that=20 procmail sees after the end of the search area). When the search area = is B,=20 that is a point equivalent to the second of the two adjacent newlines = that=20 enclose the empty line that marks the end of the head.=20

The reason I'm bringing that up is this: if there = are=20 multiple empty or blank lines between the head and the body, ^^ will = mark the=20 start of the second of those lines, not the start of the first line of = the=20 body that contains some text.=20

So if you want to test whether <pattern> is = the first=20 printing text in the body, even if it is not necessarily flush left on = the=20 very first line, you might need a condition like the following, where = there is=20 space/pipe/tab/pipe/dollar.

      *$ B ?? =
^^$SPCNL*<pattern>

6.16 ANDing traditionally

Erm, you knew this already if you = read the=20 man pages. Stacking condition lines one after another does the AND = operation,=20 where all of the conditions must be present:
      * =
condition1
      * condition2

6.17 ORing traditionally

Here is simple OR case. There are = some cases=20 where it's impossible to OR conditions with this style. [philip] knows more about those cases.
      *  condition1|condition2

Likewise, two exit code tests can often be ORed = like this

      * ? command1 || command2

But there are many situations where two tests = cannot be ORed=20 by combining them into one condition:=20

  • a regexp search of one area ORed with a regexp search of a = different=20 area=20
  • a positive regexp search [i.e., for a match to its pattern] ORed = with a=20 negative regexp search [i.e., for the absence of any match to its = pattern]=20
  • an exit code condition ORed with a regexp search condition=20
  • an exit code condition seeking success ORed with an exit code = condition=20 seeking failure=20
  • a size test ORed with anything else (even another size test) =

How can I make OR conditions = that all use=20 the SAME action? I want to be able to test for a number of variants on = certain=20 requests, all in one block.=20

[hal] Yes, this can = be easily=20 done

      CASE =3D ""

      :0
      * case 1 tests
      {
          CASE =3D 1
      }
          :0 E
          * case 2 tests
          {
              CASE =3D 2
          }

      :0
      * ! CASE ?? ^^^^
      {
          # real work, perhaps with explicit tests on CASE
      }

Case study: Finding text from header and=20 body=20

[david] In addition = to the=20 standard ways of coding OR, here's a special one for searching the = subject and=20 the body for a given word in either:

      * HB ?? =
^^(.+$)*(Subject:(.*[^a-z0-9])?|$(.*\<)*)remove\>

If the string doesn't have to be preceded by a word = border,=20 it gets a little simpler:

      * HB ?? =
^^(.+$)*(Subject:.*|$(.|$))*string

6.18 ORing and score recipe

Once any of the conditions match, = the=20 score gets a positive value and the recipe succeeds. Idea by Erik = Selke selke@tcimet.net=20

[era comments] ...allegedly the scoring system is = going to=20 cost you more than plain old regex matching. Floating-point math and = all that,=20 even if you use extremely simple scoring. Thus, it would probably be = slightly=20 more efficient to do it the De Morgan way.

      * =
1^0 condition1
      * 1^0 condition2

We can now write the previous case stydy (HB ORing=20 traditionally) with scores. I was tempted to write it like this, when = [david] told me the following.

      * 1^0 H ?? match-it
      * 1^0 B ?? match-it

[david] That will = work, but it=20 isn't the best way to do ORing, because if a match is found to the = first=20 condition procmail still takes the trouble to test the second one. = Better, use=20 the supremum score on each condition:

      $SUPREME =
=3D 9876543210

      *$ $SUPREME^0 first_condition_to_be_ORed
      *$ $SUPREME^0 second_condition_to_be_ORed
      * ... etc. ...
      *$ $SUPREME^0 last_condition_to_be_ORed

Upon reaching the supreme score, procmail will skip = all=20 remaining weighted conditions on the recipe, deeming them matched. = Since all=20 conditions on this recipe are weighted, once procmail finds one = matched=20 condition it will skip the rest and execute the action.

6.19 ORing by using De Morgan rules

[Tim Pickett tbp@cs.monash.edu.au] I = thought=20 I'd point out that there are a few ways to do a logical OR of = conditions.=20 Someone posted a solution here that involved using procmail's scoring = system,=20 but I figured you could do it without scoring by taking advantage of = De=20 Morgan's rule:
      a or b      is same as   =
not(not a and not b)

or mathematically:

      a || b =
<=3D> !( !a && !b )

Here's a way to do ORing

      :0
      * ! condition1
      * ! condition2
      { }             # official procmail no-op. MUST LEAVE SPACE
      :0 E
      action_on_condition1_or_condition2


7.0 Variables

7.1 Setting and unsetting variables

You have already set = variables=20 with the "=3D" syntax. Variable names are case sensitive: var is different from VAR
      VAR =3D /var/tmp  # =
directory
      VAR =3D "this"    # literal
      VAR =3D 1
      VAR =3D $FOO      # another.
      VAR =3D "$VAR at" # combined with previous value

Unsetting a variable is done like this

      VAR             # kill variable.
      VAR=3D            # same, but with old style
      VAR =3D ""        # Variable is said to be "null" now

And you can put multiple assignments on the same = line,=20 although not recommended:

      VAR=3D1  VAR=3D2  =
VAR=3D3

Examine the following, which are all equivalent. = The back=20 ticks will not require a shell in the absence of any SHELLMETAS so neither of these will spawn a shell =

          #   case1: We Don't care if file exists this =
time...

          VAR =3D `cat file`

          #   case2: The use of {} is considered "modern"

          :0
          * condition
          {
              VAR =3D `cat file`
          }

          #   case3: oldish, and procmail specific and errors have
          #   been reported if you use this construct.
          #   Note: There must be no space in "VAR=3D|"

          :0
          * condition
          VAR=3D| cat file

7.2 Variable initialization and sh syntax

Procmail borrows = some sh=20 syntax for variable initialization. Note that sh's ${var:=3Ddefault} = and=20 ${var=3Ddefaultvalue} syntaxes are not available in a procmail rcfile. =

  • VAR1 =3D ${VAR2:-value} sets VAR1 to VAR2 if VAR2 is set and = non-null, and=20 sets VAR1 to default "value" otherwise=20
  • VAR1 =3D ${VAR2-value} sets VAR1 to VAR2 if VAR2 is set, and = sets VAR1 to=20 default otherwise=20
  • VAR1 =3D ${VAR2:+value} sets VAR1 to "value" if VAR2 is set and = non-null,=20 and sets VAR1 to VAR2 otherwise.=20
  • VAR1 =3D${VAR2+value} Sets VAR1 to "value" if VAR2 is set and = sets VAR1 to=20 VAR2 otherwise.

And here are the classic usage examples

      VAR =3D ${VAR:-"yes"}     # set VAR to default value =
"yes"
      VAR =3D ${VAR+"yes"}      # If VAR contains value, set "yes"

Ever wondered if this calls `date` in all cases? =

      VAR =3D ${VAR:-`date`}

No, procmail is smart enough to skip calling date if VAR already had value. It doesn't evaluate = the whole=20 line. Below you see what each initialising operator does. Study it = carefully

      VAR =3D ""                # Define =
variable
      VAR =3D ${VAR:-"value1"}  # VAR =3D "value1"
      VAR =3D ""
      VAR =3D ${VAR-"value2"}   # VAR =3D ""

      VAR =3D ""
      VAR =3D ${VAR:+"value3"}  # VAR =3D ""
      VAR =3D ""
      VAR =3D ${VAR+"value4"}   # VAR =3D "value4"

      # Note these:
      VAR =3D "val"
      VAR =3D ${VAR:+"value3"}  # VAR =3D "value3"
      VAR =3D "val"
      VAR =3D ${VAR+"value4"}   # VAR =3D "value4"


      VAR                     # kill the variable
      VAR =3D ${VAR:-"value1"}  # VAR =3D "value1"
      VAR
      VAR =3D ${VAR-"value2"}   # VAR =3D "value2"

      VAR
      VAR =3D ${VAR:+"value3"}  # nothing is assigned
      VAR
      VAR =3D ${VAR+"value4"}   # nothing is assigned

And if you want to choose from several initial = values, you=20 might use the recipe below instead of the standard var =3D = ${var:-"value"}.

      :0
      * VAR ?? ^^^^
      {
          #   no value (or was empty), set default value here based on
          #   some guesses

          VAR =3D "base-default"

          :0
          * condition
          {
              VAR =3D "another-default"
          }

          ...more conditions..
      }

You could also use equivalent, but less readable = condition=20 line in previous recipe:

      *$ ${VAR:+!}

It works, because if variable contains a value the = line=20 expands to

       * !

Where "!" is the procmail "false" operation. One = more way to=20 do the same would be, that we require at least one character to be = present.=20 You could use also regexp (.), which would require at least one = character to=20 be present, but you might not like matching pure spaces.

      * ! VAR ?? [a-z]

7.3 Testing variables

If possible, perform positive tests, = rather=20 than negative, like below:
      * ! TEST_FLAG ?? =
yes

With negative test, this would be:

      *  TEST_FLAG ?? no

Using literal strings like "yes" and "no" might = present more=20 clear though what is going that a traditional "!" negation of a test. = Note,=20 that the following fails if the variable is unset or null.

      * variable ?? (.)

That was why it would be better to test:

      *$ variable ?? $NSPC

Or

      * variable ?? (.|$)

to require that variable = contain at least=20 one character. But neither is a way to check whether a variable is set = or not,=20 because each treats a null variable the same as an unset one. This is = the best=20 way to check whether a variable is set or not:

      =
*$ ! ${VAR+!}

[gsutter@pobox.com] Here is = yet=20 another way to test if variable is set and if it isn't, sets it to a = default=20 value.

      :0
      *$ ! VAR^0
      {
          VAR =3D "value"
      }

7.4 What does $\VAR mean?

[era and david] Procmail 3.11, = $\VAR will=20 escape regexp metacharacters. It should produce a suitably = backslash-escaped=20 expression for Procmail's own use. In addition $\VAR will always begin = with=20 leading empty parentheses.=20

You can't pass the $\VAR construct to shell = programs, because=20 there is that leading parenthesis. Here's a recipe to standardize the = regexp.=20 You can pass SAFE_REGEXP to an external programs like sed.

      PROCMAIL_REGEXP =3D =
"$\VAR"

      :0
      * PROCMAIL_REGEXP ?? ^^\(\)\/.*
      {
          SAFE_REGEXP =3D "$MATCH"
      }

[era] Note that this = is slightly=20 inexact; Procmail will backslash-escape according to Procmail's needs, = not=20 sed's. For example, Procmail doesn't think braces are magic (although = that=20 would be nice to have in Procmail as well) whereas many modern = variants of sed=20 do.

7.5 Common pitfalls when using variables

Procmail is picky = and=20 forgives nothing. Here are some of the favorite mistakes one can make: =
      $EMAIL  =3D "foo@site.com"      # Done Perl =
lately? Remove that $

      # Erm, this is ok, but many procmail recipe writers want to
      # take extra precautions and include the regexps in parentheses.
      # So, maybe (yabba|dabba|doo) would be more safe

      REGEXP  =3D "yabba|dabba|doo"

      *  Subject:.*$REGEXP  # Hey, you need the "*$ Subject..."

      *$  $REGEXP ?? hello  # surely you meant '* REGEXP ?? hello'

7.6 Quoting: Using single or double quotes

Pay attention to = this:
      VAR =3D "you"
      NEW =3D 'hey "$VAR"'  # won't extrapolate $VAR; you get literal
      NEW =3D "hey '$VAR'"  # extrapolates to: hey 'you'

You can even combine separate words together

      VAR =3D "1 ""and"" 2" # same as "1 and 2"

Don't let these many quotes disturb you, just count = the=20 beginning and ending quotes. Superfluous here, but you may need some = similar=20 construct somewhere else.

      VAR =3D '1 =
'"'"'and'"'"' 2'  # same as: 1 'and' 2

[david] Beware = forgetting quotes,=20 like when you'd do

      SENDMAILFLAGS =3D =
-oQ/var/mqueue.incoming -odq

Procmail translates ! = into |=20 "$SENDMAIL" "$SENDMAILFLAGS" as the procmailrc(5) man page warns us. = By the=20 rules of sh quoting, that means that shell sees only the first switch =

      % sendmail -oQ/var/mqueue.incoming

My suggestion: since you need a soft space inside = $SENDMAILFLAGS, use the quotes when you define = $SENDMAILFLAGS but do this instead of using the = ! operator for forwarding:

      =
SENDMAILFLAGS =3D "-oQ/var/mqueue.incoming -odq"

[Walter Haidinger walter.haidinger@gmx.net]=20 Here's yet another approach: deliver messages from procmail directly = to=20 mailboxes in all those users' homes. No sendmail involved, much lower loads.

      :0:
      * <condition>
      /var/spool/mail/someuser

[philip] Assuming = that "someuser"=20 is an actual user in the password file (I haven't been following this = thread,=20 some maybe that isn't true here), then the following is probably = better:=20

Walter Haidinger comments on = this recipe:=20 I'm happy to announce that this works really = well. No harm=20 is done to the system-load anymore. What a relief!

      :0 w
      * conditions
      |procmail -d someuser

That lets procmail's very tricky "screenmailbox()" = routine=20 take care of bogus mailboxes in a secure fashion.=20

Is that as safe as forwarding? = Does=20 another sendmail delivering to /var/spool/mail/someuser use the same = locking=20 mechanism and notice that mailbox is already locked? I don't want to = risk a=20 corrupt mailbox.=20

[philip] Sendmail = only delivers=20 directly to files through aliases that say things like:

          whatever: /some/local/file

Under normal circumstances, sendmail calls the = local mailer=20 to actually store mail in a file, and since that's procmail (right?), = there=20 shouldn't be a problem. Also, sendmail 8 does kernel-level locking = when it=20 delivers directly.

7.7 Quoting: Passing values to an external program

Remember = to=20 include the double quotes when you send variables' values to the shell = programs. Below you see a mistake, because the content of the SUBJECT = is not=20 quoted and thus not available from perl variable $ARGV[1].
      :0               =
           # Use procmail match feature
      * ^Subject:\/.*
      {
          SUBJECT =3D "$MATCH"
      }

      :0
      * condition
      | perl-script $SUBJECT      # mistake; use "$SUBJECT"

There is also another way. If your script can = access=20 environment variables (almost all programs can), then you do not need = to pass=20 the variables on the command line. Above, the SUBJECT is already in = the=20 environment and in Perl you can get it with:

      =
$SUBJECT =3D $ENV{SUBJECT};

Next, do you know what is the difference between = these two=20 recipes?

      :0
      | "command arg1 arg2 arg3"

      :0
      | command "arg1" "arg2" "arg3"

You guessed it. The first one quotes the entire = command and=20 does not do the right thing, the latter is correct and depending on = the=20 content of argN variables. Anyway, play safe and always add quotes.=20

Sometimes you need trickier quoting to to get = single quotes=20 around the arg. Pay attention to this, = because this=20 may be the reason why your grep command doesn't seem to succeed as you = expect.=20

      #  If $GREP "$arg" doesn't seem to work

      * ? $GREP "'"$arg"'" $DATABASE

7.8 Passing values from an external program

External programs = cannot=20 set procmail variables directly. Programs must write the values to = external=20 files and then read the values from these files. Capturing only one = value is=20 easy:
      var =3D `command`      # capture STDOUT

But if a program modifies the body and exports some = status=20 information it is trickier. We assume here that the script is = controlled by=20 you and that you have added the switch --export-status option which = causes the=20 program to print information to a separate file.

    =
  LOCKFILE    =3D $HOME/.run$LOCKEXT  # protect external file writing
      valueFile   =3D $HOME/tmp/values

      #   modify body, and export status values to external file: one
      #   value in every line
      #
      #       VALUE1
      #       VALUE2
      #       VALUE3

      :0 fb
      | $NICE script.pl --export-status $valueFile

      values =3D `cat $valueFile`

      # Derive values from each line

      :0                              # line 1
      *$ values ?? ^^\/[^$NL]+
      {
          var1 =3D $MATCH
      }

      :0                              # line 2
      *$ values ?? ^^.*$\/[^$NL]+
      {
          var2 =3D $MATCH
      }

      :0                              # line 3
      *$ values ?? ^^.*$.*$\/[^$NL]+
      {
          var3 =3D $MATCH
      }

      LOCKFILE    # Release lock

[richard] = Alternatively write=20 valueFile from your rc or external program with lines like

      PARAM1=3D"value for param 1"
      PARAM2=3D"value for param 2"
      PARAM3=3D"value for param 3"

and read it with

      INCLUDERC =
$valueFile

Now there is no need to worry about synchronizing = the read=20 with the lines, or about adding new parameters, since each is labeled = in=20 valueFile.

7.9 Incrementing a variable by a value N

[dan, phil and = Richard]=20 Here's a recipe for incrementing a variable by a value N. If $VAR is = not a=20 number, we get an error. Note that if $VAR + $N is not greater than 0, = this=20 recipe will not change the value of VAR if the assignment happens = inside=20 braces. You must place the assignment after the closing curly brace. =
      :0
      *$ $VAR ^0
      *$ $N   ^0
      { }             # procmail no-op
      VAR =3D $=3D

7.10 Comparing values

It's too expensive to call the shell's = test function to do [-lt|-eq|-gt] because you can = do the=20 same with procmail. The do-something below is run if SCORE <=3D = MAXIMUM. The=20 recipe simply subtracts SCORE from MAXIMUM and determines if the = result is=20 positive.
      :0
      *$ -$SCORE   ^0
      *$  $MAXIMUM ^0
      {
          .. do-something
      }

[idea by era] it's getting slightly cumbersome if = it's=20 between MIN and MAX:

      :0
      *$   $SCORE ^0
      *$  -$MIN   ^0
      {
          dummy               # no-op, just for the LOG

          :0
          *$ -$SCORE  ^0
          *$  $MAX    ^0
          {
              suitable
          }
      }

Eg. When values are MIN=3D1, MAX=3D5, SCORE=3D4 =

      procmail: Assigning "SCORE=3D4"
      procmail: Score:       4       4 ""
      procmail: Score:      -1       3 ""
      procmail: Assigning "dummy"
      procmail: Score:      -4      -4 ""
      procmail: Score:       5       1 ""
      procmail: Assigning "suitable"

7.11 Strings: How many characters are there in a given string? =

      :0
      *  1^1 VAR ?? .
      { }
      LENGTH =3D $

7.12 Strings: How to strip trailing newline.

Suppose you have = used=20 regexp, which left newline($) in the MATCH. = If you=20 wonder why the recipe works, remind yourself that regexp operator "." = never=20 matches a newline.
      :0
      * VAR ?? ^^\/.+
      {
          VAR =3D $MATCH
      }

7.13 Strings: deriving the last N characters of a string. =

      #   1998-06-23 PM-L [walter] Note the use =
of
      #   the $ sign below to anchor to end-of-string...
      #
      #   For last 2 characters use * VAR ?? ()\/..$
      #   For last 5 characters use * VAR ?? ()\/.....$

      :0                      # Last character
      * VAR ?? ()\/.$
      {
          TAIL =3D $MATCH
      }

7.14 Strings: Getting partial matches from a string.

[dan] Getting a match to the right is quite = easy with=20 procmail's match operator.
      VAR =3D "1234567890"

      :0
      * VAR ?? ()\/3.*
      {
          result =3D $MATCH                         # now 34567890
      }

but deleting 2 characters from the end is nearly = impossible=20 without forking an outside process. The cheapest might be expr because it doesn't need a shell to pipe echo to it (as sed would = and I=20 believe perl would):

      =
#   by resetting the shellmetas, this will only call
      #   `expr'. If we wouldn't have fiddled with shellmetas,
      #   this would have called two processes: sh + expr

      saved       =3D   $SHELLMETAS
      SHELLMETAS
      result      =3D   `expr "$VAR" : '\(.*\)..'`  # now 12345678
      SHELLMETAS  =3D   $saved

ksh or bash could do it as well:

 =
     #   semicolon to force invoking a shell, actually
      #   first question mark will force a shell already.

      saved       =3D $SHELL
      SHELL       =3D /bins/sh
      result      =3D `echo ${VAR%??} ;`
      SHELL       =3D $saved

Now, if you know that the last two characters will = be "90",=20 that's different. Of course, this totally screws up if the = third-to-last=20 character is a 9.

      :0
      * VAR   ?? ()\/.*[^0]
      * MATCH ?? ()\/.*[^9]
      {
          result =3D $MATCH                         # now 12345678
      }

[jari] Comments: If a = shell must=20 be used, then awk is a good tool for simple = string=20 manipulation. Its startup time is faster that perl's whose overhead is = due to=20 internal compilation. awk also consumes less = recourses=20 overall than perl. Following will only work = if VAR is=20 a string of continuous block of characters. (ARGV[1] can be used)

      saved       =3D   $SHELLMETAS
      SHELLMETAS

      VAR =3D ` awk 'BEGIN{ v =3D ARGV[1];                               =
 \
              print substr(v,1,length(v)-2); exit }'                  \
              "$VAR"                                                  \
            `

      SHELLMETAS =3D $saved

This version requires some file,=20 any file, so that we get awk started. In the previous code all the = work was=20 done in the BEGIN block and no file was ever opened.

      saved       =3D   $SHELLMETAS
      SHELLMETAS

      VAR =3D ` awk '{print substr(v,1,length(v)-2); exit }'            =
\
              v=3D"$VAR" /etc/passwd                                    =
\
            `

      SHELLMETAS =3D $saved

[dan] comments awk: = expr is sure to be a smaller binary than awk for = procmail to=20 fork, and it needs much less command-line code to do this job. Note = also that=20 one still has to diddle with SHELLMETAS to avoid a shell, because the = awk code=20 contains brackets; thus it doesn't replace all.=20

There is also a way to remove words from the end of = string by=20 procmail means if the strings are separated by same separator. Let's = use the=20 word this-mailing-list-request which we would like to shorten to=20 this-mailing-list. [david] presented the = recipe=20 1998-06-16 in PM-L.

      VAR =3D =
"this-mailing-list"

      #   1) if there is match at the end ending to these words
      #   2) Get everything up till last match and store it to MATCH
      #   3) Read MATCH, but exclude last dash "-"

      :0
      * VAR   ?? -(owner|request|help)^^
      * VAR   ?? ^^\/.*-
      * MATCH ?? ^^\/.*[^-]
      {
          VAR =3D $MATCH
      }

7.15 Strings: Procmail string manipulation example =

[1998-06-23 PM-L=20 walter] ... Now we get to apply these formulas to strip the last = character off=20 a string. It gets a bit ugly for special cases. I've deliberately = chosen a=20 worst-case scenario.
      VAR         =3D "Testing =
012301230111"
      RC_APPEND   =3D $PMSRC/pm-myappend.rc

      :0
      * VAR ?? ()\/.$
      {
          TAIL =3D $MATCH           # last character of VAR "1"

          # Get the longest match that does not end in the TAIL =
character

          :0
          *$ VAR ?? ()\/.*[^$TAIL]
          {
              HEAD =3D $MATCH      # now "Testing 012301230"

              #   if the last two or more characters in VAR are
              #   identical, they all get chopped, oops

              :0
              * -1^0
              *  1^1 VAR  ?? (.)
              * -1^1 HEAD ?? (.)
              {
                  dummy     =3D "tooshort"
                  INCLUDERC =3D $RC_APPEND
              }
          }
      }

      result =3D $HEAD              # "Testing 01230123011"

      # ........................................ pm-myappend.rc
      #   LENGTH(HEAD) plus 1 SHOULD equal LENGTH(VAR). That is
      #   not the case when the last 2 (or more) ending
      #   characters are identical. in that case, call appendrc
      #   recursively to stick back an appropriate number of
      #   TAIL characters.

      :0
      * -1^0
      *  1^1 VAR  ?? (.)
      * -1^1 HEAD ?? (.)
      {
          HEAD      =3D "$HEAD$TAIL"
          INCLUDERC =3D $RC_APPEND
      }

7.16 How to raise a flag if the message was filed

      FILED =3D !       # ! is procmail "false"

      :0 c:           # We process the message more
      * condition
      foo

          :0 a
          {
              FILED   # Kill variable
          }

      ...

      :0              # Stop if previous cases filed the message
      *$ $FILED
      {
          HOST =3D "_done_"
      }

Or alternatively: procmail automatically sets LASTFOLDER if it delivers message to mailbox.

      LASTFOLDER      # kill variable

      :0 c:
      * condition
      foo

      :0 c:
      * condition
      bar

      ... et cetera ...

      :0
      * ! LASTFOLDER ?? ^^^^      # Or   ${LASTFOLDER+!}!
      {
          HOST =3D "_done_"         # Force procmail to stop
      }

7.17 Dollar sign in condition lines.

#todo, check this recipe =
      This doesn't seem to work for me...

      * ^TO()$\foo@bar.com

[david] An unescaped = dollar sign=20 later in the line represents a newline, so what you have there is = searching=20 for the following:=20

  1. An expression that matches the expansion of the ^TO token (which = is=20 anchored to the start of a line by its definition), followed by=20
  2. A newline, followed at the start of the next line by=20
  3. "foo@bar" [the backslash escapes the f, which didn't need = escaping],=20 followed by=20
  4. any character that is not a newline (the period is unescaped), = and=20 finally=20
  5. "com".

Try this instead:

      *$ =
^TO()$\foo@bar\.com

#todo: the dollar seems exactly the same in the = above two=20 #todo Examples: are you sure that this is correct?=20

In fact, to avoid matches to things like=20 foo@bar.community.edu, you might want to do it this way:

      *$ ^TO()$\foo@bar\.com\>

7.18 Finding mysterious foo variable

I have my fellow worker's = procmail code=20 and he uses a variable FOO that I can't find in his code anywhere. = It's not a=20 shell variable either, because it's literal. Where does it come = from?=20

Your procmail runs /etc/procmailrc when it starts, = please=20 check that. It may define some common variables already for all users. =

7.19 Storing code to variable

One way to run complex code in = a=20 procmail recipe is first to store it in a variable. Idea by [era]. You could do this in a separate shell = script too.=20 The following example reads URLs from the body of a message: the URLs = have=20 been put to separate lines and some special Subject is used to trigger = the=20 dumping of the HTML pages:
      #   Code by [era]
      #
      COMMAND=3D'while read url; do
          case "$url" in
            *://*)
              lynx -traversal -realm -crawl -number_links "$url" |
              $SENDMAIL $LOGNAME
              ;;
          esac
      done'


      #  Notice the trailing semicolon after `eval' !
      :0 bw
      * ^Subject: xxxxx
      | eval "$COMMAND" ;

If you want to run the code inside the nested = block, then=20 look carefully, there are double quotes around the command in back = ticks. If=20 you leave double quotes out, then each word in SH_CMD would be = interpreted=20 separately:

      $SH_CMD =3D '$echo "$VAR" >> =
$HOME/test.tmp'

      :0
      * condition
      {
          #   condition satisfied; run the given shell command
          #   and do something more.

          dummy =3D `"$SH_CMD"`

          ..rest of the code..
      }

A similar construct works for message echoing too: =

      MESSAGE=3D'Thank you so much for your message.
      Unfortunately, the volume of mail I receive .... (blah blah blah).
      If your matter is urgent, try calling +358-50-524-0965.
      '

      :0 hw
      * ! ^X-Loop: moo$
      | ($FORMAIL -rt -A "$MY_XLOOP"; echo "$MESSAGE") | $SENDMAIL

7.20 Getting headers into a variable.

[david] Here are several ways to get the = entire header=20 into a variable:
      HEADER =3D `$FORMAIL -X ""` # =
The space after the X is vital.
      HEADER =3D `sed /^$/q` # also writable as   HEADER=3D`sed /./!q`

      :0 h
      HEADER=3D|cat -

will save the entire header into one variable. It = has to be=20 smaller than $BUFSIZE, though. This way might work as well, and will = require=20 no outside processes if it does:

      :0
      * ^^\/(.+$)*$
      {
          HEADER =3D $MATCH
      }

7.21 Converting value to lowercase

If you know that a word = belongs to=20 set of choices, you can do this inside procmail
     =
 LIST =3D ":word1:word2:word3:word4"   # Colon to separate words
      WORD =3D "WORD1"

      :0
      *$ LIST ?? :\/$WORD
      {
          WORD =3D $MATCH
      }

But if you don't know the word or string = beforehand, then=20 this is the generalized way: [idea by era and david]

      :0 D
      * WORD ?? [A-Z]
      {
          WORD =3D `echo "$MATCH" | tr A-Z a-z`
      }


8.0 Suggestions and miscellaneous

8.1 Speeding up procmail

  • Use absolute paths to take the burden of searching binary along = path=20 from shell: Use $FORMAIL variable abstraction.
          $FORMAIL =3D "/usr/local/bin/formail"

          :0 fhw
          | $FORMAIL -I "X-My-Header: value"

  • Multiple echo commands that spread = many lines=20 can be converted to single echo command if \n escape is supported. = You=20 usually see these in auto responders
    =
      echo "........."; \
          echo "........."; \
          echo ".........";

          -->

          echo ".........\n" \
               ".........\n" \
               ".........\n";

  • You can avoid multiple and possible expensive FROM_DAEMON tests = by=20 caching the result at the top of your .procmailrc. You can now use = variable=20 $from_daemon like the big brother FROM_DAEMON. The same idea can be = applied=20 to FROM_MAILER regexp. If you have pm-javar.rc, it=20 already defines variables $from_daemon and = from_mailer exactly like here:
          from_daemon =3D "!"

          :0
          * ^FROM_DAEMON
          {
              from_daemon =3D "!!"  # double !! means "OK"
          }

          :0
          *$ ! $from_daemon
          {
              ..do-it..
          }

  • Count the back ticks and you know how many shell calls procmail = has to=20 launch. See if you can minimize them and use some procmail code = instead.=20
  • ^TO and other macros are expensive, see if you can use simple=20 Header:.*\<match-it\> instead. Well, it's not clear if this = gives you=20 much speed advantage.=20
  • Don't call "$FORMAIL -xHeader:" every time you need a header = value,=20 consider if it suffices to use match = operator \/.=20
  • You can minimize the calls to only one formail=20 if you add many headers along the way: See formail usage tips in = this=20 document=20
  • Searching body is expensive, simply because it contains more = text. There=20 isn't much to do about this, because you use B=20 anyway when you need it.=20
  • See if you can move some tasks to your .cron file. procmailrc is = not=20 meant for those purposes. Instead of calculation daily values every = time in=20 procmail, let cron do that at 04:00 or 21:00. Don't run cron at = midnight if=20 you can, because everybody else is running their crons at the same = time. If=20 "logical" date change time can be used (when you arrive to work, = when you=20 leave the work), use it in cron jobs.=20
  • [philip] Setting LINEBUF permanently to a big value slows = procmail down.=20
  • Remove all calls to perl and use = programs that=20 are nicer to the system (If you just call command line perl, there = is=20 probably an equivalent alternative with awk tr sed cut)=20
  • Examine each shell command and see if you do need SHELLMETAS. If you can set SHELLMETAS to empty, this saves calling "sh" for = each=20 invocation of the external command.

8.2 See the procmail installation's examples

Did you remember = to look=20 at the examples that come with procmail? If not, it's time to give = them a=20 chance to educate you. Here is one possible directory you could take a = look.=20 Ask from your sysadm if you can't find the directory where to look = into.
      % ls =
/usr/local/lib/procmail-3.11pre7/examples/

Or if you're really anxious to get on your own, try = this. The=20 directory /opt/local is for HP-UX 10 machines and the forward contains example how to define your .forward for procmail.

      % =
find /opt/local/ -name "forward" -print

If the find succeeded and found the file, then you = know where=20 the procmail files installation directory is.

8.3 Printing statistics of your incoming mail

If you keep the = procmail log crunching, it will record to which folder the messages = was filed.=20 There is program mailstat which can process = the=20 procmail.log file and print nice summary out of it. If you generate = the=20 summary at midnight and clear the log, you get pretty nice per day/per = folder=20 traffic analysis.
      # -m merges all error =
messages into a single line

      % mailstat -km procmail.log

8.4 Storing UBE mailboxes outside of quota

I want to store spam outside = disk space.=20 Problem: if I tell procmail to deliver to, say, /tmp/spam.box, it does = so just=20 fine (according to the log). Unfortunately, it delivers to /tmp on the = mail=20 host which I cannot access. spam.box doesn't appear in the /tmp = directory of=20 the shell machine when procmail is invoked for incoming mail.=20

[philip] Under the = most likely=20 configuration of sendmail in this situation, it is impossible to have = procmail=20 invoked by sendmail on the shell machine: sendmail is probably set to = just=20 forward all mail to the designated mail delivery machine.=20

There are other options: you could temporarily = store the mail=20 in your account, then have a cron job on the shell machine that = reprocesses=20 the message. That would probably be more efficient than having each = message=20 trigger an rsh to the shell machine. If you actually get enough spam = that it's=20 pushing against your quota, then the rsh is too expensive -- use a = cron job=20 that invokes something like:

      cd your-maildir   =
  &&
      lockfile spam.lock  &&
      test -s spam        &&
      {
          cat spam >> /tmp/spam.box && rm -f spam =
spam.lock || \
          rm -f spam.lock;
      }

WARNING: the above assumes the following:=20

  • everything in your-maildir/spam is spam and belongs in = /tmp/spam.box=20
  • no further filtering of the messages is necessary: they just = need to be=20 moved (it actually treats everything in the your-maildir/spam as a = single=20 message and uses procmail as a reliable copy command, thus the DEFAULT assignment as the use of /dev/null as a = empty=20 procmailrc)=20
  • /tmp/spam.box is a not a directory

If the latter two of those conditions isn't true OR = IF THEY=20 MIGHT CHANGE then you should use formail = -s to break the message apart and invoke procmail = on each=20 one separately.=20

[era] Many sites = cross-mount=20 directories for various reasons. /tmp is always local but /var/tmp = might be=20 cross-mounted between the login host and the mail host; another one to = try is=20 /scratch -- and if all else fails, ask your admin to set up an NFS = share for=20 this purpose.

8.5 Using first 5-30 lines from the message

[era] The regexp to grab few lines (or all of = them, if=20 there are less than fifty) is not going to be very pretty, but it = saves=20 launching an extra process.
      :0 B
      * $ ^^$SPCNL*\/$NSPC.*$(.*$)?(.*$)? ... etc, the rest of the lines
      {
          toplines =3D  $MATCH
      }

The skipping of whitespace at the beginning of the = message is=20 of course not necessary. You should probably set LINEBUF reasonably high if you grab many lines, = say 30:=20 80*30 =3D 2400 bytes; probably setting it to 8192 or 16384 is a good = idea,=20 depending how much you want to match. The above gets ugly quickly, so =

      #  But if N=3D30, sed ${N}q if you don't have =
head

      :0 Bi
      {
          toplines =3D `head -$N`
      }

      :0 a
      * toplines ?? pattern
      {
          ...do-it
      }

8.6 Using cat or echo in scripts?

I have seen a lot of examples = that use=20 'echo', i.e.,

      :0
      * condition
      | echo "first line of message"  \
             "second ..."             \
             "et cetera"

I started out with spam.rc = from "ariel"=20 which got me into the habit of

      :0
      * condition
      | cat file_containing_message

although I note that spam.rc = did have one=20 recipe using the echo method. What are the reasons for choosing each = method=20 over the other?=20

Here is a comparison table. Choose the one you = think is best=20 for you=20

  • Echos don't have dependency on an external file: everything is = contained=20 in the .procmailrc file. Echos keep all the relevant stuff in one = file.=20 Cat's make you maintain multiple files. That's the main reason I = lean toward=20 echo's; you may have accounts on several machines. It is easier to = be able=20 to copy just one generic .procmailrc between them without having to = copy a=20 bunch of messages also. Mostly, though, there's no real difference = between=20 the two methods.=20
  • Echo is easier to use with variables.=20
  • Echo starts many processes, cat only starts one, but this is not = always=20 true: In most current Bourne shell implementations, echo is a = built-in. This=20 holds true with tcsh too.=20
  • The main problem I see with the use of cat is "what happens when = you=20 forget the file or destroy it ?". I suggest to, at least, test that = the file=20 is readable before catting it.=20
  • [richard] An argument against echo = is that=20 it is not well standardized, and different versions may exist on the = same=20 machine. Some recognize -n, some don't; some recognize embedded=20 metacharacters, some don't.This is an argument in favor of print. Print, however, is not a built-in on all = systems.=20 The comment on built-ins is pertinent to situations when a shell is = spawned.=20 When procmail handles the call directly, it will always look for a=20 stand-alone executable. I guess echo may be better, as long as we = are aware=20 of any differences in behavior between built-in and stand-alone = versions.=20

8.7 How to run an extra shell command as a side effect?

[jari] I was=20 once wondering what would be the wisest way to send messages to my = daily=20 "biff" log file about the events that happened during my .procmailrc=20 execution. This is how [david] commented = on my=20 ideas

      # case 1: print to BiffLog

      dummy =3D `echo "message: $FROM $SUBJECT" >> $biff`

[david] Problems you = get no=20 locking on the destination file, and unless you put it inside braces = you have=20 to run it on every message unconditionally. (Also procmail tries to = feed the=20 whole message to a command that won't read it, but the remedies for = that don't=20 help very much.)

      # case 2: We consume =
delivering recipe and therefor have to use
      #        `c' flag.

      :0 whic:
      | echo "message: $FROM $SUBJECT" >> $biff

Here it locks the destination file and you can add = conditions=20 to it, so it's probably the best. If the head or the body is less than = one=20 bufferful, you can limit the unnecessarily written data with h or b, but I think that = in most=20 OSes a partial buffer and a full one are the same amount of effort. =

      # case 3: We use side effect of "?" here. Cool, =
but this
      # doesn't do $biff file locking thus message order may
      # not be what you expect.

      :0
      *  condition
      *  ? echo message: $FROM $SUBJECT >> $biff
      { }         # procmail no-op

We have conditions possible, but there is no = locking on the=20 destination file. I'd go with method #2 or a variation thereof:

      :0 hic:                 #   we don't necessarily need =
`w'
      * condition
      | echo message: $FROM $SUBJECT >> $biff


      :0 hi:                  #   Or you could use this
      * condition
      dummy=3D| echo message: $FROM $SUBJECT >> $biff

[jari] Now,=20 when [david] has explained how various = ways differ=20 from each other, I present the recipe where I used the case 3. When I = was=20 dropping a message to a folder, I wanted to send a message to my biff = log too.=20 The idea is that the drop-conditions have already matched and then we = run=20 extra command by using side effect of "?" token. As far as the recipe = is=20 concerned, the "?" is a no-op. The pedantic way would have been to add = the=20 LOCKFILE around to the recipe, but imagine 50 similar recipes like = this...and=20 you understand why the LOCKFILE was left out. It's only necessary if = you worry=20 about sequential writing to the biff file.

     =
 :0 :
      * drop-condition
      * ? echo message: $FROM $SUBJECT >> $biff
      $MBOX

8.8 Forcing "ok" return status from shell script

...the "?" trick only allows = running some=20 additional shell commands (true command = always=20 succeeds) while conditions above have already determined that drop = will take=20 place. And you can always make condition to succeed if a misbehaving = shell=20 script always returns a failure exit code.

     =
 * ? misbehaving-shell-script || true

[david] If the script = always returns a failure code, just do this:=20

* ! ? = misbehaving-shell-script=20

The more complex case is a script that can return = either=20 success or failure but you don't care which; if the drop conditions = passed,=20 you want to run the action line. echo can = also fail if=20 the process lacks permission or opportunity to write to stdout. A more = reliable choice is true(1); its purpose in life is to do nothing but = exit with=20 status 0.=20

The command : is a shell = built-in=20 which always returns true status. Not exactly more readable than = true(1) "||=20 :" will save the invocation of true (unless true is built into = $SHELL), but=20 procmail will still run a shell. On the other hand, as long as the = command=20 itself has no characters from SHELLMETAS a = weight of=20 1^1 and no "|| anything" will avoid the shell process as well.=20

However, there is yet a better way to make sure = that a=20 failure by the script doesn't make procmail abort the recipe:

      :0 flags
      * other conditions
      * 1^1 ? shell-script
      action

Regardless of the exit status of the script, the = condition=20 will score 1 and not interfere with procmail's decision about the = action line=20 of the recipe. Weighted exit code conditions behave like this (see the = procmailsc(5) man page):

      * w^x ? command

scores w on success or x on failure.

      * w^x ! ? command

scores the same as this:

      * =
w^x  pattern_that_appears_in_the_search_area_$?_times

8.9 Make your own .procmailrc available to others

There is = never too=20 much to learn about procmail and the best source is the rc files that = people=20 have done. Remember to comment your .procmailrc file well before you = put it=20 available. Below is a recipe for sending your .procmailrc upon = request. If you=20 want to send anything more that one or two files (many times you want = to put=20 other files available too), then please do not use this code but a = general=20 file server module.
      :0
      * ! ^Subject:.Re:
      *   ^Subject:.*send.*procmailrc
      * ! ^FROM_DAEMON
      {
          :0 fhw:
          | $FORMAIL -rt                                              \
                     -A "Precedence: junk"                            \
                     -I "Subject: Requested .procmailrc";             \
                     -I "$MY_XLOOP"

          :0 a hwic
          | ( cat - $HOME/.procmailrc ) | $SENDMAIL

          :0              # trash the "Send procmailrc" request
          /dev/null
      }

8.10 Using dates efficiently

Note: See=20 module list, where you will find date and = time parsing modules. You can also parse the date = from the=20 first Received or From_ header=20 if it is the same each time in your system. That would be orders of = magnitude=20 faster and decreases your system load if you receive lot of mail. =

Calling date in your = procmail script=20 many times is not a good idea. Use the = MATCH as much as possible to be efficient in = procmail, like=20 below where we call date only once. If you = are not in=20 the same time zone as your server, and you want an accurate report of = the=20 date, you might amend the invocation to the following:

      date =3D =
`TZ=3D"KDT9:30KST10:00,64/5:00,303/20:00";date "+%Y %m %d"`

The basic recipe is here

      # =
By [richard] add %H:%M%S if you want these as well

      :0
      * date ?? ^^()\/....
      {
          YYYY =3D $MATCH
      }

      :0
      * date ?? ^^..\/..
      {
          YY =3D $MATCH
      }

      :0
      * date ?? ^^.....\/..
      {
          MM =3D $MATCH
      }

      :0
      * date ?? ()\/..^^
      {
          DD =3D $MATCH
      }

      TODAY   =3D "$YYYY-$MM-$DD" # ISO std date: like 1997-12-01

8.11 Keep simple header log

Here is a simple strategy: Record = all=20 what comes in and record all what happened to that message. See how = brief info=20 is constantly recorded to BIFF folder. You can = now check=20 the BIFF log every day to see if the messages = were sunk to=20 right folders: Remember to add BIFF rule to = every recipe,=20 so that the sink message [sunk-somewhere] is=20 recorded after incoming message headers.=20

I use this one-liner log in my Emacs window which = is updated=20 by live-mode process all the time (See the = Emacs tools=20 section later). It gives a nice overview of mail messages the I'm = receiving:=20 it's my biff(1) equivalent in Emacs.

      # this =
requires that HH and MM have been setup before,
      # see pm-jadate.rc

      NOW     =3D "$HH:$MM"            # the time only
      TODAY   =3D "$YY-$MM-$DD $NOW"   # ISO 8601: date and time

      $NULL   =3D $SPOOL/junk.null.spool    # /dev/null is dangerous
      BIFF    =3D $PMSRC/pm-biff.log

      # or if you prefer a log per day (easy for cleanup):
      # BIFF   =3D $PMSRC/pm-biff.log.$YYYY$MM$DD

      # .............................................. headers ...

      # DON'T USE THESE: they call shell
      #
      # FROM    =3D `$FORMAIL -zxFrom:`
      # SUBJECT =3D `$FORMAIL -zxSubject:`

      :0                    # Use procmail match feature
      * ^From:\/.*
      {
          FROM =3D "$MATCH"
      }

      :0                    # Use procmail match feature
      * ^Subject:\/.*
      {
          SUBJECT =3D "$MATCH"
      }

      # ............................................. incoming ...
      #  record log of incoming mail

      # or if you use a biff file per day, you could have:
      # echo "$NOW $FROM $SUBJ" >> $BIFF

      :0 hwic:
      |  echo "$TODAY $FROM $SUBJ" >> $BIFF

      # ......................................... null recipe ...
      # spam-like addresses - let friends@planetall.com fall through

      :0 :
      * From:.*(remove|delete|free|friend@)
      * ? echo "  [null-AddrReject]" >> $BIFF
      $NULL

8.12 Gzipping messages

[Sean B. Straw PSE-L@mail.professional.org]=20 On the recipe delivery line where you'd normally be tossing it into a = folder=20 do this instead:
      :0 c:
      |gzip -9fc >> $MAILDIR/mail.mbox.gz

This will compress each message as it comes in (and = since=20 most are TEXT, it does a fine job - MIME, OTOH is one of the best ways = to=20 mailbomb someone since it doesn't compress well - but the indirect = bombing via=20 mailing lists doesn't do this), reducing the disk space required, = usually=20 dramatically. Done in conjunction with something like the following at = the end=20 of your .procmailrc, you could have a header file you could quickly = rummage=20 through looking for valid messages to add to a procmail recipe, then = run:

      gzip -d -c mail.mbox.gz | formail -s =
procmail -m recipe.rc

(note that if the recipe delivers into the = mail.mbox.gz file=20 on any condition, then you should look to MOVE the file before running = this=20 process, and use the moved version. In fact, this would be a good idea = anyway,=20 as newly delivered mail may appear in the end of the gzip file while = you're=20 doing this - and since your ultimate goal is to be able to eliminate = junk,=20 you'll want to know that after you've processed a gzipped mail file, = you can=20 delete it without accidentally whacking new mail).

  =
    :0
      * LASTFOLDER ?? ^^^^
      {
          # Save the message in case we need to retrieve it.

          :0 c:
          |gzip -9fc >> $MAILDIR/mail.mbox.gz

          # copy headers for easy browsing - including being able to
          # identify lists you're being subscribed to.

          :0 h:
          header.log
      }

8.13 Emergency stop for your .procmailrc

[jari] If I have a bad luck while I am testing = a new=20 recipe, it may run in a loop and and it may send me continuously mail=20 messages. I then have to quickly recall .procmailrc and start = disabling my=20 individual "control" recipe files. Yet I figure, in situations like = this where=20 every second is important, there must be a better way. [alan] This is quite easy already; put this at = the top=20 of your procmailrc:
      #   instead of leading =
dot file, you may prefer
      #   stopFile =3D $HOME/procmailrc.stop which shows up in default =
ls.
      #   In the other hand you can do ls ~/.procmail* to see both...

      stopFile =3D $HOME/.procmailrc.stop

      :0
      *$ $IS_EXIST $stopFile
      {
          EXITCODE =3D $EX_TEMPFAIL # Means: retry later; requeue
          HOST     =3D "_stopped_by_external_request_"
      }

Then, when testing your procmailrc and disaster = happens, you=20 can simply do following to disable your procmailrc filtering.

      % touch $HOME/.procmailrc.stop

[richard] This is = also a=20 candidate recipe for including in an INCLUDERC. Combining the two = ideas, we=20 have a file procmailrc.stop which contains the recipe and is included = near the=20 top of .procmailrc, When you don't want it, mv it to procmailrc.go. = Procmail=20 complains about missing INCLUDERCs, but it does not complain about = them if=20 they exist and are empty. Another reason to not use dotted file names, = but to=20 use cp instead of mv.


9.0 Scoring

9.1 Using scores by an example

First make all the needed = matches and=20 let the SCORE value to be set. Examine the = score after=20 the final value has been calculated. The condition lines say:=20

      # =
Idea by 26 Sep 97 Stephane Bortzmeyer bortzmeyer@pasteur.fr

      :0
      *     -250 ^0
      * ^Subject:\/.+$
      *       50 ^1    MATCH ?? [!]
      *       50 ^1    MATCH ?? [$]
      *      100 ^1    MATCH ?? =
()\<(free|sex|opportunity|money|great)\>
      *     -250 ^0   ^Subject: *(Fwd|Fw|re):
      * B ?? 100 ^0    ()!!!
      { }             # official procmail no-op

      SCORE =3D $=3D      # Score has been calculated

      :0 fhw
      | $FORMAIL -i "X-Spam-Score: scored $SCORE"


      :0:             # If score had positive value, sink message
      *$ $SCORE^0
      junk.spam.mbox

Given the following subject:

     =
 "Great opportunity for free sex; no money required!!!!"

procmail scores it this way: ! was found 4 times = (200/weight=20 50), "free|sex..." regexp matched 4 times (400/weight 100).

               condition score    Total sum so far
                          ----    ----------------
      procmail: Score:    -250    -250 ""
      procmail: Score:     200     -50 "[!]"
      procmail: Score:       0     -50 "[$]"
      procmail: Score:     400     350 "^Subject:.*\<free|sex|...
      >"
      procmail: Score:       0     350 "^Subject: *(Fwd|Fw|re):"
      procmail: Score:       0     350 ! ""
      procmail: Assigning "SCORE=3D350"

[david] Some notes on = possible=20 regexps and their differences:

      * 100^1 =
^Subject:.*\<(free|sex|opportunity|money|great)\>

That condition says to score 100 for every subject = line that=20 contains any of those five words ... not to score 100 for every one of = those=20 words in the subject, but 100 for every subject line that contains any = of=20 those words. So it will never score more than 100 unless there are = multiple=20 subject lines. You see, it offers five alternative regexps:

      ^Subject:.*\<free\>
      ^Subject:.*\<sex\>
      ^Subject:.*\<opportunity\>
      ^Subject:.*\<money\>
      ^Subject:.*\<great\>

Offhand, I think regexp below would score 400: 100 = for=20 "Subject.*free" and 100 for "sex" etc. Of course, the score might be = higher if=20 other lines in the header included the strings "sex", "opportunity", = "money",=20 or "great<word border>", but appearances of "<word = border>free"=20 outside the subject wouldn't be counted.

      * =
100^1 ^Subject:.*\<free|sex|opportunity|money|great\>

      [translates to]

      ^Subject:.*\<free
      sex
      opportunity
      money
      great\>

And this one would score 400 too. How? MATCH would contain whole subject and there would = be=20 non-overlapping matches to " great ", " opportunity ", and " free ". = If we got=20 rid of either or both of the word-border marks, it would score 500. =

      Subject: Great opportunity for free sex; no =
money required!!!!
      * 100^1 MATCH ?? ()\<(free|sex|money|opportunity|great)\>

9.2 Brief Score tutorial

#todo: test=20

[elijah] If you're = serious about=20 using scores, please spend a minute reading this short example.

      VERBOSE =3D "yes"

      :0
      *  1^1 foo
      * -2^2 bar
      { }
      a =3D $=3D

      :0
      *  1^1 foo
      * -2^2 bar
      {
          :0 f
          | echo Whee: fun ; cat -
      }
      b =3D $=3D

      :0
      *  1^1 foo
      * -2^2 bar
      {
          whee =3D "fun"
      }
      c =3D $=3D

      :0 h
      /dev/null

Then if you would send a message

 =
     From foo Fooof
      To: bar
      Subject foobar

      body-something-here

The log file will tell you what happened.

      procmail: [20175] Fri Sep 26 10:25:23 1997
      procmail: Score:       3       3 "foo"
      procmail: Score:      -6      -3 "bar"
      procmail: Assigning "a=3D-3"
      procmail: Score:       3       3 "foo"
      procmail: Score:      -6      -3 "bar"
      procmail: Assigning "b=3D0"
      procmail: Score:       3       3 "foo"
      procmail: Score:      -6      -3 "bar"
      procmail: Assigning "c=3D-3"
      procmail: Assigning "LASTFOLDER=3D/dev/null"
      procmail: Opening "/dev/null"
      From foo Fooof
        Folder: /dev/null 46

9.3 Score's scope

If you have a delivering recipe and the = score is=20 positive, the action lines are executed. If the score is less or equal = to 0,=20 then the $=3D information is lost, but also = at the next=20 recipe definition, even if the recipe is never executed. Study = following=20 example:
      :0
      * 10^0
      {
          dummy =3D "Score for condition xxxx was: $=3D $NL"

          :0
          {
              dummy =3D "Next recipe, Score no longer available: $=3D =
$NL"
          }
      }

      #   Wont' work.  $=3D is getting set back to 0 outside of
      #   the delivering recipe.

      dummy =3D "Score outside of all recipes: $=3D $NL"

Here is interesting anomaly which [richard] discovered. It is presented here = only as a=20 curiosity. DO NOT USE IT IN YOUR RECIPES. (this not "clean = programming", but a=20 hack)=20

[david] If you want = to save the=20 score for later use (even if it is zero or negative):

      :0
      * 10^0
      { }                     # procmail no-op

      SCORE =3D $=3D

      :0 A
      action_if_positive

If other recipes that clobber the references for = the A flag intervene, this will work:

      :0
      * 10^0
      { }                     # procmail no-op

      SCORE =3D $=3D

      ... more stuff ...

      :0
      *$ $SCORE^0
      action_if_positive

9.4 Counting length of a string

Supposing VAR=20 contains some text, we can count the characters by using dot to match = every=20 character and increasing score for every match.
     =
 :0
      * 1^1 VAR ?? .
      { }

      LENGTH =3D $=3D

9.5 Counting lines in a message (Adding Lines: header) =

[1995-10-03=20 PM-L Idea by David Karr dkarr@nmo.gtegsc.com] = [david] later corrected 1998-01-02: For one = thing, the=20 second condition always counts one too many (the final newline plus = the=20 closing putative newline create the extra match); second, after making = that=20 correction, an empty body would score zero and leave the variable = undefined.
      :0
      * 1^1 .
      * 1^1 ^.*$
      * -1^0
      { }
      lines =3D $=3D

      :0 fhw
      * ! ^Lines:
      | $FORMAIL -a "Lines: $lines"

The reason we used it at all was that size = conditions worked=20 only on the entire text regardless of H or B or HB flags at the top of = the=20 recipe. Nowadays we can do this and get the accurate figure in one = condition:

      # leave `B ??' out to measure the =
entire message
      :0
      * 1^1 B ?? > 1
      { }
      size =3D $=3D

If you want to be silly about it (as some of us = very often=20 do),

      :0
      * -1^1 B ?? > -1
      { }
      size =3D $=3D

gives the same result, and as long as the search = area is=20 non-empty, so do these, which are even sillier:

     =
 :0
      * 1^-1 B ?? < 1
      { }
      size =3D $=3D

      :0
      * -1^-1 B ?? < -1
      { }
      size =3D $=3D

[Karr] This recipe = counts bytes=20 in the message, you could use this Content-length replacement, prefer = using=20 the next recipe. The first score counts every character, and the = second score=20 sums up every line (that is: newlines are added).

   =
   :0 HB                       # use B to measure body only
      *    1^1 .
      *    1^1 ^.*$
      {
          textsize =3D $=3D

          :0 fhw
          * ! ^Content-length:
          | $FORMAIL -a "Content-length: $textsize"
      }

9.6 Determining if body is longer than header

      :0
      *  1^1 B ?? > 1
      * -1^1 H ?? > 1
      {
          ..body was longer
      }

9.7 Matching last Received header

[david]=20 Here is way to use scores to hit the bottommost Received header.
      :0
      * $ 1^1 ^Received:.*by$s+\/.*
      action

9.8 How to add Content-Length header

We use procmail for local = delivery, and=20 would like to get it to generate the content-length header, if one = doesn't=20 exist. SUN-OS mailtool at least gets confused and merges messages = together if=20 there is no message body.=20

[stephen] All you = need to do is:=20 a) Make sure that procmail is started without the -Y flag. b) Either, = in your=20 sendmail.cf, insert:

      H?l?Content-Length: =
0000000000

Or (slightly less efficient), insert the following = recipe in=20 your /etc/procmailrc file and Procmail will take care of any necessary = magic.

      :0 hfw
      * !^Content-Length:
      | /usr/bin/formail -a "Content-Length: 0000000000"

9.9 Testing message size or number of lines

Size conditions = ignore=20 H and B on the = flag line and=20 always work on HB unless another search area = is=20 specified on the condition's own line. To test only the=20 body,
      :0                      # Note: this is =
in BYTES
      * $ B ?? < $NBR
      {
          ...whatever when fewer bytes
      }

This syntax would obey a B flag on the flag line: =

      :0 B                    # Note: this is in LINES
      * -1^1 .
      * -1^1 ^.*$
      *$ $NBR^0
      {
          ...whatever when fewer lines
      }

9.10 Counting commas with recursive includerc

[jari] Foreword: David and Phil really are = experts with=20 procmail, and let this section serve as an example to "what on Earth = is=20 recursive procmailrc and how it is used?". I would not personally use=20 recursive includerc, simply because I would not trade clarity: I find = this=20 easier to understand and maintain. split = just explodes=20 input according to comma and the print = return how many=20 elements were exploded to array a. The = performance hit=20 is not bigger than forked procmail binaries = in=20 recursive version.
      :0
      * ^CC:\/.*
      {
          field       =3D $MATCH

          saved       =3D $SHELLMETAS
          SHELLMETAS
          commaCount  =3D `echo $field | awk '{print split($0,a,",")}' `
          SHELLMETAS  =3D $saved
      }

See the recursive RC implementation at <URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/=20 cgi-bin/w3glimpse2HTML/procmail/1997-08/msg00073.html?50#mfs>=20

[richard] Here is = recipe that=20 needs no recursion. MAX_RECIP is set to 9, but you may prefer some = other=20 value. This counts each comma. It allowed in addresses.Some folks sum=20 Resent-xx or non-Resent-xx headers. I sum all. =

      :0
      *             1^1 ^(resent|apparently-)?(to|b?cc):\/.*
      *             1^1 MATCH ??,
      * $ -$MAX_RECIP^0
      {
          :0
          * $         $=3D^0
          * $ $MAX_RECIP^0
          {
              RESULT =3D "Count of commas is $=3D"
          }
      }


10.0 Formail usage

10.1 Fetching fields with formail -x

If you're new to = procmail your=20 first though to read a header content from the message would might be = call:
      SUBJECT =3D `$FORMAIL -xSubject:`

That's not good. DON'T Do THAT. You just created = expensive=20 shell subprocess where procmail calls formail and=20 feeds full message to it. We can do the same with minimum efforts: =

      :0
      * ^Subject:\/.*
      {
          SUBJECT =3D $MATCH
      }

No shell subprocess called. This is much faster and = consumes=20 fewer resources, while it may need more typing. Use it and your your = sysadm is=20 happy with your well behaving procmail recipes that don't load the CPU = unnecessarily. The equivalent with formail = might be=20 more secure, because it contains full RFC-compliant parser. The = traditional=20 way of deriving the address with formail is: =

      FROM =3D `$FORMAIL -rtzxFrom:`

But you can still make this more efficient. Here is = one=20 example where you actually want to use "old" =3D| style=20 variable assignment, make sure there are no extra=20 spaces:

      :0 hw
      FROM=3D|$FORMAIL -rtzxFrom:

That way only the header gets fed=20 into formail, whereas the previous back tick fed the whole message. Another benefit is, that you can = then check=20 the return code of formail with a or A recipe after this one.

10.2 Always use formail's -rt switch

[Philip] As of version 3.14 you should now = usually leave=20 out the -t. To quote the formail manpage:=20

By default, when generating an = auto-reply=20 header procmail selects the envelope sender from the input message. = This is=20 correct for vacation messages and other automatic replies regarding = the=20 routing or delivery of the original message. If the sender is = expecting a=20 reply or the reply is being generated in response to the contents of = the=20 original message then the -t option should be used.=20

10.2.1 For procmail versions prior=20 3.14=20

[FAQ] -r=20 breaks RFC822, so always use -rt if you = don't know=20 what this means. Perhaps you should always use it anyway.=20

[david] There is = formail -r[t]=20 rank bar graph in the source code of 3.11pre4. It might be easier to = follow as=20 a top-to-bottom listing (and again, Tom Zeltwanger appears to be using = one of=20 the older versions where From_ was mistakenly over promoted). These = are the=20 rankings in version 3.11pre4:

      formail -r:      =
               formail -rt:

      Resent-Reply-To:                Resent-Reply-To:
      Resent-Sender:                  Resent-From:
      Resent-From:                    Resent-Sender:
      Return-Receipt-To:              Reply-To:
      Errors-To:                      From:
      Reply-To:                       Sender:
      Sender:                         Return-Receipt-To:
      From_                           Errors-To:
      Return-Path:                    Return-Path:
      Path:                           From_
      From:                           Path:

[Stephane Bortzmeyer bortzmeyer@pasteur.fr] = Always use=20 -rt and never -r. = Because such=20 precedence (Sender over From) is an important violation of RFC 822. = There is=20 one canonical order, described in the RFC and nothing else should be = used,=20 like fuzzy ranking or, worse, reordering. This is a serious problem = with=20 formail.=20

The proper order is:

      =
Reply-To, else From, else Sender, else <error>

And, how would you deal with = resent mail??=20 Ie: Resent-Reply-To, Resent-From, and Resent-Sender?=20

It treats Resent-X as X = (" Whenever=20 the string Resent- begins a field name, the = field has=20 the same semantics as a field whose name does not have the prefix. "). = So you=20 have to choose an order between them, the RFC does not specify it.=20

[david] I think that = the idea is=20 that -r is intended to determine the = origination=20 address, not the place to reply; -rt is for=20 determining the place to send replies. For addressing a response, yes, = -rt will invert the header in a way more in line = with the=20 rules; for figuring out the origination point,

      =
formail -r -zxTo:

might be better than

      =
formail -rt -zxTo:

And here's an additional problem: formail -rD always uses = the -r precedences; you can't make it use the -rt precedences and the -D cache=20 checking function at the same time.=20

4.4.4. AUTOMATIC USE OF FROM / SENDER / = REPLY-TO (RFC=20 822 excerpt)=20

For systems which automatically generate address = lists for=20 replies to messages, the following recommendations are made:=20

  • The Sender field mailbox should be = sent notices=20 of any problems in transport or delivery of the original messages. = If there=20 is no Sender field, then the From field mailbox should be used.=20
  • The Sender field mailbox should NEVER = be used=20 automatically, in a recipient's reply message.=20
  • If the Reply-To field exists, then the = reply=20 should go to the addresses indicated in that field and not to the=20 address(es) indicated in the From field.=20
  • If there is a "From" field, but no Reply-To=20 field, the reply should be sent to the address(es) indicated in the = From field.

Sometimes, a recipient may actually wish to = communicate with=20 the person that initiated the message transfer. In such cases, it is=20 reasonable to use the Sender address.=20

This recommendation is intended only for automated = use of=20 originator-fields and is not intended to suggest that replies may not = also be=20 sent to other recipients of messages. It is up to the respective = mail-handling=20 programs to decide what additional facilities will be provided.

10.3 Using -rt and rewriting the From address

Sendmail adds = the From header which points to your account. But in = some cases=20 you may wish to rewrite the From.=20

  • You respond to spammer and you want to hide in some extents your = address. ( The headers will still be there, but at least hitting = r in most MUA's pick up the From )=20
  • You want to rewrite From to show your = virtual=20 address me@forever-lasting-address.com instead.=20
  • You are in some other account currently, but you want to send = message to=20 some Net service (e.g Mailing list) that expects to see the same = address you=20 first time used in subscription.

You could also use Reply-To to=20 signify where you want further responses to go, but that doesn't hide = your=20 true From address. And there are still MUAs = that don't=20 obey Reply-to. Whatever reason you have to = rewrite=20 From header, here is the command.

      :0 fhw
      | $FORMAIL -rt -I "From: me@forever-lasting-address.com"

10.4 Formail -rt and Resent-From header

Here is something = that made=20 me scratch my head a lot. Let's examine scenario first which explains = how the=20 mail travels.
      account --> virtual-address =
--> Local-address

In this chain I was sending message from my = University=20 account to my official work address, the virtual-address=20 delivers the mail to right local domain. There is only one problem = with this=20 picture. When I generated response from Local-address with=20 formail -rt, the = generated=20 address pointed back to *virtual-address, which pointed back to Local-address of course. A loop back was ready, I = never got=20 route traveling to original address account=20

What was happening here was that the mail server = that handled=20 the virtual-address, didn't forward the message, = but=20 instead resent the message. In this process a = set of new=20 headers were generated:

      Resent-From: =
<virtual-address>
      X-From-Line: <account>
      Received: from <the virtual-address mailserver>
      Resent-Message-Id: <199710151903.WAA28670@virtual-address>
      Resent-Date: <date>
      Resent-To: <local-address>
      Received: ...<account domain>
      Message-Id: <199710151904.WAA05050@account-domain>
      From: <account-domain>

And now when the formail -rt command was used, it = picked up=20 the Resent-From added destination where the = message=20 should be returned. Surprising, but according to procmail, 100% = correct. Resent-From has higher priority than From.=20

The Resent-* headers are considered informative, and should never be used when = automatically=20 generating a response. The problem here is the middleman, it should = not=20 resend a message, but rather forward it. So I put this into my .procmail to = handle the=20 broken middleman in our site.

      #   Remove that =
misleading Resent-From if it was added by our
      #   "middleman"

      :0 fhw
      * Resent-From: <our-domain>
      | $FORMAIL -IResent-From:

[edward] adds to this = that: As=20 you know, formail -rt is for=20 composing a response to the address from which an e-mail was sent. = Let's say=20 you are on vacation and have set up a procmail recipe to auto respond = to all=20 e-mail you receive. Furthermore, let's say Joe sends me an e-mail and = I=20 re-send it to you. If you wanted to respond to the sender of the = e-mail that=20 you received, would you e-mail me or Joe? You better e-mail me because = I was=20 the one who sent it to you. Joe may not even know you. Imagine if you = did send=20 your response to Joe. It would probably cause him considerable = confusion as to=20 why you are sending him e-mail informing him that you are vacation.=20

formail -rt uses a = heuristic=20 algorithm to determine who it should respond to, based on the presence = of=20 various headers and their contents. If you look at the formail.c = source code,=20 you'll see a graphical representation of this algorithm. It will also = explain=20 difference between the results of -r and = -rt.=20

Resent-Reply-To has the = highest=20 relative importance/reliability of all header fields. Next is = Resent-From and=20 Resent-Sender, followed by Reply-To, From, Sender, et al.

10.5 Quoting the message

Use formail -rtk=20

10.6 Without quoting the message

Use formail -rkb or formail = -rkt -p=20 ''=20

10.7 How to include headers and body to the reply message =

Collin Park=20 collin@cup.hp.com = suggests=20 that you can grab the content to a variable and then use variable HDRS = as you=20 wish. This This code doesn't modify original header, just put a few = fields in=20 $RH
      :0hi
      HDRS=3D|formail -rt "-IPrecedence: junk"        \
                    "-AX-Loop: lewst@corp.com"      \
                    "-IFrom: postmaster@corp.com"

[david] ...It does = require that=20 the entire head fit into sed's hold space, but it almost always will;=20 exceptions are cases where the sender messed around and added a bunch = of=20 uninformative (and usually self-congratulatory) additional headers or = when the=20 message got caught in a loop for a while but finally escaped before = being=20 bounced for too many hops.

      :0 fhw r
      | sed -e H -e '$ G'

      :0 fhw
      | $FORMAIL -rt; ... now generate reply ...

10.8 Adding text to the beginning of message

We don't = actually filter=20 anything here. It's just a trick to reprint headers and add some text = after=20 them: text appears at the beginning of body.
      =
:0 fhw
      | cat - ; echo "This text comes after the headers."

10.9 Adding text to the end of message

     =
 :0 fb
      | cat -; echo "added text after body"

10.10 Adding text before quoted message

If you are generating = an=20 auto-reply message where you want to place the notification to the = beginning=20 of body followed by the quoted original message, here is recipe for = it.=20 Substitute condition to trigger the reply = condition.
      :0
      * condition
      {
          :0 fhb
          | $FORMAIL -rtk -p '>'   \
            -I "From: me@here.com" \
            -I "$MY_XLOOP"

          :0 fhw
          | cat -; echo "added message at the start of body"
      }

12.10 How to truncate headers (save filing space)=20

[Idea by Rodger Anderson = <rodger@hpbs2245.boi.hp.com>]=20 As a last recipe, if you're tight of space, you could remove = extraneous=20 headers. But make sure you want to that, because headers may contain = useful=20 information about URLs and other things like mail server addresses. = Some=20 people keep signature information in separate X-header (say: X-My-Info) instead of at bottom of message so that = it won't=20 bother people and disturb reply quoting.

      #   =
Strip header to bare minimum
      #   If this is MIME multipart, then skip recipe

      :0 fhw
      * ! multipart
      |   $FORMAIL -k                                                 \
          -X Date:                                                    \
          -X Subject:                                                 \
          -X Message-Id:                                              \
          -X From                                                     \
          -X To:                                                      \
          -X Cc:                                                      \
          -X Reply-To:                                                \
          -X Mime-Version:                                            \
          -X Content-type:

      :0 :
      mail.default.mbox

[david] comments the = final recipe=20

  • You should keep the Reply-To header if = there is=20 one. If the sender wanted replies directed to a different address = than that=20 in the From header, you are losing that = information=20 and, when you respond, writing to the wrong place.=20
  • You ought to keep To and Cc so that you can tell when you read your mail = who else=20 was sent it. If your mail user agent has a group-reply or reply-all=20 function, keeping To and Cc=20 will allow that feature to continue working. This way you are = cheating=20 yourself out of it.=20
  • '-X From' is enough to keep both the From_ line=20 and the From header. You don't need to = specify -X=20 From: again after it. (To keep From_ = without From: you need to say -X "From " or something = similar,=20 with a quoted space.)=20
  • All mail is going to have a line (usually two) beginning 'From'. =

Another slightly different approach is to kill the = headers=20 that take the most of the space. If you're not interested in tracking = down the=20 original sender of possible UBE message, then you can remove the Received headers. You may want to fill out the = condition=20 line to simplify only your work or campus messages, and let other = messages=20 retain their full headers.

      :0  fhw
      *   possible-condition-to-handle-only-certain-messages
      |   $FORMAIL -I Received:

10.11 Adding extra headers from file

[stephen] Notice that the obvious solution = won't do=20 here:
      :0 fhw
      * condition
      | $FORMAIL -rt | cat - $HOME/newHeaders

The problem here is that there will be a newline in = the=20 middle, which causes the header to be shortened (procmail determines = the new=20 header/body boundary after having processed each filter). Use the = following=20 instead.

      :0 fhw
      * condition
      | $FORMAIL -rt -X "" ; cat $HOME/pm-newHeaders.txt ; echo

[david] If = $HOME/newHeaders ends=20 in a blank line, you don't need the "; echo". Under some circumstances = procmail puts back the blank separating line if it gets lost, but I'm = not sure=20 exactly what those are, and you have a SHELLMETAS character in there = already=20 (the first semicolon), so a shell is forked anyway.=20

But this is my favorite way (it assumes that = formail -r will=20 never generate a continuation line for From:); if you use it, make = sure that=20 the newHeaders file does NOT contain a trailing blank line:

      :0 fhw
      * whatever
      | $FORMAIL -rtn

        :0 A fhw
        | sed "/^From:/r $HOME/newHeaders"

10.12 Splitting digest

[Idea by David Hunt] One interesting = idea to=20 handle digests automatically as single messages if that we call = procmail=20 recursively. First Call formail to split the mail when headerfields = are=20 contained in the body, calling procmail again as the output-program of = formail. insertion of X-Loop makes it possible to reuse .procmailrc = for the=20 separate messages.
      #   If it looks like more =
than one mail, send to formail for
      #   splitting, then send back to procmail for sorting again.

      :0 B
      *  ^From [-_+.@a-z0-9]+  (Sun|Mon|Tue|Wed|Thu|Fri|Sat)
      *  ^From:
      *  ^TO
      *$ ! H ?? ^$MY_XLOOP
      | $FORMAIL -A "$MY_XLOOP" -m4s procmail

10.13 Mailbox: Splitting to individual files

[david] To split some old mail archives into = individual=20 files while stripping unimportant header fields, use following. The = keys are=20 to use procmail's -p option, to strong-quote = $FILENO in the setting of DEFAULT,=20 and to use /dev/null or a known empty file as the rcfile.
      % setenv FILENO 0000
      % formail -kXDate: -XFrom: -XTo: -XSubject: -XIn-Reply-To:      \
          -XX-Mailer +1ds                                             \
          procmail -p DEFAULT=3D`pwd`/'$FILENO.txt'                     =
\
          /dev/null < inputfile

10.14 Mailbox: Extracting all From addresses from mailbox

The = -ns=20 causes formail to split the mailbox and feed each mail separately to = next=20 process.
      % cat mailbox | formail -ns formail =
-xFrom: | sort -u

10.15 Mailbox: Applying procmail recipe on whole mailbox

      % cat mailbox | formail -ns procmail =
pm-experiments.rc

10.16 Mailbox: run series of commands for each mail (split = mailbox)

...Maybe the heat has melted = my brain, but=20 I can't seem to get formail to perform a series of commands on each = mail that=20 it has split from a folder. Here's an example of a simple debugging = attempt:=20 I've tried parentheses, putting the commands into a shell function, = and other=20 flailings too numerous to remember, all to naught.

      % formail -s addr=3D`formail -XFrom: | formail -r | =
formail -zx To`;\
          echo "$addr" >>output

It appears that formail doesn't use the shell when = executing=20 the command specified when splitting. No SHELLMETAS here. Given that, = the=20 secret is to fire up the shell explicitly yourself to do the piping: =

      % formail -s sh -c 'formail -XFrom: | formail =
-rzxTo:' >> output

Note that you only need two formails in the pipe, = not three,=20 as the -r flag works correctly when combined with other flags.=20

...To me, a large mailbox = would consists=20 of about 10,000 messages per month (that's about what I get). That = would mean=20 that my mailbox would contain 60,000 messages in 6 months. I sure as = heck=20 wouldn't want to skim through it all or even try to load it up in an = MUA.=20

[1998-08-27 Bennett Todd bet@mordor.net] I also deal = with monster=20 volumes of mail. I've switched over entirely to Maildir in all my mail = handling; the only place I still see mboxes is in the save folders of = my=20 netnews reading (using slrn) and whenever I want to process them I = either=20 convert them into Maildir (e.g. for archival) or simply split them = into=20 multiple messages. Splitting into multiple messages turns out to be=20 preposterously easy; using GNU csplit:=20

[richard] The=20 csplit invocation shown here will catch occurences of ^From embedded = in the=20 message body if your MUA hasn't escaped them with a >. Some MUAs = use=20 content-length headers and don't escape ^From. Procmail supports this. = Be=20 cautious if you choose to use this simple split.

      csplit -n4 - '/^From /' '{*}'

That will create an empty xx0000 which I delete, = and leave=20 the messages in files named xx0001, xx0002, etc. If you have more than = 9999=20 messages in a folder then go -n6, or -n9, or whatever. Once they're = split it's=20 really easy to use shell tools to bundle messages into batches, file = them into=20 categories, etc.=20

If you are archiving all mail traffic forever = (which I do)=20 then another dandy tool to add to the mix is glimpse http://glimpse.cs.arizona.edu/ it takes a while to = build the=20 index, but that's a fine job to run out of cron at night. Once the = index is=20 built it's a pleasingly quick way to root through big archives of = messages.=20

10.17 Option -D and cache

[Bob Weissman b_weissm@kla.com] and [stephen] These files are self-limiting. The = number=20 after the -D is the size in bytes above = which the=20 older entries will be removed. E.g., my .procmailrc has
      :0 Wh:  .msgid.cache$LOCKEXT
      |$FORMAIL -Y -D 12288 .msgid.cache

And the file never exceeds 12288 bytes by very = much. Though=20 formail indeed exceeds this size by as much as the length of one message-ID, the file size should never grow = significantly=20 beyond that, even if used indefinitely. The file is in binary format, = each=20 entry terminated by single null byte, and an occasional (significant=20 placeholder) double null=20

[philip] The format = of the cache=20 is initially as follows:

      =
entry\0entry\0entry\0\0

When the file size grows to equal-to or = greater-than the size=20 specified on the command line, formail starts over at the beginning, = using a=20 double-null to mark where it stopped. However, entries after the = double-null,=20 except for the partially overwritten one, are still valid and checked, = so that=20 the file is then in the format:

      =
entry\0entry\0entry\0\0partial-entry\0entry\0entry\0\0

New entries will be written after the first = double-null, so=20 that it implements a circular cache. Check out lines 319-322 of = formail.c=20

10.18 Option -D and message-id in the body

Some of my messages contain = the original=20 Message-ID in the body of the letter and not the Header. Is there an = option=20 for Formail to over come this problem?=20

[david] This is = strictly=20 untested; I don't know where in the body the Message-ID's appear, but = if=20 they're at the top of the body, this might help:

    =
  :0 hW        # Message-Id: in the head,
      *$ ^Message-Id:.*$NSPC
      | $FORMAIL -D $cache_size $cache_name

      :0 E BbW    # If not but there's one the body, try body.
      *$ ^Message-Id:.*$NSPC
      | $FORMAIL -D $cache_size $cache_name

You might want to copy a Message-Id=20 from the body to the head in any case (if there's none already in the = head)=20 just to have it in the right place, so we could do that first and then = formail -D will work = normally. This=20 form will run formail twice if the Message-Id header=20 is in the body instead of the head, but it will look for Message-Id on any line of = the body, not=20 just at the top:

      :0 fhw
      *$ ! H   ?? ^Message-Id:.*$NSPC
      *$   B   ?? ^\/Message-Id:.*$NSPC
      | $FORMAIL -A "$MATCH"

      :0 hW
      | $FORMAIL -D $cache_size $cache_name

10.19 Reducing formail calls (conditionally adding fields) =

#todo: url=20

Suppose you want add fields to the message when = some=20 condition is met:

      :0              # compose =
initial reply
      | $FORMAIL -rt

      :0
      * condition1
      | $FORMAIL -A "X-Header1: value1"

      :0
      * condition2
      | $FORMAIL -A "X-Header2: value2"

Hm, we have three processes called here, can we = minimize the=20 calls? Yes, this is idea from [philip] = and [david]. Notice that there is only ONE process = needed.

      :0
      * condition1
      {
          hdr1 =3D "-AX-Header1:value"
      }

      :0
      * condition2
      {
          hdr2 =3D "-AX-Header2: value"
      }

      :0 fhw
      | $FORMAIL -rt  ${hdr1+"$hdr1"} ${hdr2+"$hdr2"}

And if you want to stack all headers to only one = variable, it=20 is a bit of extra work. Below we use short variable names only because = of the=20 line space: the calls fit on one line.=20

  • field =3D all (f)ields stacked to one string.=20
  • nl =3D continuation newline terminator of previous field =

The recipe says: if field has = previous=20 value, set nl to newline separator, later = concat=20 previous contents of field with possible newline = and new=20 header field.

      field       # kill variable
      :0
      {
          nl
          nl     =3D ${field+"$NL"}
          field  =3D "$field${nl}X-Header1: value"
      }

      :0
      {
          nl
          nl     =3D ${field+"$NL"}
          field  =3D "$field${nl}X-Header2: value"
      }


      :0 fhw                          # If we have something in *field*
      * ! field ?? ^^^^
      | $FORMAIL ${field+-A"$f"}

The above recipe was the most general one, each = recipe=20 determined by itself if the f existed previously = or not.=20 But if you know that f is already set, you can = write=20 simpler recipe:

      :0          # We know f has =
value before our module
      {
          field =3D "$field${NL}X-Header1: value"
      }

10.20 Formail -A -a options

You can't use option -A with -a = or -I if=20 the header name is the same. Like below where you try to keep only the = last=20 definition of X-1, but the first -A isn't seen when -a is applied. =
      formail -A "X-1: 1" -a "X-1: 2"
      -->
      X-1: 1
      X-1: 2

Whereas; separate pipes give you the desired = results.

      formail -A "X-1: 1" | formail -a "X-1: =
2"
      -->
      X-1: 1

      formail -A "X-1: 1" | formail -I "X-1: 2"
      -->
      X-1: 2

10.21 Formail -e -s options

[david] I had=20 a file of alternating From and Date lines and wanted to convert it into an mbox. =
      formail -dem2 -s < input > mailbox

should have done it, right? Nope; formail -s took it all = as one=20 message, even with -m1. When I edited in blank lines, the command = worked. My=20 first reaction was that the -e option wasn't working as advertised and = that=20 the blank lines were necessary after all.=20

Then I realized the real problem: there was no = interruption=20 in the succession of valid header lines in the input for anything that = could=20 look like a body. I could have put something other than blank lines = between=20 each pair of header fields and then -e would have done its job, but as = long as=20 every additional line looked like a valid RFC822 header field, even if = its=20 name was the same as one that had appeared earlier, formail -s assumed that = it was still=20 the same message's head.


11.0 Saving mailing list messages

11.1 Using subroutine pm-jalist.rc to detect mailing lists =

Because I=20 didn't have sendmail plus addressing capabilities (explained in next = section)=20 I wrote module pm-jalist.rc. It is included in = the=20 pm-code.zip=20

The subroutine tries to detect and derive the = mailing list=20 name directly from the message. Many Mailing daemons: ezlm, smarlist,=20 listserv, majordomo use standardized headers from where the list name = can be=20 picked. After this subroutine has been applied to message, the = variable LIST contains the mailing list name. You no longer = have to=20 manually insert separate recipes for each new mailing list you = subscribe to,=20 because this subroutine adaptively finds new new mailing lists.=20

Once the mailing list name has been grabbed, you = can easily=20 "map" or convert the name to any suitable folder name before saving = it:

      LIST            LIST name    Description of =
mailing list
      (as grabbed)    you want
      --------------------------------------------------------------
      jde             java.jde    Java Development Env
      java            java.prog   Java programming
      FLAMENCO        flamenco    Flamenco music
      tango-l         tango       Argentine Tango dancing
      tm-en-help      tm-en       Emacs TM mime package mailing list
      w3-beta         w3          Emacs WWW mailing list

You set then conver grabbed LIST to=20 new folder name with conversion table:

      =
JA_LIST_CONVERSION =3D "\
      jde       java.jde,\
      java      java.prog,\
      FLAMENCO  flamenco,\
      "

And to detect all mailing lists, you only need one = recipe,=20 like below:

      INCLUDERC =3D $PMSRC/pm-jalist.rc

      :0 :                          # if list name was grabbed
      * ! LIST ?? ^^^^
      $LIST_SPOOL_DIR/list.$LIST

11.2 Using plus addressing foo+bar@address.com

If you have a = recent=20 enough (8.8.8+) sendmail, please ask your = sysadm to=20 activate the plus addressing. Procmail gets bar in=20 $1 automatically.=20

http://www.FAQs.org/FAQs/mail/addressing/=20

[Bennett Todd bet@mordor.net] The PLUS = feature has=20 also been Implemented in qmail and Postfix (nee VMailer). By default qmail uses "-" = rather than=20 "+", but it can be configured to use different rules; Postfix doesn't = come=20 with either enabled, but its example main.cf has a commented-out line = to=20 enable "+"-based support.=20

[Roy S. Rapoport rsr@macromedia.com] Plus = addressing=20 is implemented using sendmail (well, I'm sure the other MTAs can also = do it,=20 but my experience is with sendmail). The last few releases of sendmail = (8.8.6,=20 8.8.7, 8.8.8) all seem to automatically default to allowing it. = Basically, for=20 any address of the form foo+baz, sendmail = ignores the=20 +baz part and just delivers it to foo.=20

If you want the easiest method to handle mailing = list mails,=20 then subscribe to list by using dedicated plus address:

      login+list.procmail@example.com
      login+list.debian@example.com
      login+list.linux@example.com

When you receive message from any of these mailing = lists to=20 your login account, the list.procmail is already in variable $1 and the recipe to sink all mailing lists to = their=20 individual folders is very simple:

      #   Note: =
The $1 contains value only _IF_ procmail
      #   is invoked with option -m or -a (with an argument).
      #   Be sure procmail is invoked with that oprion either as from
      #   LDA or ~/.forward.
      #
      #   $1 is pseudo variable and it can't be used in condition line,
      #   so we copy the value to ARG.

      ARG =3D $1

      :0 :
      * ARG ?? list
      $ARG

[david] Here is what = I have=20 configured to sendmail.cf to support plus addressing:

      Mprocmail, P=3D/usr/bin/procmail, F=3DDFMmShu,        =
              \
                      S=3D11/31, R=3D21/31,                              =
 \
                      T=3DDNS/RFC822/X-Unix,                            =
\
                      A=3Dprocmail -m $h $f $u

Well, this is definition of the procmail mailer, = not the=20 local mailer. Furthermore, there's more to plus-addressing support = than the=20 definition of the local mailer. Ruleset 0 or 5 needs to be set up to = move=20 everything after the + into the 'host' variable ($h). Unless you have = a strong=20 understanding of sendmail rule sets and rewriting rules, you should = not=20 attempt to add plus-addressing to your sendmail.cf, but instead just = install=20 the latest version of sendmail and use the m4 sendmail.cf generation = tools=20 with a .mc file that contains:

      =
FEATURE(local_procmail, `/usr/local/bin/procmail')

plus whatever else your site requires.

      ...Ok, I corrected it. Well, here's what that looks =
like. I did
      look into the part about Ruleset 5 while trying it on
      originally. But all I could do was make sure that the
      plus-addressing section was there.

      Mlocal, P=3D/usr/bin/procmail, \
                      F=3DlsDFMAw5:/|@qSPfhn9, S=3D10/30,
      R/40,
                      T=3DDNS/RFC822/X-Unix,
                      A=3Dprocmail -Y -a $h -d $u
      Mprog, P=3D/bin/sh, F=3DlsDFMoqeu9, S=3D10/30, R/40, D=3D$z:/,
                      T=3DX-Unix,
                      A=3Dsh -c $u

11.3 Using RFC comment trick for additional information =

Recall from=20 [rfc1036] that the preferred Usenet mail = address=20 formats are following
        From: =
login@example.com
        From: login@example.com (First Surname)
        From: First Surname login@example.com

I invented this idea after reading Eli's excellent = FAQ about=20 mail addressing. Please read it (especially section 19.) before you = continue=20 in order to understand what I'm going to present.=20

I have an account which does not support plus = addressing and=20 I was kinda jealous to everyone that could use this neat sendmail = addressing=20 scheme. The plus addressing helps so much better to deal with mailing = list=20 messages.=20

But as it turns out, we can simulate in some extent = plus=20 addressing with pure RFC compliant address. We exploit RFC comment = syntax,=20 where comment is any text inside parentheses. According to Eli's = paper,=20 comments should be preserved during transit. They may not appear in = the exact=20 place where originally put, but that shouldn't be a problem. So, we = send out=20 message with following From or Reply-To line:

      =
first.surname@domain (First Surname+list.procmail)

Now, when someone replies to you, the MUA usually = copies that=20 address as is and you can read in the receiving end the PLUS = information and=20 drop the mail to appropriate folder: mail.procmail.=20

[About subscribing to mailing lists = with RFC=20 comment-plus address]=20

It's very unfortunate that when you subscribe to = lists, the=20 comment is not preserved when you're added to the list database. Only = the=20 address part is preserved. I even put the comment inside angles to = fool=20 program to pick up everything between angles.

      =
first.surname(+list.procmail)@example.com

But I had no luck. They have too good RFC parsers, = which=20 throw away and clean comments like this. Eg. procmail based mailing = lists, the=20 famous Smartlist, use formail=20 to derive the return address and formail = does not=20 preserve comments. The above gets truncated to

      =
first.surname@example.com

Also many mailing lists send out messages as Bcc, so your address is not even available in = headers=20 anywhere, neither is this nice RFC comment. Ah well, but this RFC = comment=20 trick works very well in private communication, virtually all MUAs = copy whole=20 contents of a From or Reply-To=20 header to To header, preserving comments and = you get=20 the benefit of plus addressing. Here is procmail code to demonstrate = reading=20 the PLUS information from RFC comment-plus field:

   =
   RC_EMAIL =3D $PMSRC/pm-jaaddr.rc      # Address explode module

      :0
      *$ To:\/.*
      {
          INPUT       =3D $MATCH
          INCLUDERC   =3D $RC_EMAIL         # Explore grabbed To address

          #  If COMMENT_PLUS was defined, module found "+"
          #  address which contained, say, "mail.procmail".
          #  Save it to folder.

          :0 :
          * $COMMENT_PLUS ?? [a-z]
          $COMMENT_PLUS
      }

Pretty simple. And you can put anything inside RFC = comment=20 and do whatever you want with these plus addresses. NOTE: there are no guarantees that the RFC = comment is=20 preserved every time. Well, the standard RFC822 says is must be passed = untouched, but I'd say it is 90% of the cases where mail is delivered = from one=20 server to another, it is kept.=20

Example: if you discuss in Usenet groups, you could = use=20 address

      first.surname@example.com (First =
Surname+Usenet.default)
      first.surname@example.com (First Surname+Usenet.games)
      first.surname@example.com (First Surname+Usenet.emacs)
      first.surname@example.com (First Surname+Usenet.linux)

11.4 Simple mailing list handling

[Peter S Galbraith galbraith@mixing.qc.dfo.ca= ]=20 I have used this in the past (by simply looking at the spool file and = seeing=20 the From_ line of the message):
      :0 :
      * ^From debian
      list.debian.mbox

      :0 :
      * ^From procmail
      list.procmail.mbox

Now, I collect specific high-volume mailing lists = (like=20 Debian) into their own spool files like above, and let other recipes = catch all=20 other mailing lists (like procmail and fvwm) into a single spool file = with=20 later rules:

      :0 :                              =
      # Majordomo lists
      * ^Sender: owner-\/[-a-zA-Z0-9_.]*
      list.$MATCH.mbox


      :0 :
      * ^X-Mailing-List: <\/[-a-zA-Z0-9_.]*   # SmartList lists
      list.$MATCH.mbox

So Debian mailing list mail goes to Debian, = procmail and fvwm=20 mail go to mail lists and mail addressed to me yet CC'ed to a list go = to my=20 main spool file.

11.5 Archiving according to TO

Traditional way to detect and = save=20 mailing list messages is:
      :0 :
      * ^TO()procmail
      list.procmail

      [and so on...]

The following code will save the message to folders = list.foo,=20 list.bar, list.procmail when the name is in the TO address.

      #   generalised version
      #   By dattier@wwa.com (David W. Tamkin)
      #   cases desired for foldernames

      LISTS =3D "(foo|bar|procmail)"

      :0:
      * $ ^TO_()\/$LISTS
      * $ LISTS ?? ()\/$\MATCH
      list.$MATCH

11.6 Using Return-Path to detect mailing lists

[philip] For most mailing lists, a more = accurate way to=20 determine whether it came from the list is to examine the = Return-Path:, From_=20 or Resent-From: header. This catches messages from the list, = regardless of=20 whether they were To: the list, Cc: the list, or even Bcc: the list, = something=20 which doesn't show in the message at all.=20

For instance, I refile message from the procmail = mailing list=20 using the following recipe:

      :0
      * ^Return-Path: +<procmail-request@informatik
      ~/Lists/procmail/.

There's one tricky thing to note: if someone sends = a message=20 to both me and the list (say, responding to a message I sent to the = list),=20 then the copy that got to me through the list will end up in my = procmail=20 folder, while the copy that went directly won't. I like this behavior, = but=20 some people, possibly yourself, may prefer it if both messages end up=20 re-filed. If so, your best bet is to combine the above with matching = against=20 the To: and Cc: headers via the ^TO_ token:

      :0
      * ^Return-Path: +<procmail-request@informatik|\
      ^TO()_procmail@informatik
      ~/Lists/procmail/.

(If you have a version of procmail before 3.11pre4, = then=20 you'll need to use "^TOprocmail" instead of "^TO_procmail".). If = you're=20 subscribed to many mailing lists, here is one general recipe=20

Notice: you don't = want to include=20 < in the recipe like: ^TO_\<\/$LISTS because The ^TO_ token = contains=20 something similar to \< but better, so that the \< can only = cause=20 problems. A trailing \> is not a bad idea, though because it's not = a=20 zero-width assertion but rather an actual character class, you have to = strip=20 it from the match

      LISTS  =3D =
"(foo-list|bar-list)"

      #   1) to get the match
      #   2) rematch sans the trailing \>
      #   3) Note: preserves capitalization of the string

      :0
      * $ ^TO_()\/$LISTS\>
      * $ MATCH ?? \/$LISTS
      * $ LISTS ?? ()\/$\MATCH
      {
          M =3D $MATCH
          <action>
      }

[Era] gives this = sample example=20 to describe what happens above:

      VAR =3D  "MOO"
      what =3D "(moo|bar|baz)"

      :0                              # Search what from VAR
      * $ VAR ?? ()\/$what
      {
          #  Now; what is was that really matched, there were several
          #  choices: moo,bar,bar
          #  Beware: $MATCH must not contain regexp characters

          :0
          * $ what ?? ()\/$MATCH
          { }                         # no-op

          # Fine, New MATCH contains moo
      }


12.0 Procmail, MIME and HTML

12.1 Mime Bibliography

List of annoying = things=20 that various MIME implementations do.
...The result is a = sort of=20 style guide for implementors of things that generate MIME. Feel free = to send=20 comments or contributions. http://www.cs.utk.edu/~moore/mime-style.html=20

12.2 Mime notes

<URL:http://www.xray.mpe.mpg.de/mailing-lists/procmail/1998-07/<= /A>=20 msg00248.html>=20

[1998-07-28 PM-L Brett Glass brett@lariat.org] MIME = filename buffer=20 overflow bug described at

      http://www.sjmercury.com/business/microsoft/docs/security07=
28.htm

This bug is particularly insidious because it can = be=20 exploited via by spamming software and could impact millions of users = in a=20 very short time.=20

Use procmail to plug the hole at the mail server, = by=20 truncating the excessively long file names in the MIME headers: = eliminate the=20 extra-long filenames, truncating them back to (say) 64 characters max? = All=20 that's required is to recognize header below and make sure that=20 <verylongname> is chopped to a reasonable size.

      Content-Disposition: attachment; =
filename=3D"<verylongname>"

[era] I believe that = the problem=20 isn't really that the filename is over the allowed length for some = platform=20 (Macintoshes allow something like 27 characters if memory serves) but = a bug in=20 how some particular mail clients allocate memory for the file name = string (but=20 I am really only speculating here).=20

...So far Eudora, Netscape = Mail, Outlook=20 Express, and mutt (Unix) have all been found to have buffer overflow = problems.=20 (mutt-0.93.2i and up are fixed. A patch for 0.93.1 is available.) =

12.3 Software to deal with mime or HTML

See also nearest Perl = CPAN=20 module, http://www.perl.org/=20 site and CPAN/modules/by-module/MIME/=20

There's also Unix program munpack to=20 explode a MIME message to separate files.=20

[MIME aware mail agents in Unix]=20

See mutt that could = handle HTML mail.=20 (Pointer to Mutt mentioned below)=20

All Emacs Mail agents can handle MIME if you = install some of=20 the mime handling packages: TM, SEMI, rmime.el. See http://www.bmrc.berkeley.edu/~trey/emacs/mime.html =

12.4 Mime content type application/ms-tnef

...A member of one of my = mailing lists=20 appears to be using Microsoft Mail. His messages to the list are = usually=20 accompanied my an encoded attachment like this one:=20 "c:\eudora\users\steven@idma.com\attach\WINMAIL11.DAT" The message = headers=20 include the following clause: Content-Type: multipart/mixed;=20 boundary=3D"openmail-part-058c9f3d-00000001" This is driving people = crazy. What=20 is causing this and is there any way to make it stop?=20

Most likely the sender is using Exchange (or = Windows=20 Messaging or Outlook97) and sent the messages in Rich Text Format. It = puts the=20 RTF message in an attachment called WINMAIL.DAT (application/ms-tnef). = But=20 this attachment is useless unless the recipient is also using = Exchange.=20

The sender can turn off the RTF option for messages = to you.=20 For more information, see: XCLN: Sending Messages In Rich-Text Format = http://support.microsoft.com/support/kb/articles/q136/2/04.= asp=20

12.5 Trapping HTML mime messages

[era]=20 Here's a simple filter to throw out unwanted HTML that is sent by = using mime.=20 [jari] This recipe detects if the = message is=20 classified as mime text/HTML and junks it to = separate=20 folder. It does not change the message content. If you want to = actually remove=20 HTML or other attachments from the message, see pm-jamime-kill.rc in the module list.
      :0:
      *$ ^Content-Type:$s*multipart/(mixed|alternative);\
         $SPCNL*boundary=3D"?\/[^;"]+
      * $ B ?? ^--$\MATCH\$([-a-z]+:.*)*Content-type:$s*text/HTML
      junk.html.mbox

Some more examples can be found from section: = 'Explaning ^^=20 and ^'

12.6 Complaining about HTML messages

[Marek Jedlinski eristic@gryzmak.lodz.pdi.net= ].=20 This how I respond to HTML messages. In my noHTML.txt=20 I politely explain why I don't appreciate receiving HTML mail, and ask = to=20 resend the message as plain text. What happens in the majority of = cases is=20 that the sender resends the same message again ("oh, it bounced, let's = try=20 again") and I assume they don't actually read my explanation since = they just=20 happily resend the HTML cr*p. It bounces again at which point they = give up...=20 Tough luck, I say ;)=20

BTW, the above recipe is placed after=20 mailing list mail gets sorted. When someone sends HTML mail to a = mailing list=20 I read, I just flame them in person

      =
TXT_NO_HTML =3D $HOME/noHTML.txt

      :0 BH
      *  ! ^FROM_DAEMON
      *$ ! ^$XLOOP
      *    ^Content.Type.+multipart.alternative
      *    ^Content.Type.+text.html
      {
              LOG =3D "$NL --TRASH: multi-part HTML $NL"

              :0
              | ($FORMAIL                                             \
                    -rtk                                              \
                    -A "X-Mailer: Procmail Autoreply"                 \
                    -A "$XLOOP" ;                                     \
                  cat $TXT_NO_HTML                                    \
                  ) | $SENDMAIL
      }

12.7 Converting HTML body to plain text

Note: Older=20 lynx has security holes: http://ciac.llnl.gov/ciac/bulletins/h-82.sHTML http://lynx.browser.org/=20

The most popular solution to convert HTML body into = plain=20 text is to use lynx. Another more = straightforward=20 method is to use a perl one liner: it's quicker, easier to use with = procmail=20 but it doesn't pretend to know about HTML DTD. The recipe below should = be=20 taken with grains of salt: seeing HTML tag is no guarantee that the = body=20 "only" has HTML. A cautious recipe writer also watches for MIME multi = part=20 messages. (See pm-jamime.rc to draw some = mime=20 characteristics from message)=20

This recipe has been written so that you can add = more=20 alternative HTML conversion scripts. You may even want to select the=20 appropriate conversion for a message: e.g perl for unimportant ones.=20

Note: This is = oversimplified=20 method of checking if body contains HTML. It would be probably a good = idea to=20 check mime headers which indicate HTML encoding here as well.

      :0 B
      * ()<HTML>
      * ()</HTML>
      {
          conversion =3D "lynx"     # or select this conditionally

          :0
          * conversion ?? lynx
          {
              # In new lynx version you can read from stdin. If
              # /dev/stdin doesn't exits try /dev/fd/0
              #
              # lynx -dump -force_HTML -nolist -restrictions=3Dall \
              #   /dev/stdin
              #
              #  Without a global lock on this, you have a chance
              #  that two procmail instances will try to write to
              #  msg.dump

              file =3D "$HOME/tmp/msg.dump"

              LOCKFILE =3D $file$LOCKEXT

              :0 fbw
              | cat > $file && lynx -dump $file

              LOCKFILE

          }

          :0 E fbw
          | perl -0777 -pe 's/<[^>]*>//g'

      }

12.8 Getting rid of unwanted mime attachments (HTML, vcard) =

Microsoft=20 and Netscape MUAs are conquering the PC world and it's likely that you = will=20 receive messages from people that use this software. The unfortunate = thing is=20 that you receive the message in mime format:
      =
HEADERS
      --mime-boundary
      plain text
      --mime-boundary
      Some idiotic HTML (or other type) copy of the text
      --mime-boundary

When you would like to see a traditional message in = the=20 format:

      HEADERS
      plain text

Good news. There's a procmail module that addresses = this=20 problem. The module can kill any mime attachment and the predefined = sets=20 include typical cases:=20

  • Microsoft Explorer has a bad habit of including 7k = application/ms-tnef=20 attachment to the end of message.=20
  • Lotus Notes sends similar extra attachment.=20
  • Microsoft Express sends a copy of message in HTML format in the=20 attachment.=20
  • Netscape's Mozilla sends a copy of message in HTML. See example. = It Also=20 sends annoying vcards.

The module is called pm-jamime-kill.rc and included in Jari's pm-code.zip. (Note: Procmail=20 module list)

12.9 Sending contents of a HTML page in plain text to someone =

[timothy] Send an mail with the subject: = "GetPage:=20 some.url.here/". And it comes back. Kurt Thams thams@thams.com also pointed = out that=20 lynx allows file:// protocol = and since=20 procmail is running as you, this would be a security risk.
      GetFile: ~user/.login

We make the script safe here by forcing "http://$match"/; and not = simply using=20 "$MATCH"

      :0
      *$   ^Subject:$s+GetPage:()\/.*
      *$ ! ^$MY_XLOOP
      |   ($FORMAIL                                                   \
              -rt                                                     \
              -I "Precedence: junk"                                   \
              -I "Subject: Requested page: $MATCH"                    \
              -I "$MY_XLOOP" ;                                        \
           lynx -dump "http://$match"/;                                 \
          )| $SENDMAIL

[era] If all you need = is to=20 create a suitable MIME package, there are various MIME command-line = utilities=20 such as metasend (which is for interactive use, and so doesn't work = very well=20 with Procmail) and mpack you can try. If your needs are simple, you = could even=20 read up a bit on the MIME spec and generate the necessary headers and=20 separators yourself (echo Content-Type: multipart/mixed etc etc etc).=20 Conversely, if your needs are complex, get the Perl MIME package from = CPAN and=20 cook up your own tool. The MIME FAQ (especially part 6) is a good = place to=20 look for info. http://www.FAQs.org/FAQs/by-newsgroup/comp/comp.mail.mime.h= tml=20

[jari] See procmail = module list=20 at the beginning of this document for procmail based MIME file = servers. (Note:=20 Procmail=20 module list)


13.0 Simple recipe examples

13.1 Saving: MH folders -- numbered messages

Hm. This is = explained in=20 the procmail man pages, but not very well. There are just one or two = occasions=20 where the man page tells how to create individual files instead of = catenating=20 messages to a folder. Notice the /. at the = end of=20 folder name
      :0
      * condition
      dir-folder/.

[manual] When = delivering to=20 directories (or to MH folders) you don't need to use lockfiles to = prevent=20 several concurrently run- ning procmail programs from messing up.=20

On a save to a directory, how = does=20 procmail determine what to put after $MSGPREFIX to complete the name = of the=20 file?=20

[philip] It's the = inode number of=20 the file encoded in base-64 with the set of characters A-Za-z0-9-_, in = reverse=20 order. So, for example, the inode numbered 59699 would be encoded as = follows:

      59699 =3D 51 + 64 * ( 36 + 64 * 14 )
      A=3D0, B=3D1, ..., N=3D13, O=3D14, ..., a=3D26, ..., k=3D36, ..., =
z=3D51,
      0=3D52, ...
      --> zkO

13.2 Saving: to monthly folders

      # Use =
any date method mentioned previously to define variables
      # YYYY YY MM DD. Archive digests monthly

      :0 c:
      * ^From:.*\/mailing-list-digest@some.net
      {
          # Get the "mailing-list-digest" string, do not use following
          #
          #       MBOX =3D `echo $MATCH | sed -e 's/@.*//' `
          #
          # Because we really don't need those extra shell processes.
          # Procmail can derive the word 10x more efficiently

          :0
          * MATCH ?? ()\/[^@]+
          {
              MBOX =3D $MATCH
          }

          :0 :
          $YYYY-$MM-$MBOX
      }

13.3 Modifying: Filtering basics

Pay attention to the cat command position in each recipe.
      :0 fbw
      | echo "This is a line of text _before_ the body"; \
        cat -

      :0 fbw
      | cat - ; \
        echo "This is a line of text _after_ the body"

      :0 fbw               # prepend text before the body
      | cat msg.txt -

      :0 fbw               # append text at the end of body
      | cat - msg.txt

      :0 fbwi              # replace the body with text from file
      | cat msg.txt

13.4 Modifying: Squeezing empty lines around message body =

[david] Anything that replaces the body is = going to=20 require an outside process, even if it's only /bin/echo. In order to = trim=20 empty lines from the beginning of message and from the end of message, = you can=20 do this, if the entire body fits into LINEBUF
      :0 B fbw
      * ^^$*\/.(.|$)*.$
      | echo "$MATCH" # trailing extra newline intended

If your version of cat is BSD-ish,

      # SysV's cat has a different meaning for -s and =
cannot do this

      :0 B fbw
      * $$$
      | cat -s

otherwise, it can be done with a very simple sed = filter:

      :0 B fbw
      * ^^($)|$$$
      | sed /./,/^$/!d

Note that cat -s has slightly different results = from the=20 others: if there are any empty lines at the top of the body, cat -s = will keep=20 one. The echo and sed suggestion will remove all empty lines from the = top and,=20 like cat -s, keep one at the bottom.

13.5 Modifying: shuffling headers always to same order

[phil] To sort the headers in the message into = predictable order, you can use following recipe. The spaces have been=20 eliminated between the -I and its argument = in the=20 above. The shell may or may not allow unquoted spaces in the second = part of=20 the ${variable:+blah}. For example, under solaris 2.6, /bin/sh barfs = on=20 ${FROM:+-I "From: $FROM"}, while /bin/ksh handles it just fine. I = think the=20 POSIX shell standard requires that it be allowed, but, well, will your = next system be POSIX compliant?
      :0
      * ()\/^From: +\/.*
      {
          FROM =3D $MATCH
      }

      :0
      * ()\/^Reply-To: +\/.*
      {
          RT =3D $MATCH
      }

      :0
      * ()\/^X-Mailer: +\/.*
      {
          XM =3D $MATCH
      }

      :0
      * ()\/^Message-Id: +\/.*
      {
          MID =3D $MATCH
      }

      :0
      * ()\/^Date: +\/.*
      {
          DATE =3D $MATCH
      }

      :0
      * ()\/^To: +\/.*
      {
          TT =3D $MATCH
      }

      :0
      * ()\/^CC: +\/.*
      {
          CC =3D $MATCH
      }

      :0
      * ()\/^Subject: +\/.*
      {
          SUBJ =3D $MATCH
      }

      :0 fh w
      | $FORMAIL                                                      \
          ${XM:+-I"X-Mailer: $XM"}                                    \
          ${TT:+-I"To: $TT"}                                          \
          ${FROM:+-I"From: $FROM"}                                    \
          ${RT:+-I"Reply-to: $RT"}                                    \
          ${CC:+-I"Cc: $CC"}                                          \
          ${MID:+-I"Message-Id: $MID"}                                \
          ${DATE:+-I"Date: $DATE"}                                    \
          ${SUBJ:+-I"Subject: $SUBJ"}

13.6 Service: Auto answerer to empty messages

[elijah] Here is piece of code that responds = to empty=20 messages.
      :0 B
      * ! ...
      | (echo "From: me@here.com" ;                                   \
        $FORMAIL -r -A"Precedence: junk"                              \
        -A"X-Loop: me@here.com" ;                                     \
        echo "Your blank message was received.\n"                     \
             "Did you mean to say something?\n"                       \
             "\n"                                                     \
             "-- \n"                                                  \
             "My Signature\n"                                         \
             "this has been an automated response\n"                  \
        ) | $SENDMAIL

13.7 Service: File server -- send file as attachments upon request =

This section is here only for indexing purposes. The File servers = are=20 described in the Procmail=20 module list=20

13.8 Service: Ping responder

Sometimes I'm on the road and I = don't=20 seem to get access to the site where my messages are. The telnet = connection=20 fails and standard Unix "ping" plays dead for me. "What's happening in = that=20 site?" I wonder. Here is a recipe that I have added to all of my = accounts. It=20 sends an immediate reply if at least the mailhost is up and gives some = status=20 information.
      :0
      * ^Subject: ping$
      {
          :0 fh
          | $FORMAIL -rt

          #   Remember, Don't send back anything that would be vital to
          #   attacker. It doesn't matter if the `uptime` or other
          #   scripts fail, the reply is sent anyway.

          :0 c    # Record this ping request
          |   ( cat -;                                                \
                echo `uptime`;                                        \
                echo "$HOST User count: " `who | wc -l`;              \
              ) | $SENDMAIL

          :0 :                    # or sink to $DEFAULT
          $PING_SPOOL
      }

13.9 Service: simple vacation with procmail

Don't forget to = look into=20 procmailex(5) man pages which also has vacation example. The ones = presented=20 below may not work for you. Here is a very simple vacation recipe. = Whenever=20 the file ~/.vac exists, the vacation program is called. Be sure that = you have=20 the ~/.vacation.msg file ready too. Remember that vacation does not save you=20 messages; so we need c flag here.
      #  Some prefer the non-dotted file which shows up in =
ls listing

      vacationFlagFile =3D $HOME/.vac

      :0 wc
      *$ ? $IS_EXIST $vacationFlagFile
      |  vacation $LOGNAME

Some people like to raise a flag in .procmailrc = instead of=20 creating a file. If you like the variable approach better, here is the = equivalent implementation of the above

      =
VACATION =3D "yes"    # Comment this when not in vacation

      :0 wc
      * VACATION ?? yes
      | vacation $LOGNAME

[philip] and [era] Since vacation only sends replies -- it = never=20 sends the original # messages, one way to do two things with your = .forward=20 file. Substitute "abc" with your login name.

      =
|/usr/ucb/vacation","exec /usr/local/bin/procmail -f- ||exit 75 #abc

13.10 Service: vacation code example

[By Eric Black eric@Mirador.COM] Here is = the procmail=20 part
      OFFSITE =3D =
"my_guest_login@wherever.I.am"

      #  Forward urgent mail to me at my off site address; afterward,
      #  continue processing it as normal The procmail pattern match
      #  may be case-insensitive, in which case this rule could be
      #  simplified...

      :0 c
      * ^Subject: .*urgent
      | $SENDMAIL $OFFSITE


      #  Use "vacation" to tell other people I'm not here To enable,
      #  un-comment the next two lines; to disable, comment them out
      #
      #  The -a Identifies another name that can legitimately
      #         appear in the To: line of the mail header instead
      #         of your login name

      :0 wc
      | vacation -a ericb eric

And here the ~/.vacation.msg file

      Subject: I'm out of town for a while
      From: eric (via the vacation program)

      I'm out of town until <return-date>.  Your mail regarding
             "$SUBJECT"
      will be read when I return, or possibly at some unknown
      time before then if I get a chance to check for mail.

      If your message must be seen by me before I return,
      you can send it with the word "URGENT" in the subject header.
      Such mail will be automatically forwarded to me so that
      I see it sooner.
      --Eric

13.11 Service: Auto-forwarding

[timothy]=20 I have my .procmailrc setup to forward mail to another (mail only) = account.=20 When I am not going to be at the account, I want to turn forwarding = off
      #   look for the file to tell us whether or =
not to forward mail
      #   if the file exists, forward the mail
      #   or not

      ELSWHERE =3D "me@elsewhere.com"
      FILE     =3D "$HOME/.forwardmail"

      :0 c
      *$ ? $IS_EXIST $FILE
      ! $ELSWHERE

      #   if a message arrives from the other account
      #   with the Subject 'forward-off' then remove the
      #   file, efectively turning off forwarding

      :0 hwic
      *$ ^From:.*$ELSWHERE
      *  ^Subject: forward-off
      | $NICE mv -f $FILE $FILE.off

      #   if a message arrives from the other account
      #   with the Subject 'forward-on' then remove the
      #   file, efectively turning off forwarding on

      :0 hwic
      *$ ^From:.*$ELSWHERE
      * ^Subject: forward-on
      | $NICE mv -f $FILE.off $FILE

13.12 Service: forward only specific messages

Here is piece = of code=20 that triggers forwarding according to addresses. If you have lot of = these kind=20 of forwarding, you should use simple awk = database=20 which you would grep.
      #   By Jim Hribnak =
hribnak@nucleus.com
      #   info@domain1.com goes to joe@domain1.com
      #   info@domain2.com foes to fred@domain2.com

      :0
      * ^TO_()info@domain1.com\>
      {
          FORWARDTO =3D "$FORWARDTO joe@domain1.com"
      }

      :0
      * ^TO_()info@domain2.com\>
      {
          FORWARDTO =3D "$FORWARDTO fred@domain2.com"
      }

      :0 fhw
      *    FORWARDTO ?? @
      * ! ^$MY_XLOOP
      | $FORMAIL -A "$MY_XLOOP"

        :0 a
        ! $FORWARDTO

13.13 Service: Making digests

      # By =
jimo@eskimo.com
      # Add this message to the digest accumulator

      :0 c:
      | $FORMAIL -k -X From: -X Message-Id -X Date -X Subject >> =
$DIGEST

      # Check size of digest, and send it off if it's big enough

      :0
      * $       -$DIGSIZE     ^0
      * $ `wc -l <$DIGEST`    ^0
      | $NICE send-digest $DIGEST

13.14 Kill: killing advertisement headers and footers

A mailing list that I = subscribe recently=20 began adding a block of "boiler plate" text to the beginning and end = of every=20 message that goes through the list (groan). The text is always the = same, and=20 is always at the beginning and end of the message.=20

[david] sed could do = both at=20 once, but the problem is that sed never knows when it is N lines from = the end=20 if N>0; it knows the last line when it reads it, but when it is = looking at=20 the next-to-last line it doesn't know that there is only more one line = to=20 come. It does, however, know how many lines of input it has already = read.=20

So I have three suggestions: if you know that the = header is X=20 lines long [let's say 5 for this example] and that the first line of = the=20 footer contains some string or pattern that will not occur in the = significant=20 part of the post,

      :0 fbwi
      * conditions
      | sed -ne 1,5d -e '/pattern/q' -e p

If you recognize the end by the last line that you = want to=20 keep instead of the first line that you want to delete, omit the n = option and=20 the p instruction:

      | sed -e 1,5d -e =
'/pattern/q'

Finally, if the only reliable way to spot the = footer is by=20 reaching so many lines from the end (because any search pattern might = occur in=20 the real text as well), we can score as you've been doing to get the = number of=20 the last significant line. Let's say the footer is three lines long; = because=20 ^.*$ always counts one line too many (long story), we subtract four = instead of=20 three:

      :0 fbwi
      * conditions
      * 1^1 B ?? ^.*$
      * -4^0
      | sed -e 1,5d -e "$=3D"q

13.15 Kill: simple kill file recipe with procmail

Kill files = are=20 widely used with news readers to delete uninteresting posts when you = enter a=20 newsgroup. A kill file usually contains one single entry per line to = match the=20 message content and this can be easily done with procmail. Remember = however=20 that for every message procmail forks a process, so before you apply = the kill=20 file rules to the messages, be sure your recipes are in this order: = the kill=20 file rules are applied only to unknown messages =
      SINK MAILING-LISTS
      SINK ANNOUNCEMENTS
      SINK WORK MESSAGES
      OTHER DELIVERIES
      apply kill file rules and UBE recipes to the rest

Recipe will drop the message (i.e. consider it = 'delivered')=20 if one of its headers matches a pattern in kill file.

      :0 hW:  $HOME/.kill file$LOCKEXT
      | egrep -i -f $HOME/.kill file

The reason why there is explicit lock file is that = you must=20 be able to update the kill file while your procmail is running. An = example=20 edit script is presented below.

      #!/bin/sh
      # program: kill file.sh
      #
      file=3D$HOME/.kill file
      lock=3D$file.lock
      cp $file $file.tmp
      emacs -q $file          # or use whatever you prefer: vi, pico
      lockfile $lock
      mv $file.tmp $file
      rm -f $lock

13.16 Kill: duplicate messages

[Lars Kellogg-Stedman lars@bu.edu] Put this as a first = entry in=20 your .procmailrc and you won't see any duplicates as long as the 8K = cache=20 doesn't get full. The duplicates folder is cleaned out weekly via a = cron job.=20 While it may be tempting to simply sink duplicates to /dev/null, I = have come=20 across broken mail clients the stick the same value in the Message-id header of all outgoing mail.
      :0
      * ^Subject:\/.*
      {
          SUBJECT =3D $MATCH
      }

      MID_CACHE_LEN   =3D 8192
      MID_CACHE_FILE  =3D $PMSRC/msgid.cache
      MID_CACHE_LOCK  =3D $PMSRC/msgid.cache$LOCKEXT

      LOCKFILE =3D $MID_CACHE_LOCK

      # IF  the message has a message-id header
      # AND formail -D is successful (exit status=3D0)
      # THEN
      #   log a message to the procmail log
      #   sink the message

      :0
      *  ^Message-Id:
      * ? $FORMAIL -D $MID_CACHE_LEN $MID_CACHE_FILE
      {
          LOG=3D"dupecheck: discarded message, $SUBJECT $NL"

          :0              # Store duplicates, notice no lock!
          duplicate.mbox
      }

      LOCKFILE            # Release lock by killing variable

And here is a bit simpler recipe, a slightly = modified version=20 from the [manual]. Procmail notices = formail's=20 success, considers the message delivered and does not stop processing = the=20 rcfile due to c flag, which let's a message = to fall=20 into safety copy inbox.

      :0 hWc: =
$PMSRC/pm-msgid.cache$LOCKEXT
      *  ^Message-id:
      | $FORMAIL -D 8192 $PMSRC/pm-msgid.cache

        :0 a:
        duplicate.mbox

There was a pretty heavy thread around September = 1997 about=20 duplicate detection, where some promising stuff was posted. One item = you=20 should definitely have in your collection is Eli's hashd <URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/procmail= /=20 1997-09/msg00160.html>=20

Matt Saroff also started a thread about duplicates: = <URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/procmail= /=20 1997-05/msg00599.html> where several of the replies are also = helpful.=20

13.17 Kill: spam filter with simple recipes

[Ed McGuire = emcguire@i2.com] Seeing = several junk=20 mail filters posted recently, varying from the simple to the complex, = I=20 thought I would also share my own. I junk whatever comes from my ISP = but is=20 not addressed to my domain or to one of the mailing lists I subscribe = to.
      #   1.  mail to my domain
      #   2.  NOT addressed to me directly
      #   3.  NOT coming from mailing lists I'm subscribed to.

      0:
      * ^(received):.*psi\.com
      * ! ^((apparently-)?to|cc):.*(i2|intellection)\.com
      * ! ^(to|cc):.*(pdp-?8-lovers|procmail|sunshine|info-pdp11)
      junk.ube.mbox

[Gordon Matzigkeit gord@m-tech.ab.ca] I have = just=20 discovered an effective rule for separating SPAM from the rest of my = e-mail.=20 Just substitute your username for gord in = the line=20 below

      # Anything which is not addressed to me =
is probably SPAM.
      :0:
      * !^TO().*\<gord\>
      junk.ube.mbox

This only works because I handle all mailing list = addresses=20 above that point in my .procmailrc (i.e. all traffic that arrives from = mailing=20 lists that I am subscribed to goes into other folders). Most SPAMmers = seem to=20 do it nowadays by sending mail via mailing lists, rather than creating = huge=20 To lists of users=20

Many times sysadm install a list of know addresses = that send=20 spam and then they check the incoming mail against the "black list". = Keep in=20 mind that that some fgrep implementations = have a=20 problem with the -w word switch. Note that the above recipe scans the = FULL=20 HEADER, so use it with some caution, i.e., be careful what you add to = your=20 list of spam domains.

      # by [philip]; egrep =
would do here too, if it is posix
      # compliant, it may have -f switch that makes it behave
      # like fgrep.
      #
      # Note: option -F would make [ef]grep to search fixed string
      #       instead of regexps.
      #

      BLOCK_FILE  =3D $HOME/Mail/DeniedNames.lst
      UBE_MBOX    =3D $HOME/Mail/junk-ube.mbox

      # To filter out the Subject lines, so that mails sent
      # with the subject "Have you received a message from
      # blah-blah@spam" don't get filtered.
      # [era] suggested we use formail
      #
      # Edsel Adap edsel.adap@Canada.Sun.COM agrees there is a
      # likely bug in Solaris 2.5.1 "/usr/bin/fgrep -i" and
      # suggested the use of /usr/xpg4/bin/fgrep instead.
      #
      # edsel.adap@canada.sun.com Sun Microsystems Developer Support
      # Files in /usr/xpg4 are available via the SUNWxcu4 package,
      # which is part of the user, developer, all, or Xall Solaris
      # clusters.
      #
      # Solaris 2.4 doesn't have /usr/xpg4/bin/fgrep :-(, you
      # must use  `tr A-Z a-z' before piping the message to fgrep.

      :0 hw:
      *$ ? $FORMAIL -ISubject: |fgrep -i -f $BLOCK_FILE
      $UBE_MBOX

The file DeniedNames.lst is simply a list of = addresses

      82338201@compuserve.com
      Dwnliner@ix.netcom.com
      Emerald@earthstar.com
      FreeWay@dm1.com

13.18 Kill: (un)subscribe messages

I'm getting tired of those = pesky=20 (un)subscribe messages that certain "other" mailing lists seem to pass = through=20 to the list at large instead of capturing them at the list server, = like=20 SmartList does.=20

[Adam Shostack adam@bwh.harvard.edu] = The=20 following do help, although they're often too broad. (I use a .safe = rule to=20 cover those cases) The < 1000 is a useful hueristic. It's rare that = unsubscribe messages are long.

      :0 :
      * (Delete|u*n*Sub(s| )*| add | leave | help )
      * < 1000
      junk.misc.mbox

[Rodger Anderson = <rodger@hpbs2245.boi.hp.com>] I've=20 been working on a recipe to filter out those pesky s*bscribe and = uns*bscribe=20 messages from mailing lists, and I'm posting what I have so far. As an = aside,=20 it also filters out very short messages, which I've found are usually = some=20 sort message meant for list owner/request address.=20

I give heavy weight to Subjects starting with = (un)?s*bscribe,=20 with also pretty heavy weight to Subjects containing either of those = words. I=20 then give heavy weight to the body of messages starting with those = words, and=20 a lighter weight to lines starting with them. Then multiple = occurrences get=20 some weight too, up to a point. Then I count the words in the message = against=20 all that.

      :0 B
      *  1^0
      *  30^0 H ?? ^Subject: +(un)?subscribe\>
      *  20^0 H ?? ^Subject:.*\<(un)?subscribe\>
      *$ 20^0   ^^$SPCNL*(un)?subscribe\>
      *$ 10^0    ^$SPC*(un)?subscribe\>
      *  8^.4   \\<(un)?subscribe\>
      * -.4^1  \\<$a+\>
      junk.misc.mbox

[Adam Shostack adam@bwh.harvard.edu] = How about=20 looking for sub & unsub, as well as a perennial misspelling = 'unsuscribe=20 me'? I also find filtering on add, leave and help to be useful. This = may well=20 be the only word on the line. I think it has to do with broken list = management=20 packages.

      | :0B
      | *  1^0
      | * 30^0 H ?? ^Subject: +(un)?subscribe\>

      * 20^0 H ?? ^Subject: +(un)?sub?(scribe)?\>

      (The B is often missing, as is the word fragment 'scribe')

      | * 20^0 H ?? ^Subject:.*\<(un)?subscribe\>

      * 20^0 H ?? ^Subject: +(add|leave|help)$

        # fewer points if more words

      * 15^0 H ?? ^Subject: +(add|leave|help)

[david 1998-10-20] You want to match on messages = where the=20 first non-blank thing in the body is "unsubscribe" at the end of a = line, where=20 there are five lines or fewer in the body?

      :0 =
B
      *$ ^^$SPCNL*unsubscribe$
      * 7^0
      * -1^1 ^.*$
      junk.misc.mbox

^.*$ always counts one line too many, so a = five-line body=20 will be counted as six; that's why we need a prejudice of 7. But if = the first=20 non-blank text in the body is "unsubscribe" alone on a line, is a line = count=20 really necessary? True posts that include the word will have it in the = middle=20 of a sentence, such as the preceding one. What you'll find by = specifying a=20 line limit is that unsubscribe requests with long signatures or = attachments at=20 the bottom of a previous message will get through.

13.19 Time: Once a day cron-like job

[Bill Moseley moseley@netcom.com] If you = want to=20 do something only once a day, they you have to store the date = somewhere and=20 check against that stored date.
      YYMMDD_FILE =
=3D $HOME/.yymmdd
      YYMMDD      =3D $YY-$MM-$DD

      #   Contains single line of procmail code
      #   YYMMDD_PREV =3D ..

      INCLUDERC $YYMMDD_FILE

      #   If different date, then enter this block
      #   The echo updates stamp in file.

      :0
      *$ ! YYMMDD ?? ^^$YYMMDD_PREV^^
      *  ? echo "YYMMDD_PREV =3D $YYMMDD" > $YYMMDD_FILE
      {
          ...do the cron jobs..
      }

13.20 Time: Running a recipe at a given time

If I put a program to my = recipes, it will=20 be executed every time message arrives. That's a problem, and I'm not = allowed=20 to use cron in this account. I'm looking for some sort of condition to = check=20 the current time and if its outside of the hours 11pm and 7am then = execute the=20 action.=20

[david] How do your = From_ lines=20 look? If they're the traditional kind that sendmail and smail add, = they=20 include the local time on your system at receipt. So include a check = that the=20 hour is between 07 and 22 inclusive, like this:

     =
 :0 c
      *  ^From .*some-address.* (0[789]|1.|2[012]):[0-5][0-9]:
      |  command

I included the minutes and the colon that separates = the=20 minutes from the seconds so that the expression for testing the 07-22 = range=20 can match only on the hour.

13.21 Time: Triggering mail and using cron

[david] Put something like the following = entries in your=20 personal crontab for your userid (and not knowing if you particular = cron=20 "cd's" to your home directory first):
      0 23 * * =
*        touch $HOME/.mail.relay.on
      0 7 * * * rm -f $HOME/.mail.relay.on

And if your cron doesn't know the HOME variable = (that'd be an=20 exception)

      0 23 * * *  /bin/csh -c 'touch =
~LOGNAME/.mail.relay.on'
      0 7 * * *   /bin/csh -c 'rm -f ~LOGNAME/.mail.relay.on'

Then, in your .procmailrc do:

    =
  :0 c
      *  ^From.*some-address
      *$  $IS_FILE $HOME/.mail.relay.on
      | command

the script will run_my_program only if both the = subject=20 matches and the file test succeeds. The file test will succeed only = between=20 11pm and 7am.=20

In all honesty, if system gives usable From_ lines, = I like=20 following suggestion better. I use it all the time to turn blocks of = procmail=20 code on and off at given times or dates, and it works likes a charm. = It uses=20 many fewer processes and is less likely to get the status wrong if for = any=20 reason one of the cron jobs fails to run or doesn't do its job.=20

This pages only at day time

      =
:0 c
      * ^From .*some-address.* (0[789]|1.|2[012]):[0-5][0-9]:
      | command

This pages at night

      :0 c
      * ^From .*some-address.* (0[0-6]|23):[0-5][0-9]:
      | command

13.22 Decoding: Uudecode

[philip] here is=20 piece of code to do uudecode match when certain condition is matched. = The=20 magic string here is "begin ...file", the body = is then fed=20 to my_uudecode_program whatever it does to = it.
      :0 b
      *      ^From:.*someone@somewhere\.com
      *      ^Subject: Subject
      * B ?? ^begin 644 file.tar.gz
      | my_uudecode_program

13.23 Decoding: MIME

      #   by Peter =
Galbraith galbraith@mixing.qc.dfo.ca=

      #   MIME filtering of accented characters and split lines.
      #
      :0
      * ^Content-Type: *text/plain
      {
        :0 fbw
        * ^Content-Transfer-Encoding: *quoted-printable
        | mimencode -u -q

          :0 A fhw
          | $FORMAIL -I "Content-Transfer-Encoding: 8bit"

        :0 fbw
        * ^Content-Transfer-Encoding: *base64
        | mimencode -u -b

          :0 A fhw
          | $FORMAIL -I "Content-Transfer-Encoding: 8bit"
      }


      #   1995-10-18 Tim Pickett tbp@cs.monash.edu.au
      #
      #       Decode MIME quoted-printable Content-Transfer-Encoding
      #
      #   Conditions
      #
      #       Mail has a MIME-Version header with a number in it.
      #       Header saying "Content-Transfer-Encoding: =
quoted-printable"
      #       exists

      :0
      *$ ^MIME-Version:$s*$d*(\.$d*)
      *$ ^Content-Transfer-Encoding:$s*quoted-printable
      {
        :0 fhw     # Remove header
        | $FORMAIL -I"Content-Transfer-Encoding:"

        :0 fbw             # Decode the body.
        | mmencode -u -q
      }

13.24 How to send commands in the message's body

      :0 b
      * ^Subject: ARCHIVE
      | sed -e '/$s*[^a-zA-Z]/,$ d' | sh

13.25 Matching two words on a line, but not one

How does one = write a=20 recipe that will do this: Put mail in mailbox which has a line with = two string=20 (one and two) like:
          one     two

but save mail in error-folder = if the line=20 as only the first string like: one (string two is missing)=20

[philip] I presume = these lines=20 would be located in the body of the message, and that by "space = between one=20 and two" you mean "whitespace between one and two". If those = assumptions are=20 wrong then you'll need to tweak the following recipes:

      # The 'B' tells procmail to look in the body instead =
of the header.
      # The second colon tells procmail to lock the mailbox with a
      # local lock file -- if mailbox is a directory then you don't need
      # it. The brackets in the condition contain a space and a tab.

      :0 B:
      *$ one$s*two
      default.mbox

      :0 B:
      * one
      error.mbox

Now, the above will match even if "one" or "two" is = part of=20 another word (at the end in the case of "one" and at the beginning in = the case=20 of "two"). If you don't want that then you'll need to change the = recipes to=20 read:

      :0 B:
      *$ ()\<one$s*two\>
      default.mbox

      :0 B:
      * ()\<one\>
      error.mbox

13.26 How to define personal XX macros?

By macro, I'm = referring to=20 the procmail's FROM_DAEMON, TO and TO_ that you can use in matches. = Here is=20 one way to make one's own macro=20

[alan] Define HEADERS = to include=20 those headers you care about. Pick one of the definitions below (and = remove or=20 comment out the others). Here are three ways to define user to_ macro=20

  1. use only To:=20
  2. use either To: or Cc:=20
  3. To:, Cc:, or Apparently-To:
      =
to_ =3D '^To:(.*\<)?'
      to_ =3D '^(To|Cc):(.*\<)?'
      to_ =3D '^((Apparently-)?To|Cc):(.*\<)?'

And you use it like this

      :0 =
:
      *$ $to_()foo@bar.com
      address-matched.mbx

[jari] and here are = some more=20 examples

      cc_      =3D =
"(^((Original-)?(Resent-)?(Cc|Bcc)):(.*[^a-zA-Z])?)"
      from_    =3D "(^(Apparently-|Resent-)*\
      (From|Reply-To|Sender):(.*\<)?|\
      ^From $NSPC+)"}

13.27 How to change subject by body match

Suppose you to = change the=20 mail's subject when there is a match in the body. The desired outcome = would be=20 this:
      From: foo@this.is
      Subject: Fault: NNNN in program block YYY    << changed

      Fault: NNNN in program block YYY

Here is the answer

      :0 fhw
      *       ^Subject: NOK case report
      *$ B ?? ^$s*\/Fault: [0-9a-f]+ in program block.*
      | $FORMAIL -I "Subject: $MATCH"

13.28 How to change Subject according to some other header =

Suppose=20 you want to change the subject when mail comes to some particular = address; or=20 when some other header field. Here is one way to do it, we suppose = that mail=20 comes to various internal mail addresses. See the HEADERS macro in = previous=20 section.
      # By [alan]
      # Examine headers, create a subject tag if we recognize a list

      TAG =3D ""

      :0
      * $ ${HEADERS}info@foo.com
      {
          TAG =3D "info"
      }

      :0E
      * $ ${HEADERS}check@foo.com
      {
          TAG =3D "check"
      }

      # ...and so on...
      # now, if TAG is set, insert it into the subject

      MATCH       # kill this

      :0 fhw
      * !  TAG ?? ^^^^
      *   ^Subject: *\/[^ ].*
      | $FORMAIL -I "Subject: $TAG - ${MATCH:-<no subject>}"

Or you could use the command = line=20 arguments, add following line to your .forward. (alias=20 file syntax)

      foo: "|/usr/local/bin/procmail -m =
/usr/local/etc/pm-tagit.rc foo"

Then in tagit.rc you = would instead=20 say:

      ARG =3D $1

      :0
      * ARG ?? ^^foo^^
      {
          TAG =3D "foo@go"
      }

      :0
      * ARG ?? ^^somethingelse^^
      {
          TAG =3D "somethingelse@go"
      }

This method will work even if someone Bcc:s a = message to=20 foo@some.com.

13.29 How to call program with parameters

...now, suppose I want to call = program with parameter $FOUND, and get the result = back in=20 RESULT, how do I do it ?=20

The stdout of myprogram will be captured at stored = in the=20 variable RESULT. Also consider what should happen if there are spaces = or tabs=20 in the value of $FOUND. Perhaps it should be better off enclosed with = quoted.

      #   Make sure FOUND is not empty before =
passed to program

      :0
      * ! FOUND ?? ^^^^
      {
          RESULT =3D `program "$FOUND"`
      }


14.0 Miscellaneous recipes

14.1 Matching valid Message-Id header

[philip] wrote full RFC compliant matcher. = Follow the=20 link=20

<URL:http://www.xray.mpe.mpg.de/mailing-lists/procmail/1998-03=20 /msg00375.html>

      dq =3D '"'                  =
              # (literal) double-quote
      bw =3D "\\"                               # (literal) backwhack
      ws         =3D "[         ]*"                     # whitespace
      atom       =3D "[-!#-'*+/-9=3D?A-Z^-~]+"
      word       =3D "($atom|$dq([^$dq\]|$bw.)*$dq)'
      local_part =3D "$word($ws\.$ws$word)*"
      domain     =3D "(\[$ws([^][\]|$bw.)*$ws\]|$atom($ws\.$ws$atom)*)"

      :0
      * ! $ ^Message-Id:$ws$ws$local_part$ws@$ws$dom=
ain$ws
      thats-non-valid-message-id

14.2 Sending two files in a message

If you plan to send = multiple=20 files in a message, be sure that every file has extra blank line at = the end so=20 that they can be catd together. Instead of doing =
      (cat THIS; echo " "; cat THAT ) | $SENDMAIL

You do

      (cat THIS THAT ) | =
$SENDMAIL

But sometimes you don't have control over the = files, then you=20 can do this to make sure there is blank line. Notice, only two = processes used=20 compared to first choice.

      (echo '' | cat THIS =
- THAT ) | $SENDMAIL

[David] And an sed = expert would=20 do it this way

      (sed -e '$ !b' -e '/./G' -e "r =
THIS" THAT ) | $SENDMAIL

  • $: the last line=20
  • !: everywhere except the range (in this case, everywhere except = the last=20 line)=20
  • b: branch to a label. No label: branch to the end (and, since -n = is not=20 in effect, print the pattern space)

Now remember that everywhere except the last line, = we've=20 skipped ahead, so the rest of the code will be executed only for the = last line=20 of the input.=20

  • /./: on lines that contain a character (but we get here only for = the=20 last line, so on the last line if it contains a character) =

  • G: append a newline and the contents of the hold space to the = pattern=20 space (the hold space is empty, so basically, if the last line was = already=20 empty, do nothing, but if the last line was not empty, append a = newline and=20 thus add a blank line after it).=20
  • r file: After finishing with this run through the sed = instructions, read=20 the named file and copy it to the output.

This side of sed comes out only after sed has had a = few=20 drinks...

14.3 Excessive quoting of message

[25 Nov 1997 buck@Compact.COM] = I=20 administer a LISTSERV mailing list and our host has asked us to reduce = excess=20 quoting of previously posted material. ...Subject: asking if this was=20 excessive quoting. With the weights below, this extra copy will = activate at=20 66% quoted lines of all body lines.=20

[era] I would = definitely tolerate=20 75% quotes. And in the end, you will of course always have to face the = kinds=20 of people who would rather change their quoting style to evade such=20 constraints than quote less. An idealized quote parser should perhaps = realize=20 that a non-blank prefix that recurs on a lot of lines is probably a = customized=20 quote string.=20

This will preserve the correspondent's original = subject (with=20 a Re: added if it didn't already have one) and thus the template text = should=20 indicate the nature of the problem.=20

I'm not sure what would be appropriate to generate = behavior=20 more like I suggest below, any takers? Perhaps no score at all for = empty=20 lines, neutralize .signatures (hope sender obeys "-- " convention) and = add=20 10^0.5 for each quoted line and dish out -15^0.3 for non-quoted? (I = haven't=20 really explored this -- could be completely up the creek.) [Also, = perhaps long=20 runs of quoted material should be penalized harder than quoted snippet = --=20 reply text -- quoted snippet -- reply text alternations?]

      COPY_ADDRESS =3D "listAdm@foo.com"

      :0
      * ^Sender: <mailing list tag>
      {
          # - quoted lines
          # - non-blank, non-quoted lines
          # - completely blank lines

          :0B
          *$  10^1 ^$s*>
          *$ -15^1 ^$s*[^>$WSPC]
          *$ -15^1 ^$s*$
          {
              # You don't need to repeat the original condition here
              # You also don't really need to extract SENDER
              # Generate a reply with appropriate headers and the
              # body quoted

              :0 fhw
              | $FORMAIL -rtk -A "Bcc: $COPY_ADDRESS"

              # Now "replace" the body with template text + body (In
              # other words, add the template before the quoted body)

              :0 fbw
              | cat $HOME/template.txt -

              # Now send it off to recipients mentioned in generated
              # header

              ! -t
          }

          # Wasn't excessively quoted; save it
          :0 :
          $SOME_MBOX

14.4 Sending message to pager in chunks

I have a 200 character limit = on my pager.=20 But I have wordy contacts who go over that limit. What I would like to = do is=20 have a recipe split up messages addressed to my pager into 200 = character (max)=20 messages.=20

<URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/=20 cgi-bin/w3glimpse2HTML/procmail/1997-12/msg00125.html?43#mfs>=20

[era] This stuff = about forwarding=20 to pagers is a recurring topic on this list. I've tried to find a good = summary=20 of all the issues but there always seems to be some tiny twist to what = people=20 would like to have implemented. As a general comment for future = generations,=20 the Procmail part is usually trivial and the problem reduces to = writing a good=20 program (shell script or otherwise) for formatting the text precisely = the way=20 you want it, and spitting it out in suitable chunks.=20

Here's something to split up the body of the = message into=20 smaller chunks and do a shell script on each chunk. The -s option to = fold says=20 to only wrap lines on whitespace if possible

      # =
  Create a duplicate of the message to forward to the pager.
      #   This will be reformatted and have most headers stripped off.

      :0 c
      {
          # Construct header with only From: and Subject: retained

          HEADER =3D `$FORMAIL -XFrom: -XSubject:`

          #   Reformat body as 200-character lines and send each
          #   as a separate message with the preconstructed minimal
          #   header

          :0 bw
          |   tr '\012' ' ' | fold -s -w 200 | while read line; do
              echo -e "$HEADER\n\n$line" | \
              $SENDMAIL pageraddress@wherever.com ; done
      }

If your version of echo doesn't understand \n to = mean newline=20 (and/or the -e option to enable this escape processing), you need to = tweak=20 this. (You might need to anyway -- this is mostly untested. In my = limited=20 testing, I found the messages would arrive in more or less random = order.=20 Inserting pauses in the script should help to some extent, but could = lead to=20 other problems and is not an ideal solution anyhow.)=20

I don't know if the header trimming is required; = some pager=20 gateways appear to count the headers as part of the message, while = others=20 don't. Again, for future generations, details like this are relevant = to=20 include when you ask about how to do this.

14.5 Playing particular sound when message arrives

[Peter S = Galbraith=20 galbraith@mixing.qc.dfo.ca= ]=20 Here is the command in shell to produce the sound:
  =
    % cat anyfile | /usr/X11R6/bin/auplay /usr/lib/exmh/drip.au

However, it won't work directly in the recipe

      procmail: Executing "/usr/X11R6/bin/auplay =
/usr/lib/exmh/drip.au"
      Can't connect to audio server

Strange. The command works from the shell if I = su to user mail. Anyway, = I got it to=20 work by fully specifying the audio server (which is my workstation, = where I=20 receive mail)

      AU      =3D =
/usr/X11R6/bin/auplay
      TUNE    =3D /usr/lib/exmh/drip.au

      :0 hwic
      * ^From:.*foo@bar.com
      | cat > /dev/null; $AU -audio tcp/mixing:8000 $TUNE

14.6 Combining multiple Original-Cc and Original-To headers

How can I use procmail/formail = to combine=20 the information in these headers into their CORRESPONDING=20 header MINUS the Original-* Note that I can have multiple Original-Cc: = headers=20 and I want all the recipients combined into one Cc: header.

      #   1998-01 by [david]
      #   initialize as unset

      ORIG_TO ORIG_CC

      #   The -c option to formail takes care of headers continued onto
      #   indented lines; the pipe to tr takes care of multiple
      #   Original-To: headers by linking their contents with commas.
      :0
      * ^Original-To:.*[^   ]
      {
          ORIG_TO =3D `$FORMAIL -zcxOriginal-To: | tr \\12 ,`
      }

      #   Drop trailing comma from tr:
      :0 A
      * ORIG_TO ?? ,^^
      * ORIG_TO ?? ^^\/.*[^,]
      {
          ORIG_TO =3D $MATCH
      }

      #   Likewise for Original-Cc: lines:

      :0
      * ^Original-Cc:.*[^   ]
      {
          ORIG_CC =3D `$FORMAIL -zcxOriginal-Cc: | tr \\12 ,`
      }

      :0 A
      * ORIG_CC ?? ,^^
      * ORIG_CC ?? ^^\/.*[^,]
      {
          ORIG_CC =3D $MATCH
      }

      #   Now, let's install the changes if needed:
      #   with -A instead of -I or -i it should
      #   not clobber existing To: or Cc: information.
      #   -A : Append a custom header field onto the header in any case.

      :0
      * ORIG_TO ?? ^^^^
      * ORIG_CC ?? ^^^^
      { }
      :0 E fhw
      | $FORMAIL                                                      \
        ${ORIG_TO:+-A "To: $ORIG_TO"}                                 \
        ${ORIG_CC:+-A "Cc: $ORIG_CC"}

14.7 Forwarding sensitive messages in encrypted format

Valdis.Kletnieks@vt.edu = Please=20 note that the standard Unix crypt(1) command = is not=20 secure, as it uses a modification of the Enigma engine, which was = broken by=20 the Benchley Park guys (Turing and the rest) back during WWII, using a = mechanical relay based computer. As such, it is trivially easy to = break using=20 any computer more resent than a Radio Shack TRS-80. Poke around in any = of the=20 comp.sources.Unix archives, they had a "Crypt Breaker's Workbench" = posted well=20 over a decade ago. For similar reasons, I can't recommend single-pass = 56-bit=20 DES anymore either. Triple-DES (with an effective 112-bit key) looks = safe, as=20 do any of the encryptions provided with PGP.
      # =
  by [alan]
      #   See if addressed *directly* to me, and ..
      #   ..has not already been forwarded

      KEY             =3D "TheMagic"
      FORWARD_EMAIL   =3D "foo@bar.com"

      :0
      * $   ^To:.*$LOGNAME(@|[^0-9a-z]|$)
      * $ ! ^$MY_XLOOP
      {
          # now let's encrypt the body using mimencode

          :0 fbw
          |   echo "MIME-Version: 1.0" ;                              \
              echo "Content-Type: application/crypt" ;                \
              echo "Content-transfer-encoding: base64" ;              \
              echo "" ;                                               \
              crypt $KEY | mimencode -b

          #   Now let's prepare the headers for forwarding the mail,
          #   and mark it so we don't loop

          :0 fhw
          | $FORMAIL   -I"Resent-To: $FORWARD_EMAIL" -I"$MY_XLOOP"

          :0
          ! $FORWARD_EMAIL

      }


15.0 Procmail and PGP

15.1 Decrypt pgp messages automatically

Warning: if you use remailers or anonymous = services, you=20 must use different passwords and different user id's to decrypt = incoming=20 messages. If you just receive messages encrypted with one key, then = you this=20 may be useful to you. However, it is generally considered a huge = security risk=20 to keep your password carved into your .procmailrc.
 =
     :0 fbw
      * B ?? PGP ENCRYPTED MESSAGE
      | pgp -z "your pass phrase" -f +batch 2>&1

15.2 Getkeys from key server

      # by =
Adam Shostack adam@bwh.harvard.edu =
1996-02
      #
      # This first ruleset protects me from mailbombs from an automated
      # service that I often send incorrect commands to, generating 5mb
      # of reply. It also sorts based on success of the command.
      #
      # swissnet.ai.mit.edu is fast key server

      :0
      * From bal@swissnet.ai.mit.edu
      {
         :0 h
          * >10000
          /dev/null

          :0 h
          *^Subject:.*no keys match
          /dev/null

         :0 E
         | pgp +batchmode -fka
      }

15.3 Auto grab incoming pgp keys

      #  =
[Opher Kahn kahn@dg-rtp.dg.com] This =
first ruleset protects
      #  me from mailbombs from an automated service that I often send
      #  incorrect commands to, generating 5mb of reply. It also sorts
      #  based on success of the command.
      #
      #  swissnet.ai.mit.edu is PGP key server

      :0
      * From bal@swissnet.ai.mit.edu
      {
         :0 h
         * >10000
         /dev/null

         :0 h
         *^Subject:.*no keys match
         /dev/null

         :0 E
         | pgp +batchmode -fka
      }

      #  auto key retrieval
      #
      #  I have an elm alias, pgp, points to a key server The log file
      #  gets unset briefly to keep the elm lines out of my log file.

      :0 W
      * B   ?? -----BEGIN PGP
      * H ! ?? ^FROM_DAEMON
      {
          KEYID =3D `/usr3/adam/bin/sender_unknown`
      }

      LOGFILE=3D

      #   #todo: We should get rid of the 'elm' dependency here.
      #   #todo: correct this sometime... [jari]
      #

      :0 ahc
      * ! ^X-Loop: Adams autokey retrieval
      | $FORMAIL -a"X-Loop: Adams autokey retrieval" | elm -s"mget =
$KEYID" pgp


      #!/bin/sh
      #
      #   Script: sender_unknown
      #
      #   unknown returns a keyid, exits 1 if the key is known. $output
      #  is to get the exit status. Otherwise, this would be a one
      #  liner.

      OUTPUT=3D`pgp -f +VERBOSE=3D0 +batchmode  -o /dev/null`
      echo $OUTPUT | egrep -s 'not found in file'
      EV=3D$?
      if [ $EV -eq 0 ]; then
              echo $OUTPUT | awk '{print $6}'
      fi
      exit $EV

      # end of sender_unknown


16.0 Includerc usage

16.1 Using: multiple rc files

...Do INCLUDERC statements = function as a=20 kind of "call" which returns control to the "original" rc file if = processing=20 falls off the end of the included rc file? Or if processing falls off = the end,=20 does mail then get delivered to $DEFAULT and processing stop? Suppose = I have=20 these commands

      INCLUDERC =3D =
$PMSRC/pm-a.rc
      INCLUDERC =3D $PMSRC/pm-b.rc
      INCLUDERC =3D $PMSRC/pm-c.rc

Yes, the control is returned to the original file = where the=20 includerc was called from. And No, mail does not get delivered in the = $DEFAULT=20 because the includerc just ends: processing continues until there is = no more=20 statements in the top level.=20

Includerc is nothing more that a sliced top level = recipe.=20

16.2 Using: You can call rc file conditionally

One = interesting way to=20 prevent false hits when filtering UBE is to try to see if the message = comes=20 from some valid destination first. If it comes, then it shouldn't be = run=20 through UBE filter, because it may filter valid messages out. No ube = filter is=20 completely bullet proof.=20

Here is an example where the UBE detection is put = into use=20 only when the message comes from somewhere that I don't know = beforehand (or I=20 have just forgot to tweak my .procmailrc)

      ME   =
   =3D "(me@here.is)"
      LISTS   =3D "(procmail|list-a|list-b)"

      :0                      # Idea by Bill Moseley
      *$ ! ^TO_()$ME
      *$ ! $LISTS
      {
          # Could be UBE or I might be on a unknown distribution list.
          INCLUDERC =3D $PMSRC/pm-ubecheck.rc
      }

[dan] That would = work; common=20 practice, however, is to put recipes for filing mail from lists (and, = per=20 Bill's preferences, anything mentioning procmail in the head gets = treated the=20 same as mail from this list) first; then the only remaining condition = to=20 consider there would be unexpected blind carbons: * ! ^TO_moseley. = This method=20 is good if you get much more spam than legitimate mail (including mail = from=20 list subscriptions as legitimate) and you want procmail to deal with = spam=20 right away. I belong to several very active mailing lists, so I = actually=20 receive more pieces of legitimate mail than pieces of spam.=20

One way to get the best of both worlds is this: =

      * $ ! =
()\/(^TO_$LOGNAME|procmail|list-(ABC|123|XYZ))

because then, if the regexp matches (and thus the = negated=20 condition fails and you don't detour into $PMSRC/checkspam.rc), MATCH = is=20 already set to the name of the mailing list, and you can do further = tests by=20 just examining MATCH (or a variable you copy it into) instead of a = repeating a=20 complete head search. [I prefer to use the variable $LOGNAME rather = than=20 hard-coding my name because then others can use the code, and I can = use it=20 unchanged on sites where my logname is different, and if my logname is = changed=20 my procmailrc will keep up with it.] For example (I've separated=20 the
conditions into two lines so that, per Bill's preferences, a = mention of=20 procmail in the head will get the message into the Procmail List = folder, even=20 if a match to $^TO_$LOGNAME is also present and appears sooner):

      :0
      * ! ()\/(procmail|list-(ABC|123|XYZ))
      * $ ! ^TO_$LOGNAME
      {
          INCLUDERC=3D$PMSRC/pm-ubecheck.rc
      }

      #   The next recipe has an `E' flag, so it will be examined
      #   only if the preceding one didn't match; thus if $MATCH was
      #   set inside pm-ubecheck.rc, it won't hurt anything here, and a
      #   value for $MATCH set in pm-ubecheck.rc
      #   won't be mistaken for a list name:

      :0 E: # MATCH is non-null only if it matched a list name
      * MATCH ?? (.)
      $MATCH

      #   Remaining recipes will be read only for two types of mail:
      #   those that met $^TO_$LOGNAME but not any expected list
      #   name, and those that went through pm-ubecheck.rc but came out
      #   undelivered.

16.3 Autoloading an rc file

Now when you know that includerc = can be=20 called conditionally, let's discuss about "autoloading of module". For = example=20 I use following statement in nearly all my modules to import = predefined=20 variables
      :0
      * !  WSPC ?? ( )
      {
          INCLUDERC =3D $PMSRC/pm-javar.rc
      }

It says that "If variable WSPC does not contain = space, then=20 load module". If the module has already been loaded by some other rc = file, the=20 WSPC would exist. If it does not exist yet, then we load the module. = This is=20 classical example of conditionally loading functions or variables into = current=20 module:

      Check if feature is present, No? Then =
load module module.

Justin Lloyd jlloyd@harris.com suggest a = general=20 way of caching the included rc files. We use top-level script that = records=20 every module that was included. The module is loaded only if it it not = yet=20 included:

      #   pm-xximport.rc

      :0
      * ! INCLUDE_CACHE ?? ()\<$RC\>
      {
         #    Module was not there yet, add it to the list
         INCLUDE_CACHE  =3D "$INCLUDE_CACHE$RCFILE$NL"
         INCLUDERC      =3D $RC
      }

This different approach then the previous one. = Instead of=20 checking features, the presense of module is checked. Two sides of the = coin=20 which can be used for the same thing. You can pick either solution but = to my=20 opinion=20

  • Adding extra top level INCLUDE_CACHE is extra work. Procmail = must open a=20 separate top-level rc file every time with call
      RC=3D"pm-xxscript.rc"   INCLUDERC=3Dpm-xximport.rc

  • If feature already existed, you would still have to open the=20 pm-xximport.rc file for every call to find it out. Eg. Here you=20 pm-xximport.rc is called 3 times no matter if 1,2,3 were already = present=20
      RC=3D"pm-xxscript1.rc"   =
INCLUDERC=3Dpm-xximport.rc
      RC=3D"pm-xxscript2.rc"   INCLUDERC=3Dpm-xximport.rc
      RC=3D"pm-xxscript3.rc"   INCLUDERC=3Dpm-xximport.rc

  • With simple feature test, procmail can evaluate the condition in = place=20 without the need of opening separate file.
      if no feature present..
          then load

      if no feature present..
          then load

Note however, that both suggestions accomplish the = same=20 thing; the implementation is only different. If the typical count of = including=20 RC files per module were big enough, I'd use justin's way. Currently = the=20 typical count is only 1-2 (VAR,DATE).

16.4 Making: naming of the rc file

When you write an rc file, = think=20 whether or not it could be generalized so that others could use it. I = have=20 adopted style where all procmail files start with prefix pm, so that I can stack other files as well to the = same=20 directory. If I simply named them as rc.*, = look what=20 happens:
      % ls rc*        # fine, print rc =
files

but I wold like to print all procmail relates files = and=20 backup them with one command, so this would print all procmail relates = files

      % ls pm-*

      --> pm-mytest.rc
          pm-jaube.rc
          pm-tips.txt
          pm-art.txt
          pm-incoming.log
          pm-list.mbox        # the mailing list

I usually use a name like pm-xxSCRIPT-NAME.rc for a rc file where xx is my initials from first name and surname, = like (J)ari=20 (A)alto. These scripts are product versions, that can be distributed. = I also=20 have private scripts that handle my mailing lists, work messages and = so on and=20 they have prefix my.

      =
pm-jascript.rc
      pm-myscript.rc      << private version

and when I download someone else's script I would = like to see=20 it named so that it's unique to the person who did it.

      pm-jdscript.rc      # John Doe's script.

16.5 Making: Using name space when saving procmail variables =

If=20 you're going to write rc file that works like any other programming = language=20 subroutine, you must separate it from the world and make it well = behaving. A=20 subroutine is traditionally a black box: you call it with arguments = and it=20 responds with returned values. You don't need to know what happens in = there.=20 And you expect that the subroutine hasn't changed the existing = environment,=20 like procmail variables DEFAULT LOGFILE etc. when it ends.=20

So the process diagram of a good RC subroutine is: =

                          pm-xxscript1.rc
      call            --> +------------+
      arguments           | black      | --> it may call
                          | box        |     other subroutines
                          |            | <-- pm-xxscript2.rc
      output values   <-- +------------+

Procmail does not have local variables, so you must = put the=20 variables to global name space. Let's see an example where subroutine = uses=20 MAILDIR for chdir = purposes.

      MAILDIR_xxscript1   =3D $MAILDIR       =
       # save
      ...
      MAILDIR             =3D new location
      ...
      ...at the end of subroutine
      MAILDIR             =3D $MAILDIR_xxscript1    # restore

Here the original value is saved when subroutine = started and=20 the original value was restored when subroutine exited. The global = namespace=20 (xxscript1) used was unique and is guaranteed not to clash with anyone = else's.=20 If the pm-xxscript2.rc would have also used = MAILDIR the saved value would have been in

      PROCMAILVAR_xxscript2

and the two wouldn't mix up with each others MAILDIR. The general name for saved variable is = therefore:

      PROCMAILVAR_scriptname

This follows the simple "onion" or "stack" model, = where=20 variable's value is saved before changing it and restored on exit = point.

      save-x-1
      set--x-1

          save-x-2
          set--x-2
          ..
          restore-x-2

      restore-x-1

16.6 Making: Public and private variables in rc file

As you = learned=20 above, the variables should be put to RC file's name space. The user = interface=20 variables (public) should be all caps and private variable should = start with=20 lowercase letter. Whether you use "theVarStyle" or "the_var_style" is = up to=20 you.
      [script pm-xxscript.rc]

      # ........................... public

      XX_SCRIPT_FLAG =3D ${XX_SCRIPT_FLAG:-"default"}
      XX_SCRIPT_VAR  =3D ${XX_SCRIPT_VAR:-"default"}

      # ........................... private

      charset =3D "a-z1-2"
      regexp  =3D "something-that-matches"

Whether you need to stick prefix xx_script to the private variables depends on = whether you=20 call another includerc which may happen to use same names as you:

      [pm-xxscript.rc]
      charset =3D ...           # watch this
      ...
      INCLUDERC =3D ..          # call another subroutine

          charset =3D ..        # holy cow, it used same variable

      ..back in the pm-script.rc

      :0
      * $charset              # BOOM, not what you think.

In this case it would be wise a) not to define = charset at the top of the file but to move the = definition to=20 just before the recipe where it is used or b) make the name unique, = with xxScriptCharset.

16.7 The rules of thumb for constructing general purpose rc file =

  • Write good documentation at the beginning of file: how to set up = the=20 includerc and explain what it does. If you don't include docs, = people may=20 skip your extraordinary useful script. Also, remember that the = script lives=20 in the Net and passes through many hands long after you have been=20 disconnected.=20
  • Keep the layout like this: the user interface variables must all = be in=20 capital letters. Familiarize yourself with what(1) tags too. Notice = the=20 first and last lines: if you keep the format like this, then any = universal=20 tool can rip your code from any file (or mail), because it's = delimited by=20 "pm-xxScript.rc -- " and "end of pm-xxScript.rc". See Unix what(1) = for first=20 line's syntax.
          # @(#) =
pm-xxScript.rc -- procmail script for ...
          # DOCS

          USER VARIABLES

          private variables

          CODE

          # end of pm-xxScript.rc

  • Always include version number or last modification date = somewhere.=20 Prefer some version control tool, like RCS, VCS, ClearCase, whatever = you=20 have at hand.=20
  • Use a variable name like dummy in = appropriate=20 places to tell what's happening in the code. Remember that the VERBOSE setting isn't much help if you can't tell = by looking=20 at the LOG where on earth the code is = executing.=20
          dummy =3D "start of =
pm-xxScript.rc"
          ...
          dummy =3D "Now testing if we have control message XXX"
          :0
          * condition
          {
              dummy =3D "Now testing if the command is YYY"
              :0
              * condition
              ...
          }
          ...
          dummy =3D "end of pm-xxScript.rc"

  • If you need the value of some common headers, don't just call = formail=20 like this because the value may already be available prior your = includerc.=20 For example the user may already have needed the Subject=20 value and stored it in a variable
       =
   [in pm-xxScript.rc]

          XX_SCRIPT_SUBJECT =3D `$FORMAIL -xSubject:'

          [User may have already read the content to SUBJECT]

          SUBJECT   =3D `$FORMAIL -xSubject:'
          INCLUDERC =3D $PMSRC/pm-xxScript.rc

      Your pm-xxScript.rc launches an unnecessary formail call. Instead,
      use the existing SUBJECT.

          [user]
          :0
          * ^Subject:\/.*
          {
              SUBJECT =3D $MATCH
          }

          ...

          XX_SCRIPT_SUBJECT   =3D $SUBJECT            # Note this!
          INCLUDERC           =3D $PMSRC/pm-xxScript.rc

          [ in the pm-xxScript.rc variable definitions  ]

          #   User should initialize the variable
          #   XX_SCRIPT_SUBJECT if he already has read the
          #   subject.

          :0
          * XX_SCRIPT_SUBJECT ?? ^^^^
          * ^subject:\/.*
          {
             SUBJECT =3D $MATCH
          }
          ...the rest of the code

  • Add header X-Loop=20 and test against it if you are sending an automated reply. The X-loop prevents responding to already responded = message.=20
          :0
          *    condition
          *  ! ^FROM_DAEMON
          *$ ! ^$MY_XLOOP
          {
              # Ok, now we're clear to send an automated reply
          }

16.8 An includerc skeleton

Here is my includerc file skeleton = that i=20 use in all my modules. The funny looking ".$" are for the text2HTML = Perl=20 filter. The documentation section can be ripped and turned into HTML = very=20 easily is you just keep the standard 4 tab column positions and start = the=20 description with "File id" and end it with "Change Log". The command = to make=20 the HTML is:
      % ripdoc.pl pm-xxscript.rc | =
t2HTML.pl > pm-xxscript.html

These two perl files are available from my ftp = directory.

      # @(#) pm-xxscript.rc -- one line =
description string here
      # @(#) $Id: pm-tips.txt,v 2.5 2002/02/03 22:22:42 =
jaalto Exp $
      #
      #   File id
      #
      #       .Copyright (C)  1997-98 Foo Bar
      #       .$Contactid:    foo@bar.com $
      #       .$Created:      YYYY-MM $
      #       .$keywords:     procmail [subroutine|recipe] whatItDoes $
      #
      #       This code is free software in terms of GNU Gen. pub. Lic. =
v2 or later
      #       You can get newest version by sending mail to maintainer =
with
      #       subject "send <FILENAME>"
      #
      #   Description
      #
      #       This subroutine Parses <what> from variable INPUT
      #
      #   Required settings
      #
      #       PMSRC must point to source directory of procmail code.
      #       This subroutine will include
      #
      #       o   pm-xxScriptA.rc
      #       o   pm-xxScriptB.rc
      #
      #   Call arguments (variables to set before calling)
      #
      #       o   INPUT, the string from where to parse...
      #       o   VAR1, description, default is ...
      #       o   VAR2, description, default is ...
      #
      #   Returned values
      #
      #       ERROR will have value "yes" if couldn't parse INPUT
      #       OUTPUT will have result after successful parse
      #
      #   Example usage
      #
      #           :0
      #           * condition\/.*
      #           {
      #               INPUT     =3D $MATCH
      #               INCLUDERC =3D $PMSRC/pm-xxscript.rc
      #               #  OUTPUT has the result
      #           }
      #
      #   Change Log: (none)

      # ..................................................... &init =
...

      dummy       =3D "init: pm-xxscript.rc start"

      #  Read the standard variable definitions if they are not
      #  yet defined: that's "if WSPC variable does not contains space,
      #  as it should, then global variables haven't been read yet"

      :0
      * !  WSPC ?? ( )
      {
          INCLUDERC =3D $PMSRC/pm-javar.rc
      }

      # .................................................... &input =
...
      # - User configurable variables with reasonable defaults
      # - But parameters like "INPUT" that must be set beforehand
      #   are not mentioned here.

      VAR1    =3D $VAR1{VAR1:-"default1"}
      VAR2    =3D $VAR2{VAR2:-"default2"}

      # .................................................... &do-it =
...

      dummy       =3D "subroutine: pm-xxscript.rc parses now that and =
that"

      <the code>

      dummy       =3D "subroutine: pm-xxscript.rc end."

      # end of pm-xxscript.rc


17.0 Mailing list server

Simple Mailing list server
      # by Lars =
Hecking lhecking@nmrc.ucc.ie
      #

      MAJORDOM =3D "majordomo-(users|docs|workers)"

      :0 w
      * $ ^(Sender|To|Cc):.*\/$MAJORDOM
      * $  MAJORDOM ?? ()\/$\MATCH
      | $APPNMAIL $LISTS/$MATCH

Here is another, by Brock Rozen brozen@torah.org with ideas = from [dan]

      # get the date =
in RFC822 format for insertion into some messages;
      # the "Resent-Date:" field is copied from the "Date:" field on
      # some systems. RFC1123 says "All mail software SHOULD use 4-digit
      # years in dates..."

      LIST_NAME =3D "myList"
      LIST_ADDR =3D "$LSIT_NAME foo@bar.com"
      LIST_DATE =3D `date '+%a, %d %h %Y %H:%M:%S %Z'`
      LIST_ERR  =3D "$EMAIL"        # my admin address

      #   Sendmail ignores "To:" in the presence of "Resent-To:"
      #

      :0 fhw
       *$ !^X-List: $LIST_NAME
       *$ ^TO()$LIST_NAME
       |  $FORMAIL
              -A "X-List: $LIST_NAME"                                 \
              -I "Resent-To: $LIST_ADDR "                             \
              -i "Resent-Date: $LIST_DATE"                            \
              -I "Errors-To: $LIST_ERR"                               \
              -A "Precedence: bulk"                                   \
              -A "X-Loop: $COMSAT"

      :0 a
      ! -oi `cat /var/tmp/src/power-users.list`


18.0 Common troubles

18.1 Procmail modes: normal, delivery, and mail filter.

... a) what recipes procmail = goes through=20 if there's no /etc/procmailrc on the system b) how it decides whether = an=20 address/local-part is valid or not c) how procmail selects the mailbox = to drop=20 the mail=20

[philip] Delivery mode is invoked using the -d flag. All = arguments are=20 the -d are user names. It is usually used by the MTA to deliver mail = to users,=20 and indeed, procmail will return failure if it is given an invalid = user name.=20 In delivery mode, procmail reads /etc/procmailrc before the user's=20 .procmailrc.=20

Note: Procmail will = work in=20 delivery mode only if it is setuid root, if it is invoked with the = ruid of the=20 recipient named in -d, or, under certain OSes where the build routines = have=20 determined that it is safe, if the euid is that of the recipient and = the egid=20 is the recipient's login group.=20

Mailfilter mode is invoked = using the -m=20 flag. It accepts only one rcfile as an argument -- other arguments are = either=20 variable assignments or arguments that are made availible to the = rcfile itself=20 as $1, $2, etc. If the specified rcfile is located under = /etc/procmailrcs/=20 then procmail will take on the uid of the owner of that file. = Otherwise, it=20 will run as the user who invoked it. /etc/procmailrc, that procmail -d = reads,=20 is ignored. In mail filter mode, procmail unsets ORGMAIL and DEFAULT to = suppress=20 normal delivery -- reaching the end of the rcfile results in the mail=20 bouncing. If the rcfile sets either of them then procmail will attempt = delivery to that mailbox if it falls off the end of the rcfile; = however, the=20 mailbox will have to be writable by the uid/user that procmail is = running as.=20

Note: Only one rcfile = can be=20 named on the command line, but names of other rcfiles can be passed in = the=20 positional parameters to be used later in INCLUDERC assignments.=20

Normal mode is invoked by not = using the=20 -m or -d flags. It = accepts any=20 number of rcfiles and variable assignments as arguments. Procmail runs = as the=20 invoking user in this mode. /etc/procmailrc is ignored.=20

So, to answer your questions: if procmail reaches = the end of=20 the specified rcfile, it bounces the mail (/etc/procmailrc is = ignored).=20 Everything is up to the rcfile -- how to determine whether the address = is=20 valid and where to put the message if it is.

18.2 Procmail as sendmail Mlocal mail filtering device

...I'm a new sys admin at my = company, and=20 I've been trying to set up Procmail as the mail filtering device = (still using=20 mail as the Mlocal) I've tried setting up the sendmail.cf to use = Procmail as a=20 filter (we want to use the current mailer as the local mailer) with = one local=20 procmail rc file. Procmail seems to work just fine if set up as the = local=20 mailer, but I'm still having problems setting it as the filter.=20

[John M Vinopal banshee@abattoir.com = answers=20 sendmail.cf]

      R$+ < @ $=3Da . > $*
          $#procmail $@ /etc/mail/procmailrc $: $1 < @ procmail > =
$3
      R$+ <@ procmail > $*                      $1 < @ =
resort.com .> $2

so this sends anything of the form foo@resort.com = through=20 procmail and rewrites it as foo@procmail. the procmail script = reinjects it and=20 it bypasses the call to procmail and then is rewritten back to = foo@resort.com.=20

      /etc/mail/procmailrc:
      :0
      ! -oi -f "$@"

18.3 Procmail doesn't pass 8bit characters

You've mistaken. = Procmail=20 does not do that to your mail. Frank Gadegast phade@powerweb.de tells = you:=20

  • procmail wasnt the problem, it was sendmail=20
  • I uncommented this line in sendmail.cf and now I get all nice = German=20 Umlauts.
          # strip message body =
to 7 bits on input?
          # O SevenBitInput

The problem was that some mails run through the = local mailer=20 procmail and arrived all right (local mail), all mail from external = (that=20 dropped into my most used mailbox where I use a procmail-filter), did = not=20 arrive all right. This made me think it procmail, but these mails came = from=20 external and it was sendmail to blame.

18.4 My ISP isn't very interested in installing procmail

...I recently requested my ISP = to install=20 procmail, and they responded by saying no. Their main reason was they = did not=20 wish to incur the traffic from any/ all of their subscribers setting = up=20 mailing lists.=20

[Jon Lewis <jlewis@inorganic5.chem.ufl.edu>] = Wouldn't=20 you need write access to either /etc/aliases or /etc/procmailrc to = setup=20 mailing lists? Tell the ISP that procmail will greatly improve mail = delivery=20 and enable all users to filter out junkmail without ever seeing it. If = they=20 still refuse, find a better ISP.

18.5 My ISP has systemwide procmailrc; is this a good idea? =

[eli] I, for one, do not like my ISPs to put = stuff in=20 /etc/procmailrc. There is precious little I will gain from that and = plenty of=20 opportunity for them to make mistakes I would not have. At one ISP I = know=20 people got upset at some sendmail level filtering of mail. One of = those upset=20 is a habitual complain-to-spammer-ISP person. He did not want problems = seeming=20 to go away if they were really there. Another guy just didn't trust = the=20 filtering.=20

Writing a shell script that will give the user a = .procmailrc=20 which includercs a system wide shared procmailrc is the best way to do = it.=20 This forces the filtering to be "opt-in".

18.6 Procmail changes mailbox and directory permissions

By Ed = McGuire=20 emcguire@i2.com. = Before procmail=20 was used:
      > -rw-rw----   1 foo      mail  =
1127 Sep 11 07:33 foo

After:

      > -rw-------   1 =
foo      mail  1517 Sep 11 07:34 foo

when the UMASK environment variable is more = restrictive than=20 the mode of the mailbox, procmail changes the mode of the mailbox. The = default=20 value of UMASK is 077. If you want to preserve the group access to = your=20 mailbox, I think you can set UMASK to 007 in the rcfile:

      UMASK =3D 007

Further note: the above UMASK suggestion in .procmailrc does not work. See = comment=20 by Gjermund S𲳥th gjermund@nextel.no=20

However the permissions on DEFAULT=20 are handled before procmail even opens the .procmailrc, so changing = the umask=20 there will have no effect on the mailspool.=20

[Scott J. Kramer sjk@lux.com] it's documented in = the=20 MISCELLANEOUS of the procmail(1) man page:=20

If /var/mail/$LOGNAME already = is a valid=20 mailbox, but has got too loose permissions on it, procmail will = correct this.=20 To prevent procmail from doing this make sure the u+x bit is set. =

Otherwise, you might notice a syslog message like:=20

procmail: Enforcing stricter = permissions=20 on "/var/mail/sjk"=20

when it chmod's the file to 600. As you've = discovered, this=20 is inconsistent with the SYSV (Solaris 2 anyway) default mailbox = protection of=20 660, gid=3D6 (mail). I think that's an OS-dependent bug, with the = `chmod u+x=20 ...' as the workaround.

18.7 Changing mbox permission during compilation to 660

...it appears that mail that = procmail=20 delivers back into the spool it is writing out with owner.group = user.mail and=20 rights 600. To me this is reasonable. Mail delivered to the spool by = /bin/mail=20 is written out owner user, group mail 660.=20

When procmail delivers mail = 600 later=20 attempts at delivery with procmail removed from the .forward file = fail:=20 /bin/mail doesn't have permissions (or refuses to uses its = permissions).=20

Since we have fickle and = unruly users who=20 will be moving their forwards in and out of place this is a = problem.

Is the correct solution to = force procmail=20 to write 660? If so, how is this done? I assume in the section of = config.h=20 just below the warning about only messing with a section if you think = you know=20 what you are doing. I don't like feel like I know well enough what I'm = doing=20 to walk into that territory without some guidance.=20

[alan] I used to be = the manager=20 of the system support in the College of Engineering, at the University = of=20 California, Santa Barbara.=20

We supported about 1500 users from two HP 9000 = G30's, using=20 one of them as the centralized mailer. Mail was available via NFS = exported=20 /usr/spool/mail to over 200 workstations, of all kinds: SGI, HP, Sun, = etc.=20

We replaced /bin/mail with procmail as the local = mailer=20 (Mlocal) because procmail correctly avoided NFS-locking problems, and = it=20 supported user-configurable mail filtering, without compromising = system=20 security.=20

In over two years subsequent to the change, we had = no loss of=20 mail due to procmail being used as the local mailer. If you wish = further=20 comment from the current system managers, send mail to=20 "postmaster@eci.ucsb.edu".=20

To answer your specific questions:=20

* you can configure the permissions directly, by = changing one=20 of the following defines in config.h:

      /* bit =
set on mailboxes when mail arrived */
      #define UPDATE_MASK     S_IXOTH
      /* if found set */
      #define OVERRIDE_MASK   (S_IXUSR|S_ISUID|S_ISGID|S_ISVTX)
      /* the permissions on the mailbox will be left untouched */
      #define INIT_UMASK      (S_IRWXG|S_IRWXO)       /* =3D=3D 077 */
      #define GROUPW_UMASK    (INIT_UMASK&~S_IRWXG)   /* =3D=3D 007 =
*/

We did not find it necessary, however:=20

  • We did disable all locking except dot-locking, since the kernel = locks=20 were the source of the NFS-locking problems. There have continued to = be=20 occasional locking problems, but these are "victim"-induced problems = caused=20 by using non-supported and discouraged mailers, such as "mailtool" = from=20 older Suns. These locking problems have nothing to do with mail = delivery,=20 but from the mail client using kernel-advisory locks, and then = orphaning=20 them or, leaving them locked all day long.=20
  • An alternative to having users use .forward files, is to create = a file=20 of users who would like to use procmail as their local delivery = agent, and=20 use this file to initialize a class variable.

Write a special rule in sendmail.cf which delivers = mail using=20 Mprocmail instead of Mlocal when the destination user is in the = special=20 procmail user class.=20

This allows users who want procmail-direct delivery = in spite=20 of management worrying.=20

I set this up to test procmail delivery on our = system before=20 changing Mlocal to use procmail. We placed some "volunteer" users in = the=20 procmail class file, and they never had any problems (I was one of = them).=20

18.8 The .forward file must be real file

http://www.math.fu-berlin.de/~guckes/mail/forwarding.html=20

...I tried to make a softlink = to=20 ~/.forward, but then my procmail wouldn't run. When I made a real = ~/.forward=20 file, then it worked again. My question is -- why would procmail treat = a link=20 to a file any differently than the actual file itself?

      ln -s ~/.procmail/forward ~/.forward

[Werner Reisberger wr@tribe.ping.de] That's not = a problem=20 with procmail, this is an MTA issue. Due to security reasons sendmail = will not=20 deliver mail to files whicharesymlinks.=20

[david] procmail has = restrictions=20 on what permissions it will tolerate on an rcfile. For example (I'm = just=20 guessing here) it can tell whether it can read the target file but it = cannot=20 tell who might be able to write to it. This prevents a major security = hole=20

You can make hard link to the file, since A hard = link is=20 completely indistinguishable from the original file. But note: a file=20 hard-linked to two or more names is very distinguishable from a file = with only=20 one (hard) link, and procmail, for example, will not deliver to a = plain folder=20 that has two or more hard links.=20

You can also put the real file at ~/.forward and = let=20 ~/.procmail/forward be a symlink to=20

[< mikk0022@maroon.tc.umn.edu>] I suppose, = the=20 reasoning behind procmail's folder policy is that procmail locks the = file by=20 name, not inode. Hence it cannot guarantee mutual exclusion for access = to a=20 file which has multiple names.=20

My understanding of the .forward policy is that a = symlink=20 need not share the permissions of its target. Therefore somebody's = .forward=20 symlink could have proper permissions, while its target could be = writable by=20 others. This would allow anybody with the write permissions to execute = any=20 program (potentially) from the user's forward file.

Two hard links share the same permission, so this = argument=20 doesn't hold.

18.9 Using .forward if procmail already is LDA

[Elie Rosenblum fnord@jurai.net] If you have = a=20 .forward, it is used by sendmail to replace a call to the LDA for the = user in=20 question. So if you have a .forward that doesn't call procmail, = procmail is=20 ...=20

[david] Elie sent the = answer to=20 me with a carbon to the list, but since reading my personal copy my = inbox got=20 trashed. As of this writing the list copy hasn't reached me, but the = rest of=20 that sentence (as I recall from reading it before it got hosed) was to = the=20 effect that procmail is then never invoked at all on your incoming = mail; a=20 .forward takes precedence over the LDA. That scenario never occurred = to me.=20 Thank you for explaining.=20

[Philip]=20 Scratch the bit about /etc/procmailrcs/$LOGNAME. You're mixing up = procmail -d=20 with procmail -m.=20

Ah, got it ... after rereading the man page. The = part about=20 /etc/procmailrcs really can apply only when procmail is setuid root, = so again=20 it's something I've no experience with and never quite followed or = retained.=20 So no file in /etc/procmailrcs is ever used implicitly, but = /etc/procmailrc=20 can be.=20

[Philip]=20 $HOME/.forward is handled by sendmail. If you have a forward, then = sendmail=20 rewrites attempts to deliver to you into
attempts to deliver to the = addresses listed in the .forward file.
=20

Or in other words, the .forward takes precedence = over the=20 LDA. Thank you both.

18.10 Mail should be put in the mailqueue if write fails

...We want to deliver directly = to a user's=20 home directory. But this can of course be temporarily full. Then the = mail=20 should not bounce, but instead be put = back in the=20 mailqueue and tried again until either it succeeds or sendmail bounces = it=20 after 5 days (as usual). The README file says this is my choice (to = bounce or=20 not), but I cannot find any place where I can set this. What is the = correct=20 place to set this behavior=20

[1998-06-24 PM-L phil] The -t flag causes procmail = to return=20 EX_TEMPFAIL where it normally would have returned EX_CANTCREAT. If = you've made=20 procmail the local delivery agent then you should add -t to the A=3D = define,=20 before the -d flag.

18.11 Qmail: how to make it work with procmail

[1998-11-10 = PM-L John=20 Conover conover@inow.com] All=20 you do is install fastforward and dot-forward, (they are optional, and = are not=20 required.) Then cp /var/qmail/boot/proc or /var/qmail/boot/proc+df, to = /var/qmail/rc.=20

[1998-11-10 PM-L Greg Boes gboes@ashfordtech.com] = From the=20 qmail FAQ (4.4 How do I use procmail with qmail?) Put

      | preline procmail

into ~/.qmail. You'll have to use a full path for = procmail=20 unless procmail in in the system's startup PATH. Note that procmail = will try=20 to deliver to /var/spool/mail/$USER by default; to change this, see=20 INSTALL.mbox.

18.12 Qmail: Procmail looks file from /var/spool/mail only

...Procmail seems to want to = do something=20 in /var/spool/mail. But since I use qmail, I don't have a = /var/spool/mail. Is=20 there a way to have procmail not to create temp stuff there?=20

[philip] Get procmail = 3.11pre7=20 and uncomment and and correct for your local setup the = MAILSPOOLHOME=3D"/.mail"=20 define in src/authenticate.c. Compile and install. t's relative to the = user's=20 home directory. Thus the name MAILSPOOLHOME. =

[Ekkehard Knopp <knopp@rz-online.de] at the=20 qmail-home-page you can find a patch for procmail-3.11.pre7 called=20 procmail-maildir-patch. When you can't find it, I can send you a = netmail. Have=20 no problems with procmail and qmail. Works good.

18.13 Qmail: patch to procmail 3.11pre7 to work with Maildirs =

[Jaye=20 Mathisen mrcpu@cdsnet.net] On=20 the www.qmail.org page is a patch that lets procmail 3.11pre7 work = with=20 Maildir's, (qmail's NFS safe delivery format), and not must Mailbox's. =

Very useful. Really slows down delivery though. On = my test=20 box, just adding procmail to the delivery where all it did was deliver = to the=20 default mailbox, and no other rules whacked my speed test from = something like=20 600,000 messages/day to about 180,000.=20

Killer. I suspect Procmail's locking of the Maildir = 8 ways=20 from Sunday is probably partially to blame.

18.14 AFS: How to use Procmail when HOME is in AFS cell

...I've viewed some of the = archived posts=20 concerning AFS and procmail, but each seems to have a different = perspective on=20 the subject. Besides the fact that AFS isn't the greatest product in = the=20 world, does everyone agree that it is not possible to use procmail = when your=20 $HOME lies in an AFS cell? Mail sent locally seems to work with = procmail, but=20 mail from users w/o a token or AFS id just gets delivered to=20 /var/spool/mail/someone.=20

[Christopher Lindsey lindsey@ncsa.uiuc.edu = 1998-03-09=20 PM-L] AFS is awesome! You just have to treat it nicely. :) The only = viable=20 solution that we've been able to come up with involves patching the=20 procmail-3.11pre7 sources to "fake" user home directories out of = another=20 directory.=20

For example, my home directory in AFS is

      /afs/ncsa.uiuc.edu/.u1/lindsey/

It is kept as such on the mail server in = /etc/passwd as well.=20 However, we have some space set up via NFS in /var/forward with space = for each=20 individual user (so /var/forward/lindsey in my case).=20

The procmail patch intercepts requests for the = user's home=20 directory and replaces it with the "fake" directory (the /var/forward = one). So=20 for all practical purposes, procmail things that my home directory is=20 /var/forward/lindsey, and everything works fine.

18.15 Help, some idiot sent my address to 30 mailing lists =

You can=20 make a procmail recipe to junk incoming mail from the lists until you = get the=20 unsubscribe messages delivered to cancel your participation. You = should=20 complain to the list's maintainer that such things was even possible: = The=20 mailing list should have sent you a confirmation message with unique=20 "participate ID number" that you need to send back in order for the=20 subscription to take in effect.
      KILL_FILE =3D =
$PMSRC/.kill-immediately

      :0
      *$ ? $IS_READABLE $KILL_FILE
      {
          KILL =3D `cat $KILL_FILE`
      }

      # 1) Make sure KILL has value
      # 2) if match is found from header.
      # 3) /dev/null does not need lockfile

      :0
      *  KILL ?? [a-z]
      *$ $KILL
      /dev/null

[sean] ...In the long = haul, your=20 best bet with dealing with this problem is to stamp out the offender - = bring=20 this harassment to the attention of their ISP and get their account = closed.=20 Repeat as necessary. Most of the mailing lists should have some record = of the=20 submission request. Even if forged, the abuser probably has their IP = address=20 in the headers somewhere (and if the person is actively subscribing = your=20 friend to so many lists and actually WORKING at covering their tracks, = apparently you've REALLY crossed them). Most people who stoop to these = immature harassment tactics aren't bright enough to fully cover their = tracks.=20

Another alternative to having to manually deal with = unsubs on=20 certain lists is once you've identified filterable characteristics of = the=20 lists, BOUNCE them. Most semi-intelligent listserv implementations = will unsub=20 you if they get repeated bounces. Yea, not nice to the listserv = maintainer -=20 but then, if perhaps they'd implement a subscription verification = system, it=20 wouldn't have been a problem to begin with.

      :0
      * condition
      {
          # may expose your .forward - but if you're bouncing lists,
          # it probably doesn't matter much.
          EXITCODE =3D 67

          # save header for examination.
          :0 h:
          bounce.log
      }

You've got a sticky situation. You can't simply = ditch all=20 unrecognized mail - you need to be able to review potential refuse = first, and=20 take action on anything which doesn't belong (because you certainly = don't want=20 to continue getting the non-wanted lists till the end of eternity - = you should=20 want to unsubscribe from them to simplify your mail).

18.16 Help, Procmail beeps and prints to my console

...when messages get filtered = through=20 procmail I get a beep and then first 10 lines or so are also sent to = the=20 console. I get a lot of messages so the beeps, and stuff on my screen = is=20 getting very annoying.=20

[sean] One or the = other should do=20 the trick (or both even): Go to your login file (what it is named = depends on=20 the shell you're using), and add:

      biff -n

Or/also, in your .procmailrc add:

      COMSAT =3D "no"

[manual] has = information on the=20 COMSAT variable. It also states (contrary to reasoning I gave in = above) that=20 COMSAT defaults to 'no' if you specify an rc file on the commandline=20 (otherwise, it is on by default).=20

Doing this latter one should keep procmail from = generating=20 COMSAT/BIFF notifications, but would still leave your shell capable of = receiving them, say, if you only processed certain mail in procmail = manually=20 or some such. Personally, I turn biff off AND set the COMSAT off. I = read my=20 mail when I read my mail, and I check it often enough (with a POP = client at=20 that).

18.17 Help, procmail dumps mail to console

...I have installed sendmail = and procmail=20 on my linux machine (latest version of slackware) it works ok, but = procmail if=20 run with -d $u dumps all mail after receiving immediately on the = console with=20 ---- more ---- I don't like it, a beep is ok, but I do not want all = the=20 garbage on my screen. Is there a way to tell procmail that I just want = the=20 mail in my mailbox (/var/spool/mail/$u) ? Thanks for the help!=20

[Xavier Beaudouin kiwi@oav.net] Check your = /etc/inetd.conf=20 for a in.comstat, add a '#' at the beginning of the line, save the = file and=20 killall -HUP inetd. This should stop this ;-)

18.18 Help, corrupted From_ line in mailbox

[Jeffrey S. Gilton jeffg@castlec.com = 1998-02-11 in=20 procmail mailing list " Solved the FFrom problem"]=20

Thanks to everyone who responded to my questions = about a=20 problem where the From line was getting corrupted. Here I tell what = was the=20 real problem.=20

To recap, when our Caldera OpenLinux 1.1 system = received=20 multiple mail messages very quickly, some messages would get multiple = F's on=20 the from line and then subsequent messages would be missing the F's.=20

Most responses said that it sounded like a file = locking=20 problem. Suggested solutions were to get the latest version of = procmail or=20 recompile our version so that it would look at the file locking = mechanisms.=20

The funny thing was that three systems with new = installs=20 didn't exhibit the problem.=20

The file locking recommendation eventually led to = the real=20 problem. On a good system I would run our spam script (we spammed = ourselves to=20 trigger the problem) and everything would work. Using top I would see = multiple=20 instances of procmail running. Looking at the directory where the = spool files=20 were, I would see a spool_file.lock file get created and then go away. =

Finally, I did the exact same thing on the system = that=20 wouldn't work. There I would see the multiple instances of procmail = running=20 but no lock file being created. I said to myself "Now that I know what = is=20 happening, the question is why."=20

It turned out to be a permission problem on the = spool=20 directory. On the system that worked, the permissions were rwxrwxr-x = with the=20 owner being root and the group being mail. On the system that didn't = work, the=20 permissions were rwxr-xr-x with the owner and group being root. This = meant=20 that procmail, which is run as mail couldn't write the directory file. = We=20 changed the broken system to rwxrwxr-x with owner root and group mail. = The=20 problem disappeared.=20

As I said, the suggestions about lock files were = key. It=20 guided our investigation until we found the real problem. I thank = everyone who=20 responded.=20

I've seen other posting about corruption of the = From line.=20 Perhaps you have the same problem.=20

[Christopher B. Smith cbsmith@envise.com] I had = the exact=20 same problem with my upgraded OpenLinux system. For the record, if you = are=20 running the imapd that comes with it, you should really set your = permissions=20 for the directory is as follows:

      chmod 1777 =
/var/mail/spool

I got that feedback from the guy who wrote imapd, = and it=20 works very well.

18.19 Directing user's mail to HOME instead of /var/spool/

...I have a need to direct all = a single=20 user's mail to a mailbox in his home directory, to $HOME/mailbox, =

      # One possible solution, not perfect

      UHOME       =3D /tmp_mnt/users
      UHOME_LIST  =3D "(login1|login2|login3)"

      *$ ^TO\/$UHOME_LIST@
      *   MATCH ?? ()\/[^@]+
      $UHOME/$MATCH

[era] Perhaps = preferably use ^TO_=20 if you have Procmail 3.11pre7 or newer. This is the classical case of = using=20 Procmail where you really need the envelope recipient information. The = headers=20 are not enough to determine who a = message is for.=20 If Procmail is your MDA, you can have this, but I'd still think = something=20 involving Sendmail would be more appropriate. For one thing, what if = this user=20 would suddenly really want to use Procmail? You can set DEFAULT and ORGMAIL for = this one=20 user in /etc/procmailrc to come around that, = but the=20 bottom line, as so many times before, is that Procmail might not be = the right=20 tool for this.

18.20 NFS mounting /var/mail is a good way to get bad performance=20

<URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/procmail= /=20 1998-06/msg00199.html>
      > /var/mail stays =
at a Solaris 2.5 machine. Cucipop is working
      > at the same machine. It's fine there. But, I want to have
      > more than one machine with cucipop and when I put cucipop at
      > another machines, NFS clients, it is delaying more 30 or 40
      > seconds to close the session.

[1998-06-23 PM-L Brad Knowles brad@colltech.com] NFS = mounting=20 /var/mail is a good way to get bad performance, especially when you're = doing=20 any NFS writes. Even if you're not doing any NFS writes, just having = to deal=20 with local file locking and trying to translate that into NFS file = locking is=20 a nightmare (in general, file locking is one of the single biggest = problems=20 left with NFS).

      > Procmail is working good =
on NFS, it finishes quickly. But when
      > cucipop is put on a NFS client, procmails starts to delay =
too.

Procmail probably isn't writing to NFS, or if it = is, it's=20 probably not using the same locking mechanism as cucipop. = Unfortunately, each=20 vendor and each program have their own ideas on how to best do that.=20

[philip]=20 cucipop was written by the author of procmail. Ideally, when you = compile=20 cucipop you edit its config.h to use the locking techniques that = procmail's=20 autoconf process determined for your system(s). However, even if you = didn't do=20 that, cucipop uses the same dotlocking algorithm as procmail.=20

Also, keep in mind that any POP3 server will have = to copy the=20 mailbox in order to work on it, and many of them copy the mailbox to=20 /var/mail/.username (you got it -- creating lots of NFS writes). When = they're=20 done, they copy the mailbox back to /var/mail/username (after they = copy any=20 new mail messages that have come in to the end of /var/mail/.username = and=20 locked then truncated the original /var/mail/username file).=20

[philip]=20 cucipop doesn't use a temporary file: it keeps it all in memory. On = deletes it=20 updates the mailspool in place which should never lose data, though if the server crashes in the = middle of=20 this you can end up with one or more bogus messages.=20

This is a real nightmare when = you start=20 talking about users who select "Leave mail on server" and have = multi-megabyte=20 mailboxes.=20

[philip]=20 Assuming you have enough memory, cucipop should be pretty fast.=20

I think maybe now you're starting to understand why = POP3=20 really doesn't scale well at all in multi-machine environments (unless = you've=20 cooked up a custom mail store that uses a real database back-end, like = Oracle=20 Parallel Server), with /bin/mail (or procmail) as a writable interface = to this=20 message store and POP3 and/or IMAP as a readable (and writable) = interface to=20 this same message store. Then you can let the database vendors deal = with the=20 hard data replication and distribution problems.=20

Otherwise, it's a pain-in-the-ass.

      > Is there another good pop server?

Have you tried QPopper from Qualcomm? It's the = single best=20 POP3 server I've ever run across, although I wouldn't put even it in = an NFS=20 write environment.=20

BTW, I used to be the Mail Systems Administrator = for GNN=20 (Global Network Navigator), the web site/National ISP co-operative = between=20 O'Reilly & Assoc. and AOL. At our peak, we had hundreds of = thousands of=20 registered users, of which up to five to six thousand were logged in = at any=20 one time, with their MUA set to check their mail every minute.=20

We had a single primary Mail/POP3 server machine = (Dec Alpha=20 2100 w/ four 250Mhz processors, 4GB RAM, 28GB hardware = mirrored/striped mail=20 spool), and one warm spare (same CPU/RAM configuration, physically = hooked up=20 to the same disks, but through DECsafe ASE not mounting them unless = the=20 primary died).

18.21 I can't see the sendmail's response in LOGFILE

...As the man page says, this = should've=20 written to my LOGFILE. It didn't. But it DID activate the pipe in the = recipe.=20 So what's up here?

      :0 hc
      *$ ? $IS_EXIST $HOME/.vacation
      | LOG=3D| ($FORMAIL -r; echo $IM_NOT_HERE) | $SENDMAIL -t

[david] The man page = says that a=20 variable capture recipe assigns the standard output of the command to = the=20 variable. Since you are repiping the output of formail and echo to = sendmail,=20 sendmail sucks up the standard output of formail and sendmail. = Sendmail itself=20 does not write to standard output, so the stdout of ( $FORMAIL -r ; = echo=20 $IM_NOT_HERE ) | $SENDMAIL -t is nothing.=20

Thus you're assigning a null string to $LOG, and = when=20 procmail writes $LOG to the logfile you can't see a difference.

18.22 Compiling procmail and choosing locking scheme

General = advice:=20 Everything except dot locking is usually broken.=20

[stephen, = <199607292139.XAA12433@hera.cuci.nl>]. Remove=20 fcntl() and lockf(), only allow flock() (or omit it completely) Kernel = locks=20 don't work. But that's all some programs use. Across a networked = filesystem,=20 lockf() doesn't work, fcntl() and flock() should, but they don't = either=20 because the lockd is buggy. Mailtool uses fcntl() but does it wrong, = so that's=20 another problem. The only thing that works on all platforms, all = networks, all=20 the time are .lock files.=20

Makefile refers to:

      # =
Uncomment (and change) if you think you know
      #LOCKINGTEST=3D100
      #        it better than the autoconf lockingtests.
      #        This will cause the lockingtests to be hotwired.
      #        100     to enable fcntl()
      #        010     to enable lockf()
      #        001     to enable flock()
      #        Or them together to get the desired combination.

config.h refers to:

      =
/*#define NO_fcntl_LOCK uncomment any of these three if you */
      /*#define NO_lockf_LOCK definitely do not want procmail to make */
      /*#define NO_flock_LOCK use of those kernel-locking methods */

18.23 Forwarding lot of mail causes heavy load

...There are several forward = (e.g. !=20 walter@localhost) recipes For every forwarded mail, a distinct = sendmail=20 process is created. This leads to a heavy (IMHO unbearable) system = load. How=20 can I stop procmail from running a sendmail process for every mail=20 forwarded?=20

SUMMARY: Look at qmail, it's better than sendmail.=20

[era 1998-08-15 PM-L] (Blows dust off old = underutilized Bat=20 Book/ORA sendmail book) Yeah, setting QueueFactor (q) and QueueLA (x) = to=20 suitable values should do what you want. You need to have = load-balancing=20 support compiled in, though; according to the Bat Book, sendmail -d3.1 = tells=20 whether you have it or not. (Mine just says getla:0 which I would = imagine=20 means I have the support but the load average was below the cutoff = level.=20

AFAIK using load averaging = would have the=20 first messages delivered and the rest queued. However, also not being = a=20 sendmail guru, I do not know how to empty a sendmail queue for = incoming mail=20 only. Moreover, even if I knew how to do this, it would have to be = done after=20 procmail finishes.=20

[Liviu Daia daia@stoilow.imar.ro] = Instruct=20 sendmail to queue messages when called from procmail:

      SENDMAILFLAGS=3D"-oi -od d"

then disable the normal sendmail daemon from your = system init=20 scripts, and run it in flush queue mode only, that is, replace

      /usr/sbin/sendmail -bd -q 15m

in your init scripts with

      =
/usr/sbin/sendmail -q 15m

("15m" is how often the queue will be run (15 = minutes).=20 Change it to whatever is appropriate for your purposes). Also make = sure to=20 disable forking in your sendmail.cf.=20

The downside of this approach is that it will also = delay the=20 delivery of local messages. Different approach: pipe messages to = sendmail=20 instead of using '!' and use the wait flag. Something along the lines = of:

      :0 w
      * conditions
      | $SENDMAIL $SENDMAILFLAGS <recipients>

Well, I'm actually not sure you can use the 'w' = flag without=20 'f' (the manual doesn't say it, and I'm not too familiar with procmail = internals), so if that doesn't work you might also try Sendmail will = rewrite=20 the From_ header (which you can probably = safely=20 ignore), and it will (optionally) add a From: if one=20 doesn't exist, but it won't touch an existing=20 From:. Well, actually it will encode or = decode any=20 8-bit characters in the From: according to = the options=20 in sendmail.cf, but it won't change the meaning of=20 the "From:". In fact, that's exactly what procmail does too in the '!' = recipes.

      :0 fw
      * conditions
      | $SENDMAIL $SENDMAILFLAGS <recipients>

      # dummy recipe to stop procmail from delivering an empty message
      :0
      a /dev/null

18.24 What happens to mail if MDA Procmail fails

...When procmail is the local = mailing=20 agent distributing e-mail to a user's $HOME and the target machine is = 'down',=20 where does the e-mail go? I was given the impression that the mail = would be=20 collected on the 'mailhub' in /usr/mail/BOGUS.xxx (Solaris system). It = is not=20 happening and we have the potential of losing mail.=20

[philip] I assume = that by "target=20 machine" you mean the NFS server for the given user's account. = Procmail's=20 attempt to read ~/.procmailrc will timeout, then when it tries to = write to=20 $DEFAULT (which you say is in their home directory) it'll time out = (again) and=20 return EX_CANTCREAT to sendmail. Sendmail will then presumably bounce = the=20 message.=20

Now, if sendmail is looking for .forward files in = user home=20 directories, then procmail will never be called, as sendmail will try = to open=20 the .forward file and consider it a transient error when it times out, = causing=20 the message to be queued for a later delivery attempt.=20

(Note: invoking procmail with the -t flag causes it = to return=20 EX_TEMPFAIL instead of EX_CANTCREAT. This would cause the message to = be=20 requeued. However, this is not generally recommended.)

18.25 Procmail reads entire 90Mb message into memory

...last week my workstation = ground to a=20 halt when procmail received a 90Mb Email message (ran out of memory). = The=20 point is, such message sizes are fine by me, as long as the system can = handle=20 it. Is there any way I could make procmail only read the headers of = that=20 message before scanning /etc/procmailrc/ ~/.procmailrc and acting on = it? That=20 way it wouldn't need to read the entire message into memory.=20

...Recently, I modified the = sendmail.cf=20 file to pipe messages through procmail before sending them to deliver, = so that=20 I can use system-wide procmail recipes for spam filtering. However, = yesterday=20 we had a client send a 22 megabyte e-mail message (on purpose, no = less) and=20 the system just came to its knees trying to deliver it to the user's=20 mailbox.=20

[philip] Btw, All the = versions of=20 /bin/mail (or mail.local) that I've seen the source for either read = the entire=20 message into memory first or use a temp file. Depending on where temp = files=20 are located, a 90MB temp file may be just as bad as holding it in = memory.=20

And, No, there isn't. Hacking it in would not be = non-trivial,=20 mainly because the current code runs with the assumption that the = entire=20 message is there, and determining when it actually needs to see the = entire=20 body (to do demand loading) would not be easy. Remember that a = condition on=20 the size of the message, ala

      :0
      * > 10000000
      /dev/null

would require the body to be read... It really is = just better=20 to simply have sendmail enforce the limit. You should be doing it = there anyway=20 to cut down on the totally trivial denial-of-service attacks and = because it's=20 more efficient.=20

...I am running procmail ver = 3.11pre7 and=20 I keep getting "out of memory as i tried to allocate 8xxxxxx bytes.". = I have=20 over 100 meg available swap space so i have a difficult time = understanding=20 this. Is this a known error?=20

Procmail's memory allocation technique appears to = non-optimal=20 for some OS/libc combos, namely implementation of the libc system = function=20 realloc() (FreeBSD has been reported). It's conceivable that the = configuration=20 process could be enhanced to detect this system limitation to use a = strategy=20 more efficient on them. Don't hold your breath.=20

[ed] There is a patch = available=20 that should fix the problem for you. See the messages at <URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/=20 = cgi-bin/w3glimpse/procmail?query=3DAlbsmeier&errors=3D0&case=3Don= =20 &maxfiles=3D100&maxlines=3D30>.

18.26 Help, procmail uses occasionally huge chunk of memory

...we've noticed that = occasionally,=20 procmail uses a huge chunk of memory. It's always the same 17MB as = reported by=20 the top command. Can anyone enlighten me as to why sometimes procmail = creates=20 such a huge footprint and other times doesn't, for the same user with = an=20 unchanged .procmailrc file?=20

[ed] Is your = operating system a=20 BSD variant such as FreeBSD or OpenBSD? If so, then the problem is due = to a=20 poor implementation of the Standard C Library system function = realloc() on=20 those platforms. A patch that works around this is available. See the = messages=20 at=20

<URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/=20 cgi-bin/w3glimpse/procmail?query=3DAlbsmeier&errors=3D0=20 &case=3Don&maxfiles=3D100&maxlines=3D30>=20

Specifically, the patch is located at=20

<URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/=20 cgi-bin/w3glimpse2HTML/procmail/1997-10/msg00330.html?63#mfs>.=20

It's an artifact of procmail's memory management. = It reads an=20 entire message into memory before working on it. Fear the system with = procmail=20 as the local delivery agent, where people are slagging 100M CAD files = around.=20 :-)

18.27 Procmail signaled out of memory in my verbose log

...I notice in my procmail = verbose log the=20 following 'transaction':

      procmail: =
[10239] Sat Jan  9 08:49:02 1999
      procmail: Out of memory
      buffer 0: "formail"
      buffer 1: " formail -A "X-Check: List""
        Folder: **Bounced** 5744
      procmail: Notified comsat: "bhoule@:**Bounced**"

If I act quick enough when = this happens, I=20 can look in spool/mqueue and find a message with a gazillion addresses = in the=20 To: line. So it seems that formail is having trouble adding my X-Check = header=20 to an already large set of headers.=20

[philip] No, it's = procmail that's=20 unable to allocate enough memory. The buffer dumps indicate that = procmail was=20 unable to get enough memory somewhere between parsing the action line = and=20 reaching the next recipe -- buffer 0 would not contain the string = "formail" if=20 procmail had gotten to another recipe or variable assignment. What's = weird is=20 that the message is so small (only 5744 bytes according to procmail). = Do you=20 only see this error on this recipe, or at random places in your = .procmailrc?=20 If the later, then I would guess that your mailserver is running out = of memory=20 for some other reason and that procmail happens to be an innocent = bystander.=20 If the former, then, well, I'm not sure.=20

The message is never delivered = to me. Is=20 there anything I can do so that procmail/formail will act as if it was = never=20 there so the incoming dumps into my inbox rather than returning an = error to=20 the mailer? This "*Bounced*" business is not a = very=20 helpful action.=20

Giving procmail the -t flag will cause fatal = internal errors=20 that are normally returned as permanent errors to be returned as = temporary=20 failures instead. Otherwise there's no way to control that. (Setting = EXITCODE=20 won't work because procmail needs to malloc memory to handle TRAP and=20 EXITCODE, and it'll refuse to try that when it was malloc that caused = the=20 exit.)

18.28 Variables DEFAULT and ORGMAIL

...According to the man pages, = DEFAULT is defined as ORGMAIL ...so=20 if I redefine ORGMAIL, then DEFAULT should = change as=20 well, which doesn't help me. Any help on this would be = appreciated=20

[david] DEFAULT is initially defined as equal to ORGMAIL. Once procmail has started reading = /etc/procmailrc=20 (if it is the MDA) or your .procmailrc, you can change the value of = either=20 without affecting the other.=20

In fact, you can even set DEFAULT on the command = line when=20 you invoke procmail (I'm not sure about doing that with ORGMAIL, though), and that value will override its = normal=20 initial value equal to ORGMAIL.=20

What if it is possible that dropping to DEFAULT = fails due to=20 disk full? Then you would better have another drop place in another = file=20 system. Peek at bdf(1) or df(1) to find out the different mounted file = systems.

      # Place this to the end of your =
.procmailrc and define
      # DEFAULT_SECONDARY

      :0 :
      $DEFAULT

      :0 E
      $DEFAULT_SECONDARY

If you deliver explicitly to $DEFAULT, procmail = treats it=20 like any other save-to-folder recipe, and if the write fails, it = continues=20 reading recipes.=20

...If I had set the "deliver" = destination=20 as ORGMAIL rather than DEFAULT, would it have made any difference?=20

Nope. If you write a recipe for it, procmail just = expands the=20 variable and doesn't give a heck if it happens to be the same = destination as=20 DEFAULT or ORGMAIL. DEFAULT is special to procmail only when it uses = it on its=20 own after falling off the end of the rcfile; ORGMAIL=20 is special only at startup (without -m) and when procmail falls off = the end of=20 the rcfile and finds that it cannot save the message to DEFAULT.=20

In general, if procmail falls = off the end=20 of the rcfile, fails to save to DEFAULT, and = then=20 fails to save to ORGMAIL, does it revert to = the=20 compiled-in value of ORGMAIL ?=20

[philip] Procmail has = no fallback=20 beyond the current value of ORGMAIL. If = delivery to=20 both DEFAULT and ORGMAIL fail,=20 then procmail gives up and exits with error code 73 (EX_CANTCREAT) or = 75=20 (EX_TEMPFAIL), depending on whether the -t flag was given. Setting = EXITCODE would probably override those. The = message is=20 logged as "Bounced".

18.29 When DEFAULT cannot be mailed to

If procmail gets to = the end of=20 the rcfile without delivery (or without being directed to another = rcfile by an=20 INCLUDERC or HOST assignment), it assumes these:
    =
  :0:
      $DEFAULT

      :0 e:
      $ORGMAIL

That is, it tries to deliver to $DEFAULT and if it = can't, it=20 tries $ORGMAIL. If that fails too ("deep, deep trouble" as Stephen = says in the=20 man page), it exits without delivery and reports failure to the MTA, = which,=20 depending on other factors, will either requeue the letter and try = delivering=20 later or will bounce it to the sender.

18.30 Variable DROPPRIVS

...I have procmail invoked = from a=20 mailtable for a virtual domain. Presently that runs as root, inherited = from=20 sendmail. I'd like to have it run less privileged. I tried chown'ing = the rc=20 file to the user I want used and setting "DROPPRIVS=3Dyes". That = didn't do it.=20 So I added "LOGNAME=3Duser" and "USER=3D$LOGNAME" before the DROPPRIVS = assignment=20 and that didn't work.=20

[philip] DROPPRIVS = only has an=20 effect inside the /etc/procmailrc used when procmail is running in = delivery=20 mode (-d), not when it's running in mail filter mode (-m). USER and = LOGNAME=20 have no effect on the working of DROPPRIVS, as procmail is just going = to=20 change to the uid/gid of the user specified on the command line after = the -d.=20 Your mailtable entry should be specifying the procmail mailer, which = runs=20 procmail in mail filter mode.=20

If the following are true:=20

  • procmail is running in mail filter mode=20
  • no assignments were given on the command line=20
  • the -p flag was not specified=20
  • the rcfile specified is located under /etc/procmailrcs/ without=20 backwards references ("/../"s)=20
  • the rcfile is not a directory (duh!)

then procmail will assume the uid and gid of the = owner of the=20 rcfile. If the rcfile is actually a symlink, the procmail will assume = the uid=20 and gid of the link itself, not the underlying file. If your OS allows = anyone=20 to give away ownership of files with chown, the procmail adds the = following=20 restriction to those above:

      /etc/procmailrcs =
must be owned by root and mode 700.

18.31 Variable HOME

[david] = Since=20 procmail doesn't understand tilde, you have to use variable HOME = instead.
      CONTENT   =3D `cat ~/file.txt`        # =
Won't work
      CONTENT   =3D `cat $HOME/file.txt`    # ok

But accessing other user's home is another story. = You could=20 change the SHELL temporarily to get procmail understand the reference, = like=20 this:

      SHELL   =3D /bin/csh
      CONTENT =3D `cat ~user/file.txt`
      SHELL   =3D /bin/sh                   # restore original setting

Because the tilde is in $SHELLMETAS, when procmail = sees a=20 tilde, it will invoke a shell. It's better to skip the extra process = of a=20 shell and use the $HOME variable: put a symlink somewhere under your = own home=20 directory that points to the other user's file so that you can use the = $HOME=20 variable in your .procmailrc and avoid the shell invocation.=20

However, there are dangers on this too, because = sysadm may=20 move home directories and your symlinks may be out of date. If you = expect such=20 changes and broken links, then you could cache the needed home = directories at=20 time you need them:

      HOME_PHIL   =3D `ksh -c =
"echo ~phil"`
      HOME_ED     =3D `ksh -c "echo ~ed"`

18.32 Variable HOST

[philip] If = a=20 assignment to the "HOST" variable occurs where the assigned value = doesn't=20 equal the hostname of the machine on which procmail is running, = procmail will=20 stop reading the procmailrc, and if there are other procmailrcs = specified on=20 the command line, it will start reading them.=20

[david] It goes back = to the early=20 days of procmail, before Stephen thought of INCLUDERC or the "var ??=20 condition" syntax. When people had to use different code based on = which local=20 host machine was processing a particular message, the method was to = list a=20 number of rcfiles on procmail's command line. The first one would = start out=20 with general code for all messages and all hosts and then have a

      HOST =3D some.specific.machine

assignment, followed by code for mail delivered on = that=20 machine. If the first nine characters of "some.specific.machine" = matched the=20 real value of $HOST, procmail would stay in that rcfile; on a = mismatch, it=20 would jump to the second rcfile named on the command line.=20

The second rcfile would probably be for another = particular=20 machine, so (unless it first had some universal code for all machines = except=20 the first one, or unless there were only two machines where procmail = might=20 run) right at the top it would have

      HOST =3D =
this.specific.machine

Again, a match for the first nine characters would = keep=20 procmail reading this rcfile, but a mismatch would make it jump to the = next=20 rcfile.=20

And so it went. An incorrect HOST assignment (note = that=20 "HOST" alone attempts to unset the variable, so it is always an incorrect assignment) in the last rcfile = on the=20 command line made procmail drop the message and exit. Since we almost = never=20 name more than one rcfile on the command line now, attempting to unset = HOST in=20 .procmailrc will have that effect.=20

I would guess that the only use of this original = setup still=20 around is in SmartList, where flist invokes procmail with a number of = rcfiles=20 on the command line and uses things like = HOST=3Dgo.to.the.next.rcfile.now to=20 move from one to the next. Also, procmail's -m facility (which didn't = exist=20 back then) is incompatible with using HOST to jump among rcfiles, = because it=20 requires naming exactly one rcfile on the command line.=20

Nowadays we can do something like this to use = different=20 rcfiles on different hosts:

      :0
      * HOST ?? ^^\/[^.]+
      {
          INCLUDERC =3D $HOME/.$MATCH.rc
      }

18.33 Variable LINEBUF

...[manual] Length of the = internal line=20 buffers, cannot be set smaller than 128. All lines read from the = rcfile should=20 not exceed $LINEBUF characters before and after expansion. If not = specified,=20 it defaults to 2048. This limit, of course, does not apply to the mail = itself...=20

Note: Beware of = simply setting=20 LINEBUF to a huge value: such an assignment causes procmail to = immediately=20 allocate twice that much memory (procmail has two buffer internally of = size=20 $LINEBUF).=20

[philip] Those 160 = lines of=20 condition are almost certainly overflowing LINEBUF. You should either = a) use=20 one of the innumerable recipes sent to the list demonstrating the use = of=20 fgrep; b) break it into multiple recipes; or c) increase LINEBUF. If = you=20 modify this list of domains regularly, then you should strongly = consider (a),=20 as (b) and (c) just put off it happening again.=20

LINEBUF only applies to = lines from=20 procmailrcs. You generally only have to worry about LINEBUF when you have a variable expansion or = command=20 expansion (back quotes) that doesn't have an obvious and reasonable = bound on=20 its size. procmail will avoid over running its LINEBUF=20 length buffer when doing command expansions by ignoring the extra = output, so=20 you're safe there, as long as data truncation is fine. Variable = expansion=20 isn't checked like that, so you can cause procmail to core dump by = doing=20 something like:

      :0
      * ^Subject: \/.*
      |some-program $MATCH

then feeding procmail a message with a huge = Subject: header=20 field: since no shell meta characters appear in the action, the action = line=20 will be expanded and exec()ed by procmail directly instead of by the = shell. On=20 the other hand, the following is fine:

      :0
      * ^Subject: \/.*
      |some-program $MATCH ; ;

The semicolon forces a shell invocation, and the = shell should=20 be safe. If your /bin/sh can buffer overrun on variable expansion, = then you're=20 in more trouble than you know.=20

Action lines aren't the only place to watch your = variable=20 expansions. Variable assignments and condition lines that have a = leading=20 dollar sign also undergo expansion. For example, this isn't safe:

      SUBJECT =3D `$FORMAIL -x Subject:`
      NEWSUBJ =3D "Subject: $SUBJECT"

procmail won't buffer overrun in the first line, = but a really=20 long subject could cause the second to do so. The following should be = safe:

      NEWSUBJ =3D "Subject: `$FORMAIL -x =
Subject:`"

but even then only if you're sure the shell is = doing the=20 expansion of NEWSUBJ.=20

Note that matching against the value of a variable = (using the=20 "var ??" condition special) is safe no matter what the size of the = contents of=20 the variable. The problem is when you interpolate the variable into = something=20 else.=20

Is there any easy way to know = default=20 LINEBUF value for specific procmail? I'm sure there's a much easier = way, but=20 this will work:

      #   Mitsuru Furukawa
      #
      $OUT    =3D $HOME/tmp/linebuf.lst

      :0 wc:  $OUT$LOCKEXT
      *$ ! ?  $IS_EXIST $OUT
      | echo "$LINEBUF" > $OUT

[philip] If you = examine the=20 procmailrc manpage, you'll note that it lists fourteen variables = (among them=20 DEFAULT but not LINEBUF) whose values are reset in the environment by=20 procmail, plus some additional ones like IFS, ENV, PWD, and PATH which = come=20 out of the top of config.h. Following this is a list of all of = procmail's=20 magic variables, including those fourteen. The idea is that while = procmail has=20 thirty magic variables, only fourteen of them are put into the = environment by=20 procmail.=20

The others may have default values, but they're = 'input only':=20 if what you're doing depends on one of the others having a certain = value, then=20 you should just go ahead and set it to that value. I know of only two = ways to=20 find out what value procmail is using by default: a) check the manpage = (the=20 manual pages should show the correct default for the machine), or b) = fire up=20 your favorite debugger and hope that no one stripped the procmail = binary.=20

There will be no error message when Procmail dumps = core, even=20 though the reason is apparently precisely that LINEBUF=20 is being exceeded too much.=20

Is there a limit on the length = of a single=20 line=20

[david] Yes, both = before and=20 after variable expansion and command substitution, it must be shorter = than=20 LINEBUF characters. The exceptions are (1) = comments=20 and (2) commands that are run by a shell rather than directly by = procmail. The=20 entire condition must be under LINEBUF = characters=20

Unfortunately, LINEBUF seems to be a write-only = variable; you=20 can change its value but you can't find out its current setting. =

18.34 Variable LOG and LOGFILE

If you want to print something = to the=20 LOGFILE, you could do it like this
      LOG =3D "  This message goes to LOGFILE"
      LOG =3D " $NL$NL  And this has linefeeds around $NL$NL"

Or like this, which proves to have some nice = feature in=20 respect to VERBOSE setting:

      dummy =3D "  This message goes to LOGFILE"
      dummy =3D " $NL$NL  And this has linefeeds around $NL$NL"

You see, if you set VERBOSE=3D"off"=20 Then the dummy lines are not printed and = recorded to=20 the LOGFILE. LOG = messages are=20 aways printed, and that's not very nice if you're trying to suppress = messages=20 while you call some subroutine:

      saved   =3D =
$VERBOSE
      VERBOSE =3D "off"

      #   Hope this subroutine does not use LOG
      #   Eg. $PMSRC/pm-jaaddr.rc

      INCLUDERC =3D $RC_ADDR

      VERBOSE =3D $saved            # restore original value

18.35 Variable TRAP

Here is one example how to write to the = log file,=20 Be sure that you have preset all the variables, this just demonstrated = the=20 usage of TRAP. Pay attention to right use of = single=20 and double quotes if you pass the values to the shell. Like in this = example=20 where the /dev/ is removed from the FOLDER variable's value.
      =
TRAP =3D 'echo "
      FROM    $FROM
      TO/CC   $TO / $CC
      SUBJECT $SUBJECT
      FOLDER  $LASTFOLDER
      " | sed -e "s#FOLDER   /dev/#FOLDER   #g"'

And if your MUA expects the file to be touched = before it sees=20 new incoming mail, here is recipe by [david]:

      TRAP =3D 'touch =
-m $HOME/Mail/$LASTFOLDER' # with strong quotes

Place it early in your rcfile; then each recipe = that saves to=20 a directory can look simply like this, and the trap will take care of = the=20 touching:

      :0 flags # no local lockfile needed =
for save to directory
      * conditions
      directoryname/.

[david] Procmail = terminates when=20 it exits ... after final delivery of the message. It doesn't terminate = (nor=20 execute TRAP) after delivering a copy to a c = recipe=20 [however, a clone does execute TRAP when it terminates, unless you = unset TRAP=20 for it]. It doesn't execute the trap after a variable assignment, a = variable=20 capture recipe, a filtering recipe, nor any other non-delivering = action.=20

On the other hand, it does = execute the=20 trap if you do a quick bail-out by unsetting or missetting $HOST.=20

[Recipe to record Subject lines on = exit]=20

<<Some Message had doubled Subject = lines>> [david] ...this will list all subject lines in = the log=20 file upon exit if there are two or more. The earliest would appear = twice: once=20 in the trap output and once in the logabstract.

     =
 :0
      * ^Subject:.*$(.+$)*Subject:
      {
          #  If there is already `TRAP' set, combine the
          #  old trap recipe with this

          TRAP =3D "${TRAP:+$TRAP ; }$FORMAIL -XSubject:"
      }

18.36 Variable UMASK

There is a better way to find out which = folders=20 contain new mails if you are using procmail to filter the mails. (This = was a=20 hack by one of my friends) procmail allows you to set UMASK on the = folders. So=20 before doing anything, set UMASK to 076, which means the perms will be = -rwx-----x to any folder which receives mails. now using find -perm -001, you can print the folders which = have new=20 mails. the shell script which does this will also have to chmod o-x on = all=20 these folders.=20

...How does this work? AFAIK = umask only=20 applies to new files created and not to appending to existing files = which is=20 what procmail essentially does, right?=20

[era] Procmail does = interpret=20 UMASK this way, so this works, but I don't think it's a particularly = good=20 solution. It's actually hinted at in the documentation for UMASK in=20 procmailrc(5). find is a rather heavy = program to start=20 up every time you want to look for mail. (Haven't done any timings, = though.)=20

  • I just grep -c '^From ' on my mail folders to see how many = messages=20 there are in them. (This is only an approximation, in the case where = one or=20 more messages contain unescaped From_ lines.)=20
  • For a really pedestrian solution, keep all your spool files in = their own=20 directory (I think this is a good idea for other reasons as well) = and do an=20 ls -lrt on that directory, possibly piped into a sed script to trim = off=20 files with time stamps older than, say, 24 hours.=20
  • If your mail reader will reset permissions on spool files when = it gets=20 mail from them, the UMASK trick is a good base for a mail checking = script,=20 but I would then only ls -l the spool files and look for files with = an x01=20 permission.

18.37 UMASK and permissions

My mail folder says = -rw-r--r-x, Is there a=20 bug in Procmail's umask handling? (see last x bit)=20

[philip] That's a = feature, not a=20 bug! To quote the procmailrc(5) manpage:=20

UMASK: The name says = it all (if=20 it doesn't, then forget about this one :-). Anything assigned to UMASK = is=20 taken as an octal number. If not specified, the umask defaults to 077. = If the=20 umask permits o+x, all the mailboxes procmail delivers to directly = will=20 receive an o+x mode change. This can be used to check if new mail = arrived.=20

Anyhow, normally, under Unix, = the create=20 system call will set default permissions of 666 and the umask can only = be used=20 to mask off the bits you don't want (and not to e.g. add x bits). = Shouldn't=20 Procmail work this way, too, just to be consistent with the rest of = the=20 system?=20

creat() will set the permissions to whatever you = want it to,=20 modulus the umask. If the umask is zero, you can set the permissions = to 7777,=20 though that would be kind of stupid (and actually, most versions of = UNIX won't=20 let you set both the sticky bit and an executable bit unless you're = root, for=20 historical reasons). Most programs that call creat() or = open(..,O_CREAT,...)=20 give a mode argument of 0666, as they generally don't write out = executables.=20 Procmail just happens to call open() with a mode argument of 0667, to = be=20 modified by your umask.

18.38 Performance difference between back tick and "|" recipe=20

Procmail sends the whole message to stdin whenever it sees back = ticks=20 used. And if you use recipe, you can add the h flag to=20 feed only the header to the program, and not the whole message. Let's = ask=20 academic question: Which one of the choices below is efficient?
      # Side effect: Do something with shell
      dummy =3D `echo hi there > some-file.txt`

      :0 hwic
      | echo "hi there" > some-file.txt

Procmail sends whole message to first line and only = headers=20 to second recipe. Answer: It doesn't matter. Either way procmail will = make one=20 write system call which will return 0 [bytes written] and off it goes. = You=20 should use the first one, because the latter affects the A and E flags later, = first one is=20 more clear overall.=20

While someone suggested following, it was rejected = because it=20 hurts performance more [stephen]. The = cat process=20 is useless and directing to dev null does not buy anything.

      :0 hwic
      | cat - /dev/null; echo "hi there" > some-file.txt

18.39 Procmail's temporary file names while writing file out

...Any ideas what might make = those .nfs*=20 files? They contain messages which seem to have been successfully = processed by=20 procmail in the later parts of the .procmailrc . However, I doubt = they'd ever=20 get cleaned up if I didn't discover them.

      =
/disk3/home/foobar/Mail 119) ls -la backup
      total 22
      drwx------  2 stanr         512 Nov 11 21:00 .
      drwx------  3 stanr        2560 Nov 11 21:11 ..
      -rw-------  1 stanr        3063 Nov  4 03:31 .nfsA0c724.4
      -rw-------  1 stanr        1780 Nov  3 23:00 .nfsA47da4.4
      -rw-------  1 stanr         849 Nov  3 23:22 .nfsA481f4.4
      -rw-------  1 stanr        2293 Nov 11 11:28 .nfsA737d4.4
      -rw-------  1 stanr        2598 Nov 11 20:39 msg.HCJB
      -rw-------  1 stanr        3127 Nov 11 21:00 msg.ICJB
      -rw-------  1 stanr        1884 Nov 11 20:45 msg.KCJB
      /disk3/home/stanr/Mail 120)

[david] procmail uses = temporary=20 name while it is trying to write a file out, which it renames if = things go=20 well. I noticed that they all came from a 4h 31 span overnight; = perhaps there=20 was some systems work being done on your machine that screwed things = up?

      :0 ic
      | cd backup && rm -f dummy `ls -t msg.* .nfs* | sed -e =
1,3d`

[aaron] When a file = that is being=20 used by a program on an NFS client gets unlinked the NFS server = renames it to=20 something like that. It should then actually get unlinked when the = file is=20 closed, but it looks like the NFS server never got the close message = for=20 those.=20

[Keith Pyle keith@ibmoto.com] It is a = result of=20 using NFS, but the fault lies with the operating system on the NFS = client.=20 Keep in mind that NFS is stateless from the perspective of the NFS = server. It=20 keeps no information on how any file is being used. So, if a client = tells the=20 server to delete the file, the server deletes the file. This is not = normally a=20 problem, but many programs use a "trick" of Unix where the program = opens a=20 file, unlinks (deletes) it, and then continues to use the file. For = all local=20 files, the Unix kernel will not actually delete the file until all = processes=20 which have the file open exit. This works very well for temporary = files.=20

If a client tells an NFS server to delete a file, = it will=20 delete the file immediately because of the stateless nature of NFS. = The server=20 has no way of knowing if any client still has the file open. To avoid = this=20 problem, if a client unlinks an open file on an NFS filesystem, the = file is=20 renamed to .nfs* where * is a unique value. The NFS client system is = supposed=20 to delete the .nfs* file when the process exits. However, there are = some=20 versions of Unix which do not do this well (e.g., AIX). If one of = these OS's=20 is used, it is common to find .nfs* files in various places. = Therefore, it is=20 a good idea for system administrators to periodically purge any .nfs* = files=20 over a certain age to eliminate the unsightly buildup in the = filesystems.=20

18.40 Parameter $@

[david] Of = version=20 3.11pre7 procmail does not grok "$*", nor does it grok "$@" outside a = pipe or=20 forward action. The only way to get the positional parameters all = quoted=20 together into "$*" is something like this:=20

This doesn't work after all

      =
ARGS =3D `echo "$@"`

Procmail substitutes null for "$@" there. This works, though:

      :0 ir
      ARGS=3D|echo "$@"

After that you use "$ARGS" instead of "$*".=20

If you try to set ARGS with ARGS=3D"$@", procmail = doesn't=20 substitute for "$@" and makes $ARGS null. If you try ARGS=3D"$*" you = get the=20 literal text '$*'.=20

[philip] Of course, = $ARGS differs=20 greatly from $@ in that $ARGS will either be split on whitespace (if = unquoted)=20 or one argument (if double-quoted). $@ has the cool property that if = double=20 quoted it'll still be split into multiple arguments on the original = argument=20 boundaries. Since full-blown mail addresses often have spaces, this=20 distinction should not be casually dismissed. Note that while you = might not=20 type in such an addresses, your MUA's reply builder may.

18.41 Procmail variables are null terminated (detecting null = string)=20

You can't catch null in the message. Eg if you try like this
      NUL=3D`/usr/5bin/echo "\000"`

      :0 HB
      * $ $\NUL
      {
          LOG =3D "Caught NUL"
      }

[philip] It won't = work as=20 expected. The problem is that environment variables (and therefore = procmail=20 variables) are null-terminated, and therefore cannot contain a null. = The above=20 line creates an empty variable. The solution is to use an inverted = character=20 class:

      NUL =3D `/usr/5bin/echo '[^\001-\377]'`

Note that procmail handles 8-bit characters except = for null=20 in procmailrcs, so you can use a literal control-A and octal-377 in = your=20 .procmailrc and save an echo and shell invocation right there.

18.42 FROM_DAEMON TO and TO_ and case-sensitiveness

[david] ^TO is case-insensitive by default. = Stephen once=20 told me something to the effect that tokens like ^TO, ^TO_, = ^FROM_DAEMON, and=20 ^FROM_MAILER are always case-insensitive, even = if the=20 recipe has the D flag, but I'm not positive = that that=20 was what he was saying, and we never pursued it. Certainly they are=20 insensitive to case if there is no D.=20

[philip] If a regexp = contains the=20 ^FROM_DAEMON token, then that entire regexp is treated as = case-insensitive.=20 Other conditions in the recipe are not affected by this. The other = tokens have=20 no effect on the case-sensitivity. (This is with procmail 3.11pre4) =

18.43 TO_ macro deciphered

...What is the essential = difference=20 between TO and TO_ ?=20

[phil 1996-03-21] The difference is that = ^TOalias1@site may=20 match something like bobs-alias1@site while ^TO_ won't.=20

[elijah 1997-09-16] Let's rewrite that in perl /x = format. See=20 below. The definition of the word boundary in block (E). See below. = The ^TO_=20 expansion was added in v3.11pre4. You'll probably have to just ^TO (no = '_'),=20 which should work almost as well.

      /            =
           # [begin regexp]
       (                      # [Block (A)]
        ^                     # Anchor to start of line
         (                    # [Block (B)]
           (Original-)?       # Optionally proceed (C) with "Original-"
            (Resent-)?        # Optionally proceed (C) with "Resent-"
                    (         # [Block (C)]
                     To       # "To"
                    |Cc       # or "Cc"
                    |Bcc      # or "Bcc" {very rare in practice}
                    )         # [end (C)]
           | (                # [Block (D)]
              X-Envelope      # Proceed line 17 with "X-Envelope"
             |Apparently      # or "Apparently"
                 (-Resent)?   #    with optional "-Resent" appended
             )                # [end (D)]
                -To           # "-To" [line 14]
          )                   # [end (B)]
             :                # ":"
             (                # [Block (E)]
              .*              # any text
                # any single char other than letters, numbers,
                [^-a-zA-Z0-9_.]
                              # hyphen (-), underscore (_), or period =
(.)
             )                # [end (E)]
              ?               # Block (E) is optional
       )                      # [end (A)]
      /x                      # [end regexp]

18.44 TO_ macro and RFC 822

...According to RFC822 the = From address=20 can contains almost anything and the valid mail address can be = extracted from=20 the line as long as it is enclosed between <...>. Like=20 foo@example.com.=20

[by Vikas Agnihotri vikas@insight.att.com] = Block=20 (E){see TO_ macro explanation} is there to slurp up that part. The=20 <encapsulation> is not needed, and a case such as:

      From: "jester@fun.house" fool@aol.com

Will confuse a test for "^TO_jester@". Yes, I have = seen=20 people do that stuff, apparently not even maliciously. And although = valid=20 following is also valid

      From: =
someone@somewhere.com another@one.com

[Elijah continues] it will also confuse the regexp. = I don't=20 like the ^TO and ^TO_ macros for most things and typically use stuff = like=20 this:

      ^(Resent-)?(To|CC):.*[< ]{address}([ =
>]|$)

It still can be confused, but the things that will = cause=20 problems are fairly rare in practice. You might prefer something like = this:

      ^(Resent-)?(To|CC):([^(]+([(].*[)])?)*[, =
<]{address}([, >]|$)

Which can correctly deal with

    =
  To: (hatter@tea.party) {address}
      To: (fake {address}) bill.the.lizard@the.jury.box
      To: Alice alice@the.croquet.game, =
"W. Rabbit (late)"
              hare@small.hole, Gentle Reader =
<{address}>
      To: jabberwocky@vorpal.swords.r.us, duchess@the.croquet.game,
              chesire@no.where, {address}, dinah@meow.org

It will still fail for

      To: =
(fake <{address}>) mockturtle@tortoise.edu

If someone is malicious enough to send you such = mail.

18.45 FROM_DAEMON deciphered

Here is the exploded FROM_DAEMON = regexp=20 as of 3.11pre7
      =
(^(Precedence:.*(junk|bulk|list)
        |To: Multiple recipients of
        |(
          ((Resent-)?(From|Sender)|X-Envelope-From):|>?From )
          ([^>]*[^(.%@a-z0-9])?
          (
              Post(ma?(st(e?r)?|n)|office)
              |(send)?Mail(er)?
              |daemon
              |m(mdf|ajordomo)
              |n?uucp
              |LIST(SERV|proc)
              |NETSERV
              |o(wner|ps)
              |r(e(quest|sponse)|oot)
              |b(ounce|bs\.smtp)
              |echo
              |mirror
              |s(erv(ices?|er)
              |mtp(error)?|ystem)
              |A(
                 dmin(istrator)?
                 |MMGR
                 |utoanswer
                )
          )
          (  ([^).!:a-z0-9][-_a-z0-9]*)?
             [%@>         ][^<)]*(\(.*\).*)?
          )?
          $
          ([^>]|$)
        )
      )

[era] explains the = last regexps=20 as follows:

      (([^).!:a-z0-9]   End of e-mail =
address token
        [-_a-z0-9]      Another alpha token
        )?              ... or maybe not;
       [%@>\t ]         Address separator -- either address@... or
                          <address> or a bare address with =
whitespace
                          around it
       [^<)]*           Skip as long as we don't run into another
                          bracketed address or end of comment
                          (presumably to prevent this from matching
                          inside parenthesized comments in the first
                          place)
       (\(.*\).*)?      Skip optional parenthesized comments and
                          anything after them if found
      )?                ... or maybe not; maybe we just see an ...
      $                  ... end of line instead
      ([^>]|$)           Uh, I should know what this is supposed to =
do,
                          but I can't quite remember what it's for. I
                          think it had something to do with continued
                          header lines ... Anyone?

Does ^FROM_MAILER match on the = Return-Path: line?=20

[david 1998-04-29] Apparently not, but it does = match on the=20 UNIX From_ line, which usually contains the same address as the = Return-Path:=20 header.=20

Does anyone have an idea how I = can use=20 this macro but tell it to ignore the Return-Path line in the = header?=20

There's probably some way within procmail without = the extra=20 fork of formail, but this is easy to think of and easy to write:

      :0h
      HEAD_WITHOUT_FROM_=3D| formail -IReturn-Path: -I'From '

      :0
      * HEAD_WITHOUT_FROM_ ?? ^FROM_MAILER
      action

If you want to consider only the From: header, try = this:

      :0
      * ^\/From:.*
      * MATCH ?? ^FROM_MAILER
      action


19.0 Technical matters

19.1 List of exit codes

The right place to look is=20 /usr/include/sysexits.h, but the codes should be pretty much standard. = These=20 ones are from HP-UX 10 and the code that you will be using mostly is = EX_NOUSER=20 or EX_NOPERM. It tells to the sender of UBE to "piss off and delete me = from=20 your list; I'm not here"
      EX_OK          0      =
  successful termination
      EX__BASE       64       base value for error messages

      EX_USAGE       64       command line usage error
      EX_DATAERR     65       data format error
      EX_NOINPUT     66       cannot open input
      EX_NOUSER      67       addressee unknown
      EX_NOHOST      68       host name unknown
      EX_UNAVAILABLE 69       service unavailable
      EX_SOFTWARE    70       internal software error
      EX_OSERR       71       system error (e.g., can't fork)
      EX_OSFILE      72       critical OS file missing
      EX_CANTCREAT   73       can't create (user) output file
      EX_IOERR       74       input/output error
      EX_TEMPFAIL    75       temp failure; user is invited to retry
      EX_PROTOCOL    76       remote error in protocol
      EX_NOPERM      77       permission denied

I thought that by using the = EXITCODE, I=20 would be assured that the mail would be rejected = but in=20 fact Sendmail 8.8.7 attempts to deliver the "user unknown" to = netcom.com,=20 which is obviously wrong?=20

[sean] Sendmail = accepts the=20 message, then passes it on to Procmail, either as the local delivery = agent, or=20 via a .forward file (depending on your system's configuration). = Procmail says=20 "gee, gotta lie about not being here" and rejects the message, when is = sent=20 back into the spool, and delivered according to who it appeared to = come from.=20

Had SENDMAIL determined the user didn't exist = (password file=20 / aliases / virtusertable.txt), then it would have rejected the = message right=20 when the remote was doing SMTP RCPT. But the user WAS valid, and so it = accepted it.=20

Another scenario is when you have a mail secondary, = and your=20 primary (where the user account and procmail are) is down. Some system = goes to=20 deliver mail to you, and resolves to your secondary -- which simply = holds mail=20 for your primary -- it hasn't a clue which user is valid and which = isn't.=20 Well, the (E)HELO (the system sending your primary the message) takes = place=20 during the SMTP session, the message is coming from your secondary - = not from=20 the original sender. At THAT point, if the user didn't exist, I = believe=20 sendmail would be issuing an unknown user error to the secondary, = which in=20 turn should mail that message back to who it thinks is the sender (I = can't=20 check my Bat book from where I'm at - any sendmail pros are welcome to = elaborate).=20

is there any way at all to get = around this=20 (force the rejection at delivery time)? Better yet, is there some sort = of=20 check to make sure that the Received domain reasonably matches the = From:=20 domain?=20

You'd need to have a ruleset in your SMTP Daemon = (generally=20 Sendmail) to check domains (which WILL fail on many valid messages, = BTW) and=20 reject it WHILE the SMTP delivery session (actually, the negotiation) = is in=20 progress. By the time Procmail has the message, you've completely = accepted the=20 message, and any rejection you might hope to do is bouncing the mail - = to the=20 apparent sender.=20

Such is the problem with forged mail.=20

I wouldn't suggest this tactic for fighting spam = anyway - so=20 much of it is forged, and any bounce you send out simply uses up = system=20 resources on your machine and those on the system that was spoofed. = Spammers=20 don't REMOVE addresses from their lists (they want the lists to look = as big as=20 possible when they go to sell it to someone else) -- some have even = taken to=20 GENERATING addresses at domains and sending messages to them with the=20 assumption that somebody will probably have an account by that name = ("bill@=20 joe@ dave@ ...").=20

Use procmail to trashbin (or otherwise file) all = the junk and=20 then manually take action on those which get through.

19.2 List of precedence codes

The priorities most sendmails = recognize=20 are following. The lower the priority, the later the message gets = dealt with.=20 A smart vacation program will ignore anything with a list, bulk, or = junk=20 priority. --Adam Shostack adam@bwh.harvard.edu =
      0   first-class
      30  list
      60  bulk
      100 junk
      100 special-delivery

[dan] You should use = bulk when you distribute files via File Server. = The value in=20 the Precedence: header says absolutely NOTHING about the contents of = the=20 message itself, it merely suggests a priority level to the mail = system. From=20 pp. 668 of the O'Reilly's sendmail book, bulk = typically=20 has a value of -200 while junk -100; thus a = message=20 with junk will get higher priority=20 than that of bulk (although this can be = changed in the=20 sendmail.cf file).=20

Other than on heavily loaded machines, this value = won't=20 matter anyway, since all mail will be quickly processed.=20

[Stephen] ...Mail = sent by a=20 person is usually considered to be more important than autoreplies = generated=20 by some daemon. One way to express the lower priority of autoreplies = is by=20 adding a "Precedence: junk" field. This allows mail transport agents = to make=20 educated decisions about which mail to forward first (in case the = mailqueue=20 gets clogged).=20

Another point is: other autoreply services, like = vacation. They try to make an effort not to = accidentally=20 reply to a message generated by another daemon (e.g. yours). One way = they=20 detect this is by looking at the Precedence field. If it contains = junk, they know, this is not something we should = respond to.=20

19.3 Sendmail and -t

sendmail -t tag reads To, Cc, = Bcc, etc,=20 for the recipient of the auto response?

      =
:0h
      * condition
      * !^X-Loop: foo@site\.com
      | ($FORMAIL -rA "X-Loop: foo@example.com" ) | sendmail -oi -t

[david] That's not a = problem,=20 because formail -r will not generate any Cc: or Bcc: headers unless = you tell=20 formail to add them. The only line where sendmail -t will look for = recipients=20 will be the To: line.

19.4 RFC822 Reply-To and formail problem with multiple recipients=20

[david] formail -r extracts only = one return=20 address, even when the Resent-Reply-To: or Reply-To: header contains = more than=20 one (and Stephen has told me he plans to leave it that way).=20

  • Looking for the best address to reply to is a completely = different=20 algorithm than looking for the best group of addresses to reply to. = Finding=20 a group of addresses involves actually determining that you even are = searching for a group and not only for one address. Then finding out = the=20 best address for each. It's already a tricky business doing this = just for=20 one address.=20
  • It makes thousands of autoreply recipes vulnerable to mail-storm = attacks. Formail tries its best to control the damage even if = operated by=20 someone who doesn't know what he is doing. If it were to reply to = multiple=20 addresses at times, this damage control is severely undermined. =

[dan]I understand = these concerns;=20 however RFC822 specifically allows for multiple recipients in a = Reply-To:=20 header. Given that, it seems that there should be a straight-forward = way to=20 deal with this in formail; even worse is that "formail" silently = ignores=20 multiple Reply-To: addresses.=20

For (a), wouldn't the Reply-To: (or = Resent-Reply-To:) header=20 supersede all other addresses and thus greatly simplify the searching? = For=20 (b), how about only using multiple (Resent-)Reply-To: addresses if = formail's=20 "-t" option is also specified? Or if you are really worried about = mail-storms=20 and existing recipes, a new formail option.

19.5 Procmail and IMAP server

[ed] See=20 also ftp://ftp.cac.washington.edu/mail/imap.vs.pop = ...This paper is=20 an elaboration on a short note entitled "Comparing Two Approaches to = Remote=20 Mailbox Access: IMAP vs. POP", which was written in 1993 and recently = updated.=20 The purpose of this paper is to provide more extensive background on = message=20 access paradigms and protocols, and then to specifically compare the=20 Internet's Post Office Protocol (POP) and the Internet Message Access = Protocol=20 (IMAP) in the context of "online" operation.=20

...I log in to a set of NFS-ed = servers (or=20 more precisely AFS-ed), and my mail comes into another server (not a = part of=20 this set) which is running IMAP. So sendmail never delivers mail into=20 /var/mail/$LOGNAME on my login machines, and instead delivers to the = IMAP=20 server. Since sendmail never reads my .forward file in the home = directory, I=20 figure procmail never gets invoked.=20

You need a program which will fetch your e-mail = from the IMAP=20 server and then feed it to procmail. One such program that can do this = is=20 fetchmail. Check out http://locke.ccil.org/~esr/fetchmail/. The bad news = is that=20 once you do this, you probably won't be able to use an IMAP client to = read=20 your e-mail anymore. But that might be good news if you prefer an MUA = that=20 reads mbox files but doesn't grok IMAP.

19.6 Machine which processes mail

...The just-installed procmail = does not=20 work and I am assuming that sendmail is trying to run procmail on = another=20 machine. Is there anyway I could find out the appropriate ARCHITECTURE = for=20 that machine=20

[era] The following = should tell=20 you the name of the machine which processes mail for the machine = you're asking=20 about. You can then try to log in to that machine if you have shell = access=20 there, which is something you need to have in order to compile = Procmail on it.=20

      nslookup -q=3Dmx machine      # alternatively =
use host(1) command

If you don't have nslookup (doh) or don't = understand what it=20 says, try adding this to your .forward

      "|uname =
-a >/full/path/to/home/.uname.out"

i.e. this should be there in addition to what else you do. Otherwise this = will lose=20 your mail thoroughly, since it reads the mail but doesn't save it = anywhere.=20 You might want to save a copy of all incoming mail to a safety = mailbox, too,=20 just in case. Like so:

      =
/full/path/to/home/safetymailbox
      |"uname -a >/full/path/to/home/.uname.out"
      |"IFS=3D' '&& exec /usr/local/bin/procmail -Yf- || exit =
75"

If you try this, it is very important that the file = safetymailbox exists and is writable. (man = 5 forward if you have = that -- I=20 don't seem to have this manual page on systems with newish versions of = sendmail, is that correct?)=20

Try the uname command = (and/or read=20 the manual) to see what you should expect to find in the file = .uname.out=20

19.7 Compiling procmail and MAILSPOOLHOME

...I am compiling 3.11pre7 on = a new system=20 and have a couple of questions. I edited the makefile to be the home = directory=20 "/home/a/abc" for example. I defined MAILSPOOLHOME as "/mail". The = incoming=20 mail is actually stored in "/usr/mail/abc". When I pipe test messages = through=20 procmail (using "procmail</usr/mail/abc"), rather than them ending = up in my=20 inbox, they end up in a mailbox called "msg.gs.KB". What on earth did = I goof=20 up? As I sit here and think about this, should MAILSPOOLHASH be set to = 1=20 instead of 0?=20

[philip] If incoming = mail is=20 supposed to be stored in /usr/mail/loginnamehere, then you should not = define=20 MAILSPOOLHOME at all, but rather define MAILSPOOLDIR to "/usr/mail/" = and leave=20 MAILSPOOLHASH as 0. Defining MAILSPOOLHOME causes mail to be delivered = to=20 insides each user's home directory, which does not appear to be what = you want.=20 MAILSPOOLHASH causes addition levels of hierarchy in the spool = directory to be=20 created, thus avoiding the 'fat slow directory' problem. =


20.0 Smartlist

20.1 MLM RFC

Smartlist FAQ
http://www.hartzler.net/smartlist/SmartList-FAQ.html =

Mailing list managers and Header fields are = controlled by=20 rfc2369 "The Use of URLs as Meta-Syntax for Core Mail List Commands = and their=20 Transport through Message Header Fields". http://www.cis.ohio-state.edu/htbin/rfc/rfc2369.html =

20.2 Other mailing list software

ezmlm

http://www.ezmlm.org/ and http://www.qmail.org/=20

..SmartList development is dead and in the long = term it is=20 difficult to manage and improve. I've moved the vast majority of my = lists to=20 ezmlm, which is built upon qmail. They are both packages based in part = or in=20 whole on DJB code. Performance is great. Extensibility is terrific.=20 Development is active. Using ezmlm requires that you use qmail as your = MTA.=20

...indeed, this will keep me = from using=20 ezmlm, as I use exim (www.exim.org) as my MTA and much prefer it over = qmail=20 (which I've also used).=20

...I use qmail concurrently = with Sendmail=20 but it is the qmail 'way' to lure you in with the idea it is a = painless=20 drop-in replacement for Sendmail. Before long you realize you need to = convert=20 your entire mail server over to qmail. Qmail seems to disdain anything = not=20 qmail, Unix itself, in pretty harsh terms.=20

...exim is the descendant of = smail3, and=20 has copious configuration capability... more extensive filtering than=20 sendmail.. and is a drop-in replacement for sendmail (unlike qmail). I = found=20 qmail's installation quite different from that to which I had become=20 accustomed.. (e.g. the multiple entries in passwd). I know the linux = community=20 uses and likes qmail so I'm sure it's pretty good. qmail is a drop in=20 replacement for sendmail as well=20

SmartList = limitations

[1998-10-08 SL-L cueman@cue.com] ...The trouble = with=20 rapid delivery MTA's and SmartList(is not!) is multigram and the = bounced mail=20 processing feature and large lists. If you run small lists it's not so = bad.=20 But with 30k+ subscriber monthly announcement lists, I have to = redirect=20 bouncing mail someplace else. The faster the mail is delivered the = faster the=20 bounces come in. The larger the list, the more multigram bogs down and = the=20 more procmail processes accumulate and then you get file table = overflows and=20 all that stuff from the load being sky high and hundreds of processes = hanging=20 around. Also because alot of mail servers don't behave the way we = would like=20 them to and return to us proprietary bounced mail messages that = SmartList=20 can't understand, the bounce feature is only able to successfully = remove a=20 portion of the offensive mail addresses, requiring the listowner or = some one=20 to manually go through it all anyway.=20

SmartList is not covered by the GNU GPL -- the = copyright=20 notice in the docs do say that changes/modifications can be made if = they are=20 'marked'.=20

Personally, I think the bounced mail removal system = needs to=20 be entirely redone if SmartList(is not!) is to be able to look forward = to the=20 future enough to merit further development.=20

Other MLM software

20.3 SmartList code (mailing list implementation with procmail) =

Smartlist 'FAQ'
http://www.hartzler.net/smartlist/SmartList-FAQ.html =

Mark's Smartlist = add-ons
Mark=20 David mcCreary mdm@internet-tools.com = ftp://ftp.mail-list.com/=20

...front-end for Smartlist = mailing lists,=20 and allows people to mail to list-on@domain.com and = list-off@domain.com, which=20 then creates and sends a properly formatted subscribe/unsubscribe = message to=20 the list-request address.=20

It also handles change of = address,=20 switches to/from digest lists, moderated subscriptions, and a few = other=20 things.=20

My experience is that if you = are planning=20 on running lots of lists, then eliminating questions/problems from = subscribers=20 is of paramount importance, and those procmail recipes may be worth = the time=20 to learn/tweak. --Mark=20

Michelle's (SmartList add-ons) = Confirmation cookie
ftp://ftp.fatfree.com/ confirm-1.1.tar.gz ftp://ftp.rahul.net/pub2/artemis/ confirm-1.1.tar.gz = To add=20 subscription confirmation to smartlist=20

The mail-list.com front-end = for Smartlist=20 Mailing Lists
ftp://ftp.mail-list.com/ mark david mcCreary=20 <mdm@internet-tools.com ...This is a front end to Smartlist mailing = lists.=20 It provides easy to use addresses for subscribers of the list. It also = allows=20 list owners to concatenate many Smartlist commands into one mail = message,=20 which is then broken up and sent in as separate messages to Smartlist. =

These scripts can accept messages for any mailing = list,=20 extract the name of the list, and then format the appropriate commands = that=20 are then forwarded on to the standard Smartlist list-request address. = These=20 scripts do not directly alter the Smartlist dist file.=20

A discussion group mailing list is available for = support.=20 Please send a blank message to front-end-on@mail-list.com= =20 to start the subscription process.=20

Moderated lists -- Perl=20 script
http://www.mjolner.com/~lbr/moderate/

20.4 Installation trouble: getparams

Does anyone out there = know what=20 the error means when it occurs when installing Smartlist? Procmail is = already=20 installed on the system (by the sysops)
      make: =
*** No rule to make target `getparams'

[Hal Wine] Yes, it means that you haven't built = procmail yet.=20 Build procmail first, then execute Smartlist's install.sh script. You = need to=20 get and untar the procmail sources in your own directory, then get and = untar=20 the corresponding Smartlist sources in the same directory tree.=20

Then build (but don't install) procmail, then = install=20 Smartlist using the install.sh script. Smartlist uses and builds files = in the=20 Procmail source tree, so that has to be done first=20

[sysops] don't=20 have the time to mess with getting Smartlist running. Obviously,when I = attempt=20 to install Smartlist, it's not finding Procmail. What do I have to do = to get=20 the install program to find Procmail?=20

If the sysops aren't going to install Smartlist, = read all the=20 sections in Manual about non-root use of Smartlist (it works fine).=20

You should make sure that smartlist, when invoked, = uses the=20 matching version of procmail. This means either use the version of = Smartlist=20 that matches the sysop installed version of procmail, or set up your = PATH such=20 that you use the version you built. If you use your own version, make = sure it=20 uses the same locking strategies as the "official" version.

20.5 Accepting mail only from users in whitelist(s) =

1998-10-08 PM-L=20 dave@magic.geol.ucsb.edu Dave=20 Robbins.
      ML          =3D =
/usr/local/lib/aliases
      ACCEPTLIST  =3D "$ML/mylist.accept $ML/everyone $ML/others"
      FROM        =3D `$FORMAIL -rtzxTo:`

      :0
      * ? echo "$FROM" | $EGREP -i -f $ACCEPTLIST
      * ? test -r $ACCEPTLIST -a -s $ACCEPTLIST
      {
              :0 HB
              ! `cat $ML/mylist`
      }

      :0
      *  ! ^FROM_DAEMON
      *  ! ^FROM_MAILER
      *$ ! $MY_XLOOP_LIST
      | ($FORMAIL -rtk                                                   =
 \
         -A "$MY_XLOOP_LIST"                                             =
 \
         -A "Precedence: junk";                                          =
 \
      echo "Your post to mylist@magic.geol.ucsb.edu was not =
successful\n" \
           "because the mailing list is restricted to submissions\n"     =
 \
           "from only certain individuals and groups. Sorry.\n"          =
\
      ) | $SENDMAIL -oi -t


21.0 Additional procmail or MUA software

21.1 Comstat to handle multiple mailboxes

ftp://ftp.belwue.de/pub/Unix/xcomsat.tar.gz=20

21.2 Elm and pgp support (Mutt is the successor to elm.) =

Mutt's=20 primary site is ftp://ftp.guug.de/pub/mutt/ with various mirrors = outside the=20 US to avoid the crypto distribution problem. If you want elm, see http://www.elm.org/=20

[Liviu Daia daia@stoilow.imar.ro = mentions=20 that] ...Provided that you configure it correctly, it will use lynx to = convert=20 HTML attachments to plain text automatically, and display them in its = pager.=20 You can reply in plain text to those attachments, and you can also do = the same=20 thing with any kind of attachment for which you give it a way to = convert to=20 plain text. It's definitely not aimed at the beginner level like Pine, = but=20 it's far more powerful too. Also GPL-ed.

21.3 MH sites

New MH
http://www.math.gatech.edu/nmh/=20


22.0 Additional procmail software for Emacs

22.1 What is Emacs

...first thing I learned on a = Unix machine=20 was that vi is a text editor and Emacs is a way of life. --David W. Tamkin dattier@wwa.com=20

Emacs refers to a = programming=20 platform (it's not only a text editor, or a programming editor, but it = does=20 almost everything you tell it to do except make your coffee) which can = be=20 found almost in any Unix platform. Nowadays Emacs is also available = for the PC=20 platform too. There are two flavors to choose from: Emacs, maintained = by the=20 FSF (Free Software Foundation), and XEmacs, sometimes called "Emacs = the next=20 generation", because it has a better graphical user interface (gui) = and=20 internally advanced OO design (it can highlight on tty, whereas Emacs = can't).=20 XEmacs is being maintained by group of programming wizards.

      See /elisp.html

Emacs add-in packages are lisp and the lisp file = extension is=20 .el. Inside each package one finds instructions = how to use=20 and how to install the package into Emacs.

22.2 Emacs and procmail mode and Lint

Procmail mode for Emacs = (which=20 can also lint procmail recipes) is available. People familiar with = C-coding=20 know lint, which is a rigorous code syntax checker. You can read about = this=20 Emacs mode from http://tiny-tools.sourceforge.net/=20

22.3 Emacs and lining up backslashes

Some time ago I wrote = makefile=20 to my Emacs tgz kit and as a side effect I got frustrated with the use = of=20 backslashes within the make rules. This backslash problem is universal = in=20 almost every programming language, (e.g C/C++ macros) including = procmail,=20 where you sometimes use echo a lot,
      :0 h
      * condition
      | ( cat -;       \
          echo "And the body text\n"  \
               "follows here with\n"  \
               "these echoes"; \
          ) | $SENDMAIL

Ouch. That looks bad. Any line up tool anywhere? = Yes, get my=20 Emacs tiny-tools.tar.gz and look at the file tinymy.el=20 which defines function timy-backslash-fix-paragraph.=20 Here is piece of lisp code that you stick to your .emacs to make the = key Control-\ to run the backslash fix

      (global-set-key "\C-\\"  =
'my-backslash-default-column)

      (defun my-backslash-default-column (&optional arg)
        "Col 76."
        (interactive "*P")
        (autoload 'timy-backslash-fix-paragraph "tinymy" t t)
        (timy-backslash-fix-paragraph (or arg 76) 'verb)
        )

After that, you just put your cursor inside = paragraph and hit=20 Control-\ to get the following line up = effect. The=20 column position is best to set near right margin, but not further than = a=20 regular page's maximum column 80.

      :0 h
      * condition
      | ( cat -;                                                      \
          echo "And the body text\n";                                 \
               "follows here with\n";                                 \
               "these echoes";                                        \
          ) | $SENDMAIL

Guys, Emacs is available for every platform, even = for=20 Windows95 and WindowsNT. So, go ahead and install one if you haven't = already.=20 Setting up your personalized Emacs may require steep learning curve, = but it's=20 well worth the effort :-)

22.4 Emacs and browsing mailbox files

If you use Gnus as your = MUA,=20 then you already can browse mailboxes. If you just want to read some = arbitrary=20 mailbox without firing up Gnus, then you can use package tinymbx.el, in kit http://tiny-tools.sourceforge.net/ It defines a = special=20 mailbox reading minor mode that is activated when you visit mailbox = file. You=20 can copy, file, delete messages or mail the author of the current = message.=20 There is no separate summary buffer as in RMAIL, but you move from = message to=20 another with PgUp and PgDown keys.=20

22.5 Emacs and live-mode.el

http://www.zanshin.com/~bobg/ from bobg@ipost.com Bob Glickstein = 1997=20

...`live-mode' is a minor mode that works like the = "tail -f"=20 Unix command. If the file grows (or changes in any other way) on the = disk,=20 then the buffer copy is periodically updated to show the new file = contents.=20 This makes live-mode ideal for viewing such = things as=20 log files. --Bob=20

You definitely want this if you browse procmail log = files.=20 This package updates the log file buffers whenever they change on = disk. You=20 can think it like biff if you record incoming = file to=20 short $BIFF log.

22.6 Emacs and font-lock.el

Font-lock comes=20 standard in Emacs releases. You can colorize your procmailrc if you = use font-lock. Here is some lisp code; put it
in = your .emacs=20 and reload it with M-x load-file. When you load file that matches procmailrc or procmail.log the=20 font-lock attributes for the file get set. Change the regexp if your = procmail=20 filenames are different.
      (add-hook =
'find-file-hooks 'my-find-file-hooks)

      (defun 'my-find-file-hooks ()
        (require 'cl)

        ;;   colors are available to Emacs only under X window

        (when (and window-system
                   (fboundp 'font-lock-mode)) ;; make sure this is =
present
          (cond
           ((string-match "procmailrc" buffer-file-name)
            (setq font-lock-keywords
              (list
                '("#.*"             . font-lock-comment-face)
                '("^[\t ]*:.*"      . font-lock-type-face)
                '("[A-Za-z_]+=3D.*"   . font-lock-keyword-face)
                '("^\\*.*"          . font-lock-doc-string-face)))

            ;; Turn the fontifying mode on if it's not on already

            (unless font-lock-mode (font-lock-mode 1)))


           ((string-match "procmail.log" buffer-file-name)

            ;;  The strings "" in the procmail log makes font-lock =
crazy,
            ;;  We kill the String class from the buffer with
            ;;  these statements.
            ;;

            (let ((table (make-syntax-table)))
              (modify-syntax-entry ?\" "_" table) ;; Change "
              (set-syntax-table table))

            (setq font-lock-keywords
              (list
               (cons "Opening "       'font-lock-type-face)
               (cons ".* error .*"    'font-lock-keyword-face)
               (cons "Folder:"        'font-lock-type-face)))
            (unless font-lock-mode (font-lock-mode 1))))))
      ;; End code


23.0 Procmail, Emacs and Gnus

23.1 Gnus pointers

Gnus
http://www.ifi.uio.no/~larsi Gnus and procmail: http://www.ifi.uio.no/~larsi/www.gnus.org/manual/gnus_6.htm= l#IDX1501=20

23.2 Why use procmail with Gnus

Gnus has very powerful mail = split=20 methods and one normal reaction against the need of procmail is: "Hey, = Gnus=20 does my mail splitting, I don't need procmail". The difference between = Gnus=20 and procmail splitting is quite easily explained: you want procmail to = preprocess the mail before gnus ever sees it and then postprocess the = mail=20 with Gnus (read, move mail from the inbox to another)=20

Case1: Gnus and = regular mailbox,=20 no procmail. Gnus reads directly one huge mailbox where all incoming = messages=20 are. When the user starts Gnus, it slurps in the whole mailbox and = starts=20 splitting the mail according to the its split rules.

      mail -> $MAIL --> fire up Gnus  --> =
split1.mbx split2.mbx ....

Case2: procmail and = Gnus. The=20 mail is always delivered to procmail first. Procmail is free to put = the mail=20 anywhere or just let it drop to the user's default inbox, usually = pointed by=20 environment variable $MAIL.

      mail -> =
procmail                --> Post processing with Gnus
              [the  ~/Mail/spool]
              --> split1.mbx
              --> split2.mbx
              [The default procmail rule drops to inbox]
              --> $MAIL

You can let gnus to process the messages further: = like moving=20 messages from one inbox to another.=20

Summary=20

  • If you use procmail, the incoming messages are immediately = categorized.=20 The incoming mail is put in the folder of your choice. The mailboxes = are=20 there waiting for you all the time. You can use less=20 or more to view them in a hurry.=20
  • If you don't use procmail and let Gnus to do all the splitting, = you=20 always see one huge inbox, $MAIL. It will not be split until you = fire up=20 Emacs and Gnus. If you're in a hurry, you may not have time to start = Emacs=20 & Gnus, before reading the important messages. Your only option = is to=20 read all messages in $MAIL and try to find the ones that consider = e.g you=20 work.

So, let procmail drop messages to their inboxes and = Gnus to=20 possibly "fine process" these inboxes.

23.3 Setting up gnus for procmail - Basics

Procmail and Gnus=20 communicate with each other very nicely when you use the mail backends = like:=20 nnml, nnmh and = nnfolder. See Emacs info Gnus::Node: Select Methods for more. =

Here are step by step instructions for reading the = mail with=20 nnml mail backend. We suppose that you have = the=20 following definition in your procmailrc so that the incoming mail is = delivered=20 to the right
directory.=20

The important point here is = that the name=20 of the gnus nnml group is identical; except = the .spool suffix, to the spool file where procmail = writes. So=20 if you write to list.procmail.spool, the = group name in=20 gnus is named nnml:list.procmail

      #  .procmailrc excerpt

      PMSRC       =3D $HOME/pm
      MAILDIR     =3D $HOME/Mail
      SPOOL       =3D $MAILDIR/spool
      RC_LIST     =3D $PMSRC/pm-jalist.rc

      #  The file name must be list.xxxxx.spool in order to
      #  `nnml' to work in Gnus.Define procmail mailing list

      PROCMAIL_SPOOL =3D $SPOOL/list.procmail.spool

      #  GNUS must have unique message headers, generate one
      #  if it isn't there. By Joe Hildebrand hildjj@fuentez.com

      :0 fhw
      | $FORMAIL -a Message-Id: -a "Subject: (None)"

      #   detect mailing lists and store messages to spool directory

      INCLUDERC =3D $RC_LIST

      :0 :
      * ! LIST ?? ^^^^
      $SPOOL/list.$LIST.spool

  • Copy the Lisp code below to your ~/.gnus=20
  • Start Gnus with M-x gnus-no-server (M-x means ESC followed by x). = You will see=20 Group buffer to appear.=20
  • Make the new group with G m list.procmail RET = nnml RET. You can read the group as usual and = query new=20 mail with g command.
      (setq
       gnus-secondary-select-methods '((nnml ""))
       ;; See also nnmail-procmail-suffix which is .spool by
       ;; default
       ;;
       nnmail-use-procmail        t
       nnmail-spool-file          'procmail
       nnmail-procmail-directory  "~/Mail/spool/"
       nnmail-delete-incoming     t)

And then I have procmail always deliver to = ~/Mail/spool/. If=20 you add more inboxes, create them inside gnus Group buffer=20 with G m.

23.4 Gnus for procmail - More gnus

Okay, let's continue our = journey=20 in Emacs. What you read previously was the minimum you needed to get = your Gnus=20 to read procmail delivered files. However, if you're new to Gnus, here = are=20 some more tips and basic instructions. The best advice I can give is = that you=20 go to each buffer: In group, you press G = C-h and in Summary C-h = m and print the commands to printer that you see = listed.=20

In Group buffer=20

  • When you press g to get new mail to = these=20 groups, the group disappears if there = is no=20 mail. If you want the group to be permanently visible, then set =
      (setq gnus-permanently-visible-groups  =
"^nnml\\|^nnfolder")

      In emergency, press `L' to list all groups.

  • If you made a mistake and wrote list.procmaill=20 with an extra l accidentally in the group = name, use=20 G r to rename = group.=20

  • Raise or lower the priority of your procmail mail groups with = S l. Values 1 or 2 or = 3 are good.=20 Consider reserving 1 for your primary mail and 2 and 3 for mailing = lists.=20

  • When you exit a group and have read some articles, they won't = show up=20 next time you go there. But by giving prefix argument before = entering the=20 group with SPC, Gnus will list all read = articles.=20 You give the command like C-u SPC, where C-u is the = prefix=20 argument.

Settings=20

  • You want gnus to tell you everything it does
      (setq gnus-verbose 10)  ;; 0..10

  • You expire articles (get permanently rid of them) with the 'E' = command=20 in the Summary buffer. The default expiry time = is 7=20 days. You can define the expiry time in days with
      (setq nnmail-expiry-wait 7)

  • If you read mailing lists, you want automatic expiry when you = have read=20 the article. Use the following to set up groups that use this = automatic=20 expiration.
      (setq =
gnus-auto-expirable-newsgroups
          (concat
           "procmail"
           "\\|other-list"
           "\\|and-some-other-list"))

      (setq =
gnus-uncacheable-groups "^nn\\(virtual\\|m[hlk]\\|db\\)")

23.5 Emacs and Gnus -- Fiddling with spool files

Well, to = tell you=20 the truth, managing Gnus is scary at first: You can make a lot of = mistakes=20 along the way or otherwise change your mind about group names and so = on. It's=20 a tricky task to move mail from one directory to another if you decide = to=20 rename the spool file name where procmail is putting the filtered = mail.=20

Let's take an example: Say you decide to change the = spool=20 file name list.procmail.spool to mail.procmail.spool, because you come = to=20 think that all your mail groups should have the same prefix "mail." in = your=20 Gnus group buffer. You already changed procmail to output to that = file, so now=20 you have two files sitting in your spool directory.

 =
     ~/Mail/spool/list.procmail.spool
      ~/Mail/spool/mail.procmail.spool    # make sure this exists

      % rm =
~/Mail/spool/list.procmail.spool

23.6 Gnus and article snippets

[These articles have been = collected=20 from the GNUS hypertext archive]=20

I'm also a bit confused with = the proposed=20 solution of having procmail filter incoming mail in a nnmail-procmail-directory instead.=20

You have Procmail stuff mail in spool files, = pre-sorted and=20 filtered. Gnus then picks these up and stuff the messages in the = appropriate=20 groups. Gnus uses movemail to actually move the mail out of the spool, = and=20 movemail uses locking that Procmail understands, so there is no danger = of mail=20 loss.=20

Why are nnfolder-directory and nnmail-procmail-directory two different = directories if nnmail-procmail-directory will contain the mail = boxes that=20 procmail appends to and nnfolder-directory = is supposed=20 to be "All the nnfolder mail boxes will be stored under this = directory"?=20

Because Procmail should stuff its mail in different = folders,=20 not in the ones that your regular mail = is stored=20 in.=20

Is the idea to have Gnus use = nnmail-procmail-directory as a temporary directory = that it=20 draws from to process and then deposit nnfolder mailboxes in the nnfolder-directory ?=20

Yep -- Jason L Tibbitts III (tibbs@hpc.uh.edu)=20


Procmail settings
     =
 (setq nnmail-use-procmail t)
      (setq nnfolder-directory "~/gMail/")
      (setq nnmail-spool-file 'procmail)
      (setq nnmail-procmail-directory "~/incoming/lists/")
      (setq gnus-secondary-select-methods '((nnfolder "")))
      (setq nnmail-procmail-suffix "")

Procmail is adding incoming mail to=20 ~/incoming/lists/listname. The nnfolder groups I subscribed to are = named=20 "nnfolder:lists.listname" Gnus does create the ~/gMail/lists directory = with a=20 zero length file in this directory for each list, but doesn't move any = mail=20 over and so it thinks I have "No more unread newsgroups".

      (nnmail-get-spool-files)

After much experimentation, I finally got movemail = to work. I=20 changed nnfolder-directory to = "~/gMail/lists/" and=20 Gnus now moves mail from "~/incoming/lists/" to corresponding groups = in=20 "~/gMail/". My problem seems to be solved, but still these workings = seem=20 counter-intuitive to me. By what the manual has to say about nnfolder-directory I would think Gnus should build = the=20 nnfolder groups in "~/gMail/lists/" instead given my definitions.=20

I think nnmail expects the spool files to be called = "~/incoming/lists.whatever", not "~/incoming/lists/whatever".

      (setq nnmail-procmail-directory "~/incoming/lists/")

I thought you said the groups were called = "lists.whatever"?=20 So the spool files were called ~/incoming/lists/lists.whatever.spool, = then?=20

23.7 Emacs GNUS - POP - Procmail

Is it possible to get new mail = via POP,=20 run it through procmail (for quick things like trashing junk mail and=20 archiving mailing lists) and then have Gnus do its own mail = processing? This=20 is basically what I do now with procmail in my .forward file and all = output=20 going into ~/.MailBox for Gnus to find.=20

[Mark Moll (mmoll@cs.cmu.edu) 08 May 1997 ] First, = let Gnus=20 know that you're using procmail:

      (setq =
nnmail-use-procmail t
      nnmail-procmail-directory "~/Mail/spool/"
      nnmail-procmail-suffix ""
      nnmail-spool-file 'procmail)

Second, let gnus pop your mail every 5 minutes and = invoke=20 procmail:

      (defun mm-pop-mail () (interactive)
          (call-process "/usr0/mmoll/bin/procinc"))
      (gnus-demon-add-handler 'mm-pop-mail 5 t)
      (gnus-demon-init)

Finally create the following script (called procinc in the previous step):

  =
    #!/bin/sh
      MOVEMAIL=3D/usr/local/lib/xemacs-19.14/lib-src/movemail
      ORGMAIL=3D$HOME/.newmail
      $MOVEMAIL kpop://ux2.sp.cs.cmu.edu/mmoll $ORGMAIL
      # this is copied from the procmail (1) man page:
      if cd $HOME &&
      test -s $ORGMAIL &&
      $HOME/bin/lockfile -r0 -l3600 $HOME/.newmail.lock 2>/dev/null
      then
      trap "rm -f $HOME/.newmail.lock" 1 2 3 15
      umask 077
      $HOME/bin/formail -s $HOME/bin/procmail < $ORGMAIL &&
      rm -f $HOME/.newmail.lock
      fi
      rm -f $ORGMAIL
      exit 0

Instead of using a demon you can, of course, also = pop your=20 mail manually by pressing g in the Group buffer if you add the following line to your = ..gnus:

      (add-hook 'gnus-get-new-news-hook =
'mm-pop-mail)



From: Markus Dickebohm m.dickebohm@uni-koeln.de=20 1997-06=20

Recently I switched to procmail to filter some = mails from=20 high volume mailing lists out of my inbox (I don't like my mail = notifier do=20 blink every few seconds).=20

Personal mails and mails from some low volume lists = stay in=20 /var/spool/mail/$USER.=20

I set nnmail-use-procmail = and both=20 the personal mails and the procmail-filtered mails are incorporated to = Gnus.=20 That's exactly the way I like it.=20

Today I started Gnus and to new nnml groups showed = up. The=20 reason was that the procmail rule produced a file "ding.spool" while = the nnml=20 group I used for this list via the nnml-split-method=20 variable was "Ding".=20

This behavior shows that Gnus doesn't split the = procmail=20 filtered mails again. I understand the manual that the variable nnmail-resplit-incoming is responsible for that. = Do I have=20 to set this variable or is it OK to get the procmail rule and nnmail-split-method in sync?=20

The manual says.. "This also means that you = probably don't=20 want to set nnmail-split-methods either, = which has=20 some, perhaps, unexpected side effects."=20

This is not what I want, since the remaining mails = in=20 /var/spool/mail/$USER should be split further by Gnus. Do I really = have to=20 decide to use procmail or nnmail-split-method or is it justified to get the = best from=20 both?=20

!!=20

in the Info file, section `Mail & Procmail' (or = so), I=20 read:=20

... If you use procmail, you should set nnmail-keep-last-article to non-`nil', to prevent = Gnus from=20 ever expiring the final article in a mail newsgroup. This is quite, = quite=20 important.=20

Why? I thought this was important only if the nnmail-use-procmail variable is nil and the = .overview files=20 are updated with a script. When nnmail-use-procmail is t and procmail = writes=20 its stuff to the spool files, (ding) knows everything about all its = messages.=20

... being able to reliably = deliver mail=20 directly to (ding)'s nnmh directories, for example, using procmail = would be=20 very nice...=20

As already hinted at by Per Abrahamsen this is = possible as=20 long as you don't move or copy articles (within) = ding into=20 these directories. Just set nnmail-keep-last-article=20 to be true.=20

But that's an awfully big exception to what would = be a rather=20 nice feature. Certainly filing mail into different mail groups is = something I=20 do on a regular basis. That's why I am advocating pre- and post-hooks for all modifications to the = overview/active=20 information. With that in place, it would be possible to use a locking = mechanism to prevent procmail and (ding) from both trying to modify = these=20 files at the same time. Then, copying and moving messages between mail = groups=20 during procmail deliveries would be 100% reliable.=20

Unfortunately, there's no simple way to allow moves = and=20 copies into groups that have external delivery agents. The preand = post- hooks=20 stuff will solve the problem of safe overview / active file update. = This is=20 only part of the problem for move/copy. If an article has arrived = since you=20 last checked for new news, then ding doesn't quite "see" it (as it = doesn't=20 "see" new news until you ask it to look). What's needed here is for = ding to=20 update its notion of what the last article in the group actually is = before=20 doing the move/copy--ie., to run a local *-get-new-news (of course, = locking=20 via a hook is still required).=20

Adding this will need a lot of mucking around with = the=20 internals, the way things currently stand.=20

Another approach entirely might be to wait until = the stuff=20 that was discussed for IMAP gets added--where ding asks the backend = for all=20 information and doesn't maintain any state in .gnus. It'll be simple = then to=20 make the backend check for new mail before actually copying/moving the = article--ding won't have to be fooled as to what the actual article = numbers=20 are. You could add something like this right now, but I think it'll = really=20 stretch the code some. (cf. gnus-cache.el for the meaning of "stretch" = :-).=20


24.0 RFC, Request for comments

24.1 RFCs and their jurisdiction (munged Addresses)

Try = dejanews=20 <power seach> Groups: gnu.emacs.gnus Search: RFC=20

The real implementation of = news software=20 doesn't care if the from field is munged or not=20

[1998-03-25 gnus.emacs.gnus, Marty Fouts fouts@null.net] The point of = the=20 argument is: The RFCs don't demand what those who would quote them to = suppress=20 munging claim they do. In particular, RFC 1036 is advisory, an attempt to describe how netnews works with NNTP. = In the=20 case of header munging, RFC 1036 does not describe=20 the way the software works in the field. There is no reason to cite an = advisory RFC that in many ways is incorrect to support an untenable = position.=20

Note: Marty is=20 an IETF USEFOR and has a good understanding how the RFCs should be=20 interpreted. See gnu.emacs.gnus 1999-02-08 and theread / Re: "Sender" = field/.=20 <URL:http://search.dejanews.com/msgid.xp?MID=3D%3Cy1ud83pre7w.fs= f@acuson.com%3E&format=3Dthreaded&maxhits=3D200>=20

[1997-11-05 gnus.emacs.gnus, Marty Fouts] No RFC = forces the=20 address of the poster to be a reachable address = (indeed,=20 Sender: is sometimes user@host without the domain part) -- it only = requires=20 such addresses to be syntactically correct. The RFCs do not require anything. The RFCs related to Usenet = are advisory. RFCs describe various things and define a = small=20 number of standard protocols, netnews is not an=20 internet standard protocol.)=20

The bottom line WRT RFCs that are informational is = that when=20 there is an ambiguity, or a difference between the RFC and the = implementation,=20 the implementation (which is what the RFC was trying to describe in = the first=20 place) has precedent.=20

As much as y'all want it to be otherwise, the = implementation of netnews, (I. E. INND, NNTP) = doesn't care=20 about whether or not an address can be replied to. It is rumoured that = some=20 news posting software checks the validity of an address. Such software = is in a=20 tiny minority.=20

[counter argument 1998-03-25=20 gnu.emacs.gnus, Jan Vroonhof vroonhof@frege.math.ethz.ch]=20 Now although INND and friends are important parts of the Usenet = software=20 bundle the news READERS are even more important. Now I'll bet 99% = readers,=20 like f.i. Gnus, assume the address in the header is the address to be = replyed=20 to when the user requests to go into a private discussion with the = author=20 (i.e. reply instead of followup).=20

[marty] netnews is a = public forum. mail is a private=20 communication medium. Posting in a public forum=20 does not require that I give you access = to my=20 private address, just as speaking at a public meeting does not require = that I=20 give you my unlisted phone number.=20

One thing is for certain: putting the burden on = anyone=20 wishing to send an mail to you, by requiring them to decipher the = address.=20 Someone may never "reply by mail" to persons using those phony = addresses.=20 Anyone who wishes to send a personal mail cannot just hit 'reply'. = People who=20 do this accept this, which is they will watch the newsgroups for = followups=20 regularly. If someone eagerly wants to get personal, he can spend the = extra=20 minute to decipher the correct address for the person. --Marty=20

[counter argument, vroonhof] = However if=20 you don't want to give me your phone number, why give me a false one? = If=20 people with this desire at least put only their name and had no=20 "<adress>" part then one could have the news reader say "Reply=20 impossible, no address given".=20

[Counter argument, unknown] = When I was=20 using Pegasus Mail (Win95), it took me about 10 minutes to set up = filters that=20 removed over 75% of the spam I received. 10 minutes is too great a = burden to=20 you? MY, what a busy person you are.=20

[timothy] What about = the accounts=20 from which I do not control (network at work) where I do not have say = over=20 what software is installed? I can say to the sysadmin ``Hey I'd like = Pegasus=20 mail installed'' and he nods and mumbles something. He's got 2 years = worth of=20 backlog from there not being a real sysadmin around=20

[Counter argument, unknown] = Furthermore,=20 there are a number of procmail recipes available on the net, that can = be used=20 with minor adjustments to filter your mail. No heavy-duty Unix skills = are=20 required. Just the initiative to take responsibility for your own=20 problems.=20

I know procmail very well, and spammers are still = getting=20 through. You know why? They refuse to follow all the conventions we = depend on.=20 And they spam mailing lists, so I have to filter for that as well. I = have=20 spent untold hours trying to develop better and better filters with = lower=20 numbers of mis-hits. Nothing works as well as not giving more spammers = my=20 address.=20

...You simply prefer to put = the problem=20 off on somebody else, rather than take the time to deal with it = yourself.=20 Well, that kind of laziness does seem to predominate in the "world of = the=20 Internet" these days.=20

I have spent the time, learning from what others = have done=20 and seeking to improve them. You are certain you are right and refuse = to think=20 about it anymore.... and that kind of laziness is all over the = Internet.=20

The only one it wrongly inconveniences are those = who need to=20 mail me and have lost my mail address. If you want to followup a = Usenet post,=20 do it in Usenet. I'll be back here for followups. I get enough mail, = and don't=20 need mail for Usenet threads.=20

If you would like me to use a real address, please = set me up=20 an account with procmail where I can get all my Usenet related = messages sent.=20 --Timothy

24.2 Comments about addresses munging

[1998-03-24 gnu.emacs.gnus drwho@No-Spam-see-sig.xnet= .com]=20

...I am well aware that it is bad behavior, as I am = well=20 aware that it breaks standards. However, I'm also well=20 aware of the fact that I do not need to have a mail-box filled with = spam every=20 time I look at it. Things have quieted down considerably since I = started=20 altering my From: line. There's still the occasional that gets me, = though.=20 It's not really such a big deal right now, but after following the = net-abuse=20 newsgroups for a while, it has become apparent to me that spammers are = trying=20 new tactics to grab mail addresses (msg id's, sender: lines, etc...).=20

Since I have to download most of my mail from a = POP3 account,=20 it takes time that I don't have to wait for all that spam to download. = If=20 breaking my headers means getting a few moments peace and freedom from = spam,=20 then so be it.=20

[M. Maxwell drwho@No-Spam-see.sig = 1998-03-26=20 gnus.emacs.gnus]=20

...Believe me, I don't like = having to do=20 this <<header munging>> at all. But it saves me considerable aggravation. I also don't have to = download my=20 mail from a POP3 server (my ISP has a shell account), but I prefer to = read=20 mail offline simply because I get so much of it with all those mail = lists,=20

And since that's the case, I end up downloading = plenty of=20 junk along with the legit mail, after which, my local procmail puts it = where=20 it belongs. In other words, not in my inbox. And so I'll do what I = have to to=20 foil the spammers (until we get some sort of legislation passed on = junk mail).=20 And those that do get past the fouled headers = are dealt=20 with accordingly.

24.3 RFC and valid mail address characters

What characters are legal in = e-mail=20 addresses? So far, I have uppercase, lowercase, digits, _ - + . @ =

[elijah] Most any = 7bit character.=20 For all practical purposes whitespace (space, tab, newline) are really = inadvisable. This post is from a valid address. I also have ones with = control=20 characters -- eg <@qz.to> (may not show up right in your = newsreader).=20 See RFC822 for the full rules on generating an address, but the quick = and=20 dirty thing is any of the "specials" must be quoted to be used.

      See definition of `specials' in RFC
      specials    =3D  ()<>,;:\.[] and a double quote

If you don't believe me, there are mail toys to = prove this.=20 Best one I know of right now is Tom Phoenix's "fred&barney"@redcat.com= =20 address. You can replace the "&" with just about any string I = believe.=20 I've tried it with stuff like "fred($)barney"@redcat.com= =20 and it seems pretty stable.

24.4 RFC and login-name@fdqn

[1998-06-08 Message-ID: wkd8cjekay.fsf@mjf.vip.be= st.com=20 Marty Fouts Usenet-user@usa.net in=20 gnu.emacs.help. Refer also to summary of the whole thread in = 1998-06-11=20 Message-ID: wk4sxs62ll.fsf@mjf.vip.be= st.com=20 by Marty Fouts.]
      >>>>> In =
article x7g1hfu2sf.fsf@gkar=
.prescienttech.com,
      >>>>> Rich Pieri rich.pieri@prescienttech.com=
 enscribed:

      > -----BEGIN PGP SIGNED MESSAGE-----

      > Marty Fouts writes:

      >> Sort of: system-name is not a hook into gethostbyname. =
The
      >> /variable/ system-name is set by a builtin defvar to
      >> gethostbyname. system-name returns the value of the =
/variable/
      >> system-name, and the emacs lisp manual advises setting it =
if it
      >> is not correct.

      > It still uses gethostbyname() to set the initial value.
      > gethostbyname() is supposed to return an fqdn on a networked
      > host.

So? That the initial value is an FQDN is no = indication that=20 the value returned at any time thereafter will be. This is why emacs = doesn't=20 use system-name to create mail addresses, but has a separate function. = If=20 emacs itself doesn't rely on system-name to generate any mail = addresses, why=20 should gnus?

      >>> user@fqdn is the =
agent responsible for submission of a
      >>> message to the network. user@fqdn is the RFC sender =
of the
      >>> message. user@fqdn therefore must be made to be a =
valid
      >>> mailbox.

      >> This is just flat out wrong. There is no such requirement =
in
      >> any RFC or implied by any combination of RFCs.

      > Premise: Gnus is used interactively. Premise: "user"
      > (user-login-name) is the login name of the person using Gnus.

And that's where you fail first. There is no = requirement=20 anywhere in any RFC or combination of RFCs that a login name even = exist.=20 Although your premise is true, it is irrelevant to your conclusion, as = explained below.

      > Premise: "fqdn" =
(system-name, self-referential gethostbyname) is
      > the canonical network host name of the machine "user" is =
using at
      > the time.

And that's where you fail second. There is no = requirement=20 anywhere in any RFC or combination of RFCs that the machine "user" is = using be=20 exposed as a part of a mailbox. I am /allowed/ to do that, and if I do = that I=20 am required to support that mailbox as valid. I am not=20 /required/ to do that.=20

I've already cited, and will repeat, that a TIP is = a good=20 example of such a machine. So is a POP3 client. You are missing some = more=20 premises, most notably that user@fqdn is the sender of the message in the sense of any RFC or = combination of RFCs.=20

Most importantly, you are missing some steps in = your logic.=20

Nor will you be able to, because there are no such=20 requirements.=20

To put this as simply as possible:=20

You are incorrect to assert = that there is=20 any requirement that a system support the mapping from = (login-name,FQDN) to a=20 mailbox of the form login-name@FQDN.=20

Once you understand that this assertion is = incorrect, it=20 should be easy to see that all assertions derived from it are = incorrect.=20

24.5 RFCs and messages signature

http://www.chemie.fu-berlin.de/outerspace/netnews/son-of-10= 36.html=20

According to universal defacto Net convention, = there must be=20 "\n-- \n" before signature. The extra space in signature delimiter = tells that=20 it is user's messages and not the Message Digest that uses delimiter = "\n--\n".=20 There is no RFC that would address this though.=20

And by the way: it's rude to have a longer sig than = 1-3=20 lines. Better yet, move the repetitive information to the X-headers if = your=20 MUA supports modifying the headers.=20

NOTE: The choice of = delimiter is=20 somewhat unfortunate, since it relies on preservation of trailing = white space,=20 but it is too well-established to change.=20

[Paul O. Bartlett pobart@access.digex.net]= Eg.=20 When one is writing text, the preferred Un*x editor routinely = truncates=20 trailing blanks when writing a file, so that even if there were "-- " = in the=20 signature, Pine includes it automatically as part of the = editable
text, and=20 the editor would simply truncate the blank. The signature delimiter = may be=20 "too well-established to change," but it collides with the reality of = the=20 tools people use.

24.6 RFC and using MIME in Usenet newsgroups

[1999-02-12 Marty Fouts gnus@.fogey.com in=20 gnu.emacs.gnus=20

24.6.1 in Message-id: wklni3b3gl.fsf@Use= net.nospam.fogey.com]=20

The use of MIME is debatable. The use of MIME in a = USENET=20 posting is inexcusable, except for the case covered in draft by:=20

Insofar as there exist = authorities=20 empowered (by common consent or otherwise) to define what is and is = not proper=20 in various hierarchies or newsgroups or cooperating subnets, those = authorities=20 ought to establish, by means of rules, guidelines, charters or = whatever else,=20 the practices considered acceptable within their domains. In = particular they=20 ought to establish which of the more exotic content types are likely = to be=20 inappropriate. In the absence of such specific guidance, the following = default=20 recommendations are offered as an indication of best practice at the = present=20 time.=20

Note that the comment "is inexcusable" is my = opinion. The=20 draft, contrary to your apparent understanding, merely gives = -guidelines- for=20 how to use mime headers.=20

If you, or anyone else, feels that the draft = replacement for=20 RFC 1036 needs to be worded differently, you are welcome to join the = task=20 force and attempt to persuade the members of this. However, a warning = is in=20 order: the process has been ongoing for several years, deadlines = approach, and=20 this particular issue has been argued in a great deal of detail. =

24.7 Some RFC Pointers

http://www.cis.ohio-state.edu/hypertext/information/rfc.htm= l=20 http://www.nexor.com/public/rfc/index/rfc.html=20

  • rfc821 SMTP protocol, see also rfc959 FTP protocol standard=20
  • rfc822 Format of internet messages (formerly called as Arpanet) = A new=20 draft that is likely to replace 822 is at: ftp://ftp.ietf.org/internet-drafts/draft-ietf-drums-msg-fmt= -04.txt=20
  • rfc1036 (the mail message format standard: From, to, date ...) = Check=20 also son-of-1036.html mentioned earlier.=20
  • rfc1153 Digest message format, 1990, Status: EXPERIMENTAL)=20
  • rfc1738 URL specification, mailto, http, <URL:address> = consult=20 rfc2396 which supersedes rfc1738. the <URL:...> wrapping has = been=20 de-recommended by popular demand. "define a single, generic syntax = for all=20 URI". See also rfc2369 "The Use of URLs as Meta-Syntax for Core Mail = List=20 Commands"=20
  • rfc1855 Netiquette Guidelines 1995=20
  • rfc1991 PGP Message Exchange Formats=20
  • rfc2076 Common Internet Message Headers=20
  • rfc2045,6,7 MIME=20
  • rfc2111 Content-ID and Message-ID Uniform Resource Locators Also = rfc1341=20
  • rfc2142 Mailbox names for common services, roles and functions =

More Details=20

  • Common Internet Message Headers http://andrew2.andrew.cmu.edu/rfc/rfc2076.html =


25.0 Introduction to E-mail Headers

25.1 To find out more about mail (Resources)

All=20 about Email headers
http://www.stopspam.org/email/headers/headers.html = ...This=20 document is intended to provide a comprehensive introduction to the = behavior=20 of mail headers. It is primarily intended to help victims of = unsolicited mail=20 ("mail spam") attempting to determine the real source of the = (generally=20 forged) mail that plagues them; it should also help in attempts to = understand=20 any other forged mail. It may also be beneficial to readers interested = in a=20 general-purpose introduction to mail transfer on the Internet.=20

[See also RFC pointers in the RFC section]=20

IMC -- Internet Mail=20 Standards
http://www.imc.org/mail-standards.html=20

FAQ archive
http://www.FAQs.org/FAQs/=20

RTFM ftp archive - Read the = fine=20 manual
ftp://rtfm.mit.edu/pub/Usenet-by-hierarchy/comp/mail/=20

Sendmail
ftp://rtfm.mit.edu/pub/Usenet-by-group/comp.mail.misc/sendm= ail_FAQ=20

UNIX EMail = Software
http://www.FAQs.org/FAQs/mail/setup/Unix/part1/index.html=20 ...This document is intended for system administrators who need to = know how to=20 set up their UNIX systems for mail communication with the outside=20 world...UUCP, Addresses, Domain Addresses, FQDN, NIC, MX record, = Bang-Paths,=20 Gateways, Routers, Smarthost, MIME, X.400, "The maps", Aliases=20

Plus addressing
http://www.FAQs.org/FAQs/mail/addressing/=20

Understanding E-Mail = Addresses, DNS,=20 Gateways
http://www.uiuc.edu/uiucnet/3-2-1.html=20

The Unix MBOX, Berkeley,=20 format
http://www.qmail.org/qmail-manual-HTML/man5/mbox.html = ...This=20 format comes to us from the ancient UNIX mail program, V7 = /bin/mail...Each=20 message ends with two blank lines=20

[1998-09-06 PM-L Dallman Ross dman@netcom.com] I would have = thought=20 the connection to Berkeley was /usr/ucb/mail (a.k.a. "Mail," with a = capital=20 "M"); not /usr/bin/mail (a.k.a. "/bin/mail"). ("UCB" stands for = "University of=20 California, Berkeley.") The two are close, though different enough = that I get=20 messed up if I try to use /bin/mail for much. But "ancient UNIX mail = program"?=20 I use and prefer /usr/ucb/mail whenever I'm in a UNIX shell. Many = others do,=20 too. <Yeesh.> (I don't like pine. It feels too GUI.)=20

Okay, sorry for the digression, but you all were talking about the RFCs and From_ lines. If it's = called=20 "Berkeley Mail Format," then I'd infer it comes from Berkeley Mail.=20

Listspec -- proposed standard = from mailing=20 list messages
http://www.within.com/~chandhok/ietf/listid.sHTML http://www.ietf.org/internet-drafts/draft-chandhok-listid-0= 3.txt=20

Internet mailing lists have evolved into fairly = sophisticated=20 forums for group communication and collaboration; however, = corresponding=20 changes in the underlying infrastructure have lagged behind. Recent = proposals=20 like [LISTSPEC] and now [RFC2369] have expanded the functionality that = the MUA=20 can provide by providing more information in each message sent by the = mailing=20 list distribution software.=20

In order to further automate (and make more = accurate) the=20 processing a software agent can do, there needs to be some unique = identifier=20 to use as an identifier for the mailing list. This identifier can be = simply=20 used for filter string matching, or it can be used in more = sophisticated=20 systems to uniquely identify messages as belonging to a particular = mailing=20 list independent of the particular host delivering the actual = messages. This=20 identifier can also act as a key into a database of mailing lists.=20

Literature

Dr. Bob's Painless Guide to the Internet : & = Amazing=20 Things You Can Do With E-Mail by Bob Rankin No Starch Press ISBN: = 1886411093=20 List Price: $12.95=20

Netiquette by Virginia Shea Paperback 1 Ed edition = (May 1994)=20 Albion Books ISBN: 0963702513 Amazon.com Price: $19.95=20

The Elements of E-Mail Style : Communicate = Effectively Via=20 Electronic Mail by David Angell, Brent D. Heslop Addison-Wesley Pub Co = (C)=20 ISBN: 0201627094 Paperback - 157 pages (April 1994) List Price: $12.95 =

All About Internet Mail (Internet Workshop Series, = No. 7) by=20 Lee David Jaffe Library Solutions Inst & Pr ISBN: 188220820X = Amazon.com=20 Price: $34.00=20

3 Rs of E-Mail : Risks, Rights and Responsibilities = by Diane=20 B. Hartman, Karen S. Nantz Crisp Publications Inc. ISBN: 1560523786 = Paperback=20 - 153 pages (June 1996) List Price: $12.95=20

E-mail Companion; Communicating Effectively Via the = Internet=20 and Other Global Networks by John S. Quarterman, Smoot Carl-Mitchell = Addison=20 Wesley Pub Co ISBN: 0201406586 Paperback - 318 pages (November 1994) = List=20 Price: $19.95=20

The Internet Message : Closing the Book With = Electronic Mail=20 (out of print) by Marshall T. Rose Prentice Hall (Sd) ISBN: 0130929417 =

Managing Mailing Lists: Majordomo, LISTSERV, = Listproc, and=20 SmartList By Alan Schwartz O'Reilly & Assoc. 1st Edition March = 1998 ISBN:=20 1-56592-259-X 298 pages, $29.95 http://www.oreilly.com/catalog/mailing/=20

sendmail, 2nd Edition By Bryan Costales & Eric = Allman=20 O'Reilly & Assoc. 2nd Edition January 1997 ISBN: 1-56592-222-0 = 1050 pages,=20 $39.95 <http://www.oreilly.com/>

25.2 Lecture by Alan Stebbens

<URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/=20 cgi-bin/w3glimpse2HTML/procmail/1996-08/msg00098.html?69#mfs>=20

There are two general classes of headers: those = generated=20 automatically by the MTA, and those configured and inserted by the = MUA, on the=20 user's behalf.=20

The former, the ones generated by the MTAs, are = used mostly=20 for tracking the e-mail, and generally have nothing to do with the = content of=20 the mail, much like those bar-code labels FedEx uses to track = packages.=20

The latter, the ones inserted by the MUA or by the = user, are=20 just like the shipping label the FedEx customer fills out, ie: they = determine=20 the source, the destination, and describe the content of the mail.=20

It would be overburdensome for the user to generate = all of=20 these MUA headers themselves, so the user's mailer generates many or = most of=20 them automatically, typically under configuration control. Of course, = the user=20 can always override or replace the automatic MUA headers.=20

The MTA headers, on the other hand, are almost = completely=20 automatic and the user almost never can change them. Only under = special=20 circumstances should the MTA headers be inserted or modified by the = user.=20

>From the user's perspective, however, the = e-mail process=20 seems atomic, so that the distinction of these header classes is lost. = Even=20 some systems managers or postmasters fail to appreciate that it is = during=20 different stages of the e-mail process, that different sets of headers = get=20 inserted.=20

To help clarify this distinction, here's a diagram = of the=20 e-mail process and its several stages:

      sender =
-> MUA -> MTA ->..-> MTA -> MDA ->{maildrop}-> MUA =
-> reader
      [1]       [2]    [3]        [4]    [5]                [6]

Headers typically provided by "template" by the MUA = to the=20 sender, usually during stage [1] (when composing e-mail):

      From:               # who I am
      To:                 # the target
      Cc:                 # people to keep informed, but need not =
respond
      Bcc:                # secret admirers
      Subject:            # what's the mail about
      Reply-To:           # highest priority return address
      Priority:
      Precedence:
      Resent-To:          # used for redirecting e-mail
      Resent-Cc:
      X-BlahBlah:         # personalized headers

When the sender is done composing, and says "send = it" to=20 his/her mailer, some additional headers may get inserted by the MUA at = this=20 stage [2]:

      Date:
      Resent-Date:        # if being redirected
      From:               # If not already present
      Sender:             # if a From: is already present
      X-Mailer:           # what MUA composed this message
      Mime-Version:
      Content-Type:       # what kind of stuff is in here
      Content-Transfer-Encoding:
      Content-Length:

When the MTA receives the e-mail from the MUA at = stage [3],=20 it may insert additional headers showing the origination of the = e-mail:

      From                # if local e-mail, =
automatic or by -f option
      Date                # If not already present
      Message-Id:         # unique ID for the e-mail; the first MTA
                          # creates this
      Received:           # shows inter-system e-mail tracking info
      Return-Path:        # shows how to get back to the sender

As each MTA hands off the e-mail, additional = headers may get=20 added, all as part of the MTA to MTA handoff in stage [3]:

      Received:           # inserted by each MTA

As the final MTA hands the e-mail to a delivery = agent (MDA),=20 in stage [4], there are still some more header insertions which may = occur:

      Apparently-To:      # added if no To: =
header exists
      From                # may get added if local e-mail

Some sites insert special rewrite rules and = filtering to=20 occur to support virtual domains, and these header changes will occur = at stage=20 [5], just before the incoming mail is dropped. Generally, though, no = new=20 headers are added, except possibly one to avoid loops:

      X-Loop: $USER@$HOST # inserted to avoid filtering =
loops

Finally, at stage [6] when the reader views his/her = e-mail,=20 most MUAs will apply a filter to the stored mail causing selected = headers to=20 be omitted from the display. In a sense, then, this filtering = "removes" the=20 headers from the user's view (although no headers are actually removed = by the=20 MUA).=20

The headers typically omitted are those inserted by = the MTAs,=20 and those having to do with the transport process and less with the = contents.=20

25.3 Applied to received messages

[alan]=20 So, now that we have a common understanding...=20

The first "From" is a Unix-mail From_=20 header (note the space). This is inserted automatically by MTAs, = unless one is=20 already present and only then if it seems valid.=20

The second From: is = generated by the=20 MUA (your personal mailer), either by configuration, or by the user. = The=20 rewrite rules in sendmail and most filtering programs concern = themselves with=20 the From:, To:, = Cc:, Reply-To: headers.=20

I'll assume that if "From smmi" is not "correct", = then you=20 must be trying to hide the delivery process, and implementing = something of a=20 virtual domain.=20

In general, it is a bad idea to "correct" the = automatic mail=20 headers inserted by the MTAs. This is a different matter than changing = addresses to show virtual domains. The From_ = header is=20 part of the history of the message, showing how the mail was = originated.=20 Similarly, the "Received:" headers should not be messed with. Changing = the=20 history of an e-mail message will make it very difficult to diagnose = e-mail=20 delivery errors.=20

That being said, and, since I also believe in the = freedom of=20 choice, I will now supply you with "enough rope to hang yourself" :^)=20

There are two places where you can have the From_ header corrected: just before it gets = dropped into the=20 mailbox (for incoming e-mail), or as it gets submitted to the MTA (for = outgoing e-mail).=20

Changing the From_ before = it gets=20 dropped is easy. Just use a recipe like this:

      =
FROM    =3D `$FORMAIL -zxFrom:`
      DATE    =3D ...construct the RFC date format

      :0 fhw
      | $FORMAIL -I "From  $FROM $DATE"

The From_ header is = created=20 automatically by the MTA (sendmail) when it receives a piece of mail. = If the=20 mail is sent through sendmail without using the '-f' option, then = sendmail=20 sets the default From_ to that of the = current user. If=20 you are not root, or a "trusted user" (see the sendmail man page), = then=20 sendmail will ignore the From_ header and = either=20 remove it altogether or replace it. Even if you are root, sendmail = will=20 replace the From_, if the e-mail is being = received=20 locally (as opposed to from the network).=20

If you wish to change the From_, you=20 must invoke sendmail, as root or a "trusted user", and use the "-f" = option.=20 EG: to set the From_ to match the From: header, use the following recipe, as root: =

      :0 h
      FROM=3D|$FORMAIL -zxFrom:

      :0
      ! -oi -t -f"$FROM"

Please read the man page on sendmail, noting the = use of '-f'.=20

25.4 Bcc lecture by Alan Stebbens

<http://www.rosat.mpe-garching.mpg.de/mailing-lists/procmail= /1996-11/msg00054.html>=20

Procmail most typically processes incoming mail at = a=20 destination site; the BCC formatting (or lack of it) is done on = outgoing mail,=20 at the originating site.=20

For this discussion, let's make distinctions as to = the kinds=20 of mail there are: (a) incoming mail, and (b) outgoing mail. Bcc's are = inserted into outgoing mail by the user, and the message is then = handed to a=20 MUA. The MUA may then handle the BCC's or defer that to the Mail = Transport=20 Agent (MTA), such as sendmail. Whichever agent performs the Bcc = function, that=20 function is performed in at least three different ways:=20

  • Many MUAs format outgoing mail without the Bcc: headers, so that = the=20 same message header can be sent to all recipients. The Bcc: = recipients=20 receive an extra line in the message body, indicating the nature of = the=20 mail. The text of the message varies from MUA to MUA; The Rand = Mailer, MH,=20 for example inserts the lines around the original text: =
      ------- Blind-Carbon-Copy
      ...
      ------- End of Blind-Carbon-Copy

  • Some MUAs will send the message, separately, to each Bcc: = recipient,=20 with the recipient address on the Bcc: header. Each Bcc recipient = thus knows=20 that they received the message by way of the Bcc, but do not know = whom else=20 was a Bcc recipient. All Bcc recipients are private, even to other = Bcc=20 recipients. (It would be nice if all MUAs behaved this way). =

  • A few MUAs deliver the message without the Bcc:, but also = without any=20 special indication; you must guess that it was a Bcc.

The original mail standard RFC822 says this about = Bcc:=20

4.5.3. BCC / RESENT-BCC=20

This field contains the = identity of=20 additional recipients of the message. The contents of this field are = not=20 included in copies of the message sent to the primary and secondary=20 recipients. Some systems may choose to include the text of the "Bcc" = field=20 only in the author(s)'s copy, while others may also include it in the = text=20 sent to all those indicated in the "Bcc" list.=20

So, procmail would handle = Bcc's correctly=20 if the sender's MUA included the Bcc in the header in the first place. = But,=20 since procmail is most typically used on incoming mail, it=20 will never have a chance to deal with Bcc: headers.

25.5 Bcc lecture by Philip Guenther

<URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/procmail= /1996-11/msg00055.html>=20

The Bcc: header should in general not appear in an = incoming=20 message (if procmail is used for processing outgoing mail it may occur = there).=20 Most (?) Mail User Agents will send a bcc by just removing the header = entirely=20 and putting the address in the envelope recipient list with the other=20 recipients from the To: and Cc: headers. Done this way, the address to = which=20 the message was bcc'ed *does not occur in the headers at all*, and you = are=20 SOL.=20

By the time procmail is run (in the standard = installation),=20 the envelope is lost, which is the only way you would be able to = process Bcc's=20 with any possible regularity, and even that's suspect as if an alias = at=20 another site that contains your address is bcc'ed, then the envelope, = by the=20 time it reaches you site, will only contain your (local) address.=20

Furthermore, the whole point of the Bcc: header is = that the=20 people who receive the message do not know the entire list of address = to which=20 the message was sent. If an alias is bcc'ed, it is not clear whether = the=20 members of the alias should know that it was the alias that was bcc'ed = and not=20 just the individual in question alone.=20

There MUST be some trace of = the BCC=20 destination that travels with the e-mail. Otherwise, how does it know = its=20 destination? If I'm right, then couldn't procmail use this to properly = handle=20 the message?=20

[alan]=20

<URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/procmail= /=20 1996-11/msg00093.html>=20

Only the MTA knows the destination address because = it is part=20 of the "envelope", the information which is passed on the "RCPT To: = some-user"=20 SMTP line. This information is how the MTA knows to deliver the mail, = and not=20 by the contents of the headers.=20

Of course, when invoked properly, many MTAs can = read the=20 headers to obtain the addresses needed on subsequent "RCPT" commands = in the=20 ensuing SMTP connections. In fact, the Bcc: header can be read along = with the=20 rest of the destination headers to obtain the recipient addresses, but = the=20 Bcc: will also be removed from the headers.=20

The address by which an MTA receives a mail is = known as the=20 "envelope address", which may be distinct from any headers in the = message=20 itself, or, the same as one of them, for directly addressed mail.=20

With mailing lists, for example, the addressee will = never see=20 his/her own address, but will see the mailing list in the To: or Cc: = header=20 fields. Even here, when mail is addressed to more than one mailing = list, there=20 is a lack of standard for determining the = address by which=20 a message is received. There are lots of conventions followed, and = heuristics,=20 but no clearly defined standard to indicate the cause of delivery.=20

You may be able to configure your MTA to pass along = the=20 envelope in a new header, or pass it by argument to the local delivery = program=20 (which can be procmail). It is then up to the local delivery program = to use=20 (or not) the envelope address information.=20

If you wish to understand the limits of your mail = system, you=20 should read RFC822 (mail formatting standards) and RFC821, which = describes the=20 original language of SMTP. There are several extensions in progress, = but the=20 basic commands of "MAIL", "RCPT", and "DATA" should suffice. =


26.0 Message headers

26.1 What is correct From address syntax

Case 1 is what is in RFC 822, = as I recall.=20 I regard Case 2 as "screen syntax" for inclusion in plain-text = message-body=20 contexts. It could be used for the interactive presentation of = headers, but I=20 would be inclined to think any tool that doesn't accept the original = RFC-822=20 form is broken.

      1. login@path.to.host =
(Personal Name)
      2. Personal Name login@path.to.host

[1998-05-31 FAQ-L Simon Lyall simon@darkmere.gen.nz] = Both forms=20 are legit but the way news and standards documents are going is for = the first=20 form to be discouraged. This efectively means that software should = accept both=20 forms but only generate the second (this is when the article is first = created=20 not by someone half way around the world).=20

The problem with the first form is that stuff in = brackets is=20 actually a "comment" rather than the name of the poster. This means = that there=20 is no way using the first form to actually say what your name is, it = is just=20 that most people say their name in the comment field. They could just = as=20 easily say something else. This means that software that displays the = comment=20 field as th name is just taking a guess.=20

The 2nd format puts the name of the posted in a = definite=20 place that software can work with and allows you to leave the use of = brackets=20 for comments. The current internet draft that on this that will most = likely=20 replace RFC 822 on this point is at:

      ftp://ftp.ietf.org/internet-drafts/draft-ietf-drums-msg-fmt=
-04.txt

The bit is section 3.4 which says:=20

Note: Some legacy = implementations used the=20 simple form where the addr-spec appears without the angle brackets, = but=20 included the name of the recipient in parentheses as a comment = following the=20 addr-spec. Since the meaning of the information in a comment is = unspecified,=20 implementations SHOULD use the full name-addr form of the mailbox if a = name of=20 the recipient is being used instead of the legacy form. Also, because = some=20 legacy implementations interpret the comment, comments SHOULD NOT = generally be=20 used in address fields to avoid confusion.

26.2 What's that X-UIDL header?

[philip]=20

  • It's not standardized, and will never be standardized by an RFC. = (No X-=20 header can be)=20
  • Some servers use this to store information for the UIDL command. =
  • Some clients apparently store UIDL information in this header in = the=20 locally downloaded copy. (Note: the POP3 protocol doesn't let the = client=20 modify the message(s) stored on the server.)=20
  • Some spamming software packages include this header in messages = they=20 send to make some POP3 clients that support client side filtering = think that=20 they've already filtered the message.=20
  • Filtering out incoming messages (pre-retrieval via POP3) seems = 'fairly'=20 safe, though some legitimate mail may include this header. Using it = as a=20 heavy weight (but not enough on its own) in a procmail scoring = recipe that=20 detects spam appears to be reasonable.

  • [philip] If a message comes into = your=20 mailbox that has the X-UIDL: header, and doesn't have your address = in the=20 header, then I would have strong doubts about its legitimacy.=20
  • [ed] comments: E-mails with = X-UIDL: headers=20 are almost definitely spam unless they've been Resent-To: me by = someone.=20 Also, valid X-UIDL: headers have 32 hexadecimal digits exactly. =

[David] The = advisability of=20 trashing all mail with X-UIDL: headers has been discussed on procmail = list=20 recently; apparently it's possible for one to appear in legitimate = mail.=20

[Elijah] Yup. Very = true. Mostly=20 likely case would probably be for certain types of forwarded mail, = including=20 some moderated mailing lists. Fluffy's mod.* list had these until I = pointed=20 out the wide-spread file-to-/dev/null problem to Fluffy.

26.3 What is that first From_ header?

[philip] the address on the From_ line is the envelope sender. If the message = has a=20 Return-Path: header, then it would probably be easier to use that = instead, as=20 then you don't have to deal with the date as found at the end of the = From_ header.=20

DON'T CONFUSE THE ENVELOPE WITH THE MESSAGE. The = headers in=20 the message are allowed to contain a list of address in the To: and = Cc:=20 headers that are totally irrelevant to where the message it going. For = example, a message from a mailing list may simply say "To:=20 procmail@Informatik.RWTH-Aachen.DE", with no visible sign that=20 "guenther@gac.edu" is an address to which the message is being = delivered. That=20 information, where the message is currently in the process of being = delivered=20 to, is found ONLY in the envelope.=20

Okay, where is this precious envelope? In SMTP the = envelope=20 consists of the MAIL FROM: and RCPT TO: SMTP commands. However, when a = message=20 is given to the local mailer, this information is typically lost. = Well, the=20 envelope sender is usually saved now days in the Return-Path: header, = but the=20 envelope recipient usually only appears in the form of the login name = that the=20 local mailer was passed on the command line. This can be used, for = example, by=20 /etc/procmailrc scripts that check $LOGNAME to see where the message = is set to=20 go.=20

A problem arises however when people start creating = virtual=20 domains. When sendmail does the aliasing (usually by mailertable I = believe?),=20 it totally loses the original envelope recipient address in the = rewriting. All=20 the addresses get rewritten to the same thing, and sendmail thus has = no reason=20 to differentiate them. Having lost their independent identities, the = now-same=20 multiple recipients are merged to form one call to the local mailer.=20

The key point here is that once the envelope = recipient is=20 lost by the virtual domain alias, THERE IS NO WAY TO GET IT BACK! You = can wave=20 your hands and try faking it, but no one in the virtual domain can = ever get=20 onto a mailing list or otherwise receive a piece of mail for which the = header=20 doesn't explicitly contain his/her mail address. And furthermore, even = doing=20 that faking is extremely difficult to do right. What I show below does = NOT=20 correctly handle messages with Resent-* headers. This can result in = messages=20 being received by people who shouldn't receive them, possibly = violating=20 someone's privacy. Please keep all that in mind if you decide to use = it. It=20 handles a goodly percentage of the cases, but it'll bite you badly at = some=20 point in the future.=20

So you may ask, does this mean that virtual domains = are=20 hopeless? The answer is no, you just have to be very careful in the=20 sendmail.cf to keep the envelope recipient stashed somewhere long = enough that=20 it can be passed as an argument to the local mailer, usually by = putting it in=20 the 'host' part of the mailer triple, though with sendmail 8.7.x, = putting it=20 into the local part with a '+' would probably be incredibly clean. In = the end,=20 it ends up being passed to procmail (standard /bin/mail has no way of = handling=20 this, but we already knew that) as another argument (i.e., -a=20 orig-envelope-recip), though with some work it might be possible to do = it via=20 a new header, but that's uglier and no more efficient. I don't have = the=20 sendmail.cf (or m4 .mc) mods necessary to do this, but if you post to=20 comp.mail.sendmail (after checking the FAQ, I think it might be there) = someone=20 may be able to give you further pointers on saving envelope recipients = in=20 virtual domain situations.

26.4 Message-Id header

...Are there known problems = with "valid"=20 mails with illegal MessageIDs? For some strange reason, some people = are=20 sending out mail with bad message id's. That wouldn't be much of a = problem,=20 except that our MITS department won't even consider fixing the = bad-message-id=20 unless it causes a problem somewhere else.=20

Why would they not consider fixing it? Their e-mail = software/gateway is broken, and needs fixing. That's that. Direct them = to RFC=20 822, sec 4.6.1. http://www.ietf.org/rfc/rfc0822.txt?number=3D822=20

[Gerald Oskoboiny gerald@impressive.net] = There are=20 problems with Some of the problems with mail containing a bad message = id=20

Some people (myself included) run filters to = automatically=20 delete incoming e-mail if its message-ID has been seen recently, or if = it=20 looks bogus.=20

Some mailing list software (including Smartlist) = does not=20 accept e-mail with a message-ID that has been seen recently. Each = message must=20 have a unique message-ID. The best way to ensure that msgids are = unique in a=20 global context is to include a fully-qualified domain name after the = '@'. In=20 particular, a message-ID like = <3.0.5.32.19971208192547.007db100@mailhub=20 > is unacceptable for this reason (even if it didn't have a space = at the=20 end.)=20

Some mail archive software (including some that I = wrote) uses=20 message-IDs as a unique identifier for that message in the archive. It = may=20 reject messages that appear to be duplicates because they have a = message-ID=20 used by other messages. (as my software does.)=20

[generating message id]=20

[Stainless Steel Rat ratinox@peorth.gweep.net=20 1998-03-13 in Emacs Gnus mailing list] ...it is strongly recommended = that=20 Message-Id strings be generated by the MUA, = rather=20 than the MTA. The reason being that a mail hub could be processing = several=20 messages at the same time (multiple CPUs), and so could accidentally = generate=20 duplicate Message-Id strings. The MTA should = generate=20 Message-Id headers only when the MUA is = stupid and=20 fails to do it.=20

[phil 1998-03-19 PM-L] ... let's do a quick work-up = of a=20 'more complete' regexp to match Message-Ids. = I'll take=20 syntax lines from rfc822 with regexps that should match them. For ease = of=20 presentation, I'm going to work from the bottom up. Note: any brackets = that=20 only contain whitespace should really contain a space and a tab.

      dq         =3D '"'                        # (literal) =
double-quote
      bw         =3D "\\"                       # (literal) backwhack
      ws         =3D "[         ]*"             # whitespace
      atom       =3D "[-!#-'*+/-9=3D?A-Z^-~]+"
      word       =3D "($atom|$dq([^$dq\]|$bw.)*$dq)'
      local_part =3D "$word($ws\.$ws$word)*"
      domain     =3D "(\[$ws([^][\]|$bw.)*$ws\]|$atom($ws\.$ws$atom)*)"

      :0
      *$ ! ^Message-Id:$ws$ws$local_part$ws@$ws$dom=
ain$ws
      {
          ...Catched illegal message id
      }

...I did start logging ids = that match that=20 condition. It matched two messages so far. One message-id was clearly = bogus,=20 but here's the other one (mailing list with 1 msg/week, no spam): =

      Message-Id: =
<199803251729.LAA10847@wuarchive.wustl.edu.>

Is your regexp incomplete wrt = trailing dot=20 in the domain part, or is the MUA/MTA broken?=20

[philip] rfc822 = doesn't allow a=20 trailing dot. I just looked at the draft of the new Internet Message = Header=20 Standard (the eventual replacement for rfc822) and it doesn't either. = Rather,=20 it further restricts the syntax of generated Message-Id headers to disallow comments or folding = whitespace from occuring in the message-id itself.=20

however: before you go = tightening that=20 regexp, note that the standard requires that = programs that=20 process messages must accept and parse messages = that fit=20 the obsolete syntax. This is because old mail messages can hang around = for=20 long periods of time in a way that most other internet data formats = don't see.=20 The new requirements are on the generation of new=20 messages, not on old messages.=20

[1998-10-22 comp.emacs Toby Speight = Toby.Speight@digitivity.com]=20

It's more usual (and useful) to refer to news = articles by=20 Message-ID (that's what Message-IDs are for!). In this case

      <URL:news:uhfwwk9ae.fsf_-_@delivery.ansa.co.uk>

If you are so attached to DejaNews:

      <URL:http://search.dejanews.com/msgid.xp?MID=3D%3C
      uhfwwk9ae.fsf_-_@delivery.ansa.co.uk%3E&fmt=3Draw>

(though for some reason this returns text/plain for = something=20 which is clearly a message/rfc822). Either of which is an unambiguous = URL, not=20 subject to the same time-dependent changes. URLs were designed exactly = to=20 remove the need for such descriptions.

26.5 Received header

...Found another interesting = pattern,=20 Received header that are all on one line. Normally a Received: header = spans=20 two lines, at least on all the mail I get. This = filter=20 locates the single line Received: headers and traps on that:

      :0:
      *Received:\/( ?[^       ])*$
      mail/Spam

[Christopher Lindsey lindsey@ncsa.uiuc.edu] = No=20 guarantees here. I just tried it out on some test mailboxes (all known = to have=20 valid mail), and it matched like mad. As far as I can tell, there's no = requirement in RFC 822 for multiple lines in a Received header.=20

[Reto Lichtensteiger rali@meitca.com] The one line = header=20 vs. multi-line header is config'ed in sendmail: An older cf file = (V8.7):

      HReceived: $?sfrom $s $.$?_($?s$|from =
$.$_) \
          $.by $j ($v/$Z)$?r with $r$. id $i$?u for $u$.; $b

A later (V8.8) one:

      =
HReceived: $?sfrom $s $.$?_($?s$|from $.$_)
              $.by $j ($v/$Z)$?r with $r$. id $i$?u
              for $u; $|;
              $.$b

26.6 Return-Path

...I've created a user = (lo_mailer) with a=20 .forward and a procmailrc file to transport incoming mail to the right = user.
That is working fine, but the Return-Path: Line is set to the = local=20 procmail user (lo_mailer) and does not contain the original = Return-Path! What=20 can I do to win back the original-line? Please help me :)
=20

[david] Normally when = you forward=20 mail you should NOT keep the original return path. If the forwarding=20 destination is invalid or unreachable, mail has to be returned to the=20 forwarder, who can fix the forwarding routine, not to the original = sender, who=20 can't do anything about it and probably never even heard of the final=20 destination address.=20

But, though you should change the return path, you = do not=20 have to lose the information that the original return path contained. = You can=20 safely put that into the body or into another header line. Try this in = lo_mailer's .procmailrc:

      :0fwh # if there's a =
return path, save it as Old-Return-Path:
      * ^Return-Path:.*<.+>
      | formail -iReturn-Path: # lower-case i

      :0Efwh # if there's no return path but there is a From_, use that
      * ^^From[      ]+\/[^  ]+
      | formail -A "Old-Return-Path: <$MATCH>"

      :0Efwh # if there was neither a Return-Path: nor a From_
      | formail -A "Old-Return-Path: unknown"

The first set of brackets in the condition line of = the second=20 recipe enclose a space and a tab; the second set enclose caret, space, = tab.=20

On the forwarding leg from lo_mailer to the final = recipient,=20 the return path will be to lo_mailer, as it should, but if the final = recipient=20 wants to know where it originated, he or she can look at the Old-Return-Path header.=20

There is one caution here. If lo_mailer is taking = mail to a=20 general response address and distributing it to specific people based = on=20 subject or body content or just by rotation to balance the workload, = fine. But=20 if you have a personal domain and your ISP is routing all mail for any = address=20 in your domain to your account on the ISP, and you're depending on = procmail to=20 deliver it to the right address in your own domain by reading To: or = Cc:=20 headers, that is the wrong approach. The correct recipient will be on = the=20 envelope, which is removed from incoming mail before procmail can see = it. Your=20 ISP has to do something that lets you know the true envelope recipient = or=20 recipients of a message, and others here know a lot more about that = than I do=20 (and way, way more than I could tell you without making mistakes).=20

[1998-11-11 Gnus-L Karl Kleinpaste karl@jprc.com] With regard to = the=20 standards for Return-Path, RFC822 observes that it should be a route back to the originator, i.e., it should = show relay=20 hops; RFC1123 in turn says that failure notifications should be sent = back to=20 the originator with the route information deleted, that is, "If the = address is=20 an explicit source route, it SHOULD be stripped down to its final = hop." ???=20 Then what's the point of providing the source route in the first = place?=20

It seems to me that Return-Path's value has become = very=20 limited in an environment where source-routed mail is vastly = deprecated, and=20 just plain not supported by many. I know that, when I did serious = sendmail=20 work years ago, I shot all source routes on sight.=20

You could very well substitute the use of = user-login-name for=20 the "-f" argument in sendmail with the value user-mail-address; the = result=20 should give the effect you need, and not create any interoperability = problems=20 -- mail will still show a proper way to return to you.=20

That said, this mailing list's requirement of = matching=20 Return-Path is indeed pretty peculiar.

26.7 Errors-To

1) Can somebody confirm that = Errors-To: is=20 deprecated? 2) Is there an RFC for this?=20

[1998-09-15 Liviu Daia daia@stoilow.imar.ro] 1) = It is an=20 UUCP thing, and it's indeed deprecated. Here's the relevant quote from = sendmail's manual. 2) Probably not, since UUCP-related RFCs haven't = been=20 updated in a while.=20

If errors occur anywhere = during=20 processing, this header will cause error messages to go to the listed=20 addresses. This is intended for mailing lists. The Errors-To: header = is=20 officially deprecated and will go away in a future release. =

26.8 X-Subscription-Info

This is a header that is used by = some=20 mailing lists: it contains an mail address for un/subscribe, or a URL = with=20 said info. Imagine the reduction in bozo messages asking how to = unsubscribe=20 from mailing lists. If your mailing list doesn't have it already, make = a=20 suggestion to the list's maintainer.=20

26.9 Reply-To header

The existence of a Reply-To: means, "IF = you=20 reply to me, send it to this address instead of the one in the From: = header."=20

In the case of a mailing list, the list usually is = that=20 default mailbox. In that case, a Reply-To header says, "don't send it = to the=20 list, send it here instead." Again, it is more a matter of "do what I = mean".=20

ListAdmin: Don't play with=20 Reply-To
http://www.unicom.com/pw/reply-to-harmful.html ... = RFC-822 on=20 reply-to is just almost hopeless. The reason people do what they do is = more=20 likely because they saw someone else doing that, and imagined it was = correct,=20 and copied - perhaps slightly varying things along the way. ...If you = use a=20 reasonable mailer, Reply-To munging does not provide any new = functionality.=20 It, in fact, decreases functionality. Reply-To munging destroys the=20 reply-to-author. capability.=20

Reply problems
http://www.cs.utk.edu/~moore/reply-problem-list.txt=20

Mail-Followup-To
ftp://koobera.math.uic.edu/www/proto/replyto.html = ...there are=20 useful things that can be done with these headers. For instance -- on = mailing=20 lists where everyone that posts is assumed to be subscribed (like this = one),=20 the listserv could add a "Mail-Followup-To: ding@gnus.org" header. It = can also=20 be used by the sender as a way to signal "I am subscribed to the list; = don't=20 Cc me or anybody else".=20

[Mail-Followup-To problems] Keith Moore = moore@cs.utk.edu Wed, 11 Feb = 1998=20 14:20:25 -0500 commented on the nmh list. Keith is the IETF = applications area=20 director, and used to chair the DRUMS working group.=20

Please don't implement support for Mail-Reply-To = and=20 Mail-Followup-To in nmh. Not only are they nonstandard, they're a poor = fix for=20 the problem.=20

Reply-To is widely misinterpreted as the = replacement for the=20 From field in replies, in such a way that "reply all" goes to Reply-To = + To +=20 Cc if Reply-To is present and From + To + CC if no Reply-to field is = present.=20

RFC 822 has language that appears to support this = view. But a=20 careful reading of RFC 822 reveals that this prose does not apply to = Reply-To=20 with respect to a "reply all" function, but only with the use of = Reply-To in a=20 "reply to author" function.=20

This leaves us with the situation where the author = of a=20 message is unable to specify the complete destination for replies. = Even if the=20 author specifies a Reply-To field, if the recipient uses "reply all",=20 addresses from the To and CC field are still included. This is the = behavior=20 implemented by almost every UA in existence, but it's almost always = the wrong=20 thing to do.=20

And RFC 822's examples make it clear that Reply-To = is=20 intended as the complete destination for = replies, not=20 merely a replacement for the From field.=20

The right way to fix this is to correctly interpret = Reply-To=20 - not as simply the replacement for the From field in replies, but as = the=20 reply destination preferred by the author of the subject message. = Adding new=20 headers doesn't fix the problem. It only makes the situation more = complex.=20

Dan's proposal is intrinsically flawed. It = incorrectly=20 assumes that the sender can reasonably anticipate the recipient's = needs in=20 replying to the message, and that such needs can reasonably be lumped = into=20 either "reply" or "followup". It doesn't solve the real problem, which = is that=20 responders need to think about where their replies go. = Mail-Followup-To won't=20 decrease the number of messages that go to the wrong place.=20

If I sent out a message inviting people to a = meeting, and=20 want "normal" replies (presumably accepting or declining the = invitation) to go=20 to my secretary. Should I put my secretary's address in = "Mail-Reply-To" or=20 "Mail-Followup-To"?=20

Say I put it in Mail-Reply-To and a responder wants = to send a=20 personal reply to me, perhaps because it's sensitive in nature. So he = hits=20 "reply to author" thinking that the message will go to me. Instead, = the=20 message goes to my secretary. This is Bad.=20

Say I put my secretary's address in = Mail-Followup-To and a=20 responder wants to send a message to the list of recipients of the = original=20 message -- maybe that responder wants to let everyone know about cheap = airfares to the meeting. So the responder hits "reply to everyone" = thinking=20 that the message will go to everyone. Instead, the message goes to my=20 secretary. This is not as bad as the other case, but it's still not = desirable.=20

So if some responses are neither "personal" nor = "group"=20 replies, why not define an extensible reply header that would include = not only=20 the address but the category of reply? Something like:=20

Labelled-Reply-To: secretary; jeeves@cs.utk.edu=20 Labelled-Reply-To: mailing-list; listname@foo.com=20

It turns out that we already have most of this in = RFC 822:=20

  • The 'phrase' before an address, or a comment, can identify a = person by=20 name and/or role. The responder can use this information to decide = whether=20 it's reasonable to send a reply to that person. e.g.
      Reply-To: (my secretary) jeeves@cs.utk.edu

  • Similarly, the 'phrase' after a group name can identify a group = of=20 recipients, which can also be used by the responder. e.g. =
      Reply-To: Secretary: jeeves@cs.utk.edu =
;,
           The Gang: a@foo, b@bar, c@zot ;

(Unfortunately, phrases are so widely botched, that = they=20 probably aren't usable for this.)=20

Summary:=20

  • The way to solve most reply problems is to encourage the = responder to=20 actually think about where the message needs to go, and make it easy = for him=20 to get the behavior he wants. (It also helps if people use the RFC = 822=20 'phrase' to label their header addresses.)=20
  • We can build interfaces that help the responder do this without = defining=20 any new header fields.=20
  • Except for a very few cases, Mail-{Reply,Followup}-To doesn't = help. It=20 only provides more opportunities for surprising behavior.

Stainless Steel Rat ratinox@peorth.gweep.net=20 1998-02-12 commented in Emacs ding mailing list=20

Every mail client is not doing supporting this. = Only the=20 badly written ones fail to distinguish between replies and followups.=20

When you get right down to it, this proposed = standard has two=20 goals:=20

  1. To make broken MUAs act less brokenly. Well, broken MUAs are not = going=20 to implement this standard, anyway; good MUAs do not need it as they = already=20 make the distinction between replies and followups.=20
  2. To make broken mailing lists act less brokenly. Administrators = of broken=20 mailing lists have decided that they like it that way. They claim = that it=20 makes it easier for their lists' subscribers to reply to the list. = The=20 subscribers that "need" list-bound Reply-To headers are using broken = MUAs.=20 See #1.=20

      This proposed standard will not solve any of the = problems=20 it attempts to address. It creates headers that are ignored by bad = MUAs and=20 are redundant for good MUAs.=20

      To summarise Keith's statement: From is the = originator's=20 mailbox. It is not an 'account'. RFC 822 states that the originator = header=20 should contain the correct default reply address.=20

      This is the scenario that the proponents of these = headers=20 have proposed, and the flaw the IETF has found with it.=20

      Joe is subscribed to a mailing list that he reads = from his=20 "private" mail account. For whatever reason, Joe posts a message to = that=20 list from work, so his work mailbox is in the From header. Joe does = not want=20 to override where responses go with a Reply-To header, but he wants = personal=20 replies to go to his private mail account instead of his work = account.=20

      The flaw the IETF found is that Joe is equating = his two=20 mailboxes with his private and work accounts. There is no such=20 correspondence as far as RFC 822 is concerned. If Joe is acting in a = "private" fashion, the system he is using is irrelevant; his private = mailbox=20 belongs in the From header and he should put that mailbox there when = he=20 originates the message, regardless of where he physically is when he = does=20 so.

      26.10 Mail-Copies-To header

      [Suggested by Lars, the Author = of Emacs=20 Gnus]=20

      ...Mail-Copies-To: is a header line used in = messages on=20 Usenet to direct copies by mail of followups to posts. http://www.math.fu-berlin.de/~guckes/rfc/mail-copies-to.htm= l=20

      [SL Baur steve@xemacs.org] The = Mail-Copies-To: header should control how your = mail (and=20 Usenet) client prepares a followup message. It gives control to the = sender=20 of a message whether courtesy duplicate = copies of=20 messages should be sent. There are two forms:

           =
       Mail-Copies-To: never
      

      Do not automatically include the sender of the = message=20 being responded to. There are two canonical examples.

            Usenet:
            From: foo@foo.bar
            Newsgroups: comp.emacs.xemacs
            Mail-Copies-To: never
      

      A followup in a conforming client should generate = in the=20 response message headers:

            Newsgroups: =
      comp.emacs.xemacs
      
            Email:
            From: foo@foo.bar
            To: mailing-list@somewhere.com
            Cc: luser@somewhereelse.com
            Mail-Copies-To: never
      
      
      

      A followup in a conforming client should generate = in the=20 response message headers:

            To: =
      mailing-list@somewhere.com
            Cc: luser@somewhereelse.com
      

      The second form includes a properly formed RFC822 = mail=20 address as the parameter:

            Mail-Copies-To: =
      someaddress@somewhere.com
      

      In this case, the sender of the message is = specifically=20 requesting that responses to the message not only go to the main = forum=20 (either mailing list or Usenet newsgroup), but a duplicate copy = should also=20 be sent to someaddress@somewhere.com. = There are=20 (again) two canonical examples.

            Usenet:
            From: foo@foo.bar
            Newsgroups: comp.emacs.xemacs
            Mail-Copies-To: foo@foo.bar
      

      A followup in a conforming client should generate = in the=20 response message headers:

            Newsgroups: =
      comp.emacs.xemacs
            Cc: foo@foo.bar[1]
      
            Email:
            From: foo@foo.bar
            To: mailing-list@somewhere.com
            Cc: luser@somewhereelse.com
            Mail-Copies-To: foo@foo.bar
      

      A followup in a conforming client should generate = in the=20 response message headers:

            To: =
      mailing-list@somewhere.com
            Cc: luser@somewhereelse.com, foo@foo.bar[2]
      
      

      There is no requirement that the address in Mail-Copies-To match the From=20 address. Footnotes: [1] Or `To: foo@foo.bar' [2] It is also = acceptable to=20 put foo@foo.bar in the To: line.

      26.11 Mail-Followup-To and Reply-To-Personal headers

      [21 = Nov 1997,=20 Mutt Development List <mutt-dev@cs.hmc.edu]=20

      Jacob Palme just today submitted an = Internet-Draft=20 describing Mail-Followup-To. Jacob, the Working Group chair Chris = Newman and=20 I all regard this as complementary to my own Reply-To-Personal = proposal, an=20 early version of which I posted here and which was also submitted as = an=20 Internet-Draft just today. In fact had me week been a bit less = harried Jacob=20 and I would have issued a joint draft. Within a few days you should = be able=20 to view these drafts in the IETF drafts directory on ds.internic.net = under=20 the names=20

      draft-ietf-drums-mail-followup-to-00.txt Jacob = Palme's=20 draft on the proposed Mail-Followup-To header.=20

      draft-ietf-drums-replyto-personal-00.txt My draft = on=20 Personal-Reply-To

      26.12 Content-Length header and From_ specification

      [1996-05-17 From: Jamie Zawinski jwz@netscape.com=20 comp.mail.headers]=20

      ...I'm not saying that the BSD Mailbox format is = good. Just=20 that the Content-Length variant of that format is worse.=20

      Ok, so someone took the From_ format, and = extended it to=20 not require mangling by adding a length indicator to the format. At = first=20 glance, this may sound simple and elegant, but it breaks the world, = and one=20 shouldn't encourage its use to spread.=20

      The thing that breaks is taking an existing,=20 widely-implemented format, and adding a requirement that it have a = length=20 indicator. This means that any existing software that already thinks = it=20 knows how to manipulate that format is going to damage the file (any = change=20 to the data will cause the length indicator to be wrong with respect = to the=20 new specification but not with respect to the old specification.)=20

      If the content-length-based format was not = otherwise-=20 indistinguishable from the ``From '' format, there wouldn't be a = problem;=20 the old software would simply fail to work with this new file = format,=20 instead of `corrupting' the documents (in = quotes,=20 because it's really just a matter of which spec you're following.)=20

      Also, mailboxes are by their nature a textual = format; but,=20 the content-length header measures in bytes rather than lines. This = means=20 that if you move the file to a system which has a different = end-of-line=20 representation (Windows <=3D> Mac, or Windows <=3D> = Unix) then the=20 content-lengths will suddenly be wrong, because the linebreaks now = take two=20 bytes instead of one, or vice versa.=20

      It's impossible for a mail client to look at a = file, and=20 tell which of the two formats (From_ or Content-Length) it is in; = they are=20 programmatically indistinguishable. The presence of a Content-Length = header=20 is not enough, because suppose you were on a system which knew = nothing at=20 all about that header, and some incoming message just happened to = have that=20 header in it. Then that header would end up in your mailbox (because = nobody=20 would have known to remove or recalculate it), and it would possibly = be=20 incorrect. (Presume further that the header was not just incorrect, = but=20 intentionally malicious...)=20

      Stricter parsing of the ``From '' separator line = doesn't=20 help either, because there are many, many variations on what goes in = that=20 line (since it was never standardized either); and also, some mail = readers=20 include that line verbatim when forwarding messages (Sun's MailTool, = for=20 example) so a stricter parser wouldn't help that case at all, = because=20 message bodies tend to contain valid matches.=20

      Some mail readers attempt to cope with this by = recognizing=20 the case where the Content-Length is not obviously spot-on-target, = and then=20 searching forward and backward for the nearest message delimiter; = but this=20 is obviously not foolproof, and makes one's parser much more = inefficient=20 (requiring arbitrary lookahead and backtracking.)=20

      Conventional wisdom is, ``if you believe the = Content-Length=20 header, I've got a bridge to sell you.''

      26.13 Moral about CC copies in Usenet

      Sending CC=20

      There has been very heated discussion in the = gnu.emacs.gnus=20 (e.g around 1999-03-20) newsgroup where many people argue for = sending CC=20 replies to the person thet posted the question to the newsgroup. The = benefit=20 of sending CC has been seen as:=20

      • The person gets fast answer.=20
      • The person may not read the newsgroup regularly and = appreciates the=20 private answer=20
      • The newfeed for him may not be very reliable, so the answer = may not=20 appear fast in the group (but we don't know this for sure)=20
      • The newgroup expiry period may be too fast for him to catch = the reply=20 (but we don't know this for sure).

      In recent years the netnews has changed a lot and = many=20 people have started using non-existing mail address in order to = prevent=20 getting UBE mail. This has made the "CC" senders annoyed, because = they get=20 bounced mail from these non-existing addresses.=20

      Not sending CC=20

      Usenet is considered a public forum, which does = not force=20 anyone to reveal their "real" address if they don't want to. It's = the same=20 as lock in their doors. Some people don't want to see non-invited = people in=20 their doors and that's why they don't like CC messages too:=20

      • The CC is superfluous: The answer has already posted to = newsgroup=20
      • a CC won't help following a thread. Person has to visit the = newsgroup=20 to see the whole discussion anyway.=20
      • A CC is subjected to mail delivery problems: Person has moved, = mail=20 delivery problem (keep trying for N days), transient failure..=20
      • He always wants to read the newsgroup and doesn't like CC = copies to=20 fill in his mailbox in expensive ISP account.

      A Clear munged address=20

      An clear non-existing mail address that indicates = that it=20 is not the real destination is usually considered good manners:

            john.doe@nowhere.net
            b.gates@vatikan
            dummy@no-replies.com
      

      Or partially modified, that a human mind can = "decode" if a=20 direct contact is wanted (but somewhat hard to programs, because = there are=20 more creative choices that what program can ever expect to see): =

            johnx.you-know-what-todo@not-here.skynet.com
            door.lock.mike@chevanix.com
            nospam.xavier@ube-stop.aol.net
      

      A valid looking address=20

      But an address that looks like a "real", but is = bogus, is=20 not a polite way to participate in Usenet. This address wold give an = impression that persn is really there:

            =
      mike@future-domain.com
      

      The MORAL learned about automatic CC copies is:=20

      An automatic CC is a bad = thing. Don't=20 guess people's minds. An open mail (real mail addresss) is not an = invitation=20 to visit his door. It is only a hint where the message comes from = (valid or=20 not). The only thing we can be sure of is that a A clear anti-UBE = address is=20 a stop sign, not to send any CC copies.=20

      When people want CC, they = indicate it by=20 saying it in mail or adding some header that can hopefully be = understood by=20 newsreaders, like Mail-Copies-To or Followup.


    27.0 Other interesting code

    27.1 Misc mail related pointers

    ProcLog
    http://synfin.net/aturner/proclog/ Aaron Turner = aturner@pobox.com = ...ProcLog is a set=20 of Unix shell and Perl 5 scripts that provide an easy to use and fast=20 reporting tool for detailing where mail goes after Procmail sorts it. = ProcLog=20 reports on current mail in your Procmail Log file, any new mail that = has come=20 in since ProcLog was last run, and the size of your Procmail log file. = What=20 makes ProcLog unique is its ability to help you keep track of mail = you've=20 read.=20

    Xbuffy -- biff = log
    http://www.fiction.net/blong/programs/#xbuffy

    27.2 Expire mail pointers

    Sh expire=20 mail
    <URL:http://www.rosat.mpe-garching.mpg.de/mailing-lists/procmail= /=20 1996-09/msg00141.html> This script will delete messages older than = $AGE=20 days from the mailbox specified on the command line. It requires that = you have=20 formail installed on your system, and if formail is in a directory = other than=20 /usr/bin, you must change the value of $FORMAIL. wravery@wravery.stu= dent.princeton.edu=20

    Gawk expire mail
    by = Roman=20 Czyborra czyborra@cs.tu-berlin.de=20 ...Using GNU version 2.15 of awk: This filter deletes all messages = older than=20 expire days from a Unix mailbox. Sample call: gawk -f expire.awk = expire=3D21 box=20 > new && mv new box [See old procmail archive, = BestOfProcmail]=20

    pick -- Auto deleting old=20 messages
    [Brian Dockter brian@nds.com] ...Once the = messages were=20 in the correct folder, I would suggest using cron and mush. mush is = able to=20 manipulate messages on a wide variety of criteria, but works with them = once=20 they are already in the folder. Here is a command which would delete = all=20 messages that are one week old or more:

          pick =
    -ago -1w | delete
    

    IMHO, with procmail to do the pre-processing, and = mush to do=20 the post-processing, I have an unbeatable mail combination.

    27.3 Usenet News related pointers

    "Perl Get News (perl4)" ftp://pittige-tijden.ml.org/pub/procmail/pgnews.gz=20

    27.4 Code: Perl Extract procmail man pages from 3.11pre7.tar.gz=20

    Contact: jari.aalto@poboxes.com = for code.=20 Procmail-3.11pre7.tar.gz has some .man files that can't be used right = away.=20 You have to run the whole make process before you can get the ready = man pages.=20 However, this small perl script takes care of only creating the man = pages if=20 that is the only thing you want to grab from the tgz kit.=20

    Be in the procmail tar.gz kit's "man" directory and = run this=20 script with command below and the *.1 pages will appear in the = directory.

          % pm-man.pl *.man
    

    27.5 Code: Sh remove matching lines from file

    [era] The name "gred" is rather obscure if you = don't=20 know what "grep" stands for. Anyway, this is also really too = specialized a=20 script to get such a general-sounding name. Incidentally, I timed this = against=20 a Perl one-liner on my /usr/dict/words (which is rather small, though; = some=20 25,000 lines) and found the shell version to be quicker, even with the = locking. A good citizen would take care to remove the temp files when = done,=20 but since Wotan thought it would be valuable to keep them around for = backup, I=20 left implementing that as an exercise. (Hint: they should be cleaned = up even=20 if the script is interrupted with ctrl-C.)
          =
    #!/bin/sh
          # gred -- like grep, but remove matching lines
          # syntax: gred regex file
          # locks file while gredding using dotlocking
          #
          case "$#" in
              2) ;;
              *) echo "Syntax: gred regex file" >&2 ; exit 1 ;;
          esac
    
          LOCK=3D"$2.lock"
          TMP=3D/tmp/$$.temp
    
          if lockfile "$LOCK"; then
              mv "$2" "$TMP"
              grep -v "$1" "$TMP" >"$2"
              rm -f "$LOCK"
          fi
          #
          # end of file
    


    28.0 UBE in Internet

    28.1 Terms used and foreword

    [Part of this has been excerpted = from=20 the Email Abuse Faq]=20

    UBE =3D Unsolicited = Bulk=20 Email
    UCE =3D (subset of UBE) = Unsolicited=20 Commercial Email

    Spam =3D Spam = describes a=20 particular kind of Usenet posting (and canned spiced ham), but is now = often=20 used to describe many kinds of inappropriate activities, including = some=20 email-related events. It is technically incorrect to use "spam" to = describe=20 mail abuse, although attempting to correct the practice would amount = to=20 tilting at windmills.=20

    Spam =3D definition = by Erik=20 Beckjord. "Some people decide that Spam is anything you decide you = want to ban=20 if you can't handle the intellectual load on a list." Remember, not to = be=20 confused with real spam, which is unwanted bulk mail.=20

    People are nowadays seeking a cure which will stop = or handle=20 UBE. That can be easily done with procmail (under your control) and = with=20 sendmail (by your sysadm). In order to select the right strategy = against UBE=20 messages, you should read this section and then decide how you will be = using=20 your procmail to deal with it.

    28.2 UBE strategies

    [Excerpted from the Email Abuse Faq]=20

    28.2.1 4g. I asked to be "removed" - = guess what?=20 I got another U*E=20

    Not surprisingly, many UBE outfits treat a "remove" = request=20 as evidence that the address is "live"; a "remove" request to some = bulk=20 mailers will actually guarantee that they will send more to you. For = many=20 others, the remove procedure does not work, either by chance or = design. At=20 this point perhaps you're starting to get a feel for the type of = people with=20 whom you are dealing.=20

    Also, getting removed doesn't keep you from being = added the=20 next time they mine for addresses, nor will it get you off other = copies of the=20 list that have been sold or traded to others. In summary, there is no = evidence=20 of "remove" requests being an effective way to stop UBE.=20

    28.2.2 4h. I asked to be "removed" - = guess what?=20 The message bounced=20

    Probably the remove procedure was false. Any remove = procedure=20 that tells you to send remove requests to AOL, CompuServe, Prodigy, = Hotmail,=20 or Juno is certainly false. The bulk mailers are an unpopular lot; = they forge=20 headers, inject messages into open SMTP ports, use temporary accounts, = and=20 pull other stunts to avoid the tirade of complaints that follow every = mailing.=20

    28.3 UBE and bouncing message back

    Has anyone found that bouncing = spam does=20 any good at all?=20

    Note: There are = several program=20 packages out there that can with a high degree of success (but not = 100%) trace=20 back a spam even if some headers are faked. This will not help you = against=20 spam houses (which don't care) but will speed you telling the = sysadmins of an=20 open relay. Such tools need human interaction for proper working. See = pointers=20 to them in this document later.=20

    Examine the messages by hand first and feed them to = automatic=20 complain script. See pointers in this document later.=20

    [sean] I had a whole = policy=20 message written up that would be sent out to spammers. Nothing but a = waste of=20 my resources. Most return paths are either completely bogus, or end up = bouncing pretty damn soon after the spam, which just brings you more = junk to=20 deal with.=20

    Instead, I choose to send messages occasionally to=20 administrators and upline providers of domains which spew. "Agreement = by=20 action" is one of the legal standards I like to use (for "should you = continue=20 to send mail to me, that constitutes acceptance of the terms herein"). =

    InterNIC recently 1997-07 removed the root files = for .com,=20 .org, and .net (I think) from access at their ftp server. Too many = spammers=20 were using them for the purpose of generating mailing lists. Access to = the=20 files now requires an assigned FTP account from InterNIC. When I get a = domain-style spam, I immediately do a whois to get DNS info on the = domain,=20 then grep the root files to obtain a list of domains serviced by the = same DNS.=20 If they appear spammy (as spam domains tend to), I add these to a list = of=20 domains to filter (egrep) in my primary domain-based ruleset. Works = for me,=20 though the list is getting big.=20

    [Kimmo Jaskari kimmo@alcom.aland.fi] = Another good=20 reason is that all those bounces, which get ignored by the = spammer/recipient=20 anyway, still take up needless bandwith on the net. The spam is bad = enough for=20 that, bouncing it back with some more stuff added is just plain silly. = You=20 become part of the problem rather than the solution. If the bounce = even gets=20 to the spammer, the spammer drops it on the floor unseen.=20

    [1998-11-03 PM-L Mark Shaw mshaw@asic.sc.ti.com] = Jari:=20 "Autoresponder is bad idea. You need more better heuristics than what = procmail=20 can do. The UBE messages really need human inspection before you send = them=20 out, otherwise you may have to apologise from lot of people eg if the=20 complaint was mistakenly sent off to some mailing list or wrong = address."=20 Mark: Having originally set up my anti-spam recipes to be = autoresponders, I=20 absolutely agree with this. I recall one morning when my strongly- = worded=20 no-spam message went out to everyone who sent me = mail for=20 several hours..... * shudder *

    28.4 UBE and "I don't mind" attitude

    ...whenever you see a spam you = don't want,=20 hit the delete key and move on. Grow up and get a life, folks. The = spams just=20 don't bother me. Why the hell does everyone have to go up in arms = everytime=20 someone sends a spam? Spams are harmless! Spams even sometime are = interesting=20 and/or useful!=20

    [Responses from thread in procmail mailing = list=20 1995-10 to "FREE 1 yr. Magazine" spam.]=20

    [Soren Dayton csdayton@midway.uchicago.edu= ]=20

    The simplest reason against UBE is that it is rude. = It costs=20 some people money to get mail on some commercial services. This is=20 fundamentally different than junk snail mail for this reason and too = much spam=20 can prevent people from getting mail (mailboxes can fill up). So it is = both an=20 intrusion into my life and it can = conceivably end=20 in me either loosing money or loosing mail (which is far more = important). It=20 is a burden on the receiver far beyond = just=20 hitting the delete key.=20

    [Mark Seiden mis@seiden.com]=20

    people who are able to monitor the incoming = machines of one=20 of the larger online services (like me) can see a sizeable increase in = system=20 load average and volume directly resulting from spams. this = competition for=20 fixed resources inevitably translates to reduced service for "first = class"=20 mail.=20

    It is impossible to engineer a mail system that can = cope with=20 an unlimited amount of abuse. this is in addition to the difficulties = of doing=20 so on a fixed price economic model, and the difficulties of keeping up = with=20 the successful rapid expansion of the population to be served.=20

    Even if you, an individual, aren't charged anything = per piece=20 of mail, there are costs borne by your service provider per piece of = mail, and=20 these are somehow passed on to you. (They've = calculated an=20 average across their entire user population to come up with a "monthly = cost of=20 Internet mail".)=20

    Spamsters and bulk mailers are not at all concerned = about=20 efficiency. as proof of that, many of them are not even courteous = enough to=20 supply a proper return address, so they can prune their lists of = undeliverable=20 mail. all they care about is getting their message across without = their paying=20 anything whatsoever for that service.=20

    Watch how this will inevitably translate into = increased costs=20 for you, the consumer, unless we change the mechanisms by which bulk = mail is=20 delivered as well as putting an appropriate economic model in place.=20

    [Steve Simmons scs@lokkur.dexter.mi.us]= =20

    If you tolerate spamming, it will only get worse. = Spamming=20 has been stopped again and again. Almost without exception, the = spammers have=20 been tracked down and, via one means or another, have been convinced = to stop=20 spamming.=20

    Spams are harmless? I've already seen the 'Magazine = Sub'=20 message 10 or 12 times. I have a low bandwidth line. If I continue to = tolerate=20 spamming, I will pay a very real penalty in performance as tens, then=20 thousands of spammers do it. Not to mention the personal time involved = in=20 taking care of the crap.=20

    Don't think that the time involved is significant? = Just wait.=20 My wife and I are fairly generous with our time and money. As a = result, we=20 were getting an average of five telephone calls *per night* asking for = money=20 for various causes. A year ago, I adopted a new policy -- I will not = under any=20 circumstances give money to a caller, and will only consider it upon = written=20 solicitation. I ask them to put me on their `do not call list'. If = they do=20 *anything else* to continue the conversation, I hang up on them.=20

    My wife opposed this, and we agreed to disagree -- = if they=20 ask for her, they get her. If they ask for me, they get my speech. = After a=20 year, she is getting 2-3 calls per night and I'm getting one or two a = week.=20

    My point here is that individual action does get re-action from the mailers. For them, I = copy their=20 internet providers on my complaints and call their Better Business = Bureau. It=20 works.=20

    If one does this politely and consistently, 98% of = the=20 spammers will stop. The remaining 2% will discover that they're in a = different=20 world from direct mail or telephone solicitation. Their mailboxes will = be=20 overloaded with complaints (when it takes a single keystroke to invoke = your=20 complain macro, you're very likely to complain). Then their suppliers=20 mailboxes will be overloaded with complaints. The free magazine folks, = who've=20 been hiding behind false ids and forging mail, will find that they're = on the=20 wrong side of the law. I'm considering contacting their local legal = officials=20 and urging them to investigate, because it sure looks like fraud to me = (read=20 `Consumer Reports' for a similar case by surface mail). Should a few = more like=20 this come in, I will contact their legal = authorities. We=20 have their fax number; it's all we need to find them.=20

    [Carl Payne cpayne@optical.fiber.net]=20

    Um, I don't know about you or anyone else here, but = this=20 cutesy, "it's-okay-by-me" spam has been circulated under half a dozen=20 different user names and "domains" on as many mailing lists. It's = obvious to=20 me the sender is trying to make people pissed off--how can he possibly = think=20 someone will buy that crap, and why does he think it's okay to send 19 = and 20K=20 files over a billion groups?=20

    AFAIC, it has to stop. Now. I'm tired of the spam, = I'm tired=20 of the "Who cares" attitude about spam, I'm tired of ISPs letting = people spam,=20 I'm tired of the jetwash of spam, and I'm tired of the bleedinghearts = that=20 say, "Golly, just ignore it, and it'll go away."=20

    I've got news for you all: when this method of = spamming=20 becomes the preferred method of "marketing" on the internet, and = people like=20 us are the bad guys because we're not allowing such litter to fly = across the=20 fiber, you will care. You will say something, most probably, "Why = didn't we do=20 something about this sooner?"=20

    The guy in the next cube from you, who's paying a = per-message=20 charge through his ISP, is probably going, "Dammit, over three dollars = this=20 month on mail I've itemized as being spam." While that doesn't seem = like a=20 lot, I revert to my earlier statement: if this becomes the preferred = method,=20 his bill (and yours) will go up, and everyone will wonder why it's too = out of=20 control to do anything about.=20

    Spam has the letters m-a-s in = it, which=20 en Espanol, means "more." I say no. Not only no, but hell no. And, I = refuse to=20 be told that my thinking is out of line just because I don't want my = mailbox=20 flooded. Do something now. Do anything now. But, don't be quiet and = listen to=20 anything that sounds like an endorsement of litter=20

    [Wolfgang Weisselberg weissel@ph-cip.uni-koeln.de]=20

    Worse is that it costs a spammer very little to = spam, say, 2=20 million addresses with 5KB:=20

    • 5 hours unattended time online=20
    • phone costs=20
    • a 'free x hours'-CD or a provider looking the othher way i.e. = something=20 between $0 and $500 (an expensive provider)

    It costs all recipients:=20

    • on an average of 5 seconds per UCE to decide that, indeed, it is = one:=20 115.7 DAYS (2777.8 hours) of mailchecking (at = $7.5/h=20 that is just $20833 --- excluding all taxes and so on!)=20
    • 379.5 hours (15.8 days) download time (multiply with your local = phone=20 costs and remember that in most places even in-city calls cost by = the=20 minute)=20
    • the same time as online time (multiply by your provider costs)=20
    • indirect costs (more HDs for the provider (9.5 GB), faster = connections=20 for all the spams, more transmission costs (9.5 GB), faster = machines, ...=20

    I can send you the complete calculation if you like = :-)=20

    Now, if UCE becomes more common ... how many = businesses are=20 connected to the Internet? Say that every business spamms once every = 10 years,=20 and that they are well distributed over the time.

       =
       Number_of_businesses / 3650 =3D UCE's iniciated per day
          UCE's iniciated per day * 2_000_000 (or more)
                      / number of mail addresses
          =3D UCEs in your mailbox
    

    Guess we are going to need T1's to just get all our = mail. And=20 a few 100 secretaries as well. Wave good-bye to usable mail.

    28.5 We need a law against UBE

    Ray Everett-Church ray@everett.org, = Attorney/Online=20 Consultant Co-Founder & Congressional Liaison Coalition Against=20 Unsolicited Commercial Email; article 1997-12 in rmailer politics = mailing=20 list=20

    In developing what eventually became the Smith = Bill, CAUCE=20 discussed this rather extensively among our drafting committee. The = bill gives=20 a cause of action against the advertiser, not any of the pathways = taken=20 between you and them. This is consistent with the interpretation of = the fax=20 law (and many other laws for that matter) wherein the advertiser -- = not the=20 advertiser's agent -- is responsible for the act committed.=20

    As for the single UCE versus bulk issue, the = general=20 consensus has been that while a single piece of spam does not do much = damage,=20 it is fundamentally no less a cost shift than 10 identical messages, = or 100,=20 or 1000, or a million. The only difference is that the costs being = shifted are=20 greater and greater. We discussed many cut off points... would 50 = spams be=20 acceptable? 25? 10? One really well crafted, hand written, heartfelt = and=20 personalized spam be permissible? And in the end we felt like we were=20 discussion angels on the heads of pins.=20

    While virtually nobody's system will crash because = of one=20 piece of spam (although George Nemeyer had trouble with three or four = pieces=20 as I recall), what is the ultimate difference if you only get one = piece from=20 each of 15 different advertisers a day? If one spam is ok, but two are = bad,=20 what is the interval... a day, a week? Enforcement depends on knowing = when the=20 threshold is crossed.=20

    So here's a scenario: you receive three spams from = what is,=20 unbeknownst to you, the same person (one advertising weightloss pills = from=20 WeightLoss Associates at PO Box 1, one for an MLM from MLM Company at = PO Box=20 2, and Bee Pollen from Pollen Partnership at PO Box 3). Each were = individually=20 crafted and appeared to be mailed only to you.=20

    Under the scenario above, if the law permits one = spam, will=20 you sue?=20

    Would you risk suing one or all of them, gambling = that they=20 sent the spam to anyone other than you (or whatever the threshold = is... 10,=20 25, 50)? Would you risk suing one or all of them on the chance that = they were=20 somehow related? What if there was a chance that you'd find out that = the three=20 companies were really different? What if you did sue and found that = they were=20 owned by the same person, but were legally organized separate entities = and=20 were therefore each entitled to one spam a piece?=20

    In short... if one spam is permitted, it could make = enforcement incredibly cumbersome, difficult and unlikely, and would = present=20 spammers with many reasons to violate the law knowing the odds of a = suit and=20 successful enforcement are greatly reduced. While bulk spam is really = bad on=20 many levels, whether it's parsed out in very small volumes makes = little or no=20 difference to the ultimate recipients as far as the diminished = utility, cost,=20 and annoyance.=20

    We need a clear, bright line. And the Smith Bill is = that.=20


    29.0 Anti-UBE pointers

    29.1 NoCEM, CAUCE and others

    The wAr of = spam --=20 pointers to reseurces
    http://spam.gunters.org/links.html=20

    NoCEM
    http://www.cm.org/=20

    The Coalition Against = Unsolicited=20 Commercial Email (CAUCE)
    http://www.cauce.org/FAQ.html ...The Problem: = Unsolicited=20 commercial mail, more commonly known as "spam", is a growing problem = on the=20 Internet. If you've used the Internet for any length of time, you've = probably=20 received solicitations via mail to purchase products or services.=20

    A Solution: A group of Internet users who are fed = up with=20 spam have formed a coalition whose purpose is to amend 47 USC 227, the = section=20 of U.S. law that bans "junk faxing", so that it will cover electronic = mail as=20 well.=20

    Spamcop - report bulk mail = intrucions=20 here
    http://www.spamcop.net/=20

    Teergrubing against = Spam
    http://www.iks-jena.de/mitarb/lutz/Usenet/teergrube.en.html= =20 ...`Teergrubing' It's German and means Tar-Pit. Once you have been = stuck you=20 can't get out. ...slow down internet connections in order to stop UBE = abuse.=20 Several hundred teergrubes are able to block spamming worldwide = without=20 blocking any e-mail. How do I start: If you are the admin of a MX = host,=20 install a teergrube.=20

    Obtuse smtpd for = UNIX

    29.1.1 http://www.obtuse.com/smtpd.html=20

    Main (configurable) features:=20

    • deny unauthorized relay (no more relay rape!)=20
    • permit selective relay exceptions (e.g UUCP downstream)=20
    • regex() filtering [block those spamming dialins!]=20
    • deny access for no MX, no PTR, etc.=20
    • defeat % hack=20
    • support MAPS, ORBS, DUL, IMRSS, etc RBLs plus your local RBL=20
    • support exception list for domains for which you will accept = mail=20
    • support selective tarpit'ing on refused connections=20
    • individually configurable rejection messages=20
    • precedence and override ordering=20
    • informative log summary scripts

    Lot of good articles about=20 spam
    http://www.sun.com/sunworldonline/swol-12-1997/swol-12-spam= .html=20

    "(anti-spam Law) US Representative Chris Smith's = statement on=20 junk e-mail" http://www.sun.com/sunworldonline/swol-08-1997/swol-08-junk= email.html=20 ...considerable variation in the approaches at the federal level, and = state=20 legislation varies widely as well. Professor David Sorkin of John = Marshall Law=20 School, who summarized and provided links to the major spam-related = lawsuits=20 noted above, also provides status summaries and links to state and = federal=20 legislation=20

    Select mail court cases -- = Lots of=20 them
    http://www.jmls.edu/cyber/cases/spam.html America = Online, Inc.=20 v. Cyber Promotions, Inc., Compuserve Inc. v. Cyber Promotions, Inc., = etc.=20

    29.2 General Filtering pages (more than procmail)

    Nancy McGough nm@noadsplease.ii.com - = Mail=20 Filtering FAQ
    http://www.ii.com/internet/robots/procmail/qs/
    http://www.ii.com/internet/FAQs/launchers/mail/filtering-FA= Q/

    Information Filtering=20 Resources
    http://www.ee.umd.edu/medlab/filter/ Doug Oard = oard@glue.umd.edu ...This = page lists=20 all known internet-accessible information filtering resources.

    29.3 Junk mail and spam

    Spam = FAQ
    ftp://rtfm.mit.edu/pub/Usenet/alt.spam/ http://www.cs.ruu.nl/wais/HTML/na-dir/net-abuse-FAQ/spam-FA= Q.html=20

    The mail abuse = FAQ
    http://members.aol.com/emailFAQ/emailFAQ.html What = is UBE,=20 UCE, EMP, MMF, MLM, Spam, it is all explained here.=20

    Get that spammer -- A VERY = GOOD=20 LINK
    http://www.triode.net.au/~forever/gspam.html ...All = about=20 Spam; traceroute, netabuse etc. Full of links and docs=20

    Whois
    http://www.networksolutions.com/cgi-bin/whois/whois/ =

    Advertising on Usenet: How To = Do It, How=20 Not To Do It
    ftp://rtfm.mit.edu/pub/Usenet/advertising/=20

    Dealing with Junk = Email
    http://www.jcrdesign.com/junkemaildeal.html ...What = you should=20 do (and not do) when you have been victimized by a junk mailer. This = document=20 teaches you how to read headers in order to trace the origin of junk = mail, and=20 includes detailed examples to show you how it is done. Headers are = designed=20 for computers to read, not people, so they can be a little hard to = follow.=20 Therefore, I hereby grant permission to print or electronically save a = copy of=20 this page on your local machine for your personal use while tracing = junk mail.=20 Please check back for updates and corrections, though.=20

    • What Not To Do: Stuff that doesn't work=20
    • What to do: effective techniques, including how to trace junk = mail back=20 to its source=20
    • Stay Calm (take a deep breath...)=20
    • Stay Mad (don't get discouraged)=20
    • How to identify the sender and who gives them Internet access=20
    • Who to complain to, abuse addresses, online services=20
    • What to say and how to say it, effective complaining

    Practical Tools to Boycott=20 Spam
    http://spam.abuse.net/spam/ ...We have been actively = engaged=20 in fighting spam for years. Recent events, including pending court = battles,=20 prompt us to present this page to the public. Fight spam to keep the = Internet=20 useful for everyone.=20

    • Filtering mail to your personal account=20
    • Blocking spam mail for an entire site=20
    • Blocking Usenet spam for an entire site=20
    • Blocking IP connectivity from spam sites=20
    • Other tools and techniques for limiting spam=20
    • Sample Acceptable Use Policy statements for ISPs

    news.admin.net-abuse.*=20 Homepage
    Timothy M. Skirvin tskirvin@math.uiuc.edu = http://www.kill/=20 file.org/~tskirvin/nana/=20

    Preventing relaying in=20 Sendmail
    ...This package adds two independent features to = sendmail,=20 access control and relay control. They will be described here = simultaneously,=20 but you can elect to include support for only one of them (either one) = on your=20 mail server. Access control lets you deny access to the server based = on the=20 senders envelope address or his IP address. Relay control lets you = decide who=20 gets to relay mail through your server. ftp://ftp.xyzzy.no/sendmail/access.tar.Z=20

    Anti-Spam Provisions in = Sendmail=20 8.8
    http://www.sendmail.org/antispam.html http://mail-abuse.org/ http://www.informatik.uni-kiel.de/~ca/email/check.html#chec= k_rcpt=20

    • Preventing relaying through your SMTP port=20
    • Refuse mail from selected hosts=20
    • Restrict mail acceptance from certain users to avoid mailbombing =

    [1998-06-15 PM-L walter] Somebody's starting to = exploit a=20 hole in sendmail 8.8, where giving a HELO longer than 1024 bytes = causes buffer=20 overflow, and all following "Received:" headers are lost. If it's done = off a=20 relay, we have no clue who sent it. There may be a more elegant = solution, but=20 here's a quick-n-dirty procmail filter for this stunt...=20

    Blocking Email
    http://www.nepean.uws.edu.au/users/david/pe/blockmail.html<= /A>=20

    • Do you or your users, receive "junk mail" (aka., "spam")=20
    • Do you have Sendmail R8.8.5 running at your site?=20
    • Would you like to block known "junk mail" senders' addresses? =

    Now you can - and there's no need to patch any = source code,=20 either. Take advantage of Sendmail's check_mail rule, to see if the = sender's=20 address is a member of a nominated "class" - drawn from the contents = of the=20 named file. Additional information and links:=20

    • Prospective Addresses/Domains to Block=20
    • Limiting Unsolicited Commercial Email=20
    • EFF "Net Abuse and Spamming" Archive=20
    • [U.S.] Court Lets AOL Block Email=20
    • Anti-Spam HOWTO=20
    • Net Abuse FAQ=20
    • Figuring out Fake Email & Posts=20
    • Fight Unwanted Email=20
    • Unsolicited Junk Email - Bad for Business=20
    • Fight Unsolicited Email and Mailing=20
    • Yahoo's Junk Email Resources=20
    • jmfilter=20
    • Complaints Addresses at U.S. ISPs=20
    • news.admin.net-abuse.* Homepage=20
    • Processing Mail With ProcMail=20
    • Panix's rc.shared ProcMail Configuration=20
    • ProcMail Workshop=20
    • Email Self Defence=20
    • The SPAM-L mailing list

    Preventing relaying in = Netscape Messaging=20 Server
    http://www.tsc.com/~bobp/nms-no-relay.html = ...discusses=20 anti-spam configurations for Netscape Messaging Server (NMS). These = include=20 proper anti-relay config, spam filters, and using blacklists such as = MAPS from=20 NMS. I was compelled to compile this page because of the extremely = poor=20 Netscape documentation which includes anti-relay configurations that = are=20 easily defeated. --Bob Poortinga bobp@tsc.com=20

    US Federal Trade = Commission
    http://www.ftc.gov/ = ...staff=20 publicized the Commission's UCE mailbox, "uce@ftc.gov," and invited = consumers=20 to forward their UCE to it. spam complaints uce@ftc.gov=20

    Spam Spade Web based tracking=20 tool
    http://www.blighty.com/ ...Figuring out forged = headers and=20 verifying IP addresses and whois information.=20

    Misc
    http://www.junkbusters.com/=20 http://www.well.com/~jbremson/spam http://www.wolfenet.com/~jhardin/procmail-security.html=

    29.4 Comprehensive list of spammers

    Against Spam=20 -- The garbage collecting.
    http://www.spam-archive.org/ To support this archive = please=20 forward mail spam to spam-list@toby.han.de. = Everybody=20 is invited to bounce Mail-Spam he/she has got to this list. This is a = mailing=20 list to distribute actual spam-eMail. All incoming mail will be = checked by=20 subject and from/sender-address wether it has already been distributed = or not.=20 No discussions in this list. To discuss about this list please = subscribe to=20 spam-list-d@hiss.org.=20

    To subscribe to blacklist-update=20 mailing list TO: Majordomo@hiss.han.de = BODY:=20 subscribe blacklist-update you@somewhere.com Mail postmaster@spam-archive.org=20 to discuss about blacklist if your name is on it. (maintained by Axel = Zinser=20 fifi@sis.han.de) Get = the updated=20 blacklist from ftp://ftp.spam-archive.org/spam/blacklist/

    29.5 Misc pointers

    Is there a way to block local = users from=20 spamming other sites? Maybe somehow force sentmail to read a rc file = that=20 would maybe then grab the from field and see if the user exists on the = system=20 or not. Or run it through some sort of filters.=20

    [philip] You can and = should do=20 this purely in sendmail. I ended up crafting a check_from ruleset that = verifies that the envelope sender address is either a) not local; b) a = local=20 user; or c) a local alias. At the time I did this mainly to force = people to=20 configure their Eudora clients so they didn't say "Return Address:=20 yourname@gac.edu" but it also covers the outgoing bogus source address = spam=20 case. For those interested in this kinda thing I've (just) put it up = for FTP:

          ftp://ftp.gac.edu/pub/guenther/
    

    IBM's Secure Mailer -- open=20 source
    http://www.postfix.org/=20

    [1998-12-15 PM-L Matthew = McGehrin matthew@reverse.net] The = official=20 project is known as 'IBM's Secure Mailer'. The unofficial codename was = Vmailer, but they had to rename that, to Postfix to agree with the = lawyers. I=20 should know, I have been alpha testing this mailer for the past year, = and it=20 so blazing fast, its amazing. It's faster and simplier to use than = sendmail,=20 and also faster and more secure than qmail. It works fine with = procmail. (look=20 in my headers). set "mailbox_command=3D/usr/bin/procmail" in=20 /etc/postfix/main.cf=20

    [1998-12-15 PM-L Liviu Daia = daia@stoilow.imar.ro] it = has=20 explicit hooks for both procmail and RBL. In fact it's incredibly easy = to=20 setup, I got it compiled and configured (with an actually usable=20 configuration) in about 15 minutes after downloading it. Adding = masquerading=20 and a virtual domain took another 2 minutes. :-) You should really = give it a=20 try, it's faster than QMail and much = faster than=20 sendmail. So far, I'm quite impressed.=20

    Qmail
    http://pobox.com/~djb/qmail.html http://www.qmail.org/ =

    Sendmail
    http://www.sendmail.org/=20

    Fetchmail -- old pop3=20 replacement
    ftp://ftp.ccil.org/pub/esr/ http://www.ccil.org/~esr/ http://www.tuxedo.org/~esr/fetchmail/=20

    Maildrop filter = utility
    http://freshmeat.net/projects/maildrop/ = ...Alternative to=20 procmail=20

    Lua
    http://www.tecgraf.puc-rio.br/lua/ lua@tecgraf.puc-rio.br = [possible=20 replacement for procmail language] ... Lua is a=20 programming language originally designed for extending applications, = but also=20 frequently used as a general-purpose, stand-alone language. Lua = combines=20 simple procedural syntax (similar to Pascal) with powerful data = description=20 constructs based on associative arrays and extensible semantics. Lua = is=20 dynamically typed, interpreted from bytecodes, and has automatic = memory=20 management with garbage collection, making it ideal for configuration, = scripting, and rapid prototyping.

    29.6 Questionable UBE stop services

    IEMMC:=20 Internet E-Mail Marketing Council Formed 1997-03

    The IEMMC was formed to provide an industry wide = trade=20 association for the purpose of promoting responsible e-mail marketing, = and to=20 establish an industry standard code of procedures and ethics which = will=20 internally regulate and govern the commercial e-mail marketing=20 industry....Under this system, all e-mail of a commercial, unsolicited = nature=20 must pass through a universal filtration system which will block the = sending=20 of any and all commercial e-mail to the address on the list. Bulk = e-mailers=20 will be required to join the organization=20

    Others have commented that:=20

    ...IEMMC is a joke. you are = probably not=20 doing yourself any favors=20

    ...Don't take that IEMMC = seriously! Many=20 people registered with them and got as many or even more spam as = before. After=20 all, Cyberpromo (the operator of IEMMC) knows that the registered = addresses=20 will be valid for some time, so they can use and sell this valuable = list to=20 other junk mailers.=20

    Spammer blacklist
    http://www.netchem.com/=20 ...remove@netchem.com Dear=20 Sir/Madam, Your mail address may be on many spammers' lists. We are = compiling=20 a remove list. Forward the original junk to = list@netchem.com

    29.7 UBE related newsgroups or mailing lists =

    alt.kill.spammers=20 alt.hackers.malicous alt.2600=20

    [1997-08-13 alt.privacy.anon-server by anonymous = poster]=20 Proper etiquette demands you contact their ISP. However, if the ISP = are not=20 interested in helping you, you should consider a posting in = alt.kill.spammers=20 (or even alt.hackers.malicous or alt.2600) - give as many details as = you can=20 about the spammer.=20

    A certain spam-provider targeted the = alt.hackers.malicious=20 newsgroup. Not the most sensible thing to do. The ISPs IPs were found, = their=20 MX host was hacked. All their DNS entries was published on alt.2600 = (so that=20 everyone could add filters to ignore all mail from this company). Oh = yeah,=20 their password file also made it to the group! The ISP then posted a = complaint=20 to alt.2600, much to the enjoyment of everyone who took part. That = host=20 basically died a horrible death. I'm pretty sure that not many people = are=20 going to lose any sleep over this! I might as well mention that the = ISP's=20 complaint mentioned that their "freedom" was being abused. hehehe. = Most of=20 these postings can be seen in dejanews or altavista archives of = Usenet.=20

    SPAM-L mailing list and Doug = Muth's=20 Page
    http://www.claws-and-paws.com/spam-l/ ... "The = SPAM-L FAQ" - A=20 FAQ for SPAM-L, an anti-spam mailing list. This FAQ discusses how to = join the=20 list and what to post there, AND it also delves into the technical = aspects of=20 spam. For instance, the various kinds of forgeries seen in spams are = discussed=20 here, along with information on how to recognise them. If you hate = spam, this=20 is something worth checking out... "TheGoodsites List" - I maintain = this list,=20 which is part of the Spam Boycott, to show which Internet providers = out there=20 act responsibly when dealing with spam. If you're looking for an ISP = and want=20 to know where they stand on spam, this is the list for you.=20

    Send an mail message to listserv@peach.ease.lsoft.c= om=20 with the words "subscribe SPAM-L <First name> <Last name>" = in the=20 body of the message (no quotes). f you would like to contact the = owner, the=20 convention is the same as with all listserv lists. Just send e-mail to = spam-l-request@peach.= ease.lsoft.com=20

    29.8 Software: the net abuse page

    Scott Hazen Mueller scott@zorch.sf-bay.org = http://spam.abuse.net/spam/tools/=20

    29.9 Software: adcomplain -- Perl junk mail report

    billmc@agora.rdrop.com = http://www.rdrop.com/users/billmc/adcomplain.html=20

    Adcomplain runs under Unix, Windows-NT, and = Windows-95.=20 Adcomplain is a tool for reporting inappropriate commercial e-mail and = Usenet=20 postings, as well as chain letters and "make money fast" postings.=20

    It automatically analyzes the message, composes an = abuse=20 report, and mails the report to the offender's internet service = provider. The=20 report is displayed for your approval prior to mailing. Adcomplain can = be=20 invoked from the command line or automatically from many news and mail = readers.=20

    #todo: url missing=20

    [a user = happy user=20 reports] ...About 95% of all cases can be traced correctly --- = unless=20 they come from a known spamhouse; where complaining to them would not = do much=20 good anyway. Mailing lists with strange Received-Headers also can = present=20 problems in tracing

    29.10 Software: Ricochet -- Perl junk mail report

    http://www.vipul.net/ricochet/ ricochet@vipul.net Vipul = Ved Prakash=20

    MailingList: ricochet-announce-req= uest@vipul.net=20 with subject "subscribe"=20

    A lot of unsolicited mail goes unreported because = tracing the=20 origins of a possibly forged mail and finding the right people to = report to is=20 complicated and time-consuming. Ricochet, a smart net agent, automates = this=20 process. It traces the names and add resses of the systems where the = spam=20 originated from along with the servers that provide domain name = resolution=20 services to these systems (in most cases their ISPs). Then it=20 collects/generates a list of mail addresses of = tech/billing/admin/abuse=20 contacts of these system and mails them a complaint and a copy of the = spam.=20 Detailed description of its workings can be found in the README file = that=20 comes with the package.

    29.11 Software: yell -- perl

    ftp://ftp.netcom.com/pub/bo/bobmacd/yell (57k) Bob = MacDowell=20 <bobmacd+cmhcmm032598@netcom.com>=20

    yell - auto-responds to "spam" e-mail. Scans for = site names,=20 e-mail addresses and Web site names and sends appropriate messages to = users,=20 postmasters and Webmasters.

    29.12 Software: RBL lookup tool -- C

    [1997-12-04 PM-L Edward = S.=20 Marshall emarshal@logic.net]=20

    ...rblcheck is a lightweight C = program for=20 doing checks against Paul Vixie's Blackhole List. It works well in = conjunction=20 with Procmail for filtering unwanted bulk mail (under QMail, for = example, you=20 can invoke it with the value of the environment variable TCPREMOTEIP). = rblcheck is extremely simple:

          % rblcheck =
    1.2.3.4
    

    where 1.2.3.4 is the IP = address you want=20 to check.=20

    This is a quick note to announce the availability = of a new=20 tool for using Paul Vixie's RBL blacklist (see http://mail-abuse.org/ for more=20 information about the blacklist itself, if you don't already know). = Most tools=20 which use the blacklist block mail on a site-wide basis. For many = networks,=20 this treads on both the ideals of the administration, and on the = perceived=20 freedoms of the end user.=20

    Personally, I don't care either way. :-)=20

    This tool was to fill the need I personally had to = reject=20 mail, since one of the systems I receive mail through cannot, for = various=20 political reasons, implement the available RBL filters on a site-wide = basis.=20

    rblcheck is a simple tool meant to be used from = procmail and=20 other personal filtering systems under UNIX in the absence of a = site-wide=20 filter, as an alternative to imposing site-wide restrictions, or as a = means of=20 imposing restrictions on systems that cannot support the existing RBL = filter=20 patches.=20

    Simply put: you hand it an IP address, and it = determines if=20 the IP is in the RBL filter, providing the caller with a positive or = negative=20 response. With the package, a sample procmail recipe is provided, and = examples=20 of using it under QMail and Sendmail are given.=20

    http://mail-abuse.org/
    http://www.isc.org/bind.html The official home = page
    http://www.xnet.com/~emarshal/rblcheck/

    It has only been tested under Linux 2.x and Solaris = 2.5.1.=20 Success stories, patches, questions, suggestions, and flames can be = directed=20 to me at emarshal@logic.net.=20

    [PM-L Aaron Schrab aaron+procmail@schrab.com]=20 Here is my rbl setup, but, this depends both upon the format of the = Received:=20 lines, and the way that mail passes through your mail system.=20

    I currently grab the IP address from the first = Received:=20 header inserted by my ISP (I'm a sysadmin at the ISP, so I have a good = knowledge of how mail gets passed around internally). Here's the = recipe that I=20 use.

          # if there's a Received: header from one =
    of these servers, it's
          # (probably) the right one
    
          BACKUPSERVER    =3D "([yz]\.mx\.execpc\.com)"
          VIRTSERVER      =3D "(vm[0-9]+\.mx\.execpc\.com)"
          LOCALSERVER     =3D "([abc]\.mx\.execpc\.com)"
    
          # Match a header containing:
          #   Received: <anything> [<ip address>]) by <local =
    server>
    
          :0
          * $ $SUPREME^0 ^Received:.*\[\/[0-9.]+\]\)$s+by$s+${BACKUPSERVER}
          * $ $SUPREME^0 ^Received:.*\[\/[0-9.]+\]\)$s+by$s+${VIRTSERVER}
          * $ $SUPREME^0 ^Received:.*\[\/[0-9.]+\]\)$s+by$s+${LOCALSERVER}
          {
              IP =3D $MATCH
    
              # trim it down to just the IP address
    
              :0
              * IP ?? ^^\/[0-9.]+
              {
                  IP =3D $MATCH
    
                  :0 W
                  * ! ? /home/aarons/bin/rblcheck -q $IP
                  {
                      SPAM =3D "$SPAM $IP is rbl'd$NL"
                  }
              }
          }
    

    It seems to be a procmail = issue with=20 letting the IP info from sendmail pass through to the rblcheck = program. I have=20 not been able to find anyone using rblcheck successfully with procmail = as a=20 delivery agent...=20

    [1998-03-26 PM-L Edward S. Marshall emarshal@logic.net ] This = is a=20 standard problem; you should be able to change the invocation of = procmail the=20 same way as the example (run env, which in turn runs procmail). Make = sure that=20 there is a '-p' argument passed to procmail; this preserves the = environment=20 you're constructing with env (newer sendmail revisions sanitize the=20 environment for you, so that's not really an issue).=20

    If you're still having troubles, make sure you're = using the=20 latest incarnation of rblcheck, with the latest supplied procmail = recipe;=20 earlier revisions had rather insidious bugs.=20

    [1998-03-26 PM-L Xavier Beaudouin (kiwi) kiwi@oav.net] Also it seems that = sendmail=20 8.9.0Beta3 has builtin rules. I use it with sendmail 8.8.8 and = tcpwrapper=20 every day and there is about 80% spam rejected. Sounds very good. In = your=20 /etc/hosts.allow just add the following lines :

         =
     sendmail: ALL: spawn /usr/local/bin/rblcheck -q %a && \
                      exec /usr/sbin/sendmail -bs || /bin/echo \\
            "469 Connection refused. You are in my Black List !!!\r\b\r\n"
            && \
            (safe_finger -l @%h 2>&1 | /bin/mail -s "%d-%h %u" root)
    

    In your /etc/inetd.conf just add this line :

          smtp stream tcp nowait root  /usr/sbin/tcpd  \
               /usr/sbin/sendmail  -bs
    

    And check that your sendmail is not working as a daemon. That's all. Also if you = have huge=20 queue you can add a /usr/sbin/sendmail -q in the root crontab... This = should=20 help to send some waiting messages. I think we can use this to wait = for=20 official 8.9.0 sendmail since there is some cf/feature/rbl.m4 there.=20

    [timothy] ...I think = there's a=20 much more efficient way to do this: you can compile sendmail = -DTCPWRAPPERS and=20 let it run as a daemon

    29.13 Software: mapSoN

    Note: You can do exactly the = same as below=20 with procmail with one of the listed procmail modules: pm-jacookie.rc. = See the=20 code.=20

    mapSoN (NoSpam backwards) -- = The no spam=20 utility
    http://mapson.gmd.de/ ftp://ftp.gmd.de/gmd/mapson/=20

    Most spam filtering tools I've seen so far are = based on=20 procmail, or a similar tool, and use a list of keywords or addresses = to drop=20 unwanted junk mail. While this might be nice to filter mail from known = spam=20 domains like "cyberpromo.com", it won't catch faked headers.=20

    mapSoN must be installed as filter program for your = incoming=20 mail, usually by adding an appropriate entry to your $HOME/.forward = file. This=20 means that mapSoN will get all your incoming mail and it will decide = whether=20 or not to actually deliver it to your mailbox.=20

    • First of all, an user defined ruleset is checked against the = mail. If=20 any keywords or patterns match, the mail will be dealt with = according to=20 your wishes. This is useful to drop some sender's mail completely, = or to=20 sort mail into different mail folders.=20
    • If no rule matches the mail, mapSoN will check whether the mail = is a=20 reply to an e-mail you sent, or whether it is a reply to a USENET = posting of=20 yours. If it is, the mail will always be delivered.=20
    • If no signs of a reply-mail can be found, mapSoN will check = whether the=20 sender stated in the From: header has sent you mail before. If he = has, the=20 mail will pass. If this is the first time you receive an e-mail from = this=20 address, though, mapSoN will delay the delivery of the mail and = spool it in=20 your home directory. Then it will send a short notice to the address = the=20 mail comes from, which may look like this:
          From: Peter Simons simons@petidomo.com
          To: never_mailed@me.before
          Subject: [mapSoN] Request for Confirmation
    
          mapSoN-Confirm-Cookie: <some_weird_cryptographic_cookie>
    

    The person who tried to contact you will then reply = to this=20 "request for confirmation", citing the cookie stated in the mail. When = your=20 mapSoN receives this confirmation mail, it will deliver the spooled = mail into=20 your folder. Furthermore, the address will be added to the database, = so that=20 mail from this person will pass directly in future.=20

    If no confirmation mail arrives within a certain = time, mapSoN=20 can either delete the spooled mails, or send them to a special folder, = or=20 whatever you prefer.

    29.14 Software: spamgard

    [similar to MapSon] ftp://ftp.netcom.com/pub/wj/wje/release/sg-howto=20

    ...sppamgard(tm) screens from your e-mail = unsolicited bulk=20 mail. It does this in a way that you only have to change things if you = have a=20 new person from whom you do want to = receive mail;=20 you don't have to change things every time a spamster thinks of a new = trick to=20 pull, or a new spamster comes along. And spamgard(tm) is designed so = that=20 those who aren't in your "Good Guys" list can get mail to you anyway = until you=20 put them there. The instructions for them to get mail to you are = simple and=20 newbie-tested, but will still keep out bulk mail. If you're on a = mailing list=20 you want to be on, there are provisions = for=20 accepting all mail from a set of mailing lists that you specify. =

    29.15 Software: Spam Be Gone

    Spam Be=20 Gone
    http://www.internz.com/SpamBeGone/ ...uses machine = learning=20 and artificial intelligence technologies to examine incoming mail = messages and=20 determine their priority... is more than just a Spam filter, it's a = general=20 purpose mail message prioritiser. You train the system, telling it = which are=20 good, and which are bad messages. As Spam Be Gone! learns it becomes=20 customised for each individual user.=20

    PM-L W. Wesley Groleau=20 <wwgrol@sparc01.fw.hac.com> comments:=20

    > They only distribute = binaries, and=20 I'm paranoid. Anyone able to
    > convince me it's not really a = Trojan=20 Horse to collect addresses of
    > spam-haters or something even=20 worse?

    I did some sleuthing. I am 95% = convinced=20 that SpamBeGone is not a front or cover for any spammer(s). To protect = the=20 author's privacy, I won't say why I'm convinced or how I got the info. = Sorry.=20 If you're paranoid like me, you'll have to do your own sleuthing = before you=20 use it.=20

    I'm also convinced = SpamBeGone's theory is=20 sound. I won't judge the implementation until I've used it for a = while.=20

    PM-L R Lindberg & E Winnie rlindber@kendaco.telebyte.c= om=20 comments:=20

    I have to agree with the = recent comments=20 about Spam Be Gone, I found it tends to be inaccurate. I first set it = up about=20 a week ago, followed the directions and trained it on several (15 to = 20)=20 messages. One from each list we get, and the remainder from my logs of = SPAM=20 messages.=20

    The first day it missed about = half the=20 SPAM, and nailed about 1/3 of the real messages. So I tuned the = key-words a=20 bit, trained it on about 100 more SPAMs and trained it on all the good = messages it nailed. Since then it has nailed every SPAM received, = however the=20 second day it nailed about 20% of the good messages, which I then = trained it=20 to like. Since then it has been nailing about 10% of the good = messages,=20 despite continual training. I also added every list to the address = book, and=20 it still nails posts from this list, and my wife's lace list.=20

    I even went through my entire = log of SPAM=20 and trained it on every one that didn't come out a 5 (bad). Being the = kind of=20 person I am, I also checked after I trained it, and found four SPAMs, = the=20 despite my training it that they were bad (5) came out as not so bad = (4). I=20 don't dare kill 4's as far too much of my mail (like this list) ends = up as=20 4's.=20

    For me, this program is not = ready for=20 prime time. If the comments are correct that it only learns on Subject = and=20 From headers, it's not even worth trying. Since lists use the TO and = CC=20 headers to be identified, and there are several excellent other = headers=20 (X-Advertisement comes to mind) that would be assests for killing = SPAM.=20

    29.16 Software: TinyGnus - Emacs Gnus plug-in

    Availability: jari.aalto@poboxes.comhttp://tiny-tools.sourceforge.net/
    Platform: win32 and Unix Emacs versions.

    TinyGnus Is Emacs lisp = extension package=20 that integrates directly to Gnus mail/newsreaders. It includes simple = but=20 efective UBE fighting hotkeys that make it possible to complain bunch = of UBE=20 messages at once. In order to use it, you have to have permanent = Internet=20 connection and nslookup(1) tool. Features:=20

    • USER MUST DECIDE and hand select WHICH IS ube MAIL.=20 No software can decide 100% which mail is UBE, so the responsibility = is on=20 the Human user.=20
    • User selects messages that are ube with Gnus select commands, = like (#,=20 select current message)=20
    • Hotkey C-c ' u examines messages' headers and runs nslookup(1) for each Received header to = determine abuse spam and postmaster addresses where to send the complaint. =

    Copyright (c) 2002 by Jari Aalto. This material may be distributed = only=20 subject to the terms and conditions set forth in the Open Publication = License,=20 v1.0 or later (the latest version is presently available at=20 http://www.opencontent.org/). Distribution of the work or derivative of = the work=20 for commercial purposes in any form is prohibited unless prior = permission is=20 obtained from the copyright holder. (VI.B LICENSE OPTIONS)

    This file has been automatically generated from plain text file with = Perl=20 script t2html.pl 2002.0204
    Document author: Jari=20 Aalto
    Contact: <jari.aalto@poboxes.com>
    = Html=20 date: 2002-02-04 01:45