From: Subject: Reading Email Headers Date: Wed, 4 Jun 2003 15:48:56 +0200 MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_NextPart_000_0000_01C32AB0.CE88AC20"; type="text/html" X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 This is a multi-part message in MIME format. ------=_NextPart_000_0000_01C32AB0.CE88AC20 Content-Type: text/html; charset="windows-1250" Content-Transfer-Encoding: quoted-printable Content-Location: http://www.stopspam.org/email/headers/headers.html Reading Email Headers
Reading Email Headers

All About Email Headers

Introduction

This document is intended to provide a comprehensive introduction to = the=20 behavior of email headers. It is primarily intended to help victims of=20 unsolicited email ("email spam") attempting to determine the real source = of the=20 (generally forged) email that plagues them; it should also help in = attempts to=20 understand any other forged email. It may also be beneficial to readers=20 interested in a general-purpose introduction to mail transfer on the = Internet.=20

Although the document intentionally avoids "how-to-forge" = discussions, some=20 of the information contained in it might be turned to that purpose by a=20 sufficiently determined mind. The author explicitly does not endorse = malicious=20 or deceptive falsification of email, of course, and any use for such = purposes of=20 the information contained in this document is contrary to its purpose. =

Because of the nature of the examples in this document, there are = several=20 fictitious domain names with associated IP (Internet Protocol) = addresses. The=20 chance that some of these domain names may be used at some future time = is,=20 inevitably, nonzero. Similarly, all IP addresses used in the examples = are=20 unidentified at this writing, but they will undoubtedly be assigned = someday.=20 Naturally, nothing in this document is intended to reflect in any way on = future=20 users of these domain names or IP addresses.

Where Email Comes From

This section consists of a brief analysis of the life of a piece of = email.=20 This background material is important for understanding what the headers = are=20 telling you.

Superficially, it appears that email is passed directly from the = sender's=20 machine to the recipient's. Normally, this isn't true; a typical piece = of email=20 passes through at least four computers during its lifetime.

This happens because most organizations have a dedicated machine to = handle=20 mail, called a "mail server"; it's normally not the same machine that = users are=20 looking at when they read their mail. In the common case of an ISP whose = users=20 dial in from their home computers, the "client" computer is the user's = home=20 machine, and the "server" is some machine that belongs to the ISP. When = a user=20 sends mail, she normally composes the message on her own computer, then = sends it=20 off to her ISP's mail server. At this point her computer is finished = with the=20 job, but the mail server still has to deliver the message. It does this = by=20 finding the recipient's mail server, talking to that server and = delivering the=20 message. It then sits on that second mail server until the recipient = comes along=20 to read his mail, when he retrieves it onto his own computer, normally = deleting=20 it from the mail server in the process.

Illustration.

Consider a couple of fictitious users, <rth@bieberdorf.edu> and = <tmh@immense-isp.com>. tmh is a dialup user of Immense ISP, Inc., = using a=20 mail program called Loris Mail (which, by the way, is also fictitious); = rth is a=20 faculty member at the Bieberdorf Institute, with a workstation on his = desk which=20 is networked with the Institute's other computers.

If rth wants to send a letter to tmh, he composes it at his = workstation=20 (which is called, let's say, alpha.bieberdorf.edu); the composed text is = passed=20 from there to the mail server, mail.bieberdorf.edu. (This is the last = rth sees=20 of it; further processing is handled by machines with no intervention = from him.)=20 The mail server, seeing that it has a message for someone at = immense-isp.com,=20 contacts its mail server---called, perhaps, = mailhost.immense-isp.com---and=20 delivers the mail to it. Now the message is stored on = mailhost.immense-isp.com=20 until tmh dials in from his home computer and checks his mail; at that = time, the=20 mail server delivers any waiting mail, including the letter from rth, to = it.=20

Illustration.

During all this processing, headers will be added to the message = three times:=20 At composition time, by whatever email program rth is using; when that = program=20 hands control off to mail.bieberdorf.edu; and at the transfer from = Bieberdorf to=20 Immense. (Normally, the dialup node that retrieves the message doesn't = add any=20 headers.) Let's watch the evolution of these headers.

As generated by rth's mailer and handed off to = mail.bieberdorf.edu:

From: rth@bieberdorf.edu (R.T. Hood)
To:=20 tmh@immense-isp.com
Date: Tue, Mar 18 1997 14:36:14 = PST
X-Mailer: Loris=20 v2.32
Subject: Lunch today?

As they are when mail.bieberdorf.edu transmits the message to=20 mailhost.immense-isp.com:

Received: from alpha.bieberdorf.edu (alpha.bieberdorf.edu = [124.211.3.11]) by mail.bieberdorf.edu (8.8.5) id 004A21; Tue, Mar 18 = 1997=20 14:36:17 -0800 (PST)
From: rth@bieberdorf.edu (R.T. Hood)
To:=20 tmh@immense-isp.com
Date: Tue, Mar 18 1997 14:36:14 = PST
Message-Id:=20 <rth031897143614-00000298@mail.bieberdorf.edu>
X-Mailer: = Loris=20 v2.32
Subject: Lunch today?

As they are when mailhost.immense-isp.com finishes processing the = message and=20 stores it for tmh to retrieve:

Received: from mail.bieberdorf.edu (mail.bieberdorf.edu=20 [124.211.3.78]) by mailhost.immense-isp.com (8.8.5/8.7.2) with ESMTP = id=20 LAA20869 for ; Tue, 18 Mar 1997 14:39:24 -0800=20 (PST)
Received: from alpha.bieberdorf.edu (alpha.bieberdorf.edu=20 [124.211.3.11]) by mail.bieberdorf.edu (8.8.5) id 004A21; Tue, Mar 18 = 1997=20 14:36:17 -0800 (PST)
From: rth@bieberdorf.edu (R.T. Hood)
To:=20 tmh@immense-isp.com
Date: Tue, Mar 18 1997 14:36:14 = PST
Message-Id:=20 <rth031897143614-00000298@mail.bieberdorf.edu>
X-Mailer: = Loris=20 v2.32
Subject: Lunch today?

This last set of headers is the one that tmh sees on the letter when = he=20 downloads and reads his mail. Here's a line-by-line analysis of these = headers=20 and exactly what each one means.

Received: from mail.bieberdorf.edu
This = piece of=20 mail was received from a machine calling itself = mail.bieberdorf.edu...
(mail.bieberdorf.edu = [124.211.3.78])
...which is=20 really named mail.bieberdorf.edu (i.e., it identified itself = correctly---see=20 Section Whatever for more on this) and has the IP address = 124.211.3.78.
by mailhost.immense-isp.com = (8.8.5/8.7.2)
The=20 machine that did the receiving was mailhost.immense-isp.com; it's = running a mail=20 program called sendmail, version 8.8.5/8.7.2 (don't worry about what the = version=20 numbers mean unless you already know).
with ESMTP id LAA20869
The receiving = machine=20 assigned the ID number LAA20869 to the message. (This is used internally = by the=20 machine---it's something an administrator would need to know to look up = the=20 message in the machine's log files, but it's not usually meaningful to = anyone=20 else.)
for <tmh@immense-isp.com>;
The = message was=20 addressed to tmh@immense-isp.com. Note that this header is not = related to=20 the To: line (see Section Whatever).
Tue, 18 Mar 1997 14:39:24 -0800 = (PST)
This mail=20 transfer happened on Tuesday, March 18, 1997, at 14:39:24 (2:39:24 in = the=20 afternoon) Pacific Standard Time (which is 8 hours behind Greenwich Mean = Time;=20 hence the "-0800").=20

Received: from alpha.bieberdorf.edu (alpha.bieberdorf.edu = [124.211.3.11]) by mail.bieberdorf.edu (8.8.5) id 004A21; Tue, Mar 18 = 1997=20 14:36:17 -0800 (PST)
This line documents the mail = handoff from=20 alpha.bieberdorf.edu (rth's workstation) to mail.bieberdorf.edu; this = handoff=20 happened at 14:36:17 Pacific Standard Time. The sending machine called = itself=20 alpha.bieberdorf.edu; it really is called alpha.bieberdorf.edu, and its = IP=20 address is 124.211.3.11. Bieberdorf's mail server is running sendmail = version=20 8.8.5, and it assigned the ID number 004A21 to this letter for internal=20 processing.=20

From: rth@bieberdorf.edu (R.T. = Hood)
The mail=20 was sent by rth@bieberdorf.edu, who gives his real name as R.T. Hood.=20

To: tmh@immense-isp.com
The letter is = addressed to tmh@immense-isp.com.=20

Date: Tue, Mar 18 1997 14:36:14 = PST
The=20 message was composed at 14:36:14 Pacific Standard Time on Tuesday, March = 18,=20 1997.=20

Message-Id:=20 = <rth031897143614-00000298@mail.bieberdorf.edu>
= The=20 message has been given this number (by mail.bieberdorf.edu) to identify = it. This=20 ID is different from the SMTP and ESMTP ID numbers in the Received: = headers=20 because it is attached to this message for life; the other IDs are only=20 associated with specific mail transactions at specific machines, so that = one=20 machine's ID number means nothing to another machine. Sometimes (as in = this=20 example) the Message-ID has the sender's email address embedded in it; = more=20 often it has no intelligible meaning of its own.=20

X-Mailer: Loris v2.32
The message was = sent=20 using a program called Loris, version 2.32.=20

Subject: Lunch = today?
Self-explanatory.=20

Mail Protocols

This section is a little more technical than the others, and focuses = on the=20 details of how mail gets from one point to another. You don't need to = understand=20 every word, but familiarity with this subject can do a lot to clarify = what's=20 happening in strange situations. Since email spammers often = intentionally create=20 such strange situations (partly to confuse their victims), the ability = to=20 understand those situations can be quite helpful.

To communicate over a network, computers often use "points of entry" = called=20 ports; you might think of a port as a channel through which a = computer=20 can listen to communications from the network. To listen to many = communications=20 at once, a computer needs to have multiple ports; to distinguish them, = they're=20 generally numbered. On systems connected to the Internet (or any systems = using=20 the same protocols for email), port 25 is of particular importance for = the=20 present discussion; that's the port that's used to transmit and receive = mail.=20

Normal Behavior

Let's return to the example of the last section, and specifically to = the=20 point where mail.bieberdorf.edu communicates with = mailhost.immense-isp.com. What=20 really happens here is that mail.bieberdorf.edu opens a connection to = port=20 25 of mailhost.immense-isp.com, and sends the mail through that = connection,=20 along with some administrative data. The commands it uses to do this, = and the=20 responses issued by the receiving system, are more or less = human-readable;=20 they're commands in a rudimentary language called SMTP, for = Simple Mail=20 Transfer Protocol. Someone eavesdropping on the "conversation" between = the=20 machines would see something like the following transcript (the commands = issued=20 by mail.bieberdorf.edu are in boldface):=20

220 mailhost.immense-isp.com ESMTP Sendmail = 8.8.5/1.4/8.7.2/1.13;=20 Tue, Mar 18 1997 14:38:58 -0800 (PST)
HELO=20 mail.bieberdorf.edu
250 mailhost.immense-isp.com Hello=20 mail.bieberdorf.edu [124.211.3.78], pleased to meet you
MAIL = FROM:=20 rth@bieberdorf.edu
250 rth@bieberdorf.edu... Sender = ok
RCPT TO:=20 tmh@immense-isp.com
250 tmh@immense-isp.com... Recipient=20 ok
DATA
354 Enter mail, end with "." on a line by=20 itself
Received: from alpha.bieberdorf.edu (alpha.bieberdorf.edu = [124.211.3.11]) by mail.bieberdorf.edu (8.8.5) id 004A21; Tue, Mar 18 = 1997=20 14:36:17 -0800 (PST)
From: rth@bieberdorf.edu (R.T. Hood)
To:=20 tmh@immense-isp.com
Date: Tue, Mar 18 1997 14:36:14 = PST
Message-Id:=20 <rth031897143614-00000298@mail.bieberdorf.edu>
X-Mailer: = Loris=20 v2.32
Subject: Lunch today?

Do you have time to meet for=20 lunch?

--rth
.
250 LAA20869 Message accepted for=20 delivery
QUIT
221 mailhost.immense-isp.com closing=20 connection

This whole transaction depends on five commands which constitute the = core of=20 SMTP (there are a few others, but they're peripheral to the actual = process of=20 passing mail from one place to another): HELO, MAIL FROM, RCPT TO, DATA, = and=20 QUIT.

HELO identifies the sending machine; "HELO = mail.bieberdorf.edu"=20 should be read as "Hello, I'm mail.bieberdorf.edu". The sender can lie; = nothing,=20 in principle, prevents mail.bieberdorf.edu from saying "Hello, I'm=20 frobozz.xyzzy.gov" (HELO frobozz.xyzzy.gov) or even "Hello, I'm a = misconfigured computer" (HELO a misconfigured computer). However, = in most=20 circumstances, the receiver has some tools with which to discover this = and find=20 out the sending machine's real identity.

MAIL FROM initiates mail processing; it means "I have mail to = deliver=20 from so-and-so". The address given turns into the so-called "envelope = From" (see=20 Section Whatever); it need not be the same as the sender's own address! = This=20 apparent security hole is inevitable (after all, the receiving machine = doesn't=20 know anything about who has what username on the sending machine), and = in=20 certain circumstances it turns out to be a useful feature.

RCPT TO is dual to MAIL FROM; it specifies the intended = recipient of=20 the mail. One piece of mail can be sent to multiple recipients simply by = including multiple RCPT TO commands (see the section on mail relaying, = which=20 explains how this feature is sometimes abused on insecure systems). The = given=20 address turns into the so-called "envelope To" (see Section Whatever); = it=20 actually determines who the mail will be delivered to, regardless of = what the=20 To: line in the message says.

DATA starts the actual mail entry. Everything entered after a = DATA=20 command is considered part of the message; there are no restrictions on = its=20 form. Lines at the beginning of the message (before the first blank = line) that=20 start with a single word and a colon are considered to be headers my = most mail=20 programs. A line consisting only of a period terminates the message. =

QUIT terminates the connection.

SMTP is fully defined in RFC 821. Copies of the RFCs are widely = available on=20 the Web; this one is well worth reading, as it sheds much light on the=20 intricacies of mail processing.

Unusual Scenarios

The scenario above is a little bit oversimplified. The biggest = assumption is=20 that the mail servers of the two organizations involved have free access = to one=20 another. This was almost always true in the early days of the Internet, = and it's=20 still sometimes the case today, but as security has become a greater = concern,=20 and as organizations and networks have gotten bigger, sometimes = requiring many=20 separate mail servers, it's become more and more unusual.

Firewalls

Many, perhaps most, organizations with computers on the Internet are=20 protected by some kind of firewall. A firewall is just a computer = whose=20 primary job is to act as a gatekeeper between an organization's own = machines and=20 the great unwashed world of the net (so that, for instance, crackers = can't=20 easily connect to a piece of IBM's corporate network and start stealing=20 corporate secrets). From the standpoint of another computer trying to = deliver=20 mail to a system behind a firewall, what this means is that you can't = talk=20 directly to the system; you have to talk to the firewall.

No surprises here; this just introduces another "hop" in the journey = of a=20 piece of email, with the firewall acting as just another machine that = passes=20 mail. The picture above might be modified to look like this:

Illustration.

If immense-isp.com had a firewall in place, here's what the headers = from our=20 sample piece of email might look like. Notice the first Received: line. = (I'm=20 assuming that the firewall machine is named firewall.immense-isp.com; in = fact,=20 giving a machine a name like "firewall" is tantamount to inviting every = teenage=20 cracker-wannabe in the world to try to break in, so firewalls usually = have=20 perfectly ordinary, innocuous names.)=20

Received: from firewall.immense-isp.com=20 (firewall.immense-isp.com [121.214.13.129]) by = mailhost.immense-isp.com=20 (8.8.5/8.7.2) with ESMTP id LAA20869 for <tmh@immense-isp.com>; = Tue, 18=20 Mar 1997 14:40:11 -0800 (PST)
Received: from mail.bieberdorf.edu=20 (mail.bieberdorf.edu [124.211.3.78]) by firewall.immense-isp.com = (8.8.3/8.7.1)=20 with ESMTP id LAA20869 for ; Tue, 18 Mar 1997 = 14:39:24=20 -0800 (PST)
Received: from alpha.bieberdorf.edu = (alpha.bieberdorf.edu=20 [124.211.3.11]) by mail.bieberdorf.edu (8.8.5) id 004A21; Tue, Mar 18 = 1997=20 14:36:17 -0800 (PST)
From: rth@bieberdorf.edu (R.T. Hood)
To:=20 tmh@immense-isp.com
Date: Tue, Mar 18 1997 14:36:14 = PST
Message-Id:=20 <rth031897143614-00000298@mail.bieberdorf.edu>
X-Mailer: = Loris=20 v2.32
Subject: Lunch today?

In similar fashion, if all outgoing mail from bieberdorf.edu were = routed=20 through a firewall, there would be another Received: line inserted by = that=20 firewall machine. By the same token, there might be machines involved = that=20 aren't strictly firewalls, but simply common points for = routing---perhaps=20 immense-isp.com maintains machines in many physical locations, with = several=20 separate mailservers, and uses a single machine (called, say,=20 mailgate.immense-isp.com) to decide which server incoming mail should be = routed=20 to. Hence the following set of headers is a little extreme, but not = implausible:=20

Received: from mailgate.immense-isp.com=20 (mailgate.immense-isp.com [121.214.11.102]) by = mailhost3.immense-isp.com=20 (8.8.5/8.7.2) with ESMTP id LAA30141 for <tmh@immense-isp.com>; = Tue, 18=20 Mar 1997 14:41:08 -0800 (PST)
Received: from = firewall.immense-isp.com=20 (firewall.immense-isp.com [121.214.13.129]) by = mailgate.immense-isp.com=20 (8.8.5/8.7.2) with ESMTP id LAA20869 for <tmh@immense-isp.com>; = Tue, 18=20 Mar 1997 14:40:11 -0800 (PST)
Received: from = firewall.bieberdorf.edu=20 (firewall.bieberdorf.edu [124.211.4.13]) by firewall.immense-isp.com=20 (8.8.3/8.7.1) with ESMTP id LAA28874 for <tmh@immense-isp.com>; = Tue, 18=20 Mar 1997 14:39:34 -0800 (PST)
Received: from mail.bieberdorf.edu=20 (mail.bieberdorf.edu [124.211.3.78]) by firewall.bieberdorf.edu = (8.8.5) with=20 ESMTP id LAA61271; Tue, 18 Mar 1997 14:39:08 -0800 (PST)
Received: = from=20 alpha.bieberdorf.edu (alpha.bieberdorf.edu [124.211.3.11]) by=20 mail.bieberdorf.edu (8.8.5) id 004A21; Tue, Mar 18 1997 14:36:17 -0800 = (PST)
From: rth@bieberdorf.edu (R.T. Hood)
To:=20 tmh@immense-isp.com
Date: Tue, Mar 18 1997 14:36:14 = PST
Message-Id:=20 <rth031897143614-00000298@mail.bieberdorf.edu>
X-Mailer: = Loris=20 v2.32
Subject: Lunch today?

The history of the message can be reconstructed by reading the = Received:=20 headers from bottom to top; it went from alpha.bieberdorf.edu to=20 mail.bieberdorf.edu to firewall.bieberdorf.edu to = firewall.immense-isp.com to=20 mailgate.immense-isp.com to mailhost3.immense-isp.com, where it waits = for tmh to=20 come along and read it.

Relaying

Here are some possible headers from a message that had a very = different "life=20 cycle" than anything described so far:=20

Received: from unwilling.intermediary.com=20 (unwilling.intermediary.com [98.134.11.32]) by mail.bieberdorf.edu = (8.8.5) id=20 004B32 for <rth@bieberdorf.edu>; Wed, Jul 30 1997 16:39:50 -0800 = (PST)
Received: from turmeric.com ([104.128.23.115]) by=20 unwilling.intermediary.com (8.6.5/8.5.8) with SMTP id LAA12741; Wed, = Jul 30=20 1997 19:36:28 -0500 (EST)
From: Anonymous Spammer=20 <junkmail@turmeric.com>
To: (recipient list=20 suppressed)
Message-Id:=20 <w45qxz23-34ls5@unwilling.intermediary.com>
X-Mailer: Massive = Annoyance
Subject: WANT TO MAKE ALOT OF = MONEY???

A variety of things in this header might clue the reader in to the = fact that=20 this is a piece of electronic junk mail, but the thing to focus on here = is the=20 Received: lines. This message originated at turmeric.com, was passed = from there=20 to unwilling.intermediary.com, and from there to its final destination = at=20 mail.bieberdorf.edu. All well and good; but how did = unwilling.intermediary.com=20 get there, since it has nothing to do with either the sender or the = recipient?=20

Understanding the answer requires some knowledge of SMTP. In essence, = turmeric.com simply connected to the SMTP port at = unwilling.intermediary.com and=20 told it "Send this message to rth@bieberdorf.edu". It did this, = probably, in the=20 most direct manner imaginable, by saying RCPT TO: = rth@bieberdorf.edu. At=20 that point, unwilling.intermediary.com took over processing the message; = since=20 it had been told to send it to a user at some other domain = (bieberdorf.edu), it=20 went out and found the mail server for that domain and handed off its = mail in=20 the usual manner. This process is known as mail relaying.

Historically, there are good reasons for allowing relaying; on much = of the=20 net until about the late 1980s, machines rarely sent mail by talking = directly to=20 each other. Rather, they worked out a route for a message to travel, and = sent it=20 step by step along that route. It was a cumbersome system (especially = since the=20 sender often had to work out the route by hand!) By way of analogy, = imagine=20 sending a letter from San Francisco to New York, and having to address = the=20 envelope thus:=20

San Francisco, Sacramento, Reno, Salt Lake City, Rock = Springs,=20 Laramie, North Platte, Lincoln, Omaha, Des Moines, Cedar Rapids, = Dubuque,=20 Rockford, Chicago, Gary, Elkhart, Fort Wayne, Toledo, Cleveland, Erie, = Elmira,=20 Williamsport, Newark, New York City, Greenwich Village, #12 Desolation = Row,=20 Apt. #35, R.A. Zimmermann
It's clear why this is a = useful=20 addressing model if you're a postal worker---the post office in Gary, = Indiana=20 only has to be able to communicate with the adjacent offices in Chicago = and=20 Elkhart, rather than having to devote its resources to figuring out how = to get=20 something to New York. (It's also clear why this isn't a good idea from = the=20 standpoint of the letter-writer, and why email is no longer commonly = routed this=20 way!) This is exactly how email was sent; so it was important that one = machine=20 be able to give another instructions that said "I have email for=20 rth@bieberdorf.edu, to be sent from you to turmeric.com to galangal.org = to=20 asafoetida.com to bieberdorf.edu". Hence relaying.=20

In modern times, however, relaying is usually used by unethical = advertisers=20 as a technique for concealing the source of their messages, deflecting=20 complaints to the (innocent) relay site rather than to the spammers' own = ISPs.=20 (It also offloads the work of processing addresses and contacting = recipients=20 from the spammers' machines to those of an uninvolved third party; it's = widely=20 felt that relaying, especially large-scale relaying, constitutes theft = of=20 service for that reason.) The essential point here is to realize that = the=20 content of the message was formulated at the sending = point---turmeric.com in the=20 example above; the intermediate link, unwilling.intermediary.com, is = involved=20 only as an unwilling intermediary. They have no control over the sender, = much as=20 the Gary post office has no real influence over someone writing letters = in San=20 Francisco. (They do, however, have the power to turn off relaying at = their=20 site!)

One more thing to notice in the sample headers: The Message-Id: line = was=20 filled in, not by the sending machine (turmeric.com), but by the relayer = (unwilling.intermediary.com). This is a common feature of relayed mail; = it just=20 reflects the fact that the sending machine didn't supply a Message-Id. =

Envelope Headers

The section on SMTP, above, alluded to a distinction between = "message" and=20 "envelope" headers. This distinction and some of its consequences are = detailed=20 here.

Briefly, the "envelope" headers are actually generated by the machine = that=20 receives a message, rather than by the sender. By this definition, = Received:=20 headers are envelope headers; however, the term usually refers to the = "envelope=20 From" and "envelope To" only.

The envelope From header is the header derived from the information = in a MAIL=20 FROM command. For instance, if a sending machine says MAIL FROM:=20 ginger@turmeric.com, the receiving machine will generate an envelope = From=20 header that looks like this:=20

>From ginger@turmeric.com
Notice the = absence=20 of the colon---"From", not "From:". Frequently, envelope headers don't = have=20 colons after them; this convention is not universal, but it is common = enough to=20 pay attention to.=20

Symmetrically, the envelope To is derived from a RCPT TO command. If = the=20 sender says RCPT TO: tmh@bieberdorf.edu, then the envelope To is=20 tmh@bieberdorf.edu. There often isn't an actual header containing this=20 information; sometimes it's embedded in the Received: headers.

An important consequence of the existence of envelope information is = that=20 the message From: and To: headers are meaningless. The contents = of the=20 From: header are provided by the sender; and so, counterintuitively, are = the=20 contents of the To: header. Mail is routed only based on the = envelope To,=20 never based on the message To: header.

To see this in action, consider an SMTP transaction like this:=20

HELO galangal.org
250 mail.bieberdorf.edu Hello=20 turmeric.com [104.128.23.115], pleased to meet you
MAIL FROM:=20 forged-address@galangal.org
250 forged-address@galangal.org... = Sender=20 ok
RCPT TO: tmh@bieberdorf.edu
250 tmh@bieberdorf.edu...=20 Recipient OK
DATA
354 Enter mail, end with "." on a line = by=20 itself
From: another-forged-address@lemongrass.org
To: (your = address=20 suppressed for stealth mailing and annoyance)
.
250 OAA08757 = Message=20 accepted for delivery
Here are the corresponding headers = (excerpted=20 for clarity):=20
>From forged-address@galangal.org
Received: from=20 galangal.org ([104.128.23.115]) by mail.bieberdorf.edu (8.8.5) for=20 <tmh@bieberdorf.edu>...
From:=20 another-forged-address@lemongrass.org
To: (your address suppressed = for=20 stealth mailing and annoyance)
Notice that the = contents of=20 the envelope From, the message From:, and the message To: are all = dictated by=20 the sender, and have no bearing whatsoever on reality! This example = illustrates=20 why the From, From:, and To: headers can never be trusted in mail = that=20 might be forged; they're simply too easy to falsify.=20

The Importance of Received: Headers

We've seen already, in the examples above, that the Received: headers = provide=20 a detailed log of a message's history, and so make it possible to draw = some=20 conclusions about the origin of a piece of email even when other headers = have=20 been forged. This section explores some details associated with these = singularly=20 important headers, and in particular how to circumvent common forgery=20 techniques.

Unquestionably, the single most valuable forgery protection in the = Received:=20 headers is the information logged by the receiving host from the sender. = Recall=20 that the sender can lie about its identity (by putting garbage in its = HELO=20 command to the receiver); fortunately, modern mail transfer programs are = able to=20 detect such false information and correct it.

If, for instance, the machine turmeric.com, whose IP address is=20 104.128.23.115, sends a message to mail.bieberdorf.edu, but falsely says = HELO=20 galangal.org, the resultant Received: line might start like this:=20

Received: from galangal.org ([104.128.23.115]) by=20 mail.bieberdorf.edu (8.8.5)...
(The rest of the line = is omitted=20 for clarity.) Notice that, although the bieberdorf.edu machine doesn't=20 explicitly announce that galangal.org wasn't really the sending machine, = it does=20 record the correct IP address of the sender. If someone receiving the = mail had=20 reason to think that galangal.org appeared in the headers through the = work of a=20 forger, they could look up the IP address 104.128.23.115 (with a tool = like the=20 UNIX program nslookup) and find that that address in fact belonged to=20 turmeric.com (not galangal.org). In other words, logging the IP address = of the=20 sending machine provides enough information to confirm a suspected = forgery.=20

Many modern mail programs actually automate this process, looking up = the name=20 of the sending machine on their own. (The lookup process is called = reverse=20 DNS (for Domain Name Service)---"reverse" because it reverses the = usual=20 process of translating a name to an address for routing purposes.) If=20 mail.bieberdorf.edu were using software that did this, the Received: = line would=20 start something like this:=20

Received: from galangal.org (turmeric.com = [104.128.23.115]) by=20 mail.bieberdorf.edu...
Here the forgery is = crystal-clear; this=20 line effectively says "turmeric.com, whose address s 104.128.23.115, = reported=20 its name as galangal.org". Needless to say, information like this is = extremely=20 helpful in identifying and tracking forged email! (For this very reason, = spammers try to avoid using relaying machines that report reverse DNS=20 information. Sometimes they even find machines that don't do the kind of = IP=20 logging described in the previous paragraph---though there aren't very = many of=20 those around on the net any more.)=20

Another trick used by forgers of email, this one increasingly common, = is to=20 add spurious Received: headers before sending the offending mail. This = means=20 that the hypothetical email sent from turmeric.com might have Received: = lines=20 that looked something like this:=20

Received: from galangal.org ([104.128.23.115]) by=20 mail.bieberdorf.edu (8.8.5)...
Received: from nowhere by = fictitious-site=20 (8.8.3/8.7.2)...
Received: No Information Here, Go=20 Away!
Obviously, the last two lines are complete = nonsense,=20 written by the sender and attached to the message before it was sent.=20

Since the sender has no control over the message once it leaves = turmeric.com,=20 and Received: headers are always added at the top, the forged lines have = to=20 appear at the bottom of the list. This means that someone reading the = lines from=20 top to bottom, tracing the history of the message, can safely throw out = anything=20 after the first forged line; even if the Received: lines after that = point look=20 plausible, they're guaranteed to be forgeries.

Of course, the sender doesn't have to use obvious garbage; a really = devious=20 forger could create a plausible list of Received: lines like this:=20

Received: from galangal.org ([104.128.23.115]) by=20 mail.bieberdorf.edu (8.8.5)...
Received: from lemongrass.org by=20 galangal.org (8.7.3/8.5.1)...
Received: from graprao.com by = lemongrass.org=20 (8.6.4)...
Here the only dead giveaway is the = inaccurate IP=20 address for galangal.org in the very first Received: line. The forgery = would be=20 still harder to detect if the forger had written in correct IP addresses = for=20 lemongrass.org and graprao.com, but the IP mismatch in the first line = would=20 still reveal that the message had been forged and "injected" into the = network at=20 the site 104.128.23.115 (i.e., turmeric.com). However, most header = forgeries are=20 considerably less sophisticated, and the extra Received: lines are = obvious=20 garbage.=20

List of Common Headers

Home - Usenet Area - General - Email Area =

3D"Made   =A9 1997 Ken Lucke - = all rights=20 reserved   3D"Spun

------=_NextPart_000_0000_01C32AB0.CE88AC20 Content-Type: image/gif Content-Transfer-Encoding: base64 Content-Location: http://www.stopspam.org/graphics/macmade-wht.gif R0lGODlhXwAlAPcAAP//////zP//mf//Zv//M///AP/M///MzP/Mmf/MZv/MM//MAP+Z//+ZzP+Z mf+ZZv+ZM/+ZAP9m//9mzP9mmf9mZv9mM/9mAP8z//8zzP8zmf8zZv8zM/8zAP8A//8AzP8Amf8A Zv8AM/8AAMz//8z/zMz/mcz/Zsz/M8z/AMzM/8zMzMzMmczMZszMM8zMAMyZ/8yZzMyZmcyZZsyZ M8yZAMxm/8xmzMxmmcxmZsxmM8xmAMwz/8wzzMwzmcwzZswzM8wzAMwA/8wAzMwAmcwAZswAM8wA AJn//5n/zJn/mZn/Zpn/M5n/AJnM/5nMzJnMmZnMZpnMM5nMAJmZ/5mZzJmZmZmZZpmZM5mZAJlm /5lmzJlmmZlmZplmM5lmAJkz/5kzzJkzmZkzZpkzM5kzAJkA/5kAzJkAmZkAZpkAM5kAAGb//2b/ zGb/mWb/Zmb/M2b/AGbM/2bMzGbMmWbMZmbMM2bMAGaZ/2aZzGaZmWaZZmaZM2aZAGZm/2ZmzGZm mWZmZmZmM2ZmAGYz/2YzzGYzmWYzZmYzM2YzAGYA/2YAzGYAmWYAZmYAM2YAADP//zP/zDP/mTP/ ZjP/MzP/ADPM/zPMzDPMmTPMZjPMMzPMADOZ/zOZzDOZmTOZZjOZMzOZADNm/zNmzDNmmTNmZjNm MzNmADMz/zMzzDMzmTMzZjMzMzMzADMA/zMAzDMAmTMAZjMAMzMAAAD//wD/zAD/mQD/ZgD/MwD/ AADM/wDMzADMmQDMZgDMMwDMAACZ/wCZzACZmQCZZgCZMwCZAABm/wBmzABmmQBmZgBmMwBmAAAz /wAzzAAzmQAzZgAzMwAzAAAA/wAAzAAAmQAAZgAAM+4AAN0AALsAAKoAAIgAAHcAAFUAAEQAACIA ABEAAADuAADdAAC7AACqAACIAAB3AABVAABEAAAiAAARAAAA7gAA3QAAuwAAqgAAiAAAdwAAVQAA RAAAIgAAEe7u7t3d3bu7u6qqqoiIiHd3d1VVVURERCIiIhEREQAAACwAAAAAXwAlAEAI/wDrCRxI sKDBgwgTKlzIkKE9eytW3KsHoB4+ABgzatRYYqPHjx9XgBwZkuTIevz0BQqkb+BEjAIKyJxJc+YA jVZYrQgEINC/QFb+5eTpMxArnkFXbtS5ghVGVv9WJA2q859TkxpRqtTXsiI+ihhvDBlLtizZHhp5 QrUSaMW/fwCsiPTp9O3SqACgitSbF69PkX13vm3b82dhnlpZtqxnL19EifciS55MubLly5gza958 D1/nfitXUqx3D2PD06hTq0ZoD2i+1wO/knyTS8lIt23/unVrBYBQ32p7w8Xok21UoD4Li7QatDdW jSkVj5YNAEKE69izX4fg0efwwlefF/8nSfj5YY/RuU4Hq5GKHxUZVVCh4jFpXLnegVptqrOwYbc/ AdjWYHExF9dRSvWEYERQ6dSgSOnpExk+VrzG1YUYZqjhhhx26OGHIHK4knrsVbTaiSimSNBDEllh xWgvmSfjjDTWmFE9+3DFUmwlfkRJLpSABBVxeIFkmIxBzYXRkTXyw9KOFFEHQAlw5GKllZRUcmUu lZiQVlK7xVVUcsCRuZFRQxEFF5mGMWmek1u5BJZ12tXJnY0mAYZnfQlC9+RipIHlwAiEFmqooQd8 aZhxwxk1Xl6sNAdAU3P91KB/RynIVn9NsaXmUW8BBieUgWoUwxmoorrIIqkussVSk4b/CqBybjm1 lkZ85YoXX4/qVelxhhWF0ahdWdTjnsgm+2Zo+jzUWYUWhijttNRWq2Foopmm4rbcqsZiZ2BNiM+4 5JZr7rnopqvuuuy2W66L+dwDr0AAxKjsvfh6RFEg0dIrZb4A55vjk/4e69EbQXq0k3M78fSRfTLu pFFTDjepUraljqSElR19NB5fC2fk4pFs6fnYySJVuhxQzsmYGMb2YqTEG0ooUULNM7/xRscZrXRU klIBF9SS+q1UJHC7qVkY0cDN+HIg62VEQAo1VZ2Clz37JldUQ/8kaYC/RZQRb0kuzSZPbj4XHcbU DVD12wHHTdLaUEc5Z512btQUk0Gl5D3So/cKmxXdBWNkwaGIE+rAmW85dylGYk8c+aRjT6rnX5Lr DZiwej5dOAAHJC56osTNqt+aUdVaWG8sy9p4YLBLqrpRSgorOOF2Z2SAWbyPBQOuBQo1q6ZW+fbd U7smv5fyyEf0ll22H4k7AP8C4Ic02KPiByrYS4MKfMBDGitcetEl1X5xwa7r8nKp3rTDQ0ePGO7G ym3/nihhO5AV9tA7WkUA/J8AA0jAARqwgAg8oAITGMD8heYhjQnEPvjBigpa8IIYzKAGN8jBDnrw gxbshwgluBLIQMtaKEyhCjOErZUEBAA7 ------=_NextPart_000_0000_01C32AB0.CE88AC20 Content-Type: image/gif Content-Transfer-Encoding: base64 Content-Location: http://www.stopspam.org/graphics/spunwithps.gif R0lGODlhWAAfAPYAAP/////Mme/v79nZ2czMzMHBwb+/v7a2tqqqqp/c2p+fn52e/52d/5mZ/5jt 1pSUlJKP85KP8ZKO8ZGR8JGQ8ZGQ75CR8ZCO7YmJiYCAgH5+fnyk13l5eXNzc294j2hoaGhkrGST qWOewGNlpGNbpmJep2FgrmFdqGBdpl6XqF5XcV1dXVxoRFtUbFpaWlaD2FJSUk+A2UxgekxJZEeQ r0dHR0REREF0dEE/dz87dT5mZzw8PDiIeDZOTTNZXzNNTTM9ZTMzMzFJZjFHaTEsXiwtUSuF1CtK TitBeilzTiePQya36iWQjSUlJSORyyJ8xyI3biF8QyFMjhxxmRxhqxp+pxoaGhheqxd8nxZ6qRZp fxEREQ6Sygmb/gey6AcHBwaf/wCg/wCZ/wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH+ EWJ5LCBKYWNvYiBBYnJhbXMNACH+HWNhYnJhbXNAZWQuY28uc2FubWF0ZW8uY2EudXMNACH+H2h0 dHA6Ly93d3cubWVhbmllLmNvbS9zdGFyZmlyZS8ALAAAAABYAB8AAAf/gACCg4SFhoeIiYqLjIlf jZCRkpOLX48ABJmam5ydnp+goaKjnZaCBDapqqusrKSvBA8Imx0HsKOmmDaUNrejZQ8EGJllCr6h uagAYmPNYs/Q0dC9mQ9lBwoYGg/ZBLUYDxrGBAjDGtgPGAoIZbVlGh0Fx57Ju8zLY8vS0tTsHfBN yljpUIaAFQU1mlhpkulAGX9lMLSzUuZaxTIwRJEhg+uSsnsgne0TQ61ARQQECbQrWKydgoLEapTZ UcZkh5fEFKQEVSZAAJig6uEbmg8kv0zlytQg+KAJ05Y3gWpomdElMIc6r30i89NnRY70PNprNuZZ 0ZEkG54kWOZDAYo1/9hhQFBD04GM2wh8GLYCxoG4GD4csGLQ4AMrXC/2LAO2lNih0fSN7DWgcmUO ZSxr3sy5s+cBBDRz9elzI+dhQkE2WN1stWvXlDPIzuBix+zbuHPr3q17o++NuYMQSF10NQMJKPL5 GEGBwerYvKNLn74bOG7hxJetviBDR74QWnBUeB6auvnz6DNgf/xMe4N8PPItmXLEBPkB6fPrvz6c PbTVVzSjBBNeVPFDEffdJsCC+i0oQG4P7ifbeqfY818DVGCRQgIOgJFFD0AkOJsAW2yR34Mlmoib ivtRqItkYqwmxRM0iLBBF1wgQYSIspGYgY8M9hhkkD+y6GCPPw5J5P+QP97m4kcXkuBBElE4EYYR UJTAY5MoyqZil1t0qWCPJZJJ4pdlNvmll7M9aeEzq0UwBAs3xPCCEDlMsOWZXy4YpokkPujnbGgW GSigfxqa5J8sqtdfhTCutgAEJ8ygQgsgWOBcA9CROWIGJiZ6aIoKivonkKbymaiTj74YzWuwvtZp kbgNuqiQQqIoKKiHGooorj6yWk8rxLZS3m2NgppmqF4uWyafbKrILKjNTttmq/Nogp95DCYLobcS PvnZuJUZgF6a0oHbIrbbSujuu9OJC++89PKXDAZB5Kvvvvz26++/AAcs8MD6YpvtwQhna4olDDfs 8MMQRyzxxBRXbPELxRhnrPHGDQPyFQgAOw== ------=_NextPart_000_0000_01C32AB0.CE88AC20--