=for comment $Id: avenger.pod,v 1.49 2006/02/16 03:15:34 dm Exp $

=head1 NAME

avenger - Mail Avenger

=head1 DESCRIPTION

Mail Avenger is a highly-configurable MTA-independent SMTP (Simple
Mail Transport Protocol) server designed to let you filter and fight
SPAM I<before> accepting incoming mail from a client machine.
F<avenger> is the script run on behalf of each user to decide whether
to accept incoming mail.

When a client attempts to send mail to a user on the system, the
avenger SMTP daemon, asmtpd, runs avenger to process the file
F<.avenger/rcpt> in the user's home directory.  That file, a shell
script with access to special functions, determines how the SMTP
server should proceed.  The possible outcomes are:

=over

=item

Provisionally accept the mail, falling back to system-default rules

=item

Accept the mail immediately with no further checks

=item 

Reject the mail immediately

=item

Defer the mail, telling the client to re-send it later

=item

Redirect the processing to another local name.  The name can be
another email address belonging to the current user, or an email
address belonging to the special B<AvengerUser> user.  In the later
case, avenger will be re-run with a different user ID, and hence can,
for example, employ utilities that maintain state across multiple
users (assuming they all redirect processing the same way).

=item 

Run a "bodytest" rule.  With this outcome, the the SMTP transaction
continues on to receive the entire contents of the mail message, after
which a program is run on the contents of the mail message.  That
program can decide, based on the contents, whether to accept, reject,
defer, or silently discard the message.

=back

Mail Avenger should typically be configured to have a B<Separator>
character, allowing each user to maintain multiple email addresses.
With sendmail, B<Separator> is typically C<+>, with qmail it is
typically C<->.  If the separator is C<+>, then any email sent to
B<user+>I<ext>B<@your-host> will be processed by files in B<user>'s
F<.avenger> directory.

Avenger first checks for a file named F<rcpt+>I<ext> in a user's
F<.avenger> directory, then for F<rcpt+default>.  If I<ext> itself
contains the separator character, for example
B<user+>I<ext1>B<+>I<ext2>B<@your-host>, avenger will check first for
F<rcpt+>I<ext1>F<+>I<ext2>, then for F<rcpt+>I<ext1>F<+default>, then
for F<rcpt+default>.  The same algorithm is extended for arbitrarily
many separator characters.  (If separator is C<->, simply replace C<+>
with C<-> throughout the above description, including in the names of
files such as F<rcpt-default>.)

If mail is rejected by the recipient checks but the sender address of
a message is local and B<UserMail> is 1 in F<asmtpd.conf> (which is
not the default), then before rejecting mail, avenger will be run on
behalf of the sending user.  In this case, the address will be parsed
as above, but avenger will look for rules in files beginning F<mail>
instead of F<rcpt>.  This mechanism can be used by local users who
want to relay mail through the server from an untrusted IP address.

Using the F<mail> configuration files, each user can, for instance,
configure a F<mail+...> file to accept mail from an IP address he or
she trusts, even if that address is not trusted by all users.
(Alternatively, using tools such as macutil, a user might set up
relaying of mail in which the envelope sender contains a cryptographic
code, checked by the F<mail+...> script.)

Error output of an avenger script F<rcpt+>I<ext> or F<mail+>I<ext> is
redirected to a file called F<log+>I<ext> in the same directory, for
use in debugging.

=head1 AVENGER SYNTAX

Avenger configuration files are simply shell scripts, using the syntax
described in L<sh(1)|sh(1)>.  Each line of the file contains a
variable assignment, command, or function to run.  Scripts can
additionally make use of a number of avenger-specific functions and
variables.  This section describes avenger functions.  The next two
sections describe variables.

=over

=item B<errcheck>

Certain error conditions result in Mail Avenger rejecting mail by
default, unless the message is explicitly accepted through an
B<accept> or successful B<bodytest> check.  These conditions are
indicated by the B<MAIL_ERROR> environment variable described below.
If your script either rejects mail or falls through to the default
behavior, there is often no reason to run tests on a message that will
end up being rejected either way.  B<errcheck> exits immediately with
the default error if the default would be to reject or defer the mail.

=item B<accept> [I<message>]

Immediately accepts the message (without falling back to any default
rules).  If message is supplied, it will be returned to the SMTP
client.  The default message is C<ok>.

=item B<reject> [I<message>]

Reject the mail, with I<message>.  (The default message is C<command
rejected for policy reasons>).

=item B<defer> [I<message>]

Reject the mail with a temporary error code, so that a legitimate mail
client will attempt to re-send it later.  The default for I<message>
is C<temporary error in processing>.

=item B<bodytest> I<command> [I<arg> ...]

Accept the current SMTP C<RCPT> command.  However, once the whole mail
message has been received with the SMTP C<DATA> command, run
I<command> with the message as its standard input.  Depending on the
exit status of I<command> return to the client's C<DATA> command
either success, temporary, or permanent failure.  Exit code 0 means
accept the mail, 100 means reject, 111 means reject with a temporary
error code (i.e., defer the mail).  See the description of B<bodytest>
in the asmtpd/avenger interface description for more information on
B<bodytest> (since this function directly invokes B<bodytest> in
asmtpd).

Error output from I<command> will be redirected to the same log file
as output from the F<rcpt+...> avenger script invoking the B<bodytest>
function.  Standard output of I<command> will be included as a
diagnostic the bounce message if the exit code defers or rejects the
mail.

Note that I<command> and the arguments passed to B<bodytest> will be
run by the shell.  Thus, it is important not to pass any arguments
that might contain shell metacharacters such as C<E<gt>> and C<$>.

=item B<redirect> I<local>

Finish processing, and re-run avenger as if mail were being sent to a
different username I<local> (possibly belonging to the special
B<AvengerUser> user).  See the description of B<redirect> in the
asmtpd/avenger interface description for more information on
B<redirect> (since this function directly invokes B<redirect> in
asmtpd).

=item B<greylist> [I<sender-key>]

This command defers mail the first time mail is received from a
particular sender at a particular IP address.  However, after a
certain interval, B<greylist_delay>, if the client re-sends the mail,
it will be accepted.  Furthermore, from that point on, all mail will
be immediately accepted from that sender and IP address, unless the
sender stops sending mail for a period of B<greylist_ttl2> or more.
If, however, after sending the initial, defered piece of mail, the
client does not try again within a period of B<greylist_ttl1>, then
any record of the client will be erased, and the next time it tries to
send mail it will be defered again.

The parameters can be tuned by setting variables in the script.  The
default values are:

    greylist_delay=30m  # Time to wait before allowing message
    greylist_ttl1=5h    # How long to remember first-time senders
    greylist_ttl2=36D   # How long to remember ok senders

B<m> means minutes, B<h> hours, and B<D> days.  For a complete list of
allowed suffixes, see the documentation for L<dbutil(1)|dbutil(1)> (in
particular for the B<--expire> option).

I<sender-key>, if supplied, is used to identify the sender.  The
default value is C<$CLIENT_IP $RECIPIENT $SENDER>.  If, for example,
you wanted to record only the first 24-bits of IP address and didn't
care about the recipient, you could use the command:

=over

B<greylist "${CLIENT_IP%.*} $SENDER">

=back

=item B<setvars>

All functions that set a variable by means of an external query to
asmtpd are performed asynchronously.  B<setvars> actually waits for
results and sets the values of those variables.  In this way, a number
of potentially slow requests (such as DNS lookups) can be initiated
concurrently, and their latencies overlapped.  However, one must
remember to call B<setvars>, or else variables that should contain the
results of operations will remain unset.

=item B<dns> I<var> I<type> I<domain-name>

Performs a DNS lookup of I<domain-name> for records of type I<type>,
and assigns the result to variable I<var> when you call B<setvars>.
I<type> must be one of B<a>, B<mx>, B<ptr>, or B<txt> (lower-case
only).

=item B<rbl> [B<-ipf>] I<var> I<domain>

Looks up the current mail sender in a real-time blackhole list (RBL).
I<domain> is the domain name of the RBL (e.g., C<bl.spamcop.net>).  If
the sender is listed, set I<var> to the result of the DNS lookup when
you next call B<setvars>.  B<-i> looks up the sender's IP address (the
default if no options are specified).  B<-p> looks up the sender's
domain name (verified DNS PTR record).  B<-f> looks up the envelope
sender domain name in the RBL.

=item B<spf0> I<var> [I<spf-mechanism> ...]

=item B<spf> I<var> [I<spf-mechanism> ...]

Tests the sender against an arbitrary query formulated in the SPF
language.  This is a powerful way to whitelist or blacklist particular
senders.  For example, suppose you want to accept any mail from
machines in the list maintained by trusted-forwarder.org, accept mail
from any machine name ending C<yahoo.com> reject any mail from users
in the spamcop RBL, and for other users fall back to the default
system-wide rules.  You might use the following F<rcpt> file:

    spf MYSPF +include:spf.trusted-forwarder.org \
        +ptr:yahoo.com -exists:%{ir}.bl.spamcop.net ?all
    setvars
    case "$MYSPF" in
        pass)
	    accept "I like you"
	    ;;
        fail)
	    reject "I don't like you"
	    ;;
	error)
	    # Note, could instead fall through to default here
	    defer "Temporary DNS error"
	    ;;
    esac

Note that commands B<spf0> and B<spf> are synonymous, but B<spf> is
deprecated, because in a later release of Mail Avenger B<spf> will
become synonymous with B<spf1>.

=item B<spf1> I<var> [I<spf-mechanism> ...]

Performs the same tests as the B<spf> directive, but returns the
result strings B<None>, B<Neutral>, B<Pass>, B<Fail>, B<SoftFail>,
B<TempError>, and B<PermError> instead of B<none>, B<neutral>,
B<pass>, B<fail>, B<softfail>, B<error>, and B<unknown>.

=back

=head1 AVENGER VARIABLES

These variables are set by the avenger script.  In addition, asmtpd
sets a number of environment variables before running avenger.  These
are documented in the next section, ENVIRONMENT.

=over

=item B<FILEX>

The extension on the file currently being processed.  For example, if
file F<rcpt+ext> is being processed, will be set to C<+ext>.  Empty
when processing just F<rcpt> (or F<mail>).  May also contain
F<default> when a default rule file for some suffix is being run.

=item B<PREFIX>

=item B<SUFFIX>

Assuming the separator is C<+>, when processing a file
F<rcpt+base+default> or F<mail+base+default>, B<PREFIX> is set to
F<base>, while B<SUFFIX> is set to the portion of the name for which
F<default> was substituted.  When the file does not end with
F<default>, B<SUFFIX> is empty.  When the file is just F<rcpt> with no
extension, both B<PREFIX> and B<SUFFIX> are empty.  When B<SUFFIX>
itself contains a C<+> character, B<SUFFIX1> contains to the part of
B<SUFFIX> after the first C<+> character, B<SUFFIX2> contains the part
after the second C<+>, and so on for each C<+> character in suffix.


=back

=head1 ENVIRONMENT

=over

=item B<AUTH_USER>

If Mail Avenger was compiled with SASL support (which is not the
default, unless you supplied the B<--enable-sasl> argument to
C<configure>), and if the client successfully authenticates to the
server using SASL, then B<AUTH_USER> will be set to the name of the
authenticated user.

=item B<AVENGER_MODE>

Set to C<rcpt> when testing whether a recipient should receive mail.
Set to C<mail> (possibly after an C<rcpt> check fails) when checking
whether to relay mail (possibly on behalf of a local user).

=item B<AVUSER>

The effective local username for which avenger is being run.
Ordinarily, this will be the same as:

=over

=item $USER${PREFIX+$SEPARATOR}$PREFIX\

=item ${SUFFIX+$SEPARATOR}$SUFFIX

=back

However, for special avenger files like F<unknown> and F<default>, it
can contain useful information, because unlike the B<RECIPIENT_LOCAL>
environment variable, B<AVUSER> reflects substitutions from the
Mail Avenger F<domains> and F<aliases> files.

=item B<CLIENT>

This variable contains the name of the client machine, as typically
reported in "Received:" headers.  Its value has the form:

=over

[I<user>B<@>]I<host>

=back

I<user> is the user name for the connection reported by the client, if
the client supports the RFC 1413 identification protocol, otherwise it
is omitted.  I<host> is a verified DNS hostname for the IP, if asmtpd
could find one.  Otherwise, it is simply the numeric IP address.

=item B<CLIENT_COLONSPACE>

Set to C<1> if the client included a space between the colon in the
command C<MAIL FROM:> or C<RCPT TO:> and the subsequent C<E<lt>> that
begins an email address.

=item B<CLIENT_DNSFAIL>

If B<AllowDNSFail> is set to 1 in the F<asmtpd.conf> file and
resolving the client's IP to a hostname returns a temporary error,
then this variable will be set to a description of the error.

=item B<CLIENT_HELO>

Set to the argument the client supplied to the SMTP C<HELO> or C<EHLO>
command.

=item B<CLIENT_IP>

Set to the IP address of the client.

=item B<CLIENT_NAME>

Set to the verified DNS name of the client, if asmtpd can find one.

=item B<CLIENT_NETHOPS>

Set to the number of network hops between the server and the client,
if asmtpd can get the client or its firewall to return an ICMP
destination unreachable (type 3 packet) in response to a UDP probe.
Whether or not this is set will depend on firewall configurations.

=item B<CLIENT_NETPATH>

Set to as many intermediary network hops as asmtpd can determine
between the server and the client.  How close to the client asmtpd can
probe will depend on firewalls.

=item B<CLIENT_PIPELINING>

Set to C<1> if the client wrote data after the SMTP B<HELO> or B<EHLO>
command, before receiving its response.  A correct SMTP client should
not "pipeline" commands until after receiving the result of the
B<HELO> command and verifying that the server accepts pipelined
commands.

=item B<CLIENT_PORT>

The TCP port number of the client.

=item B<CLIENT_POST>

Set to C<1> if the client sent a C<POST> command at some point during
the SMTP session.  C<POST> is not a valid SMTP command; it is an HTTP
command.  However, one technique for sending spam involves exploiting
an open web proxy to "post" an SMTP session to a mail server.  The
initial HTTP headers (including the HTTP post command) simply cause
SMTP syntax errors, while the body of the POST command contains SMTP
commands.  By checking the B<CLIENT_POST> environment variable, you to
reject mail sent in this way.

=item B<CLIENT_REVIP>

The value of B<CLIENT_IP> with the order of the bytes reversed.
Suitable for prepending to C<.in-addr.arpa> or an RBL domain to
perform a DNS lookup based on IP address.

=item B<CLIENT_SYNFP>

Contains a fingerprint, abstracting the contents of the initial TCP
SYN packet the client sent to establish the TCP connection.  The exact
contents of SYN packets depends on the operating system and version of
the client, and can therefore reveal interesting information about the
type of client connecting to your mail server.  The format of the
fingerprint is:

=over

I<wwww>B<:>I<ttt>B<:>I<D>B<:>I<ss>B<:>I<OOO>

=back

Where the fields are as follows:

=over

=item I<wwww>

the initial TCP window size

=item I<ttt>

the IP ttl of the received packet

=item I<D>

the IP "don't fragment" bit

=item I<ss>

total size of the SYN packet (including IP header)

=item I<OOO>

a comma-separated list of TCP options, as follows:

=over

=item B<N>

NOP option

=item B<W>I<nnn>

window scaling option with value I<nnn>

=item B<M>I<nnn>

maximum segment size value I<nnn>

=item B<S>

Selective ACK OK

=item B<T>

timestamp option

=item B<T0>

timestamp option with value zero

=back

=back

=item B<CLIENT_SYNOS>

If asmtpd can guess the client's operating system based on
B<CLIENT_SYNFP>, it will set B<CLIENT_SYNOS> to the value of that
guess.  For example, to greylist mail from Windows machines, you can
run:

   match -q "*Windows*" "$CLIENT_SYNOS" && greylist

=item B<DATA_BYTES>

This variable is not really an avenger variable, as it is only
available in B<bodytest> commands.  It specifies the number of bytes
of message transfered in the SMTP DATA command, but after converting
CR NL sequences to NL.  Roughly speaking this is how many bytes are in
the message including all headers after the X-Avenger:, SPF-Received,
or Received: header.

=item B<ETCDIR>

The value of B<EtcDir> from the asmtpd configuration file (or
F<@etcdir@> by default).

=item B<EXT>

When avenger runs on behalf of a user B<EXT> is set to the part of the
address that determines the suffix of the F<rcpt> or F<mail> file.
For example, suppose B<Separator> is C<-> and the recipient is
B<list-subscribe@>I<host>, where I<host> is not a virtual domain.  If
the B<AliasFile> contains:

    list: user-mylist

Then avenger will be run on behalf of C<user> (because alias expansion
yields B<user-mylist-subscribe>).  B<EXT> will be set to
B<mylist-subscribe>.

Note that B<EXT> is empty when there is no suffix, and that it is
equal to the name of the system file being processed when avenger is
run on a system file.  Like B<RECIPIENT>, this variable is not set for
B<bodytest> commands.

=item B<HOST>

Set to the name of the local host, as specified by the B<HostName>
directive in F<avenger.conf>.

=item B<MAIL_ERROR>

This variable is set when the SPF disposition of the sender is
B<fail>, or when asmtpd is unable to send a bounce message to the
sender address.  In either case, Mail Avenger will reject the mail if
the script falls through to the default.

=item B<MYIP>

IP address of local end of SMTP TCP connection.

=item B<MYPORT>

TCP port number of local end of SMTP TCP connection.  Ordinarily this
will be 25.

=item B<RECIPIENT>

The envelope recipient of the message.  Note that this environment
variable is not present for B<bodytest> programs, since such programs
may be run on behalf of multiple users.

=item B<RECIPIENT_HOST>

The domain part of B<RECIPIENT>, folded to lower-case--i.e., I<host>
when B<RECIPIENT> is I<local>B<@>I<host>.  Not present for B<bodytest>
programs, as noted in the description of B<RECIPIENT>.


=item B<RECIPIENT_LOCAL>

The local part of B<RECIPIENT>, folded to lower-case--i.e., I<local>
when B<RECIPIENT> is I<local>B<@>I<host>.  Not present for B<bodytest>
programs, as noted in the description of B<RECIPIENT>.

=item B<SENDER>

The envolope sender of this mail message (i.e., the argument supplied
by the client to the C<MAIL FROM:> SMTP command.)

=item B<SENDER_HOST>

The hostname part of B<SENDER>, converted to lower-case (i.e., I<host>
in I<user>B<@>I<host>).

=item B<SENDER_LOCAL>

The local part of B<SENDER>, converted to lower-case (i.e., I<user> in
I<user>B<@>I<host>).

=item B<SENDER_MXES>

A list of DNS MX records for B<SENDER_HOST>, if that hostname has any
MX records.

=item B<SENDER_BOUNCERES>

For non-empty envelope senders, asmtpd attempts to see if it is
possible to deliver bounce messages for the sender.  If not,
B<SENDER_BOUNCERES> is set to a three-digit SMTP error code.  If the
first digit is 4, the error was temporary.  If the first digit is 5,
the error was permanent.  Note that failure to accept bounce messages
is considered a B<MAIL_ERROR> as described above, and will cause mail
to be rejected by default.

=item B<SEPARATOR>

The value of B<Separator> from the asmtpd configuration file.  There
is no default (B<SEPARATOR> will not be set if no B<Separator> is
specified in the configuration file).  However, it should be
configured for C<+> with sendmail and C<-> with qmail.

=item B<SPF0>

=item B<SPF>

The result of performing an SPF check on the message.  Will be one of:
B<none>, B<neutral>, B<pass>, B<fail>, B<softfail>, B<error>, or
B<unknown>.  Note that B<SPF0> and B<SPF> are synonymous, but B<SPF>
is deprecated as a future release of Mail Avenger will make B<SPF>
synonymous with B<SPF1>.

=item B<SPF1>

Also the result of performing an SPF check on the message, but returns
different names for the results, to be compatible with newer revisions
of the SPF protocol specification.  The new names are B<None>,
B<Neutral>, B<Pass>, B<Fail>, B<SoftFail>, B<TempError>, and
B<PermError>.

=item B<SPF_EXPL>

The explanation string that goes along with a bad SPF status.

=item B<SSL_CIPHER>

If the Mail Avenger has been compiled with support for the STARTTLS
command (using the B<--enable-ssl> option to C<configure>), and the
client is communicating over SSL/TLS, this variable will contain a
textual description of the algorithm.

=item B<SSL_CIPHER_BITS>

=item B<SSL_ALG_BITS>

B<SSL_CIPHER_BITS> contains the number of secret key bits used by the
SSL/TLS ciphers.  B<SSL_ALG_BITS> is the number of bits used by the
algorithm.  For example, if you are using 128-bit RC4 with 88 bits
sent in cleartext, B<SSL_CIPHER_BITS> will only be 40, since that is
the effective security, while B<SSL_ALG_BITS> will be 128.

=item B<SSL_ISSUER>

=item B<SSL_ISSUER_DN>

If the client has successfully authenticated itself using an SSL
certificate, B<SSL_ISSUER> will be set to the certificate signer's
common name, while B<SSL_ISSUER_DN> will be set to a compact
representation of the signer's full distinguished name.  The full
distinguished name is in the form output by the command:

	openssl x509 -noout -issuer -in cert.pem

Note that this variable is mostly useful if the B<SSLCAcert> file you
have given to Mail Avenger contains more than one certificate
authority, or signs other CA certificates.  Mail Avenger will not
accept client certificates if it does not recognize the signer of the
certificate.

=item B<SSL_SUBJECT>

=item B<SSL_SUBJECT_DN>

If the client has successfully authenticated itself using an SSL
certificate, B<SSL_SUBJECT> will be set to the client's common name in
the certificate, while B<SSL_SUBJECT_DN> will be set to a compact
representation of the client's full distinguished name.  The full
distinguished name is in the form output by the command:

	openssl x509 -noout -subject -in cert.pem

=item B<SSL_VERSION>

The version of the SSL/TLS protocol in use.

=item B<UFLINE>

An mbox C<From > line suitable for prepending to the message before
passing the message to a delivery program.  (This is mostly useful for
bodytest rules.)

=item B<USER>

The name of the user under which avenger is running.

=back

=head1 AVENGER/ASMTPD INTERFACE

avenger is just a simple shell script.  You can inspect the file to
see what it is doing.  Most of the interesting operations happen in
either asmtpd, or in external programs spawned from avenger.  This
section documents the interface between asmtpd and avenger.

avenger inherits a unix-domain socket connected to asmtpd on its
standard input and output.  It sends commands to asmtpd over this
socket, and similarly reads replies from it.  In order to avoid mixing
messages to and from asmtpd with the output of other programs you run,
however, the avenger shell script reorganizes its file descriptors so
that all communication to and from asmtpd happens over file descriptor
number 3.

Each command consists of a single line, followed by a newline (except
the B<return> command, which can optionally take multiple lines).
There may or may not be a reply, possibly depending on the outcome of
the command.  Most replies consist of zero or more lines of the form

=over

I<VARIABLE>B<=>I<value>

=back

I<VARIABLE> is typically a variable name that was supplied as part of
the command.  The avenger shell script records results by setting the
environment variable I<VARIABLE> to I<value>, so that it can be
accessed by subsequent lines of the script.

Replies are sent in the order in which the corresponding commands were
received.  However, asmtpd executes requests asynchronously.  Thus,
one can perform several concurrent operations (such as DNS requests or
SPF tests) by simply writing multiple commands to asmtpd before
receiving any of the responses.

The C<.> command is a no-op, but asmtpd echoes the C<.> back to
avenger as the reply.  This allows one to synchronize the avenger
process's state after issuing one or more commands.  For example, one
might issue several DNS lookups to check various RBLs (real-time
blackhole lists), then issue a F<.> command, then wait for replies.
When the F<.> comes back, all previous commands will also have
completed.  The avenger B<setvars> command simply sends a C<.>, then
loops until it reads back the C<.>, setting variables from any
previous commands whose replies it reads in the process.

The following commands are available:

=over

=item B<.>

The B<.> command is simply echoed back by asmtpd.

=item B<bodytest> I<command>

Ends the current avenger script.  Specifies that asmtpd should receive
the entire body of the message, then run I<command> (under the same
user ID as the current avenger script) with the entire mail message as
its standard input.  asmtpd then replies to the SMTP C<DATA> command
based on the exit status of I<command> as follows:

=over

=item 0

If I<command> exits with status 0, asmtpd will reply to the C<DATA>
command with success (SMTP code 250), and will pass the message to
sendmail (or whatever you have configured as B<Sendmail> in
F<asmtpd.conf>) for delivery.

=item 99

If I<command> exits with status 99, asmtpd will still reply to the
C<DATA> command with a successful 250 reply code, but will not spool
the data.  Either I<command> must have done something with the data,
or the message will be lost.

=item 100 (also 64, 65, 70, 76, 77, 78, 112)

If I<command> exits with status 100 (or any of the above exit
statuses), avenger will reject the mail with a hard SMTP error (code
554).  If I<command> wrote output to its standard output, this output
will be passed back to the mail client.  Otherwise, asmtpd will supply
the text "message contents rejected."

=item 111 (or any other exit status)

If I<command> exits with status 111, the result is the same as exit
status 100, except that asmtpd will use a temporary error code (451)
instead of 554.

=item signal

If I<command> exits abnormally because of a signal, asmtpd will also
use 451, but in this case will not pass the program's output back to
the client.  It will instead pass back a description of the problem.

=back

Note that asmtpd can only run one B<bodytest> command per message.  If
there are multiple recipients of a message, all must run the same
B<bodytest> under the same user ID.  If two users wish to run
different B<bodytest> commands, or even run the same command under
different user IDs, asmtpd will defer the second SMTP C<RCPT> command
with the message:

=over

452 send a separate copy of the message to this user

=back

This will cause the mail client to re-send the message later to the
second user.  To avoid forcing clients to send multiple copies of
messages, you can place B<bodytest> commands in system wide files
(such as the F<default> rule file), or use a B<redirect> command to
redirect to the B<AvengerUser>, so that commands for multiple users
can be run under the B<AvengerUser> user ID.

Note that file descriptor 0 inherited by I<command> is opened for both
reading and writing.  Thus, it is possible to modify the message
before it is spooled by the local MTA.  The command
L<edinplace(1)|edinplace(1)> is useful for running messages through
spam filters that annotate messages before spooling them.

=item B<dns-a> I<VARIABLE> I<domain-name>

Requests that asmtpd perform a DNS lookup for A (IPv4 address) records
on I<domain-name>.  If such an A record exists, the reply is a list of
one or more IP addresses:

=over

I<VARIABLE>B<=>I<IP-address> ...

=back

If no such A record exists, the reply is simply:

=over

I<VARIABLE>B<=>

=back

With the standard avenger script, this sets I<VARIABLE> to the empty
string.  If there is a temporary error in DNS name resolution, there
is no reply, and hence with the default avenger script I<VARIABLE>
will remain unset.

When checking such things as RBLs, it is advisable not to reject mail
because of a temporary DNS error.  You can use the shell construct
${I<VARIABLE>-I<default>}$ to return $I<VARIABLE> when I<VARIABLE> is
set, and I<default> when I<VARIABLE> is not set.  Similarly
${I<VARIABLE>+I<set>} returns I<set> if I<VARIABLE> is set, and the
empty string otherwise.

For example, if bad-senders.org contained an RBL of undesirable sender
hosts:

    echo dns-a BADSENDER "$SENDER_HOST".bad-senders.org >&3
    setvars
    test -n "$BADSENDER" && reject "$SENDER_HOST is a bad sender"
    test -z "${BADSENDER+set}" \
        && defer "$SENDER_HOST.bad-senders.org: DNS error"

Note that when using the avenger script, there is already a function
B<rbl> to check RBLs.

=item B<dns-mx> I<VARIABLE> I<domain-name>

Similar to B<dns-a>, but looks up MX records.  A successful reply is
of the form:

=over

I<VARIABLE>B<=>I<priority-1>B<:>I<host-1> [I<priority-2>B<:>I<host-2> ...]

=back

Where I<priority-1> is the MX priority of I<host-1>.  As before, an
empty string indicates no MX records exist, and no reply indicates an
error.

=item B<dns-ptr> I<VARIABLE> I<IP-address>

Returns a list of verified DNS hostnames for I<IP-address>.  As
before, an empty string for I<VARIABLE> indicates no MX records exist,
and no reply indicates an error.

=item B<dns-txt> I<VARIABLE> I<domain-name>

Similar to the other B<dns> commands, but looks up a record of type
TXT.  If multiple TXT records exist, returns only one.  Places some
restrictions on the TXT records, for example will not return one that
contains a newline character.

=item B<netpath> I<VARIABLE> I<IP-address>

Maps out the network hops to I<IP-address> (this is similar to the
traceroute system utility, but more efficient).  The reply is of the
form:

=over

I<VARIABLE>B<=>I<#hops> I<hop1> I<hop2> ...

=back

I<#hops> is the total number of network hops to I<IP-address> if
asmtpd can figure this out.  (It won't always be able to if
I<IP-address> is behind a firewall.)  If asmtpd cannot figure this
out, the value is -1.  I<hop1> and the remaining arguments are the
addresses of routers along the way to I<IP-address>.

=item B<redirect> I<local>

Terminates the current avenger process, and instead processes the mail
as though it is being sent to I<local>.  This command is only
available in "rcpt" mode, as opposed to "mail" mode (in which asmtpd
runs avenger to see if it should relay mail for a local user on a
non-local client machine).

I<local> can be a local user name, or a local user name followed by
the separator character and an extension.  The name is mapped using
the F<aliases> (specified by B<AliasFile> in F<asmtpd.conf>).

Note that while the B<AvengerUser> user can redirect to other users,
ordinary users can only redirect to themselves or the B<AvengerUser>.

=item B<return> I<code> I<explanation>

=item S<           > or

=item B<return> I<code>B<->I<explanation>

=item I<code>B<->I<explanation>

=item I<code>B< >I<explanation>

Specifies the SMTP reponse desired.  Also avoids further processing of
the message with system-wide default rulesets (as typically happens
when avenger simply exits with status 0).  I<code> must be a three
digit number beginning 2, 4, or 5.  (usually 250 for success, 451 to
defer mail, and 554 to reject mail).

The first form of this command (with a space between I<code> and
I<explanation>) gives a single line explanation along with the result
code.  In the second form, avenger specifies a multi-line response.
In this case all but the last line must contain a B<-> between the
I<code> and I<explanation>, while the last line must contain a space.
(Note that the B<return> keyword only appears on the first line; after
starting to issue a B<return> command, no further commands can be
issued.)

=item B<spf> I<VARIABLE> I<SPF-mechanism> ...

Evaluates the mail client based on SPF mechanisms.  It will return:

=over

I<VARIABLE>B<=>I<disposition>

=back

where I<disposition> is one of: B<none>, B<neutral>, B<pass>, B<fail>,
B<softfail>, B<error>, or B<unknown> (though the disposition B<none>
is actually impossible).

As an example, suppose that your username is C<joe>, B<Separator> is
C<+>, and you have subscribed to a number of yahoo mailing lists using
email address C<joe+yahoo>.  If spammers started sending mail to
C<joe+yahoo>, you would want to reject all mail to that address except
that originating from yahoo's computers.  Yahoo's computers might
correspond to anything ending C<.yahoo.com> or sharing a 24-bit
IP-address prefix with any of yahoo.com's MX records.  This can be
accomplished with the following script in
F<$HOME/.avenger/rcpt+yahoo>:

    echo spf YAHOO ptr:yahoo.com mx:yahoo.com/24 -all >&3
    setvars
    case "$YAHOO" in
    fail)
	reject "Sorry, this private alias for Yahoo lists only"
	;;
    error)
	defer "Sorry, temporary DNS error"
	;;
    esac

=back

=head1 EXAMPLES

If you never use your email address as an envelope sender, you can
reject all bounces to that address with these commands in your F<rcpt>
file:

    test -z "$SENDER" \
        && reject "<$RECIPIENT> not a valid sender;" \
	" should not receive bounces"

The following script runs spamassassin (a popular spam filter,
available from L<http://www.spamassassin.org/>) on the body of a
message, unless the sender of the message has an SPF disposition of
pass or is already going to be rejected by default.

    # The next line immediately falls through to the default reject
    # disposition when mail has an SPF disposition of fail or the
    # sender does not accept bounce messages.
    errcheck

    test "$SPF" = pass \
        || bodytest edinplace -x 111 spamassassin -e 100

The following script immediately accepts any mail from any machine at
MIT or NYU (provided MAIL_ERROR is not set), "greylists" machines not
in one of those domains, and if the greylist passes, falls through to
the the default, system-wide rules:

    errcheck

    spf TRUSTED ptr:nyu.edu ptr:mit.edu ?all
    setvars
    test pass = "$TRUSTED" && accept Trusted sender OK

    greylist_delay=5m
    greylist

The following script rejects mail from clients that have issued an
SMTP "POST" command (which doesn't exist) or used aggressive,
premature pipelining of commands.  If the client put a space after the
colon in the MAIL FROM: or RCPT TO: SMTP commands, it greylists the
message using a key that includes the SYN fingerprint and first
24-bits of the IP address.  If the SPF disposition of the message is
error, it defers the message.  If the SPF disposition of the message
is softfail or none, it runs the body of the message through
spamassassin.

    errcheck

    test -n "$CLIENT_POST" -o -n "$CLIENT_PIPELINING" \
        && reject "no spam please"

    test -n "$CLIENT_COLONSPACE" \
        && greylist "${CLIENT_IP%.*} $CLIENT_SYNFP $SENDER"

    case "$SPF" in
        error)
            defer "Temporary error in SPF record processing"
            ;;
        softfail|none)
            bodytest edinplace -x 111 spamassassin -e 100
            ;;
    esac

If you set your B<MACUTIL_SENDER> environment variable to be
C<user+bounce+*@your.host.com> and send mail with B<macutil
--sendmail>, you can create the following F<rcpt+bounce+default> to
accept mail only to valid bounce addresses.

    macutil --check "$SUFFIX" > /dev/null \
        || reject "<$RECIPIENT>.. user unknown"

In conjunction with this script, you may want to reject bounce
messages to your regular email addresss with your F<rcpt> script, as
described in the first example.

This example is slightly more complicated, and shows how to use a
bodytest to reject mail based on message contents.  The goal of this
set-up is to check each message with the ClamAV anti-virus software
(from L<http://www.clamav.net/>) and the spamassassin mail filter.  If
the message contains a virus or is flagged as spam, it should be
rejected with an explanation of the problem.  We construct a shell
script, F<$HOME/.avenger/body>, to run these tests on message bodies.
The script can be invoked with the line

=over

B<bodytest $HOME/.avenger/body>

=back

in your F<$HOME/.avenger/rcpt> file.  Or, alternatively the script
could be configured to run in the system-wide F<@etcdir@/default> file
(in which case you want to make sure that the B<AvengerUser> can write
its own home directory, so as to store spamassassin files).  The
script is as follows:

    #!/bin/sh
    out="`clamscan -i --no-summary --mbox -  2>&1`"
    if test "$?" = 1; then
        echo This message appears to be infected with a virus
        printf "%s\n" "$out" \
            | sed -e '/Warning:/d' -e 's/^[^:]*: //' | sort -u
        exit 100
    fi

    out="`edinplace -x 111 spamassassin -e 100`"
    case "$?" in
        0)
            exit 0
            ;;
        100)
            echo Sorry, spamassassin has flagged your message as spam
            while read a b c; do
                test "$a $b" = "Content analysis" && break
            done
            read a
            read a
            read a
            while read a b c; do
                case "$a" in
                "")
                    break
                    ;;
                -*)
                    ;;
                [0-9]*)
                    printf "  %s\n" "$c"
                    ;;
                *)
                    printf "    %s\n" "$a $b $c"
                    ;;
                esac
            done
            exit 100
            ;;
        *)
            if test -n "$out"; then
                echo spamassassin failure:
                printf "%s\n" "$out"
            else
                echo system error in spamassassin
            fi
            exit 111
            ;;
    esac

The first half of this script runs the clamscan virus checker, storing
the output in variable out.  clamscan exits with code 1 when a virus
is found, exits 0 on success, and uses other error codes to indicate
various system errors.  We only want to reject mail if clamscan exits
with code 1.  When this happens, we take the output of clamscan,
format it in a more pleasing way (stripping out warnings), and send it
to standard output.  An example of an SMTP transaction using this
bodytest and detecting a virus will look like this (tested with the
special EICAR test string that flags a positive with most virus
checkers):

    DATA
    354 enter mail, end with "." on a line by itself
    Subject: eicar test

    X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*
    .
    554-This message appears to be infected with a virus
    554 Eicar-Test-Signature FOUND

If the virus check fails, the script runs the message through
spamassassin to check for spam.  Note that spamassassin modifies the
mail message, so that we must run it with edinplace.  Note also that
clamscan will read to the end of the input file, but this is okay
since edinplace rewinds its standard input.  We use the B<-e> flag to
tell spamassassin to exit 100 on spam.  Then, if spamassassin exits 0,
we accept the mail.  If it exits with anything but 100, something went
wrong and we temporarily defer the mail.  Note that it might also be
possible to accept the mail at this point, but since spamassassin
edits the file in place, the message may be truncated if spamassassin
exits unexpectedly.

If spamassassin exits 100, we reject the mail.  We also report on why
spamassassin has rejected the mail.  Here again we take advantage of
the fact that edinplace rewinds its standard input both before and
after processing a message.  Because the file descriptor has been
rewound, we can start processing the message one line at a time with
the shell script.  Spamassassin by default (if you have not configred
it with C<report_safe 0>) contains a spam report like this:

 Content analysis details:   (11.7 points, 5.0 required)

  pts rule name        description
 ---- --------------- --------------------------------------------------
  1.0 RATWARE_RCVD_AT Bulk email fingerprint (Received @) found
  4.2 X_MESSAGE_INFO  Bulk email fingerprint (X-Message-Info) found
  0.0 MONEY_BACK      BODY: Money back guarantee
  0.5 BIZ_TLD         URI: Contains a URL in the BIZ top-level domain
  0.6 URIBL_SBL       Contains a URL listed in the SBL blocklist
                      [URIs: crocpeptide.biz]
  0.5 URIBL_WS_SURBL  Contains a URL listed in the WS SURBL blocklist
                      [URIs: crocpeptide.biz]
 ...

We skip over the headers, and for each result, print it to the SMTP
session.  Negative/whitelist results (those starting -), we do not
report, and comment lines (not starting with a number) we print
indented.  A typical SMTP session looks like this (using the special
GTUBE test line that triggers spam filters):

    DATA
    354 enter mail, end with "." on a line by itself
    Subject: gtube test

    XJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-STANDARD-ANTI-UBE-TEST-EMAIL*C.34X
    .
    554-Sorry, spamassassin has flagged your message as spam
    554-  Missing Date: header
    554   BODY: Generic Test for Unsolicited Bulk Email


Here's an example of how to use SSL client certificates for
authentication.  If you have a private CA with common name "My CA"
that signs the certificates of all your authorized mail clients, you
can place the following in F<@etcdir@/relay> to permit those clients
to relay:

    test "My CA" = "$SSL_ISSUER" \
        && accept "Relaying permitted for client $SSL_SUBJECT"
    reject "relaying denied"


=head1 FILES

F<@libexecdir@/avenger>,
F<@etcdir@/default>,
F<$HOME/.avenger/rcpt>,
F<$HOME/.avenger/rcpt*>
F<$HOME/.avenger/mail>,
F<$HOME/.avenger/mail*>

=head1 SEE ALSO

L<dbutil(1)|dbutil(1)>,
L<deliver(1)|deliver(1)>,
L<edinplace(1)|edinplace(1)>,
L<escape(1)|escape(1)>,
L<macutil(1)|macutil(1)>,
L<match(1)|match(1)>,
L<synos(1)|synos(1)>,
L<asmtpd.conf(5)|asmtpd.conf(5)>,
L<asmtpd(8)|asmtpd(8)>,
L<avenger.local(8)|avenger.local(8)>

The Mail Avenger home page: L<http://www.mailavenger.org/>.

=head1 BUGS

avenger (and the configuration files it reads) are shell scripts.  In
a shell script, it is sometimes tempting to use C<echo ...> where one
should instead use the command C<printf '%s\n' ...>.  (The later just
prints its argument to standard output, while the former interprets
various C<\> escape codes.)

In shell scripts, one must be careful about variables containing shell
metacharacters.  For example, it is not safe to run something like:

	bodytest "echo $VAR > $PWD/log"

if variable C<VAR> has untrusted contents that might contain
characters like C<E<gt>> or C<;>.  The reason is that C<$VAR> will be
expanded and sent back to the SMTP server, which will then pass the
expansion to the shell to execute the bodytest.  (C<$VAR> effectively
gets expanded twice.)  The escape utility can be used to avoid these
problems.  For example:

	bodytest echo `escape "$VAR"` ">" $PWD/log

It is easy to forget to call B<setvars> after a B<dns>, B<rbl>, or
B<spf> command.

=head1 AUTHOR

David MaziE<egrave>res


syntax highlighted by Code2HTML, v. 0.9.1