In these days of email worms, viruses, and ever-increasing spam, some sites want to apply a lot of checking to messages before accepting them. You can do a certain amount through string expansions and the condition condition in the ACL that runs after the SMTP DATA command (see chapter 37), but this has its limitations. To allow for more general checking that can be customized to a site's own requirements, there is the possibility of linking Exim with a private message scanning function, written in C. If you want to run code that is written in something other than C, you can of course use a little C stub to call it.
Unlike the ACL checks, which apply only to incoming SMTP messages, a local scan function is run for every incoming message. It can therefore be used to control non-SMTP messages from local processes.
To make use of this feature, you must tell Exim where your function is before building Exim, by setting LOCAL_SCAN_SOURCE in your Local/Makefile. A recommended place to put it is in the Local directory, so you might set
LOCAL_SCAN_SOURCE=Local/local_scan.c
for example. The function must be called local_scan(). It is called by Exim after it has received a message, when the success return code is about to be sent. For SMTP input, this is after all the ACLs have been run. The return code from your function controls whether the message is actually accepted or not. There is a commented template function (that just accepts the message) in the file src/local_scan.c.
You must include this line near the start of your code:
#include "local_scan.h"
This header file defines a number of variables and other values, and the prototype for the function itself. Exim is coded to use unsigned char values almost exclusively, and one of the things this header defines is a shorthand for unsigned char called uschar. The function prototype is:
extern int local_scan(int fd, uschar **return_text);
The arguments are as follows:
fd is a file descriptor for the file that contains the body of the message (the -D file). The descriptor is positioned at character 17 of the file, which is the first character of the body itself, because the first 17 characters are the message id followed by a newline. The file is open for reading and writing, but updating it is not recommended.
return_text is an address which you can use to return a pointer to a text string at the end of the function. The value it points to on entry is NULL.
The function must return an int value which is one of the following macros:
LOCAL_SCAN_ACCEPT
The message is accepted. If you pass back a string of text, it is saved with the message, and made available in the variable $local_scan_data. No newlines are permitted (if there are any, they are turned into spaces) and the maximum length of text is 1000 characters.
LOCAL_SCAN_REJECT
The message is rejected; returned text is used as an error message. Newlines are permitted - they cause a multiline response for SMTP rejections. If no message is given, ``Administrative prohibition'' is used.
LOCAL_SCAN_TEMPREJECT
The message is temporarily rejected; returned text is used as an error message. If no message is given, ``Temporary local problem'' is used.
If the message is not being received by interactive SMTP, failures are reported by writing to stderr or by sending an email, as configured by the -oe command line options.
The header local_scan.h gives you access to a number of Exim variables. These are the only ones that are guaranteed to be maintained from release to release:
uschar *sender_address
The envelope sender address. For bounce messages this is the empty string.
header_line *header_list
A pointer to a chain of header lines. The header_line structure is discussed below.
header_line *header_last
A pointer to the last of the header lines.
uschar *interface_address
The IP address of the interface that received the message, as a string. This is NULL for locally submitted messages.
int interface_port
The port on which this message was received.
uschar *received_protocol
The name of the protocol by which the message was received.
int recipients_count
The number of accepted recipients.
recipient_item *recipients_list
The list of accepted recipients, held in a vector of length recipients_count. The recipient_item structure is discussed below. You can add additional recipients by calling receive_add_recipients() (see below). You can delete recipients by removing them from the vector and adusting the value in recipients_count. In particular, by setting recipients_count to zero you remove all recipients. If you then return the value LOCAL_SCAN_ACCEPT, the message is accepted, but immediately blackholed.
uschar *sender_host_address
The IP address of the sending host, as a string. This is NULL for locally-submitted messages.
uschar *sender_host_authenticated
The name of the authentication mechanism that was used, or NULL if the message was not received over an authenticated SMTP connection.
uschar *sender_host_name
The name of the sending host, if known.
int sender_host_port
The port on the sending host.
The header_line structure contains the members listed below. You can add additional header lines by calling the header_add() function (see below). You can cause header lines to be ignored (deleted) by setting their type to *.
struct header_line *next
A pointer to the next header line, or NULL for the last line.
int type
A code identifying certain headers that Exim recognizes. The codes are printing characters, and are documented in chapter 48 of this manual. Notice in particular that any header line whose type is * is not transmitted with the message. This flagging is used for header lines that have been rewritten, or are to be removed (for example, Envelope-sender: header lines.) Effectively, * means ``deleted''.
int slen
The number of characters in the header line, including the terminating and any internal newlines.
uschar *text
A pointer to the text of the header. It always ends with a newline, followed by a zero byte. Internal newlines are preserved.
The recipient_item structure contains two members:
uschar *address
This is a pointer to the recipient address as it was received.
int pno
This is used in later Exim processing when top level addresses are created by the one_time option. It is not relevant at the time local_scan() is run and should always contain -1.
The header local_scan.h gives you access to a number of Exim functions. These are the only ones that are guaranteed to be maintained from release to release:
pid_t child_open_exim(int *fd)
int child_close(pid_t pid, int timeout)
These two functions provide you with a means of submitting a new message to Exim. (Of course, you can also call /usr/sbin/sendmail yourself if you want, but this packages it all up for you.) The first function creates a pipe, forks a subprocess that is running
exim -t -oem -oi -f <>
and returns to you (via the int * argument) a file descriptor for the pipe that is connected to the standard input. The yield of the function is the PID of the subprocess. You can then write a message to the file descriptor, with recipients in To:, Cc:, and/or Bcc: header lines. When you have finished, call child_close() with the PID as the first argument, and a timeout in seconds as the second. A value of zero means wait as long as it takes (which is usually fine in this circumstance). The return value is as follows:
>= 0
The process terminated by a normal exit and the value is the process ending status. Unless you have made a mistake with the recipient addresses, you should get a return code of zero.
< 0 and > -256
The process was terminated by a signal and the value is the negation of the signal number.
-256
The process timed out.
-257
The was some other error in wait(); errno is still set.
uschar *expand_string(uschar *string)
This is an interface to Exim's string expansion code. The return value is the expanded string, or NULL if there was an expansion failure.
void header_add(int type, char *format, ...)
This function allows you to add additional header lines. The first argument is the type, and should normally be a space character. The second argument is a format string and any number of substitution arguments as for sprintf(). You may include internal newlines if you want, and you must ensure that the string ends with a newline.
void log_write(unsigned int selector, int which, char *format, ...)
This function writes to Exim's log files. The first argument should be zero (it is concerned with log_selector). The second argument can be LOG_MAIN or LOG_REJECT or the inclusive ``or'' of both of them. The remaining arguments are a format and relevant insertion arguments. The string should not contain any newlines, not even at the end.
void receive_add_recipient(uschar *address, int pno)
This function adds an additional recipient to the message. The first argument is the recipient address. The second argument should always be -1.
void *store_get(int)
This function accesses Exim's internal store manager. It gets a new chunk of memory whose size is given by the argument. Exim bombs out if it ever runs out of memory.
uschar *string_copy(uschar *string)
uschar *string_copyn(uschar *string, int length)
uschar *string_sprintf(char *format, ...)
These three functions create strings using Exim's dynamic store facilities. The first makes a copy of an entire string. The second copies up to a maximum number of characters, indicated by the second argument. The third uses a format and insertion arguments to create a new string. In each case, the result is a pointer to a new string.