Postfix performance tuning

917

Author: Ralf Hildebrandt and Patrick Koetter

Postfix is fast out of the box, but like other packages, you can usually tune it to work even faster. Furthermore, there are situations where Postfix may not perform as well as you expected, whether because of hardware or software limitations on the server system or other adverse conditions, such as a big influx of spam or undeliverable mail. This article shows you how to find and analyze the most common performance problems.

This article is excerpted from the newly published book “The Book of Postfix.”

We will first look at a few elementary tweaks that still may not
be terribly obvious. Think of the suggestions here as a checklist
for solving or avoiding simple problems. Above all, keep in mind
that many performance problems are actually caused by a flawed
setup, such as a bad /etc/resolv.conf file. The following points
appear in no particular order; they’re all of equal importance.

Speeding up DNS lookups

Postfix does a lot of DNS queries because SMTP requires lookups
for MX and A records. Furthermore, many of the Postfix restrictions
use DNS lookups to verify a client’s hostname or to perform a
blacklist lookup. Therefore, it’s critical that your server be able
to look up DNS records quickly, especially if you have a high
amount of traffic.

The most common problem with DNS name resolution is that queries
take too long. You can use the dig command to perform a DNS lookup
and display detailed information about the query’s execution:

$ dig www.example.com 
; <<>> DiG 9.2.3rc4 <<>> www.example.com 
;; global options:  printcmd
;; Got answer: 
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48136 
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;www.example.com.   IN A
;; ANSWER SECTION:
www.example.com.  172800  IN A   192.0.34.166
;; Query time: 174 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) 
;; WHEN: Mon Oct  6 09:40:52 2003 
;; MSG SIZE  rcvd: 49

In this example, the query took 174 milliseconds. Now, let’s run
the query again:

$ dig www.example.com 
; <<>> DiG 9.2.3rc4 <<>> www.example.com 
;; global options:  printcmd
;; Got answer: 
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6398 
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;www.example.com.   IN A
;; ANSWER SECTION:
www.example.com.  172765   IN A   192.0.34.166
;; Query time: 18 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) 
;; WHEN: Mon Oct  6 09:41:27 2003 
;; MSG SIZE  rcvd: 49

This subsequent query for the same host took only 18 milliseconds, approximately 10 times faster. The reason that this second query was so quick is that this particular machine is accessing a caching DNS server.

If the lookups take significantly longer (or worse, time out),
then you’re having DNS problems. There are several possible
reasons:

resolv.conf settings: If you run Postfix in a chroot jail, you may have changed /etc/resolv.conf but forgotten to copy the updated file to the chroot jail (usually /var/ spool/postfix/etc/resolv.conf).

The nameservers listed in /etc/resolv.conf could be slow or not
servicing requests at all. Verify that the specified servers answer
your DNS queries in a timely manner for each server line in
/etc/resolv.conf using the dig command.

Network problems: Your uplink to the Internet might not be working as it should, or it could be saturated. If this is the case, you should consider getting more bandwidth or using traffic shaping to give
priority to nameserver queries.

Firewall settings: A firewall can block nameserver packets moving to and from your mail server.

Malfunctioning caching nameserver: If you’re running a caching nameserver locally, make sure that it’s actually working.

If your /etc/resolv.conf settings, your network, and your
firewall all seem fine, yet you still need to speed up your DNS
queries, you should consider running a local caching server, such
as djbdns dnscache or an instance of BIND on your server or
network. The cache significantly speeds up the lookup process and
decreases network utilization at the same time because recurring
lookups don’t result in outgoing packets.

Confirming that your server is not listed as an open
relay

If you’re running an open relay, you can expect that many mail
servers will refuse any mail from your servers. In addition,
spammers will use your system to send their mail, increasing the
load on your system, because your system is handling your users as
well as your abusers.

Your system will typically end up on a blacklist after the open
relay has been confirmed. It can be a royal pain to get off a
blacklist, and it may take days or even weeks. Therefore, it’s
essential that you make sure that your system is not an open relay
or open proxy. Look up your IP address on http://openrbl.org. If
you’re listed, close the open relay immediately. Allow users to
relay in only one of these situations:

  • The user’s client is listed in the mynetworks parameter.
  • The user’s client successfully performed SMTP authentication.
  • The user’s client successfully authenticated itself using a TLS client certificate.

Refusing messages to nonexistent users

It’s a good idea to refuse messages for recipients that don’t
exist in your system. If Postfix were to accept such mail, it would
have to send a nondelivery notification to the sender address.
In the case of spam or viruses, that sender address is almost
certainly not the true origin of the mail. The resulting
MAILER-DAEMON notifications will clog the queue for several
days.

This shouldn’t be too much of a problem by itself, but if you
accept mail for users that do not exist on your system, your system
can store the messages in a place that can eventually fill up, or
if you run a relaying system, the ultimate target
of the message will eventually have to send a bounce back to the
envelope sender of the message (to the Return-Path in the message
header). Furthermore, this bounce may turn out to be
undeliverable itself, because the domain used as the sender
domain probably won’t accept anything.

In any case, these bounces will clutter your queue or go to the
mailbox specified by double_bounce_recipient (which may be your
postmaster account). If you see something like this in your mail
queue, you may be having this problem:

$ mailq
 -Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient-------
63BE9CF331     10658 Mon Jan 12 14:38:30  MAILER-DAEMON
       (connect to mail3.quickspress.com[63.89.113.198]: Connection timed out)
                                          platinum@quickspress.com
1C932CF30E      3753 Sat Jan 10 16:16:38  MAILER-DAEMON
       (connect to mx.unrealdeals.biz[69.5.69.110]: Connection refused)
                                          EntrepreneurCareers@unrealdeals.biz
98EC3CF3F9      5505 Sat Jan 10 20:25:06  MAILER-DAEMON
       (connect to fhweb8.ifollowup.com[216.171.193.38]: Connection refused)
                                          root@fhweb8.ifollowup.com
50B14CF31E      5196 Mon Jan 12 11:35:11  MAILER-DAEMON
       (connect to mail.refilladvice.net[218.15.192.166]: Connection timed out)
                                          clintoncopeland@refilladvice.net
F4009CF39D      5452 Sun Jan 11 01:58:27  MAILER-DAEMON
       (connect to fhweb9.ifollowup.com[216.171.193.39]: Connection refused)
                                          root@fhweb9.ifollowup.com
 --30 Kbytes in 5 Requests.

Here you can see five messages that are being bounced back to
the original senders (notice that the sender is MAILER_DAEMON), but
in each case, the recipient’s mail server is unreachable.

To refuse messages for nonexistent recipients on your system,
set the local_recipient_maps and relay_recipient_maps parameters (the latter if you’re running a gateway that just relays mail to internal mail servers) to maps containing valid recipients.

If bounces really get out of hand, you can also employ
RHSBL-style blacklists to reject mail from servers
that don’t accept bounces at all (because all bounces that need to
be sent back to these servers remain in your mail queue for several
days). There’s an RHSBL-style blacklist at RFC-Ignorant.Org that you can use like this:

check_rhsbl_sender dsn.rfc-ignorant.org

Blocking messages from blacklisted networks

There are many different kinds of blocklists and DNS blacklists
available that list individual IP addresses, whole IP ranges, and
even sender domains for all sorts of reasons. There’s at least one
list for every kind of perceived misbehavior.

The most useful of these list open relays and open proxies,
because they can be tested automatically in an objective manner.
Here are just a few of the blacklists:

  • relays.ordb.org
  • list.dsbl.org
  • cbl.abuseat.org
  • dul.dnsbl.sorbs.net

NOTE      Few things
change faster than blacklists. Today’s hot blacklist may be out of
service tomorrow.

These blacklists have low probabilities of false positives
because they provide clear criteria for listing addresses. Running
an open proxy or open relay is generally considered wrong, so using
these lists puts social pressure on the administrators of the
misconfigured systems. (Of course, they may be clueless or just not
care.)

Refusing messages from unknown sender domains

If possible, do not accept messages containing an envelope
sender from an invalid domain. If there’s a problem during
delivery, the error report always goes back to the envelope sender,
and if this address contains a nonexistent domain, there’s nowhere
to send the error report.

Postfix tries to send the error report, finds it to be
undeliverable, and then (since it cannot be bounced, because the
envelope sender is empty) sends it to 2bounce_notice_recipient.

You can avoid this by adding reject_unknown_sender_domain to smtpd_ sender_restrictions or smtpd_recipient_restrictions.

Reducing the retransmission attempt frequency

If you have a lot of mail that your server can’t deliver on the
first few attempts, consider using a fallback relay (with the
fallback_relay parameter) or increasing the backoff time
(maximal_backoff_time) to reduce the frequency with which deferred mail reenters the active queue.

Without a fallback relay, Postfix spends precious time trying to
deliver mail to sites that are down or unreachable. Each of these
delivery attempts ties up one smtp process that has to wait until
the timeout is reached. A fallback relay can do the dirty work
of retrying transmission for messages that can’t be delivered on
the first try. This means your regular mail server can operate with
the default timeouts or even with reduced timeout values, speeding
up delivery.

On the other hand, increasing the maximal_backoff_time parameter bumps up the maximum time that the server ignores a certain
destination after a delivery problem occurs. Therefore, Postfix
makes fewer attempts to contact problematic servers.

Category:

  • Enterprise Applications