Qmail is well-known as a scalable, high-perfomance, versatile open source mail server. Many ISPs, companies (from small to enterprise), and educational institutions rely on qmail to serve their messaging needs. On the hardware and operating system side, Compaq Tru64 Unix (formerly Digital Unix) was created specifically for mission-critical applications. Its clustering solution is considered one of the best in the enterprise class. Both products are modular, secure, and fast. What happens when these two giants meet?
One year ago, I manage this combination for a company in Indonesia -- let's call them XYZ. XYZ needed this combination because they had invested lots of money in Digital's Alpha servers but their messaging requirements could not be satisfied using Sendmail (the default messaging server on Tru64 Unix). Using qmail and its add-ons, XYZ can use virtual domains and a single UID for multiple users. Their other basic need was to act as internal mail relay for around 20 affiliate companies and hundreds of stores. The traffic volume can reach thousands of messages per day. The company has a 256Kbps microwave connection for Internet uplink; so far it is sufficient for handling our load.
When I began the implementation, I faced one major obstacle: qmail needs to be compiled using GCC, but Tru64 Unix uses its own C compiler, and it's not 100% compliant with ANSI C. I had two options: download GCC and compile it myself or install an additional GCC package from Compaq. I chose the first option. GCC is the heart of most open source programs; using other C compilers sometimes creates weird errors when compiling or running programs.
I went to the GCC Web site and download GCC version 2.95. I preferred not to use GCC 3 at that time to avoid compilation problems. The program compiled smoothly (using the Tru64 Unix C compiler) and I got a working GCC binary. This sounds confusing, creating a compiler using another compiler, but it works!
Now I was ready to install qmail. I downloaded the source code for qmail, ucspi-tcp, and daemontools. I followed the installation steps from the Life with qmail site and soon had the program installed.
Next, I had to figure out how to start qmail automatically. I chose to start qmail from an rc script. I created a script to call qmail-smtpd, qmail-pop3d, and qmail-send (under manage by tcpserver). I created a soft link from the main rc directory to the rc3.d directory so qmail would start in run level 3. To avoid hassles, I disabled Sendmail in run level 3 but didn't remove it from my system. This let programs like cron inject local mail using Sendmail (rather than qmail-inject or sendmail-to-qmail wrapper).
I restarted the machine and let it boot. In the rc execution stage, I saw an indication that qmail has started (I had added
echo 'qmail start' to notify myself). I was happy, but not for long. After logging on, I ran a terminal program and executed
ps ax | grep qmail to check that the program was running. Nothing came out. I checked the rc script and the soft link -- they were correct. I then suspected execution order. I moved qmail to the last daemon to be executed in runlevel 3. I restarted the machine and checked
ps ax | grep qmail and ... nothing.
I tried to call qmail's rc script manually; it started perfectly. Checking
ps ax output showed that all qmail components started under the correct user and group id. Since the rc script ran manually, I decided to use the cron scheduler to make sure it was running. I created a script that is called every five minute to check (using a combination of ps and grep) whether qmail-send is running, and run the qmail startup script if it isn't.
I also modified the permission bit of the script according to Tru64 Unix. (One note here: Previously the script used the default shell on Tru64 Unix, but I found it had lots of bugs. Because I'm familiar with Bash, I downloaded the Bash source code and compiled it, then adjusted the script according to Bash style.)
With qmail working, it was time to add users. Using vpopmail, I created virtual domains and assigned users to them. I also installed qmail's Web interface (qmail Admin) for user management. These steps ran smoothly.
During the next several weeks I allowed users at XYZ company to create user accounts and see qmail work. After qmail had been running for six months I began noticing that qmail's queue was stacking up quickly and messages were going out slowly during peak hours when users were actively sending and receiving email. I checked qmail's Web site and found this was a known qmail anomaly with qmail-todo, which works as an intermediate pipe in the qmail queuing system to reduce bottlenecks, especially piping to qmail-remote. I download the qmail-exttodo patch, patched the qmail source code, and recompiled. The queuing system now worked normally even under high loads.
A healthy system, until ...
Around middle of 2002, true disaster struck. The Klez worm attacked my mail server and hung it many times. Using ps and vmstat commands and the kernel log, I determined several reasons for the system hangs. SMTP had a lack of open connections on port 25 and the kernel itself provided only a few open files descriptor per user. The solution was to apply a big concurrency patch and raise the number of open TCP connections for qmail-smtpd and balance them with 'max user' and 'max open files per user' in sysconfigtab, the Tru64 Unix kernel's configuration. To add further security, I retuned the soft limit for the qmail-smtpd and qmail-pop3d daemons, doubling each while carefully maintaining them so kernel activity didn't get overwhelmed.
That solved the immediate problem, but I knew I had to do some filtering and virus checking from now on. I found qmail-qfilter, a script-based filter that can parse email before injected it into qmail's queue. This add-on needed the qmailQUEUE patch to reroute the queuing system to a filtering layer first. I also downloaded several other useful patches -- qmail-oversize-dns, qmail-mfcheck, and nullenvsender-recipcount -- that I found very helpful to combat spam. Oversize-dns caches more DNS packets and thus reduces lookups. Mfcheck checks for a valid DNS entry in recipient field, which blocks viruses that inject invalid addresses. Nullenvsender-reject blocks null sender messages with multiple recipients, and helps reduce bounced message. For the sake of simplicity I don't use the RBL antispam add-on because I worry about overloading our low bandwidth and creating a bottleneck.
For filtering, I wrote some wrapper scripts to throw away messages with attachments such as .pif, .bat, .exe, .vbs, and other suspicious types. At this stage I learned that some viruses use non-standard headers for attachment info, and there are many variations. Again I used a nifty trick -- I scanned entire messages for occurrences of file name phrases. This caused several false alarms, so I needed to educate users to carefully compose their email. In my experience, business email rarely contains phrases such as "hey, I need an application.pif," so my choice was acceptable. To watch for double extensions, I had to create regex like '\.pif*'.
My ultimate headache finally came when I had to pick a virus scanner. Virtually no commercial virus scanners work with Tru64 Unix. After Googling, I found Sophos Sweep works with Tru64 Unix and also supports scanning inside many compressed file types, including even uncommon compression like UPX.
Antivirus alone is not enough, though; I had to integrate it with qmail. I chose Qmail-Scanner because it tightly couples with qmail and supports many antivirus packages. Installion was straightforward. I used an suid wrapper to run the scanner script. I also modified the tcp.smtp file to redirect email to Qmail-Scanner instead of qmail qfilter. I injected several messages with attachments to the scanner. It worked as I hoped, but the antivirus scanner needed more available memory, so I added memory for a total 1GB RAM. The soft limit (general memory limit for 1 SMTP session) also needed some adjusting, by trial and error. A 10MB memory limit didn't fit for one SMTP session so I raise it up to 40MB. That's not a small number, and I found the Perl interpreter and its modules (Qmail-Scanner uses Perl) flooding my memory.
To help reduce memory load and lower scanning time, I installed a beta version of Sophie, a daemon for scanning using libsavi, the Sophos dynamic link library for virus scanning. After compilation using GCC, I installed the binary and told Qmail-Scanner to call Sophie instead of Sweep, and request scanning at Sophie's port.
The sequence for complete scanning was as follows: run Sophie in daemon mode, make sure it creates a Unix domain socket, then run Qmail-Scanner inside qmail. I had to change some lines in a Perl module to force reading the correct file socket, but I found no other elegant way to do this. This time, I download a mail bomber program to stress-test the installation. Everything went fine, and the software was catching viruses as well, so I decide to deploy it in the production environment. I let it run for two weeks and archived the qmail logs (qmail-smtpd, qmail-pop3d, and qmail send log). For a complete picture, I created a bash script to log vmstat output (memory section).
During the two weeks everything seemed fine except (yes, again) a memory leak. The vmstat log indicated that after peak hours the amount of free memory didn't reduce as expected. So, for example, if during peak hours there were 50 SMTP sessions each with a 10MB limit, it is obvious it took 500MB; when it drops to 10 SMTP connections it should free 400MB, but vmstat's output showed no such thing. I suspected Sophie was the main cause. The other possibility was Tru64 Unix's internal mechanism for virtual memory management. I examined Sophie source code and browsed Sophie's Web site without finding a clue. I recompiled the source using various options; still there was a memory leak. I looked into the VM mechanism of Tru64 Unix, and there I ran smack up against the difference between the closed source kernel
and the open source application. It was impossible to debug the operating system's internal garbage collection mechanism, and there was no solution in the archive of support document in Compaq's Web site.
I tried detaching Sophie from Qmail-Scanner and put it back to qmail-qfilter. This reduced overall use of resources while still maintaining the highest security possible. A test for approximately one month proved that this arrangement was enough to stop most viruses that spread via email. To enhance security, we also tightened security at each user's desktop and other Windows-based machines. My final recommendation was to buy additional servers for relay and scanning so the main mail server serves only as a mailer.
From this experience, I can say that qmail and Tru64 Unix are a great combination. Qmail can exploit the Tru64 Unix kernel at its maximum performance, while Tru64 Unix can provide excellent security and a highly reliable platform that might beat Linux/qmail or BSD/qmail in overall performance. For XYZ it surely saved their current investment.
Mulyadi Santosa is a freelance writer and IT consultant in Indonesia. He holds a Bachelor's Degree in Computer Science from Sepuluh November Institute of Technology.