June 9, 2005

Building a Linux virtual server

Author: Rohit Girhotra

With the explosive growth of the Internet, the workload on servers providing Web, email, and media services has increased greatly. More and more sites are being challenged to keep up with the growing demands and are employing several techniques to avoid overloading their servers. Building a scalable server on a cluster of computers is one of the solutions that is being effectively put to use. With such a cluster, the increasing requests can be easily managed by simply adding one or more new servers to the existing cluster as required. In this article we will look at setting up one such scalable, network load-balancing server cluster using a virtual server via the Linux Virtual Server Project.

The main advantage of using LVS is that unlike Microsoft network load-balancing clusters, the LVS allows you to add a node running any operating system that supports TCP/IP to the cluster.

The cluster setup (shown below at right) consists of a load balancer server -- also known as the virtual server -- running on the Linux operating system and one or more real servers connected to it through a hub or a switch. The real servers -- which can run any operating system -- provide network services to the Internet clients, whereas the virtual server does IP-level load balancing of the incoming traffic to the various real servers. The virtual server acts as an interface between the users and the real servers and, therefore, makes the parallel services of the real servers to appear as a virtual service on a single IP address.

When the virtual server receives a client request for data, it transfers the request to the appropriate real server according to a scheduling algorithm. The real server then replies to the virtual server, which in turn forwards the reply to the client. Although it is actually the real server that services the client request, to the client it appears as if the response came from the virtual server. The IP address of the real server is masqueraded by the IP address of the virtual server.

The virtual server uses two network interfaces (dual-homed host), one connected to the Internet for the clients to access and the other connected to the internal local area network (LAN), where all the real servers are placed. Scalability is achieved by transparently adding or removing real servers from the internal LAN.

Rebuilding the kernel

Linux systems using kernel versions earlier than 2.4.28 do not have support for virtual server built into the kernel. Therefore, the first step involved in setting them up as a virtual server is to rebuild their kernel with the appropriate patch applied. Kernel versions 2.4.28 or later have LVS support built into them by default and, therefore, require no patching.

The patches can be downloaded from the LVS Web site. There are different patches for various kernel versions. For this article, we will be configuring a patch for the 2.4.x kernel: linux-2.4.21-ipvs-1.0.10.patch.gz.

To apply the patch to the kernel, move the patch file to the/usr/src directory and issue the following command as root:

#cd/usr/src/linux*
#gunzip../linux-2.4.21-ipvs-1.0.10.patch.gz
#patch -p1 <../linux-2.4.21-ipvs-1.0.10.patch

This will patch the kernel; after that you'll need to compile it. In the/usr/src/linux* directory issue these commands:

#make mrproper
#make oldconfig
#make menuconfig

This will bring up a screen with several subheadings. Select the Networking Options subhead, and then IP:Virtual Server Configuration in the following screen. Then select the following options:

virtual server support (EXPERIMENTAL)
[*] IP virtual server debugging
(16) IPVS connection table size (the Nth power of 2)
--- IPVS scheduler
<M> round-robin scheduling
<M> weighted round-robin scheduling
<M> least-connection scheduling scheduling
<M> weighted least-connection scheduling
<M> locality-based least-connection scheduling
<M> locality-based least-connection with replication scheduling
<M> destination hashing scheduling
<M> source hashing scheduling
<M> shortest expected delay scheduling
<M> never queue scheduling
--- IPVS application helper
<M> FTP protocol helper

Save the current kernel configuration and exit from menuconfig. Then from the command prompt type:

#make dep && make clean && make bzImage && make modules && make modules_install

This will create a compressed kernel image (bzImage) in the/usr/src/linux*/arch/i386/boot directory and will also create and install all the modules for the new kernel. Now copy this new kernel image (bzImage) to the/boot directory.

Lastly, either edit your/etc/grub.conf or/etc/lilo.conf file or rename the new kernel image (/boot/bzImage) to the one being referred to in your bootloader configuration file, in order to make your system boot from the new kernel.

Installing IPTables and IPVsadm

After rebuilding the kernel you need to have the IPTables and IPVsadm packages installed on your system to configure it as virtual server. IPTables is used to build, maintain, and inspect IPv4 packet filtering and NAT (network address translation) rules in the Linux kernel. Using it, IP masquerading will be provided to the real servers. IPVsadm is the administrating utility for the Linux Virtual Server and will be used to set the scheduling algorithm and rules for forwarding client requests to the real servers.

The IPTables package comes bundled with most Linux distributions and can be easily installed from the installation CDs for your distribution. The source RPM for the IPVsadm utility can be obtained from the LVS Web site. For this example we'll use the ipvsadm-1.21-10.src.rpm SRPM package.

Once the packages have been installed you need to enable IP forwarding on the server. Open the file/etc/sysctl.conf in a text editor and set this value:

net.ipv4.ip_forward = 1

Next, issue the following command to start the IPTables service on your system. This allows the virtual server to forward replies from the real servers to the clients:

#service iptables start

Enabling IP masquerading

In order to enable masquerading for the real servers, we will assume that the external Internet interface on your Linux Virtual Server is eth0 and the internal LAN interface (connected to other real servers) is eth1. Therefore, on the server issue these commands:

#iptables -t nat -P POSTROUTING DROP
#iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

The first command sets up the default policy for IPTables to DROP, which means that if none of the specific rules match, the packet will be dropped. This ensures that not every packet is masqueraded by the server and, thus, provides an extra level of security. The second command enables NAT and masquerades the internal IP addresses of all the real servers to the IP address of the external Internet interface (eth0) of the virtual server. For more on IPTables, refer to its man page.

Configuring the virtual server using IPVsadm

The next step is to configure the Linux Virtual Server using the IPVsadm utility. But before that, you must allocate proper IP addresses to all the machines on your network. Put the real servers in your internal LAN on a private IP address range, such as 10.0.0.0/255.255.255.0. Also, put the internal LAN interface of the virtual server on the same subnet. Assign the IP address of the internal LAN interface of the virtual server as the default gateway for all the real servers. For the external Internet interface of the virtual server, use a public IP address or the settings provided by your ISP.

In our example setup, we used two real servers running on different operating systems. One with the IP address 10.0.0.2(providing HTTP service) and the other with IP address 10.0.0.3(providing both HTTP and FTP services), with the default gateway for both of them set as 10.0.0.1, which is the IP of the internal LAN interface of the virtual server. The external Internet interface of the virtual server had been assigned a public IP address 61.16.130.100.

Now add the virtual service and link a scheduler to it with these commands:

#ipvsadm -A -t 61.16.130.100:80 -s wlc
#ipvsadm -A -t 61.16.130.100:21 -s wrr

The above two commands add wlc (weighted least-connection scheduling) and wrr (weighted round robin scheduling) algorithms for HTTP (port 80) and FTP (port 21) traffic on the virtual server, respectively. There are several other scheduling algorithms available; you can learn more about them from the IPVsadm man page.

Next, add real servers on the virtual server to which the client requests will be forwarded:

#ipvsadm -a -t 61.16.130.100:80 -r 10.0.0.3:80 -m
#ipvsadm -a -t 61.16.130.100:80 -r 10.0.0.2:80 -m -w 2
#ipvsadm -a -t 61.16.130.100:21 -r 10.0.0.3:21 -m

This will cause all the HTTP traffic on the virtual server to be forwarded to 10.0.0.2 and 10.0.0.3 according to the scheduling algorithm. All the FTP traffic will go to 10.0.0.3 only. The real server 10.0.0.2 is given a weight of 2 for HTTP traffic by the -w 2 switch. The default weight is 1.

Testing it out

After setting everything up, use a client machine to connect to the virtual server using its external IP address. To do this, open a Web browser and type in the server's IP address (61.16.130.100 in the example) in the address bar. You will get a Web page served by the Web server running on the real servers. Open multiple connections to the virtual server and check the status of the various connections on the real servers. You will notice that the incoming load is being equally distributed among the real servers. Thus, the virtual server is performing IP load balancing.

Although the above-described virtual server setup (virtual server via NAT) can meet the performance requirements of many servers, the design is limited by the load balancer, which is a single point of failure for the whole cluster. However, you can eliminate this bottleneck by having multiple virtual servers, each connected to its own cluster of real servers, grouped together at a single domain name by round robin DNS.

Rohit Girhotra is a 22-year-old B.E. student from NSIT, New Delhi.

Click Here!