August 18, 2008

Rocks clusters make sense for educational environments

Author: Cesar Covarrubias

Cluster computing has played a pivotal role in the way research is conducted in educational environments. Because the amount of available money and hardware varies between university researchers, often it's necessary to find a clustering solution that can work well on a small scale, but also can be expanded into a large computing cluster. To maximize grants, researchers typically ask for an open source solution to meet their needs. Despite the lack of certain desirable features, Rocks clusters are among the best open source solution for building a computing cluster.

The Rocks distribution was developed at the University of California, San Diego (UCSD) with financial support from a grant by the National Science Foundation. The developers of Rocks have been driven by one goal: "make clusters easy." Originally based on Red Hat Linux, Rocks now uses the open source CentOS as its base operating system, although Red Hat Enterprise Linux also can be used.

A Rocks cluster consists of two parts: the front end node, which manages the software packages on the cluster as well as the submitted jobs, and the cluster's compute nodes, which provide the processing power.

One of the greatest benefits of using Rocks is that installing a cluster is simple. A Rocks cluster can be built on a very limited amount of hardware: The front end and the compute nodes each require a minimum of 16GB of hard drive space and 512MB of memory. The front end node also requires two physical Ethernet ports, and the compute nodes require the ability to boot via PXE for automatic installation.

You can install a front end server for the cluster via CD or DVD. Once installation of the front end node is complete, you can install the compute nodes by booting them through PXE and downloading the image built by the front end node. This process allows for standardization of the software across all the compute nodes in the cluster.

Rocks and Rolls

During the installation process, you are asked which Rolls you would like to install on your cluster. A Roll is a software component that the cluster uses either during the computational process or for management of the server. At a minimum, a new cluster requires base, kernel, and OS Rolls. Aside from these minimum required Rolls, the Rocks distribution comes with many other useful Rolls.

One of the goals of Rocks is to minimize administrative tasks. Rocks comes with a variety of time-saving tools; for example, a useful Web page for each cluster allows an administrator or user to monitor the status of the cluster, depending on the tools installed, via each Roll.

The most important Roll included in the Rocks distribution is the SUN Grid Engine (SGE), which is responsible for scheduling, dispatching, and managing user jobs across the cluster. The SUN Grid Engine is one of the most commonly used schedulers across computing grids or clusters. The SGE provides an easy way to use the cluster's resources evenly, maximizing the processing power in the grid. SGE management is all handled via the command line, and its queue can be monitored using the Web interface and Ganglia Roll.

The Ganglia Roll allows systems administrators and users to monitor the performance of the cluster via the front end node's Web interface. The report created by Ganglia includes the number of nodes up and running, as well as CPU use, load, and memory used during the last hour. Ganglia also allows users to view the job queue on the cluster and user metrics.

The Area51 Roll allows the administrator to monitor the file and kernel integrity of the cluster, using Tripwire and a root kit check. Tripwire reports are generated automatically through cron jobs on the front end node and are published via the cluster's Web site. The root kit check must be done via command line, using the chkrootkit command.

Rocks also has many Rolls that assist researchers in computational work. Many of the researchers using Rocks are in the biological sciences, and use the Bio-informatics Roll. (As defined by Biowww.net, bioinformatics is "the use of techniques including applied mathematics, informatics, statistics, computer science [and so on] to solve biological problems.")

Many Rolls, including the PGI and Intel Rolls, contain C, C++, and Fortran compilers. Packaging the compilers makes it easier for the administrator and users to employ the full potential of the cluster.

In case the Rolls provided by the default distribution are not enough for your needs, documentation for development of new Rolls is provided on the Rocks Web site.

Limitations of Rocks

Though it provides useful and valuable Rolls for use in a Rocks cluster, the distribution has some limitations that could deter a group from using it as a clustering solution.

Rocks uses the 411 Secure Information Service for authentication, which has functionality similar to Network Information Service within the cluster. However, it's not possible to merge the cluster into a current NIS setup. Workarounds are available that use 411 to pull authentication data from an NIS server, but this approach doesn't provide live mapping.

As with other open source solutions, you may find it difficult to get your support questions answered. Despite a large community of users contributing to and supporting the distribution, you may come up against issues that are difficult to resolve. The developers at UCSD offer limited "office hours" for support.

Perhaps the greatest potential problem lies in installing patches to a Rocks cluster. Ongoing debate has ensued among users and developers as to whether it's safe to patch a cluster using yum or up2date, depending on the OS distribution used when the cluster was installed. In certain instances, patching the cluster has caused unpredictable behavior. To be prudent, you should attempt the patching procedure in a test environment to minimize the chance of unpredictable and unstable behavior by the cluster.

Summary

When evaluating a clustering solution in an educational environment, it's difficult to pass up the Rocks cluster distribution. This scalable computing solution allows you to use a limited amount of hardware or a large server farm to meet research needs. Installation of the cluster is easy and requires a limited amount of resources, allowing administrators to maximize available time. Many Rolls software packages are available for installation on the server, and you can roll your own if you need to. Despite a few limitations, Rocks is a valuable tool for any group doing processor-intensive work in an educational setting.

Category:

  • High Performance Computing