Linux.com

Feature

CLI Magic: Tracking system performance with sar

By Keith Winston on March 20, 2006 (8:00:00 AM)

Share    Print    Comments   

Sar is the "system activity report" program found on *nix systems. In Linux, you can usually find it in the sysstat package, which includes programs and scripts to capture and summarize performance data, then produce detailed reports. This suite of programs can be useful in tracking down performance bottlenecks and providing insight into how the system is used throughout the day.

User Level: Intermediate
Sadc (system activity data collector) is the program that gathers performance data. It pulls its data out of the virtual /proc filesystem, then it saves the data in a file (one per day) named /var/log/sa/saDD where DD is the day of the month.

Two shell scripts from the sysstat package control how the data collector is run. The first script, sa1, controls how often data is collected, while sa2 creates summary reports (one per day) in /var/log/sa/sarDD. Both scripts are run from cron. In the default configuration, data is collected every 10 minutes and summarized just before midnight.

If you suspect a performance problem with a particular program, you can use sadc to collect data on a particular process (with the -x argument), or its children (-X), but you will need to set up a custom script using those flags.

As Dr. Heisenberg showed, the act of measuring something changes it. Any tool that collects performance data has some overall negative impact on system performance, but with sar, the impact seems to be minimal. I ran a test with the sa1 cron job set to gather data every minute (on a server that was not busy) and it didn't cause any serious issues. That may not hold true on a busy system.

Creating reports

If the daily summary reports created by the sa2 script are not enough, you can create your own custom reports using sar. The sar program reads data from the current daily data file unless you specify otherwise. To have sar read a particular data file, use the -f /var/log/sa/saDD option. You can select multiple files by using multiple -f options. Since many of sar's reports are lengthy, you may want to pipe the output to a file.

To create a basic report showing CPU usage and I/O wait time percentage, use sar with no flags. It produces a report similar to this:

01:10:00 PM       CPU     %user     %nice   %system   %iowait     %idle
01:20:00 PM       all      7.78      0.00      3.34     20.94     67.94
01:30:00 PM       all      0.75      0.00      0.46      1.71     97.08
01:40:00 PM       all      0.65      0.00      0.48      1.63     97.23
01:50:00 PM       all      0.96      0.00      0.74      2.10     96.19
02:00:00 PM       all      0.58      0.00      0.54      1.87     97.01
02:10:00 PM       all      0.80      0.00      0.60      1.27     97.33
02:20:01 PM       all      0.52      0.00      0.37      1.17     97.94
02:30:00 PM       all      0.49      0.00      0.27      1.18     98.06
Average:          all      1.85      0.00      0.44      2.56     95.14

If the %idle is near zero, your CPU is overloaded. If the %iowait is large, your disks are overloaded.

To check the kernel's paging performance, use sar -B, which will produce a report similar to this:

11:00:00 AM  pgpgin/s pgpgout/s   fault/s  majflt/s
11:10:00 AM      8.90     34.08      0.00      0.00
11:20:00 AM      2.65     26.63      0.00      0.00
11:30:00 AM      1.91     34.92      0.00      0.00
11:40:01 AM      0.26     36.78      0.00      0.00
11:50:00 AM      0.53     32.94      0.00      0.00
12:00:00 PM      0.17     30.70      0.00      0.00
12:10:00 PM      1.22     27.89      0.00      0.00
12:20:00 PM      4.11    133.48      0.00      0.00
12:30:00 PM      0.41     31.31      0.00      0.00
Average:       130.91     27.04      0.00      0.00

Raw paging numbers may not be of concern, but a high number of major faults (majflt/s) indicate that the system needs more memory. Note that majflt/s is only valid with kernel versions 2.5 and later.

For network statistics, use sar -n DEV. The -n DEV option tells sar to generate a report that shows the number of packets and bytes sent and received for each interface. Here is an abbreviated version of the report:

11:00:00 AM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s
11:10:00 AM        lo      0.62      0.62     35.03     35.03
11:10:00 AM      eth0     29.16     36.71   4159.66  34309.79
11:10:00 AM      eth1      0.00      0.00      0.00      0.00
11:20:00 AM        lo      0.29      0.29     15.85     15.85
11:20:00 AM      eth0     25.52     32.08   3535.10  29638.15
11:20:00 AM      eth1      0.00      0.00      0.00      0.00

To see network errors, try sar -n EDEV, which shows network failures.

Reports on current activity

Sar can also be used to view what is happening with a specific subsystem, such as networking or I/O, almost in real time. By passing a time interval (in seconds) and a count for the number of reports to produce, you can take an immediate snapshot of a system to find a potential bottleneck.

For example, to see the basic report every second for the next 10 seconds, use sar 1 10. You can run any of the reports this way to see near real-time results.

Benchmarking

Even if you have plenty of horsepower to run your applications, you can use sar to track changes in the workload over time. To do this, save the summary reports (sar only saves seven) to a different directory over a period of a few weeks or a month. This set of reports can serve as a baseline for the normal system workload. Then compare new reports against the baseline to see how the workload is changing over time. You can automate your comparison reports with AWK or your favorite programming language.

In large systems management, benchmarking is important to predict when and how hardware should be upgraded. It also provides ammunition to justify your hardware upgrade requests.

Digging deeper

In my experience, most hardware performance problems are related to the disks, memory, or CPU. Perhaps more frequently, application programming errors or poorly designed databases cause serious performance issues.

Whatever the problems, sar and friends can give you a comprehensive view of how things are working and help track down bottlenecks to fix a sluggish system. The examples here just scratch the surface of what sar can do. If you take a look at the man pages, it should be easy to customize a set of reports for your needs.

Share    Print    Comments   

Comments

on CLI Magic: Tracking system performance with sar

Note: Comments are owned by the poster. We are not responsible for their content.

Reading?

Posted by: Anonymous Coward on March 20, 2006 07:04 PM
Is something broke, or is an account required just for reading now? I am losing hope in the linux community.

#

Re:Reading?

Posted by: Anonymous Coward on March 20, 2006 07:48 PM
No, they just included wrong link. The real one is
<a href="http://enterprise.linux.com/enterprise/06/02/24/2058233.shtml?tid=89" title="linux.com">http://enterprise.linux.com/enterprise/06/02/24/2<nobr>0<wbr></nobr> 58233.shtml?tid=89</a linux.com>

#

nice article, thank you

Posted by: Anonymous Coward on March 24, 2006 03:01 AM
I had not heard of sar before, and it is very useful. Thank you!

#

CLI Magic: Tracking system performance with sar

Posted by: Anonymous [ip: 128.112.203.213] on September 10, 2007 02:53 PM
Benchmarking... "To do this, save the summary reports (sar only saves seven) to a different directory over a period of a few weeks or a month."

You can use /etc/sysconfig/sysstat to change the default number of reports saved, up to a month's worth.

#

CLI Magic: Tracking system performance with sar

Posted by: Anonymous [ip: 74.0.247.74] on October 11, 2007 04:30 AM
very nice

do you have a way of converting device name to physical hardware device?

#

This story has been archived. Comments can no longer be posted.



 
Tableless layout Validate XHTML 1.0 Strict Validate CSS Powered by Xaraya