February 27, 2008

Gplot simplifies gnuplot graph creation

Author: Ben Martin

Gnuplot can generate sophisticated graphs and output them in vector or bitmap image formats. It can produce many graph types, and you can customize the way the output will look to a great extent. But the customizability of gnuplot can work against it when all you want a simple line graph comparing two series of data points. In those cases, gplot lets you use gnuplot to create simple graphs using more semantic options to customize the appearance of common graph objects.

Gplot's main dependencies, gnuplot and Perl, should be packaged by your distribution. As gplot is a Perl script, installation simply involves copying the gplot.pl file from the distribution tarball into your path to a location such as /usr/local/bin.

You need one or more sets of input data in order to generate a graph image. By default each graph line comes from its own input file. The data in each input file is in the form of a single coordinate per line in the format "X Y," where the ordinates are separated by white space. If your input only contains the Y values and you are happy to have the X ordinate start at zero and increment by one for each value, then you can use the -onecolumn option to let gplot accept a file containing only Y ordinates. If you have a matrix of data values stored in a single data file, then the -using X:Y option allows you to pick out which columns in the matrix define the X and Y ordinates for the data you wish to plot.

The below commands take an input data series and create a graph (shown below) in JPEG format representing this series. I have broken the command line after the title because the next two options (name and onecolumn) relate to the data file t1.data. Options which relate to a particular data file will use the data from the next data file path given on the command line. For example, if there were a second data file, I could repeat the name and onecolumn options after t1.data but before citing the second data file's path to create two lines in the output graph, one for each data file.

$ cat t1.data
1
2
3
4
2
3
2
3
$ gplot.pl -type jpg -title "Fun Index" \
-name "kitten index" -onecolumn t1.data
Converting 't1.data' into two columns of data
Created GUPLOT output in '/tmp/gplot.jpg'

127798-2-thumb.jpg

You can change the bland X and Y axis labels using the -xlabel foo and -ylabel bar command-line options respectively. If you use the X axis to show time data -- for example for Web traffic statistics -- then you can use the -dateformat option to tell gplot what time format the X axis time data is in. For example, %d/%m/%y will parse slash-separated day/month/year information as X ordinates. For other time parsing field descriptors, see strptime(3).

Since gplot uses gnuplot to create graphs, you might wish to get access to the data that is fed to gnuplot. The -showgnuplot gplot option will show you this intermediate gnuplot file on stdout before proceeding to process it.

If you wish to customize the gnuplot output that gplot generates, you might consider changing the input template gnuplot file that gplot uses and passing -plotcmds to gplot to start with your customized gnuplot file. The template gnuplot file allows for very simple substations and conditionals. The default template file is embedded in the gplot.pl Perl script itself on line 104 where GNUCMDS is defined.

Values in the template gnuplot file that are contained between % characters are substituted with their values when gplot is executed. The below example places the value of the -xlabel command-line parameter where %XLABEL% appears in the template.

set xlabel "%XLABEL%" font "%FONT%,%FONTSIZE%"

One might hope that any command-line parameters, or those starting with a given prefix, could be substituted in this manner. Gplot will generate an error if you attempt to pass a new command-line option for template substitution like this. Adding options for use in template substitution is not difficult though. The gplot.pl script needs two slight modifications: change my %opts on line 131 to include the new options, and Getopt::Long::GetOptions on line 155 to include each new option and its type:

my %opts = (type => 'xwin', outfile => '/tmp/gplot.%TYPE%',
...
style => 'linespoints', foo => 'defaultfoo',
);
...
Getopt::Long::GetOptions( \%opts, 'help', 'showgnuplot',
...'font=s', 'fontsize=i',.... 'verbose', 'foo=s',
@plotoptions
) || die "Failed to parse options\n";

With the addition of the foo command-line option, the template can include %FOO% wherever you wish:

set title "%TITLE% %FOO%" font "%TFONT%,%TFONTSIZE%"

The ability to customize gnuplot template files with gplot gives you the ability to produce some fairly advanced graphs using templates to specialize how things appear while retaining the simple interface of gplot.

The following example uses three different data series which are per-month statistics. Since I'm using time-related data I can take advantage of the -dateformat option to parse human-readable month names and numeric years. One downside to using -dateformat is that the axis ticks are not optimal in the default configuration. To get around that I created my own template gplot file that labels the X axis ticks in a more intuitive format. The main things to note in my-template.gplot are the last few lines, which set the minor X ticks to off and define how labeling will proceed for the X major ticks.

$ cat my-template.gplot
# GPLOT.PL pseudo commands for GNUPLOT
IF TYPE eq ps set terminal postscript landscape color
IF TYPE eq eps set terminal postscript eps color
IF TYPE eq jpg set terminal jpeg
IF TYPE eq png set terminal png
IF TYPE eq pbm set terminal pbm
IF TYPE ne xwin set output "%OUTFILE%"
set xlabel "%XLABEL%" font "%FONT%,%FONTSIZE%"
set ylabel "%YLABEL%" font "%FONT%,%FONTSIZE%"
set multiplot
set autoscale
set data style lines
set border 3
set xtics border nomirror
set ytics border nomirror
set origin 0.0,0.0
set title "%TITLE%" font "%TFONT%,%TFONTSIZE%"
set xdata time
set timefmt "%b/%Y"
set format x "%b/%Y"
set style line 10 lt 1 lw 1 pt 5 ps 0.65
set style line 11 lt 3 lw 1 pt 1 ps 0.65
set style line 12 lt 7 lw 1 pt 10 ps 0.65
unset mxtics
set xtics ("jan/2007", "feb/2007", "mar/2007", "apr/2007" )
$ cat 1
jan/2007 1200
feb/2007 1542
mar/2007 4343
apr/2007 2424
$ gplot.pl -plotcmds my-template.gplot -showgnuplot -dateformat '%b/%Y' \
-title "Web Hits" -ylabel Hits -xlabel Month -type jpg \
-name sub1.example.com 1 \
-name sub2.example.com 2 \
-name sub3.example.com 3

127798-1-thumb.jpg

Gplot can also generate vector format output using -type eps. If you use pdflatex to generate academic papers, you can use epstopdf to include gplot graphs.

At times you can run up against the boundaries of gplot and have to write gnuplot directives in a gplot template file to achieve the desired result, but the ability to substitute variables in the gnuplot template file should allow you to handle many graphing needs using gplot. Gplot makes creating simple graphs a simple task, and still lets you tap into anything that gnuplot can do if the need arises.

Categories:

  • Tools & Utilities
  • Graphics & Multimedia