October 29, 2008

More tricks with BashDiff

Author: Ben Martin

Yesterday we took a look at BashDiff, a patch for the bash shell that adds new capabilities. We've already looked at some of the additions that BashDiff makes to bash's commands and string parsing abilities. Today we'll look at modifying positional parameters, parsing XML, talking to ISAM and relational databases, creating GTK+2 GUIs, and a few other tricks and issues.

BashDiff adds commands that help you manipulate positional parameters, often with much greater efficiency than the normal bash routes. The pcursor command lets you manipulate the current positional parameters as a single group, clearing them all and storing and restoring them all from a stack. New commands prefixed with pp_ help you manipulate the positional parameters themselves. These commands all take an optional -a parameter to let you specify the name of an array to use instead of working on the implicit positional parameters array. For example, the normal bash command to append to the positional parameters is shown below with the BashDiff equivalent below it.

$ set -- $@ Z
$ pp_append Z

Many of the pp_ commands are much more efficient than using the alternatives in the standard bash shell. How much the efficiency gain of the pp_ prefixed commands matters to you depends on how heavily you use positional parameter manipulation, particularly in a loop. The case shown below is one extreme, where an expensive prepend is done 1,000 times in a loop. As you can see, the pp_ version is much more efficient. However, if the loop only runs for 100 iterations, the standard bash set command version completes in 0.15 seconds. Even though the BashDiff version needs only 0.006 seconds, the efficiency of the standard bash syntax is likely to be acceptable, unless the script that manipulates the positional parameters 100 times is itself called many many times from another script.

$ pp_trim 10000
$ time for i in `seq 1 1000`; do set -- Zoldpre $@; done

real 0m10.157s
user 0m9.880s
sys 0m0.004s

$ pp_trim 10000
$ time for i in `seq 1 1000`; do pp_push Zoldpre; done

real 0m0.088s
user 0m0.036s
sys 0m0.002s

BashDiff includes support for the Expat XML parser. You can process only one XML file per call of the new expat builtin. XML is processed and a collection of callbacks you specify are called when elements, comments, namespaces, and so on are encountered in the XML. BashDiff does some housekeeping for you; for example, the XML_TAG_STACK array will include the names of all XML elements that lead to the current one (when in a callback), while XML_ELEMENT_DEPTH will let you know the depth of the current XML element. XML attributes are not handled using explicit callbacks of their own; rather, all the attributes are passed as parameters to the start of XML element callback.

It is convenient to use bash functions as your Expat callback entry points. The example below parses the simple ugu.xml file using the BashDiff expat builtin. The startelem function handles the start of a new XML element and uses the standard bash declare function to show the contents of the XML_TAG_STACK array when it is called. I've displayed the second parameter that was handed to startelem by BashDiff to show you how XML attributes are handled.

$ cat ugu.xml
<ugu foo="bar" linux="fun">
<nested one="two" three="four" />
$ startelem() {
declare -p XML_TAG_STACK;
echo $2;
$ expat -s startelem ugu.xml
declare -a XML_TAG_STACK='([0]="ugu")'
declare -a XML_TAG_STACK='([0]="nested" [1]="ugu")'

The BashDiff patch brings in support for easily dealing with gdbmISAM files. To create a database and set a value, use the gdbm builtin, supplying the filename and the key-value pairs you want to set. The -e option to gdbm lets you test whether a key is set, and optionally, when it is set, that it has the value you are expecting. This latter case is useful when you are writing a shell script that can be configured to optionally perform additional processing. There are also options to bring in all the keys or values or import the entire gdbm database into a shell array.

The below example first sets a single key-value pair, creating the test.gdb file if it does not already exist. It then performs various tests to see if a key-value pair exists, and shows the result of each test. Two key-value pairs are then set at once, and the entire gdbm file is read into a bash array with the -W option. I used the standard bash declare as the last command to show you what the imported data looks like when in a bash array. Note that you can also use the -K and -V options to set up arrays containing only the keys and values respectively.

$ gdbm test.gdb key value1
$ gdbm -e test.gdb key
$ echo $?
$ gdbm -e test.gdb key2
$ echo $?
$ gdbm -e test.gdb key value1
$ echo $?
$ gdbm -e test.gdb key value2
$ echo $?
$ gdbm test.gdb key2 value2 key3 value3
$ gdbm -W thedata test.gdb
$ declare -p thedata
declare -a thedata='([0]="key2" [1]="value2" [2]="key" [3]="value1" [4]="key3" [5]="value3")'

The Lsql command allows you to get at SQLite databases from a BashDiff shell. The general form is Lsql -d file.sqlite SQLquery. When called this way Lsql will print any results to standard out. You can also use the optional -a to supply the name of a bash array you would like to store the results into instead of printing them.

The below example doesn't really use the data from any SQLite database, but if realdata.sqlite contained a database, you could change the SQL command given to perform real work like table joins and the results would be stored into the table array just as the example does.

$ Lsql -a table -d realdata.sqlite "select 'foo','bar','bill','ted';"
$ declare -p table
declare -a table='([0]="foo" [1]="bar" [2]="bill" [3]="ted")'

There are also Psql and Msql commands to connect to PostgreSQL and MySQL databases in a similar manner to Lsql. The -d option to Lsql makes no sense for Psql and Msql and is replaced with other parameters telling BashDiff where your database is running and the login credentials to use when connecting.

If you want to make a small, modern GUI from bash, BashDiff's gtk command might be what you are looking for. The invocation is simply gtk gtk.xml, where the XML can also be made available through redirection. Using XML from bash to define the GUI is not as clunky as you might expect, as the XML schema is quite simple. You can generate options and buttons using bash shell scripts that just spit out the expected XML elements. You get communication from the GUI by using special XML attributes like command, which, for example, allows you to specify a bash command to execute when the user clicks on a button. Also, the id attribute names a shell variable that the value in the text entry or combo box should be stored into when the user closes the GUI.

I managed to get the gtk command to segfault a few times during testing, but on later testing was unable to easily replicate the issue.

Shown below is the XML to create a simple text entry GUI that stores the value of the entry into the $entry shell variable when the user clicks the OK button.

$ cat gtk2.xml
<dialog border="10" buttons="gtk-ok,gtk-cancel" id="dialog">
<entry id="entry" initial="initial text"/>
$ gtk gtk2.xml
$ echo $entry
this is the text

BashDiff includes many other handy little expansions to the bash syntax and semantics. For example, <<+ allows you to have a here document where leading indentation is preserved relative to the input.

The vplot command allows you to take data from one or two bash arrays and create an x-y plot in the terminal. You can supply the names of the arrays or simply give the points directly on the command line in the form x1 y1 x2 y2 etc.

The BashDiff patch also includes an RPN calculator, and allows you to mix text and bash commands into a single file in a manner similar to the way PHP works.

There are also some functions with less general utility, such as creditcard and cardswipe for dealing with credit card numbers and extracting information from raw cardswipe data supplied on stdin respectively. These are complemented with other commands like protobase for dealing with a specific companies' point of sale hardware.

BashDiff also exposes many of the functions from ctype.h to let you test whether a string has a given form. Unfortunately I couldn't figure out how to get these to work properly. Executing isnumber gives the usage pattern shown below. However, trying to execute isnumber upper G gave the result of an "invalid number" instead of the expected success that G is indeed upper case.

$ isnumber
isnumber: usage: isnumber { alnum | alpha | ascii | blank ...
| cntrl | digit | graph | lower | print | punct | space ...
| upper | xdigit | letter | word } number...

$ isnumber upper G
bash+william: isnumber: G: invalid number

Whether the features of BashDiff are enticing enough for you to want to replace your unpatched shell is the hard question. BashDiff adds a collection of things that make scripting more convenient, such as non-match alternatives for commands like case, but you can work around the lack of these options in your scripts using just the vanilla bash at the expense of having slightly more contorted scripts. If you are dealing with relational databases or large lookup tables (gdbm), BashDiff might be worth having around just for these two features, even if it does not become your login bash.


  • Tools & Utilities
  • Shell & CLI
Click Here!