What to watch out for when writing portable shell scripts

Shell scripts are a popular choice for writing small programs that do file manipulation. They are generally portable across platforms, but there are a number of things that can make a shell script work fine on one machine and fail on another. This article reviews some of the issues shell programmers may run into when trying to write widely portable scripts.

Portable to what?

It can be tough to know what your target platforms must be when you first write a script. Do you want to target just Linux boxes? Just a specific Linux distribution? System V systems only? There’s a lot to be said for developing for a broader selection of platforms rather than a narrower one. For instance, one program I used had an installation script that specified /bin/sh as its interpreter. It depended on extensions found only in bash. The script was developed to run on a Linux system, but, apart from this installer, it worked perfectly well in NetBSD’s Linux emulation.

One common surprise is the discovery that a given script has to run on Windows. You might laugh this off, given the impossibility of porting a Unix shell script to a COMMAND.COM batch file; unfortunately, requirements sometimes come from left field. Cygwin and the MKS Toolkit both allow the use of Unix shell scripting facilities on Windows systems. (Windows Services For Unix, while in principle freely available for the same purpose, runs only on some Windows systems. It won’t install on Windows XP Home Edition, for example, and can reasonably be ignored.)

Shell standards

In general, the shell’s behavior and the behavior of common utilities are governed by the POSIX standard. A number of other standards, such as the Single Unix Specification, may apply, but you can probably do everything you want within the POSIX spec, which is a
little more reliably supported.

There are a few major criteria in shell script portability. These include the features of the shell itself, the set of available utilities, and the exact features and options of these utilities.

For the most part, as with any exercise in portability, the best thing to do is minimize dependencies. Don’t start by targeting the full feature set of ksh or bash for a simple script. Don’t use utilities that aren’t available on all of your target systems. When
possible, find out what your target systems will be and arrange for testing. Testing, however, should not substitute for verification of official support. If you don’t realize that the Cygwin system you’re testing on was modified by another user to add a command missing from the standard installation, you’ll get an unpleasant surprise in the field. Testing is there to let you know when a system lacks a standard feature, not to let you know when you can rely on a non-standard feature.

We’ll look at what you get when you run /bin/sh. This may or may not really be a Bourne shell. The most common variants are bash (running in “compatibility mode”) and ash, a Bourne shell clone originally developed by Kenneth Almquist. Most versions of ash have evolved substantially since the initial release. On most systems, /bin/sh has POSIX features not found in the historical Bourne shell. With the MKS tools, /bin/sh is really a ksh variant — but, while #!/bin/sh scripts work with it, you can’t just invoke it as /bin/sh. If you really need to, you can get clever and set up symbolic links from /bin to the directory the shell is really in, but this is probably overkill. Most shells will do the right thing if you just invoke them as sh.

While bash does make a number of compatibility changes when invoked under the name sh, it is a shell with extensions. It will not necessarily warn about or reject syntax which would not work in a pure Bourne shell. The same goes for the MKS toolkit shell, which is really a single shell that answers to the names sh, ksh, and bash. That’s right — there’s a “bash” out there that isn’t really the GNU bash.

Shell syntax

The big thing to watch out for here is ksh and bash extensions. For instance, the [[ ... [[ and (( ... )) syntax found in ksh and bash is not found in ash-derived shells. Be sure to check your choice of features and syntax against the shell documentation.

Standard shells all support shell globbing (the ? and * wildcards). Brace expansion is a bash (and csh) feature; you should not expect it to work in other shells. Most of the things brace expansion is used for can be addressed by looping or globs.

Shell expansion rules, such as ${variable##string}, are pretty widely supported. There are three groups of shell expansion rules. The first is just plain variable expansion; this works everywhere. The second is traditional Bourne shell conditional expansion rules, such as ${parameter:-word}. These have been around long enough that you shouldn’t ever find a shell which doesn’t support them. The more flexible variable expansion rules using # and % to strip patterns from shell variables were introduced in Korn shell, but are available in every /bin/sh I was able to find today. Still, there have been shells that didn’t support them; for instance, the classic SunOS /bin/sh didn’t have them.

Modern shells generally support the $(command) syntax for substituting the output of a command. On the other hand, the shorthand of $(&file) for $(cat file) is a Korn shell extension, so /bin/sh may not have
it. Just use the slightly longer form.

Shell globbing may or may not be case-sensitive. It is by default, but some filesystems (such as FAT32 and HFS, the traditional Windows and Mac filesystems) smash case in one way or another such that globbing isn’t case-sensitive. Likewise, you can’t safely assume that you can have two files that have the same name except for capitalization.

If you may be working with large numbers of files, you can’t rely on shell globbing across platforms being consistent; the number of arguments allowed can vary widely. Using find and xargs is a good thing. Another useful idiom is:

find .  -print | while read i
do
	process $i
done

This works on arbitrarily long lists as long as none of the filenames contain newlines.

Shell builtins

There are two kinds of shell builtins you need to be aware of. One is features, which are merely performance enhancements; for instance, many shells have a builtin test, to avoid the overhead of launching an external command for every test performed. The other kind is commands, such as set or getopts, which must be part of the shell to work at all.

One potential gotcha is that getopts, which has been part of most shells for a long time, was disabled in the Cygwin version of /bin/sh until fairly recently. (The code to implement it was still in the shell, but it was omitted from the lookup table for builtin functions.) Getops is the most portable and standard way to parse options, and now that Cygwin’s got it again, it works almost everywhere.

Another issue is the question of what, if anything, you can pass to echo to suppress the newline. BSD systems tended to use echo -n; System V systems tended to use echo argumentsc. POSIX echo uses the c behavior, and will simply emit a -n when it sees that in the argument list of echo. The exact behavior of this in more complicated circumstances may vary; furthermore, some shells don’t support it. Sven Mascheck’s page has an elegant little hack which works almost all the time:

  if [ "X`echo -n`" = "X-n" ]; then
  echo_n() { echo ${1+"$@"}"c"; } else
  echo_n() { echo -n ${1+"$@"}; } fi

You may be better off checking for availability of the printf command, also specified by POSIX, which has the ability to do this, and many other things, much more reliably. In particular, the %b format of the POSIX command-line printf supports backslash escapes. In this one, of course, c does something different; it doesn’t just suppress some upcoming newline, it terminates all output immediately. So, for instance:

    $ printf '%b' 'foocbar' 'baz'
   foo$

Try to avoid using 'echo -n' or one of its equivalents.

If you want to read in another file, the command to do this is ., not source. Bash, in an enthusiastic movement towards ecumenicalism, has chosen to implement a few csh builtin commands, such as source and setenv, but these are not portable to other shells. If you need to set an environment variable, set it and export it as two separate instructions; the export NAME=value syntax is not entirely portable.

Shell variables

The shell variable most relevant to portability is $PATH. Avoid setting this if you don’t have to. While users could, in theory, set very bad paths that would break the behavior of a shell script, it’s very hard to predict what path you should use. It’s also possible that the user actually does know something about his system that you don’t. Path setting isn’t always portable. As a particularly notable case, the MKS toolkit tends to favor DOS-style paths, including the use of colons for drive letters and semicolons to separate path components. This behavior is reasonable for the Windows environment, but might be surprising if you make a lot of changes to $PATH. Guessing in which path a given program is installed is often difficult. If you run on a system where a commonly used program is in a path you’ve never heard of, you’ll get bitten.

One environment variable that may matter is $POSIXLY_CORRECT. This is the variable used to tell most GNU utilities to use POSIX standard behavior, even when someone thinks it’s broken. This is especially useful when you’re trying to get consistent behavior from multiple systems. It helps you pick a baseline which, while sometimes inconvenient, is at least consistent. This mostly affects the behavior of commands you call, so be sure to export it into the environment.

External commands

Needless to say, you’re not out of the woods yet. The exact options and features available in basic Unix commands sometimes vary a bit. You can do everything you need to do without stepping on this too much, but the surprises can be alarming. For instance, if you’re going back and forth between System V systems, you may be used to cpio as an archiver. Unfortunately, it’ll probably be missing on BSD systems. The many varieties of tar out there only serve to confuse the issue further. POSIX gives us an archiver called pax, which is reasonably widely available and very flexible about input formats. It also meets one of the most common reasons to want an archiver in shell scripts; the desire to copy files from one directory to another.

You can generally expect core file manipulation utilities to exist. Don’t get clever and try to initialize files from /dev/zero or copy them from /dev/null; in general, don’t assume that there even is a /dev, let alone that you know what’s in it. If you want an empty file, the Bourne shell offers nice, terse syntax:
>file

Don’t get too involved with the command line options for commands; this is where differences tend to show up. The Berkeley version of ls has twenty-one of the lowercase letters used as options; they don’t all work elsewhere. The exact format of ls -l output may vary; the columns seem to be fairly consistent, but column width is likely to change a lot. You can parse it tolerably well with awk.

There’s a lot of variance in which operators are supported by the test command, sometimes also called [. Common extensions include the -a and -o operators, which perform “and” and “or” conjunctions of other tests, and the use of parentheses for grouping. Unfortunately, most man pages aren’t very specific about which of the features provided are extensions, and which are part of the formal spec. To make things more interesting, test is one of the main candidates for being a builtin shell function, so you can get even more exciting and unpredictable behavior! Tread lightly here.

Hardware

The best thing to do about hardware features, such as how to echo text in a given color, or find the right name in /dev to get access to a CD, is simple. Don’t even go there. You cannot reasonably predict device names from one system to another. Not all systems offer /dev/tty or /dev/stdin, let alone specific names for hardware devices such as CD-ROM drives. If you want access to a CD, make the user mount it, then ask the user where it’s mounted.

Don’t bet on being able to control cursors or text colors. Most scripts don’t try this, but some do. Don’t assume that the target terminal is vt100-compatible, or supports ANSI escape sequences. Don’t even emit these; they can cause other terminals to behave badly. If you want colored text, write in a language with a GUI toolkit.

Where am I?

If you need to know what kind of system you’re on, or what shell you’re running, here are two things you can check:

The output of the uname command. Particularly useful are the -s (system name) and -r (system release) flags. These won’t always correspond to the name a product is sold under; OS X 10.3 is “Darwin 7.2.0.”
$BASH_VERSION and $KSH_VERSION detect most implementations of ksh and bash. MKS Toolkit gives $SHELL_VERSION.

Do not, under any circumstances, try to parse output of system utilities to guess at the operating system type. uname is more reliable. But don’t even check unless you’re sure it matters.

Moving on

This article hasn’t covered everything you’ll ever run into. There are simply too many things you might have to deal with. Be cautious about using extensions and test your scripts as widely as you can. Try to test them a little further afield than you think they’ll need to run; users are full of strange requirements.

Get yourself copies of relevant specs, and look things up. Don’t just use two slightly different distributions of Linux, or two of the BSDs, and call your testing complete. If you need to target Windows systems, both Cygwin and MKS Toolkit are viable options. So far, I like MKS a little better.

A well-written shell script should be able to survive a number of years of migration to new systems with different shells. Even if you’re writing a script just for personal use, the time taken to write it correctly in the first place is time well spent.

– Write for us – and get paid! –

RELATED ARTICLESMORE FROM AUTHOR

How to Deploy Lightweight Language Models on Embedded Linux with LiteLLM

Automating Compliance Management with UTMStack’s Open Source SIEM & XDR

Using OpenTelemetry and the OTel Collector for Logs, Metrics, and Traces

Xen 4.19 is released

Advancing Xen on RISC-V: key updates

RELATED ARTICLES MORE FROM AUTHOR