Common shell pitfalls
The same issues cropping up again and again (unfortunately
even in my own scripts sometimes).
While there are lots of shell programming
pitfalls, at least the interpreter
will tell you immediately about them. The mistakes I describe below, generally mean
that your script will run fine now, but if the data changes or you move your script
to another system, then you may have problems.
I think some of the reason shell scripts tend to have lots of issues is that
commonly one doesn't learn shell scripting like "traditional" programming languages.
Instead scripts tend to evolve from existing interactive command line use, or are
based on existing scripts which themselves have propagated the limitations of ancient
shell script interpreters.
It's definitely worth spending the relatively small amount of time required to
learn the shell script language correctly,
if one uses linux/BSD/Mac OS X desktops or servers. This is because shell is the
main domain specific language designed to manipulate the UNIX abstractions for data
and logic, i.e. files and processes.
So as well as being useful at the command line, its use permeates any UNIX system.
Stylistic issues
First I'll mention some ways to clean up shell scripts without changing their functionality.
Note I use a shortcut form of the conditional operator below (and in my shell scripts),
when doing simple conditional operations, as it's much more concise. So I use
[ "$var" = "find" ] && echo "found" instead of the equivalent:
if [ "$var" = "find" ]; then
echo "found"
fi
[ x"$var" = x"find" ] && echo found
The use of x"$var" was required in case var is "" or "-hyphen".
Thinking about this
for a moment should indicate that the shell can handle both of these cases unambiguously,
and if it doesn't it's a bug. This bug was probably fixed about 20 years ago, so
stop propagating this nonsense please! Shell doesn't have the cleanest syntax to
start with, so polluting it with stuff like this is horrible.
[ ! -z "$var" ] && echo "var not empty"
This is a double negative, and is very prevalent in shell scripts for some reason.
Just test the string directly like [ "$var" ] && echo "var not empty"
[ "$var" ] || val="value"
Setting a variable iff it's not previously set is a common idiom and can be more
succinctly expressed like
: ${var="value"}. This is portable to the vast majority of shells.
redundant use of $?
For example:
pidof program
if [ $? = 1 ]; then
echo "program not found"
fi
Note this is not just stylistic actually. Consider what happens if `pidof` returns
2.
Instead just test the exit status of the process directly as in these examples:
if ! pidof program; then
echo "program not found"
fi
if grep -qF "string" file; then
echo 'file contains "string"'
fi
needless shell logic
We'll expand on this below, but we should do as little in shell as possible, over
its domain of connecting process to files. For example the following common shell
idiom of testing for files and directories can often be pushed into the programs
themselves. I.E. instead of:
[ ! -d "$dir" ] && mkdir "$dir"
[ -f "$file" ] && rm "$file"
do:
mkdir -p "$dir" #also creates a hierarchy for you
rm -f "$file" #also never prompts
Robustness
globbing
In the example below to count the lines in each file, there is a common mistake.
for file in `ls *`; do
wc -l $file
done
Perhaps the idiom above stems from a common system where the shell does not do globbing,
but in any case it's neither scalable or robust. It's not robust because it doesn't
handle spaces in file names as word splitting is done. Also it redundantly starts
an ls process to list the files. Also on some systems this form can overflow static
command line buffers when there are many files. Shell script is a language designed
to operate on files so it has this functionality built in!
for file in *; do
wc -l "$file"
done
Notice how we just use the '*' directly which as well as not starting the redundant
`ls` process, doesn't do word splitting on file names containing spaces. Note this
still is slow, as we use shell looping and start a `wc` process per file, so we'll
come back to this example in the performance section below.
stopping automatically on error
Often don't want a script to proceed if some commands fail. Checking the status
of each command though can become very messy and error prone. One can instead execute
set -e at the top of the script, which usually just works as expected,
terminating the script when any command fails (that is not already part of a conditional
etc.).
cleaning up temp files
One should always try to avoid temp files for performance/maintainability reasons,
and instead use pipes if at all possible to pass data between processes. Temporary
files can be slow as they're usually written to disk, and also you must handle cleaning
them up when your script exits, possibly in unexpected ways. The general method
for cleaning up temp files if you really do need them is to use traps as follows:
#!/bin/sh
tf=/tmp/tf.$$
cleanup() {
rm -f $tf
}
trap "cleanup" EXIT
touch $tf
echo "$tf created"
sleep 10 #Can Ctrl-C and temp file will still be removed
#temp file auto removed on exit
echoing errors
If you just echo "Error occurred" then you will not be able to pipe
or redirect any normal output from your script independently. It's much more standard
and maintainable to output errors to stderr like echo "Error occurred" >&2.
Note you can echo multiple lines together as in the following example:
echo "\
Usage: $(basename $0) option1
more info
even more" >&2
Portability
There are two aspects to portability really for shell scripts. There's the shell
language itself, and also the various tools being called by the script. We'll just
consider the former here. To support really old implementations of shell script
then one can test with the heirloom
shell for example, but for a contemporary list of portable shell capabilities,
see the The Open Group
spec which describes the POSIX standard. Note also the
Autoconf info on shell portability which lists details you need to consider
when writing very portable shell scripts, and the ubuntu
dash conversion info.
It's much better to test scripts directly in a POSIX compliant shell if possible.
The `bash --posix` option doesn't suffice as it still accepts some "bashisms", but
the `dash` shell which is the default interpreter of shell scripts on ubuntu is
very good in this regard. One should be testing with this shell anyway due to the
popularity of ubuntu, and dash is easy to install on Fedora for example.
bashisms
`bash` is the most common interactive shell used on unix systems, and consequently,
syntax specific to `bash` is often used in shell scripts. Note I've never needed
to resort to bash specific constructs in my scripts. If you find yourself doing
complex string manipulations or loops in bash, then you should probably be considering
existing UNIX tools instead, or a more general scripting language like python for
example.
[ "$var" == "find" ] && echo "found"
Shell script can't assign variable values in conditional constructs so the double
equals is redundant. Moreover it gives a syntax error on older busybox (ash) and
dash at least, so avoid it.
echo {not,portable}
Brace expansion is not portable. While useful it's mostly so at the interactive
prompt, and can easily be worked around in scripts.
signal specifications
Be wary of when specifying signals to the trap builtin for example, which was mentioned
above. I was even caught out by this in my
timeout script. That script handles the "CHLD"
signal which for bash at least can be specified as "sigchld", "SIGCHLD", "chld",
"17" or "CHLD", only the last of which is portable.
echo $(seq 15) $((0x10))
The command above containing both $(command substitution) and an $((arithmetic expression))
is portable. Traditionally one did command substitution using backquotes
like `seq 15`. That's awkward to nest though and not very readable in the presence
of other quoting. $((arithmetic expressions)) can be handy also for quick calculations,
rather than spawning off `bc` or `expr` for example. Note bash supports the non
portable form of $[1+1] for arithmetic expressions which you should avoid. Note
also that vim 7.1.135 at least, highlights $() as a syntax error unless #!/bin/bash
it at the top of the script— I must send
a patch. [Update June
2008: Strangely it looks like vim explicitly chooses to highlight #!/bin/sh scripts
as original bourne shell scripts rather than to the POSIX standard which the vast
majority of systems currently use. I've
asked for
this to be changed, but in the meantime you can add "let g:is_posix = 1" to your
.vimrc]
We'll expand here on our globbing example above to illustrate
some performance characteristics of the shell script interpreter. Comparing the
`bash` and `dash` interpreters for this example where a process is spawned for each
of 30,000 files, shows that dash can fork the `wc` processes nearly twice as fast
as `bash`
$ time dash -c 'for i in *; do wc -l "$i">/dev/null; done'
real 0m14.440s
user 0m3.753s
sys 0m10.329s
$ time bash -c 'for i in *; do wc -l "$i">/dev/null; done'
real 0m24.251s
user 0m8.660s
sys 0m14.871s
Comparing the base looping speed by not invoking the `wc` processes, shows that
dash's looping is nearly 6 times faster!
$ time bash -c 'for i in *; do echo "$i">/dev/null; done'
real 0m1.715s
user 0m1.459s
sys 0m0.252s
$ time dash -c 'for i in *; do echo "$i">/dev/null; done'
real 0m0.375s
user 0m0.169s
sys 0m0.203s
The looping is still relatively slow in either shell as demonstrated
previously, so for scalability we should try and use more
functional techniques so iteration
is performed in compiled processes.
$ time find -type f -print0 | wc -l --files0-from=- | tail -n1
30000 total
real 0m0.299s
user 0m0.072s
sys 0m0.221s
The above is by far the most efficient solution and illustrates the point well that
one should do as little as possible in shell script and aim just to use it to connect
the existing logic available in the rich set of utilities available on a UNIX system.
disk seeks
It's worth giving special mention to this since disk seeks are so expensive, and
since shell script is designed to deal with files which commonly reside on disks.
If you check for the presence of 2 files for example with [ -e FOO -o -e BAR
], then the check isn't short circuited and 2 disk seeks are performed. The
bash specific format of [ -e FOO || -e BAR ] does short circuit the
second test, however it's better to use the [ -e FOO ] || [ -e BAR ]
conditional format which is both portable and efficient. Traditionally this last
example would have used 2 processes, one for each of the '['. But modern shells
implement '[' internally, so there is no such overhead.