vayudoot

A technophile with wide ranging interests.

Shell scripting best practices

28 May 2020 » shell, programming

Majority of time shell script is used for doing some “hack” job of sewing together various utilities and their output. While shell scripting is easy and straight-forward most of the time, it does pop surprises time to time. End result is not just bugs but silent bugs. It would save lot of headache and debugging hours to treat shell scripting as proper programming language and use coding discipline as demanded by other high level languages. For very complex jobs, it might be worth considering python, as after all shell scripting comes with very limited vernacular. Following are some pointers for robust shell programming. Note that there are variations between various shells and POSIX standard. You might want to try out and change usage accordingly.

Environment

Using env ensures portability vs using direct path to shell.

#!/usr/bin/env bash

Declare variables

It is a good practice to declare variables upfront in shell script. Global variables should be upper case and local should be declared early in function.

GLOBAL_VAR=
foo() {
  local var1 var2=0
  var1=${GLOBAL_VAR:-100} # if GLOBAL_VAR is empty use 100 as default
}

Notice, no use of keyword ‘function’. Also, surround variables in {} to avoid shell incorrectly interpreting variable name. For special built-in variables like $@, always use quotes.

Prefer [[ over [

While [ makes your script portable (as this is POSIX compliant condition test), all shell support [[ now which is more robust and extends functionality. With [[ conditional construct, quotes are not needed for variables, supports extended pattern matching, multiple conditions can be naturally combined (instead using -o or -a etc), and various other things.

Exit if externally sourced file doesn’t exist

Instead of sourcing an external file and moving on, use circuit breaker to exit immediately. Otherwise, your script might continue and either print lot of error messages or even silently corrupt data, depending on what was used from external file.

. common_file || exit 1

Check return code of all external commands or utils

If an external command can fail, then check it’s return code. Such commands can also be wrapped in a general function.

var=$( cmd )
[[ $? -ne 0 ]] && exit 2

Use whitespace with $( )

Makes it easier to read.

PROGNAME=$( basename $0 )

Use $() instead of ``

For command substitution, use $(). The commands can be further nested with it.

last_line=$( dmesg | tail -n1 )
all_lines=$( echo $( cat file ) )

When changing directories, use directory stack

If supported by your shell, it’s better to use pushd and popd for saving and returning current working directory instead explicitly using cd.

( pushd /tmp; ls )

Use ((var++))

Instead of let var=var+1. Same for substraction and other arithmatics.

Use regex expression

For complex regular expressions, use a separate variable for regex pattern.

re="^[0-9]{2}-[0-9]{4}-[0-9]{2} *$"
if [[ ! $serial_num =~ $re ]]; then
fi

Use signal traps for cleanup

Traps can be used to clean temporary resources (lockfiles, temp files), or even rollback to previous state (say if setup of files were being modified and one failed).

trap cleanup EXIT INT TERM
cleanup() {
  echo "Cleaning up resources ..."
  rm -f $lockfile
  printf \\e[2J\\e[H\\e[m    # cls
  trap - EXIT INT TERM       # clear traps and exit
  exit 0
}

Use set builtins for robust script

These options could be either set globally, for a function, or whithin a block, depending context. Some of the most common and useful ones are listed below.

errexit (-e) makes your script exit when failure occurs. You can allow script to continue on a failing command by adding || true.

nounset (-u) exit when your script tries to use undeclared variables.

pipefail will fail your script if any of command in pipe chain fail. Return code is set to one from failing command. Default is return status of last item in pipe chain.

Set these options explicitly instead of #!/usr/bin/env bash -e.

set -euo pipefail

Reading a file

file="$(<"file")"         # to read whole file at once. Make sure size is manageable before reading it.
IFS=$'\n' read -d "" -ra file < "file" # read line by line
mapfile -t file < "file"  # read line by line. mapfile is available in bash 4.

Also, using while read pattern can be useful in specific situations.

awk '{print $2}' file.txt | while read num
do
  if [ "$num" = "0" ]; then
    echo "zero it is!"
  fi
done

Avoid temporary files for ephemeral use

Instead use <(cmd) which transforms output into something which can be used a file.

Using lockfiles without race conditions

First checking if lockfile exists and then creating one if not, obviously has race condition. Following can be used instead.

if ( set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null;
then
   trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT
   do_critical_section
   rm -f "$lockfile"
   trap - INT TERM EXIT
else
   echo "failed to acquire lock, held by $(cat $lockfile)."
fi

On Linux, one can use flock instead.

Use colors but with caution

Using colors, particularly when dumping lot of information, is very useful. However, some terminals might not be able to render colors and could screw up the output.

Debugging options

set -n to dry run the script. Useful to do syntax check.

set -v to print every command run

set -x to trace every command and expanded use

Instead of setting these variables, it might be useful control debugging behavior via global DEBUG variable. echo can be wrapped around with another print function, which changes verbosity based on DEBUG variable.

debug() {
  ((DEBUG)) && echo ">>> $*";
}

Debug with function names

While printing debug info, FUNCNAME[x] can be used to print current or prior functions in call stack. FUNCNAME[@] is array of all functions in call chain.

"${FUNCNAME[0]}"
"${FUNCNAME[@]}"

Use here-docs instead of individual prints

cat << EOF
usage: $PROGNAME <arg> <arg>
version: 0.1
EOF

Use brace expansion

echo a{d,c,b}e    # will print - ade ace abe
echo repos/ntrivedi/code/{0,1}{1..9}.cpp  # cartesian product of two braces

Use parameter expansion

${parameter:-word}          # if `parameter` unset or null, use `word` as substitute
${parameter:=word}          # if `parameter` unset or null, `word` is assigned to parameter
${parameter:?word}
${parameter:+word}
${parameter:offset:length}  # substring expansion

Some one-liners and clever ways of using shell script

Collected from internet over a period of time

  1. Remove strangely named files
    touch \-test
    rm -- -test # dangerous
    rm ./-test # remove using ./
    find . -inum $inode -exec rm -i {} \;  # remmove by inode. Only way for file named '01/01/2001'
    
  2. As RPC
    ssh user@remote "
         $(declare -p var1 var2 var3)
         $(declare -f func1 func2 remotemain)
         remotemain
       "
    ssh user@remote "dump_logs | gzip -c" | gunzip -c | read_logs
    
  3. find the longest string matching “*/” from first in $0
    ${0##*/}
    
  4. find the longest string matching “b” from last in $string
    ${str%%b}
    
  5. Find email addresses in a file
    grep -E -o "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b" file.txt
    
  6. Print 5th to 10th line in file
    sed -n '5,10p' file                    # using sed
    awk 'NR>=5{print} NR==10{exit}' file   # using awk
    
  7. Extract lines between two patterns
    sed -n '/pattern1/,/pattern2/p' file
    
  8. Remove dups but preserve the order
    awk '!visited[$0]++' your_file > deduplicated_file
    
  9. Copy from bash to clipboard using OSC command
    echo -e "\033]52;c;$(base64 <<< "copy something here")\a"
    
  10. Raise notification from terminal using OSC command
    echo -e "\033]9;This is a notification\a"
    

References

Related Posts