Pages

Monday, June 13, 2022

Shell Linting


  1. Always start with a Shebang :The first rule of scripting is to always start with a Shebang. The shebang is a special character sequence in a script file that specifies which program or interpreter should be called to run the script. It is always the first line in the script. Without the shebang line, the system doesn’t know what language to use to process the script.

#!/bin/bash


The sha-bang (#!) at the top of the Script created or to be created is a script that tells our Operating System that our file is a set of commands that will be fed (will be interpreted) by the command interpreter indicated after it. The character pair #! actually, it's a magic number two-byte, a special marker that designate a file type, and in our case, an executable shell script. Immediately after the sha-bang comes the name of the path where the interpreter to be executed is located plus the name of said interpreter. In other words, this is the path to the program that interprets the commands in the script, whether it be an interpreter, a programming language, or a utility. 


  1. Variables and Naming Conventions

1. variables should always have the form name=value


2. Ideally a  Variable name is best identified if it consists Upper case letters, digits and '-' ( underscore )


3. Variable annotations : Bash allows for a limited form of variable annotations. The most important ones are:


     local (for local variables inside a function)

     readonly (for read-only variables)

     

Strive to annotate almost all variables in a bash script with either local or readonly.

 

      4. Prefer local variables within functions over global variables

      

      5. If you need global variables, make them read only

      

      6. Make sure Variables are capitalized for better understanding

    of code.if we want to use capitals, use

     Environment (exported) variables: ${ALL_CAPS}

     Local variables: ${lower_case}

     Constants : CONSTANT_NAME


7. Variables names for loops should be similarly named for any variable you’re looping through. Check variable zone in the below loop

   for zone in "${zones[@]}"; do

      something_with "${zone}"

   Done


8. Constants and Environment Variable Names : All caps, separated with underscores, declared at the top of the file.Constants and anything exported to the environment should be capitalized.


# Constant

readonly PATH_TO_FILES='/some/path'


# Both constant and environment

declare -xr USER_SID='PROD'


9. Define Default variables when required


VARIABLE="${1:-$DEFAULTVALUE}"

which assigns to VARIABLE the value of the 1st argument passed to the script or the value of DEFAULT VALUE if no such argument was passed. Quoting prevents globbing and word splitting.


Default values

: "${S3_HOST:="https://minio.superevil.io:9000"}"

: "${S3_BUCKET_NAME:="foo/bar"}"

: "${S3_ACCESS_KEY:-""}"

: "${S3_SECRET_KEY:-""}"


10. Declare all Variables : Bash doesn’t have a strong type system. To allow type-like behavior, it uses attributes that can be set by a command. ‘declare’ is a bash built-in command that allows you to update attributes applied to variables within the scope of your shell. In addition, it can be used to declare a variable in longhand. a simple use case looks as ,


  $ declare var

  $ declare -i int

  $ var="1+1"

  $ int="1+1"

  $ echo "$var"

  1+1                 ## The literal "1+1"

  $ echo "$int"

  2                   ## The result of the evaluation of 1+1


11. Don’t start Variable Name with special characters or Numbers

   [root@ip-172-31-19-247 ~]# cat simple.sh 

   #!/bin/bash


   -one="java"

   123one="java"


   echo $-one

   echo $123one


   [root@ip-172-31-19-247 ~]# sh simple.sh 

   simple.sh: line 4: -one=java: command not found

   simple.sh: line 5: 123one=java: command not found

   hBone

   23one


12. ​​Surround your variables with {}. Otherwise bash will try to access the $ENVIRONMENT_app variable in /srv/$ENVIRONMENT_app, whereas you probably intended /srv/${ENVIRONMENT}_app.


13. Surround your variable with " in if [ "${NAME}" = "java" ], because if $NAME isn't declared, bash will throw a syntax error (also see nounset).


14. Use :- if you want to test variables that could be undeclared. For instance: if [ "${NAME:-}" = "java" ] will set $NAME to be empty if it's not declared. You can also set it to noname like so if [ "${NAME:-noname}" = "java" ]


15. Set magic variables for current file, basename, and directory at the top of your script for convenience.


# Set magic variables for current file & dir

__dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

__file="${__dir}/$(basename "${BASH_SOURCE[0]}")"

__base="$(basename ${__file} .sh)"

__root="$(cd "$(dirname "${__dir}")" && pwd)" # <-- change this as it depends on your app


16. Variable substitution :


echo "${var}"

echo "Substitute the value of var."

    

echo "${var:-word}"

echo "If var is null or unset, word is substituted for var. The value of var does not change."


echo "${var:=word}"

echo "If var is null or unset, var is set to the value of word."

    

echo "${var:?message}"

echo "If var is null or unset, message is printed to standard error. This checks that variables are set correctly."

    

echo "${var:+word}"

echo "If var is set, word is substituted for var. The value of var does not change."


      17. Global Script Variables

A number of predefined global variables are used and available to all scripts:


Global Directory Declaration:


  • BASE_DIR  This is the base directory of the project that can be used to reference other files

  • SCRIPT_DIR This is the /scripts project directory

  • CNF_DIR  This is the /etc project directory

  • LOG_DIR This is the /log project directory

  • TMP_DIR This is a temporary working directory, currently this defaults to /tmp 


When creating global variables for important paths, allow for override of these by the controlling shell environment. For example.


[ ! -z “${TMP_DIR}” ] && TMP_DIR=”/tmp”    # Correct definition


TMP_DIR=”/tmp”                             # Incorrect defintion


Global File Name Declaration :

  • TMP_FILE A pre-defined unique temporary file that is auto removed on completion

  • STOP_FILE A pre-defined file to stop script processing in loops (only if used in functions)

  • DEFAULT_CNF_FILE A pre-defined standard /etc config file name

  • DEFAULT_LOG_FILE  A predefined standard /log log file name


Variables:

  • DATE_TIME   – The date/time of the script execution

  • DATE_TIME_TZ  – The date/time/timezone of the script execution

  • USER_ID  – The running user id

  • FULL_HOSTNAME  – The full and qualified hostname

  • SHORT_HOSTNAME – The short hostname

  • LOG_DATE_FORMAT – The Date Format used for all log files

Other Variables

  • QUIET  – Quiet Logging, ERROR and WARN only

  • USE_DEBUG – Enable Debugging


18. Variable Usages

When using variables, they are always to be enclosed in curly brackets.


${TMP_FILE}   is acceptable

$TMP_FILE   is NOT acceptable

When displaying variables in stdout, they should always be included in single quotes (‘) to ensure actual value can be determined. 


info “Exiting with status code of ‘${EXIT_CODE}'”


  1. No space before or after the equal sign. Make sure no space is given when defining variables. Check the below codes language variable for reference

   [root@ip-172-31-19-247 ~]# cat simple.sh 

   #!/bin/bash


   language = java

   echo $language


   [root@ip-172-31-19-247 ~]# sh simple.sh 

   simple.sh: line 1: !#/bin/bash: No such file or directory

   simple.sh: line 3: language: command not found


  1. Double quotes around every parameter expansion : Word Splitting is the demon inside Bash that is out to get unsuspecting newcomers or even veterans who let down their guard.It's not just spaces you need to protect. Word Splitting occurs on all whitespace, including tabs, newlines, and any other characters in the IFS variable. Always double quote the variables in script as below,

   [root@ip-172-31-19-247 ~]# touch "java is lang"


   [root@ip-172-31-19-247 ~]# cat simple.sh 

   #!/bin/bash


   language="java is lang"

   rm $language


   [root@ip-172-31-19-247 ~]# sh simple.sh 

   rm: cannot remove 'java': No such file or directory

   rm: cannot remove 'is': No such file or directory

   rm: cannot remove 'lang': No such file or directory


   modified code looks as below,

   [root@ip-172-31-19-247 ~]# cat simple.sh 

   #!/bin/bash


   language="java is lang"

   rm "$language"


  1. Use Good Indentation : it is very important to make code more readable thus making it more maintainable. Where we write code with more than 1 level of logic, make sure it is always indented. It doesn't matter much whether how many spaces you indent, though most people seem to use 4 spaces or 8. Just make sure that your do's and don'ts line up and you'll be fine.

 

  a simple indentation looks as,

  #!/bin/bash


  if [ $# -ge 1 ] && [ -d $1 ]; then

    for file in `ls $1`

    do

        if [ $debug == "on" ]; then

            echo working on $file

        fi

        wc -l $1/$file

    done

  else

    echo "USAGE: $0 directory"

    exit 1

  fi


  1. Always provide usage of the script. It is important to let executors of the script understand how to execute the code. what parameters need to be passed and how to. a simple example below, if we run the script without passing a filename, it throws a usage error 


  [root@ip-172-31-19-247 ~]# cat simple.sh 

  #!/bin/bash


  if [ $# == 0 ]; then

     echo "Usage: $0 filename"

     exit 1

  Fi

  1. Sensible commenting : provide comments as much as you can while writing the code. explaining what the part of code does helps a lot when you revisit the code after some time. Don't explain the obvious lines, but explain every command that you're using or the important ones get lost in the mix. 

 

  user=$1


  # The below logic is to check if account exists on the system or not

  grep ^$user: /etc/passwd

  if [ $? != 0 ]; then

      echo "No such user: $username"

      exit 1

  Fi


  1. Long Notation : Always use long parameter notation when available. This makes the script more readable, especially for lesser known/used commands that you don’t remember all the options for.


      # Avoid:

      rm -rf -- "${dir}"


      # Good:

      rm --recursive --force -- "${dir}"


  1. Return an Exit Code : Always return an exit code when something goes wrong. Many of the times, we don't even care of what is returned when something exits but returning a non-zero code when something goes is not a bad idea. At some point we might need this return code when we are executing the shell script from other languages.


  [root@ip-172-31-19-247 ~]# cat simple.sh 

  #!/bin/bash


  cat file.txt


  if [ $? -eq 0 ]

   then

     echo "The script ran ok"

     exit 0

  else

     echo "The script failed" >&2

     exit 1

  fi


  [root@ip-172-31-19-247 ~]# sh simple.sh 

  cat: file.txt: No such file or directory

  The script failed


  1. Check Argument Types : We can save a lot of our time when we make sure the arguments provided to your script are of the type expected before you start to use them.  a simple check looks like below,

if ! [ "$1" -eq "$1" 2> /dev/null ]

        then

          echo "ERROR: $1 is not a number!"

          exit 1

      Fi



  1. Use Arrays wherever Possible : Most of the times we write code where we store a collection of items or elements in a string. always use a array when there are collection of items as below,


declare -r hosts="host1 host2 host3"

for host in $hosts  # not quoting $hosts here, we want word splitting

 do

   echo "$host"

 done


 # use an array instead!

 declare -r -a host_array=( host1 host2 host3 )

 for host in "${host_array[@]}"

  do

    echo "$host"

   done


  1. Avoid Unnecessary Pipelines : Most of the code that we write are dependent on the piping. Avoid unnecessary piping and use the shell builtin as below,


  # instead of

    cat file | command

  # use

    command < file


  # instead of

    echo text | command

  # use

    command <<< text


  # instead of

    grep pattern file | awk '{print $1}'

  # use

    awk '/pattern/{print $1}'

 

  # instead of

    grep pattern file | sed 's/foo/bar/g'

  # use

    sed -n '/pattern/{s/foo/bar/p}' file


   # instead of

    command | sort | uniq

   # use

    command | sort -u

 

   # instead of

    command | grep pattern | wc -l

   # use

    command | grep -c pattern


  1. Use Process|Command Substitution wherever possible : process substitution is a form of inter-process communication that allows the input or output of a command to appear as a file. The command is substituted in-line, where a file name would normally occur, by the command shell. This allows programs that normally only accept files to directly read from or write to another program.


  A simple example is the creation of temporary files    

  

  # using temp files

    command1 > file1

    command2 > file2

    diff file1 file2

    rm file1 file2

 

  # using process substitution

    diff <(command1) <(command2)


  # dont use

    echo 'echo “hello world”'

  # use

    echo $(echo “hello world)


  1. Activate Bash Debug Mode : In many situations, bash will continue executing the script even when a specific part fails, impacting the rest of the script badly. To ensure that the script exists upon facing some fatal error, it’s recommended to have the following lines at the start.


set -o errexit


set -e  : This tells the shell to exit the script as soon as any statement returns a non-zero exit code.


set -u : By default, bash will ignore any variables that don’t exist. adding "set -u" will not ignore the variables that does not exist


set -a : Using set -a, you can cause any variable or function that you create to be automatically exported so that subshells and scripts can use them.


set -x : shows each command as it’s being run as well as the output. enable xtrace option


  1. Write error messages to stderr : Always write Error messages belonging to stderr not stdout.

 echo "An error message" >&2


  1. Comparisons : Use = instead of == for String Comparisons. Note that == is a synonym for =, therefore only use a single = for string comparisons, for instance:


           value1=java.com”

           value2=shell.com”

           if [ "$value1" = "$value2" ]


  1. printf over echo : For various reasons, printf is preferable to echo. printf gives more control over the output, it’s more portable and its behavior is defined better. Print error messages on stderr. E.g., I use the following function:


      error() {

         printf "${red}!!! %s${reset}\\n" "${*}" 1>&2

      }


  1. Trap forced exit of script : Don’t let your script exit unexpectedly, trap when someone updates press ctrl+c and exit from your script gracefully.


    # trap ctrl-c and call ctrl_c()

      trap ctrl_c INT


      function ctrl_c() {

         echo "** Trapped CTRL-C"

      }


      for i in `seq 1 5`; do

        sleep 1

        echo -n "."

      done


  1. Use $() over backticks : Avoid using backticks ““”, they are hard to read and in some fonts easily confused with single quotes. A lot of quoting needed in nesting. Use $(command) instead of `command` because it is easier to nest multiple commands and makes your code more readable.


# dont use

$ echo "one-`echo two-\`echo three-\`\`echo four\`\`\``"

one-two-three-four 


# use

$ echo "one-$(echo two-$(echo three-$(echo four)))"

one-two-three-four


  1. Logging : Logging is the most critical thing for everyone whether he is a developer, sysadmin or DevOps. Debugging seems to be impossible without logs. As we know most applications generate logs for understanding what is happening with the application, the same practice can be implemented for shell script as well. For generating logs we have a bash utility called logger.


[root@ip-172-31-19-247 ~]# cat simple.sh 

#!/bin/bash


DATE=$(date)

declare DATE

check_file() {

     local FILENAME="$1"

     if ! ls "${FILENAME}" > /dev/null 2>&1

     then

            logger -s "${DATE}: ${FILENAME} doesn't exists"

     else

           logger -s "${DATE}: ${FILENAME} found successfully"

     fi

}

check_file "/etc/passwd"


[root@ip-172-31-19-247 ~]# sh simple.sh 

<13>Feb 26 12:48:19 ec2-user: Sat Feb 26 12:48:19 UTC 2022: /etc/passwd found successfully


  1. Builtin Commands vs. External Commands : Given the choice between invoking a shell builtin and invoking a separate process, choose the builtin.. We prefer the use of builtins such as the Parameter Expansion functions in bash(1) as it’s more robust and portable (especially when compared to things like sed).



# Dont use 

addition="$(expr "${X}" + "${Y}")"

substitution="$(echo "${string}" | sed -e 's/^foo/bar/')"


# Use this

addition=$(( X + Y ))

substitution="${string/#foo/bar}"


  1. Minimize Bash Spawn Usage: Try to use bash builtins for any sort of work unless not available. If we use external commands to perform work, bash can spawn sub shell to get the work done. This can lead to other issues. For Instance a simple seq count as below,

for number in $(seq 1 10); do


Bash is able enough to do the counting for you. You do not need to spawn an external application (especially a single-platform one) to do some counting and then pass that application's output to Bash for word splitting. The above can be written as,

C-style for loops are the best method for implementing a counter for ((i=1; i<=10; i++)). For instance


# instead of dirname, use:

declare -r file_dir="{my_file%/*}"

 

# instead of basename, use:

declare -r file_base="{my_file##*/}"

 

# instead of sed 's/blah/hello', use:

declare -r new_file="${my_file/blah/hello}"

 

# instead of bc <<< "2+2", use:

echo $(( 2+2 ))

 

# instead of grepping a pattern in a string, use:

[[ $line =~ .*blah$ ]]

 

# instead of cut -d:, use an array:

IFS=: read -a arr <<< "one:two:three"

  1. Use [ … ]] vs. [ … ] : Unless a script must run in a POSIX-compatible environment, use [[ ... ]] rather than [ ... ] when performing conditional tests. Unlike the [ and test bash builtins, [[ ... ]] is part of shell syntax, not a command. This means it can handle its internal elements (test conditions) in a more robust fashion, as pathname expansion and word splitting do not occur. Also, [[ ... ]] adds some additional capabilities such as =~ to perform regular expression tests.


  1. Use shift to read function arguments : Instead of using $1, $2 etc to pick up function arguments, use shift as shown below. This makes it easier to reorder arguments, if you change your mind later.


# Processes a file.

# $1 - the name of the input file

# $2 - the name of the output file

process_file(){

    local -r input_file="$1";  shift

    local -r output_file="$1"; shift

}


shift is a bash built-in which kind of removes arguments from the beginning of the argument list. Given that the 3 arguments provided to the script are available in $1, $2, $3, then a call to shift will make $2 the new $1. A shift 2 will shift by two making new $1 the old $3


  1. Use null delimited output where possible : In order to correctly handle filenames containing whitespace and newline characters, you should use null delimited output, which results in each line being terminated by a NUL (00) character instead of a newline. Most programs support this. For example, find -print0 outputs file names followed by a null character and xargs -0 reads arguments separated by null characters.


# instead of

find . -type f -mtime +5 | xargs rm -f


# use

find . -type f -mtime +5 -print0 | xargs -0 rm -f

 

# looping over files

find . -type f -print0 | while IFS= read -r -d $'' filename; do

    echo "$filename"

done


  1. Functions : Bash can be hard to read and interpret. Using functions can greatly improve readability. Shell functions are a way to group commands for later execution using a single name for the group. They are executed just like a "regular" command


  1. Apply the Single Responsibility Principle: a function does one thing.

  2. function name has an underscore as a prefix. It seems like a good idea to always have a special naming convention for your bash functions to avoid any potential clashes with built-in operators or functions you might include from other files

  3. Function Location : Put all functions together in the

file just below constants. Don’t hide executable code between functions. Doing so makes the code difficult to follow and results in nasty surprises when debugging. If you’ve got functions, put them all together near the top of the file. Only includes, set statements and setting constants may be done before declaring functions.

  1. Document all functions that we write : it is very important to document all functions that we write. Giving meaning full explanation of the functions is very important for a better code

  

            # Processes a file.

            # $1 - the name of the input file

            # $2 - the name of the output file

              process_file(){

              }

  1. Function Comments : Any function that is not both obvious and short must be commented. Any function in a library must be commented regardless of length or complexity. It should be possible for someone else to learn how to use your program or to use a function in your library by reading the comments (and self-help, if provided) without reading the code.


All function comments should describe the intended API behavior using:

  • Description of the function.

  • Globals: List of global variables used and modified.

  • Arguments: Arguments taken.

  • Outputs: Output to STDOUT or STDERR.

  • Returns: Returned values other than the default exit status of the last command run.

Example:

#######################################

# Cleanup files from the backup directory.

# Globals:

#   BACKUP_DIR

#   ORACLE_SID

# Arguments:

#   None

# Outputs : stdout

$ Returns : None

#######################################

function cleanup() {

  …

}

  1. Function Variable Declaration : Declare variables with a meaningful name for positional parameters of functions


     happy() {

       local first_arg="${1}"

       local second_arg="${2}"

       [...]

     }


   7. Create functions with a meaningful name for complex tests


     # Don't do this

       if [ "$#" -ge "1" ] && [ "$1" = '-h' ] || [ "$1" = '--help' ] || [ "$1" = "-?" ]; then

         usage

         exit 0

       fi


     # Do this

      help_wanted() {

        [ "$#" -ge "1" ] && [ "$1" = '-h' ] || [ "$1" = '--help' ] || [ "$1" = "-?" ]

      }


      if help_wanted "$@"; then

        usage

        exit 0

      fi


   8. Cleanup code : An idiom for tasks that need to be done before the script ends (e.g. removing temporary files, etc.). The exit status of the script is the status of the last statement before the finish function.


    finish() {

       result=$?

       # Your cleanup code here

       exit ${result}

     }

 

  9. Mandatory Script Functions :All scripts are to have the following default functions:


bootstrap   : The bootstrap function is to be identical in all scripts, this is used to source necessary common functions used by all scripts


help Or Usage: The help function is to display the usage of the function and then exit. The usage needs to specify all command line arguments, and the client identifies mandatory and optional arguments.  


process_args : This function is used to process the command line arguments for scripts.  The getopt function is used for processing arguments however this only support single character options (e.g. -v -h -p etc).  Scripts should be written for only single character options.


To improve the namespace, as well as provide a difference between operational parameters and information parameters the following two word options are used.


–help   Display script help and exit

–version Display single line script version and exit


main : The main function where all the things start. the code should be minimal.


Finish : Clean up code

 

10 . Standard Body Template : Unless we are writing a small script, we need to use functions to modularise your code and make it more readable, reusable and maintainable. The template for the script with longer code is show below,

#! /usr/bin/env bash

#

# Author: Jagadish manchala <

#

#/ Usage: SCRIPTNAME [OPTIONS]... [ARGUMENTS]...

#/

#/ 

#/ OPTIONS

#/   -h, --help

#/                Print this help message

#/

#/ EXAMPLES

#/  



#{{{ Bash settings

# abort on nonzero exitstatus

set -o errexit

# abort on unbound variable

set -o nounset

# don't hide errors within pipes

set -o pipefail

#}}}



#{{{ Variables

readonly script_name=$(basename "${0}")

readonly script_dir=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )

IFS=$'\t\n'   # Split on newlines and tabs (but not on spaces)


#}}}


main() {

  # check_args "${@}"

  :

}


#{{{ Helper functions

usage() { }

 

bootstrap() { }


my_function() { }

 

      finish(){   }


#}}}


main "${@}"


Read More