Linux bash scripts

One way of utilizing the flexibility of linux is using command scripts. A command script is simply a file, which contains a set of normal linux commands that the command shell will perform automatically in the given order. Compared to real programming languages, like python, perl or c, programming with linux (bash, tcsh, csh or sh) is computationally rather ineffective. However, often handy linux scripts can be constructed in few minutes. You do not have to know too much about command scripting to be able to write simple programs that save a lot of work.

Constructing a script file

A script file is a simple text file that can be constructed with normal text editors like nano, emacs or vi. To create a new script file, type for example:

nano my_test.script

A script file usually starts with a command line which defines the command shell to be used. In guide we use bash shell, which is the default command shell at CSC. The bash defining row is:

#!/bin/bash

After that you add the linux commands, you wish to perform. In practice, just type to the file the commands, that you would normally use to do the task in active command shell. For example, the following script can be used to create a sub directory "mapfiles" and copy all .map files to there

#!/bin/bash
mkdir mapfiles
cp *.map mapfiles/

If a line in the script starts with a # mark, it will be skipped, and rest of the line is considered as a comment ( with the exception of the first line that start with #! ).

#!/bin/bash
# This is a comment line that is not executed

mkdir mapfiles
cp *.map mapfiles/

After saving the script file and closing the editor, you can perform the commands in the script file by giving a command:

source my_test.script

Optionally you could give execution permissions for your script file with command:

chmod u+x my_test.script

And then execute the script with command:

./my_test.script

Variables and arrays

You can use variables, loops and conditional statements in the scripts. Variables can be set with syntax:

variable=value

Note that there is no spaces around the equal-to sign.

Variables are recalled with $ sign:

$variable

or

${variable}

For example, command:

echo $variable

writes the value of variable to the output. Note that in bash scripts the variables are considered to be either strings ( i.e. text) or integers. This means that decimal numbers can't be used in bash scripts for mathematical operations.

Example of using string variables

[kkayttaj@puhti ~]$ name=Veikko
[kkayttaj@puhti ~]$ familyname=Salo
[kkayttaj@puhti ~]$ address="CSC Espoo"
[kkayttaj@puhti ~]$ echo "Person: ${name} ${familyname} works at ${address}."

Person: Veikko Salo works at CSC Espoo.

For integer variables, you can do simple arithmetics with syntax: ((expression)). Commonly used arithmetic operations are listed in table below:

Table Basic arithmetic operations in bash

Operator	Function
+	addition
-	subtraction
*	multiplication
/	division
%	division reminder
**	exponentiation

Simple integer arithmetics examples:

[kkayttaj@puhti ~]$ a=5
[kkayttaj@puhti ~]$ c=3
[kkayttaj@puhti ~]$ ((c = a + b))
[kkayttaj@puhti ~]$ echo  $a plus $b is equal to $c
5 plus 3 is equal to 8
[kkayttaj@puhti ~]$ ((d = a / b))
[kkayttaj@puhti ~]$ ((e = a % b))
[kkayttaj@puhti ~]$ echo "$a divided by $b results $d and reminder $e"
5 divided by 3 results 1 and reminder 2

Bash can also use one dimensional array variables, i.e. variables that contain list of items. A specified array item can be recalled by using an index number in brackets with the array variable name (${variable[index]}). For example we can define a simple three element array with command:

array=(a b c)

We can now recall either the whole array or just on element from it. Command:

echo ${array[*]}

now prints out:

a b c

while command:

echo ${array[2]}

prints:

Note that in the array, the indexing starts from 0 and thus the sample command above prints out the third element of the array. You can check the number of items in the array by adding # sign to the beginning of variable name. For example in this case command:

echo ${#array[*]}

prints out value:

A special case of array variable is $ that holds command line arguments i. e. items that you can provide as input parameters for your script. It the case of this argument array $0 refers to the name of the actual script, $1 refers to the first arguments, $2 to the second and so on. $# refers to the number of arguments and $@ to the full argument list. Below is a sample script that illustrates using the $ array variable:

#!/bin/bash
from_dir=$1
to_dir=$2
mkdir $to_dir
cp $from_dir/*.map $to_dir

If we now execute this script, named e.g. my_script2.sh we have to give two arguments for the command. The first argument is in this case used to define a source directory for the copy command and second argument as target directory. For example command:

./my_script2.csh source_data map_files

Would copy all the file with extension .map from a directory named as source_data to a new directory called:map_files.

Quotation marks

Three different quotation marks are used in bash. Quotation marks are frequently needed to define variables and commands to be executed. Following quotation marks can be used

" " Take text within quotes literally after substituting any variables
' ' Take text enclosed within quotes literally
` `Take text enclosed within quotes as a command, execute the command and then replace with output of the command to the location of quotation marks

Below are some examples to illustrate the functional differences of different quotation marks. Quotation marks can be used to operate with variables and arguments. When the double or single quotation marks are used all the the text inside the quotation marks are used as one argument. The difference between these two quotation marks is that in when double quotation marks are used, variables are substituted by their values while single quotation marks all text is used as it is. If you run commands:

variable=sample1 echo "value = $variable"

the result will be

value = sample1

But if you use single quotation marks instead:

echo 'value = $variable'

you will get output:

value = $variable

In linux commands and scripts quotation marks are typically used to define arguments that contain space or other special characters. Say we would like to use grep to pick all rows that contain a string file size from a file called files.txt. Following command would not work:

grep file size files.txt

If you run the command above, you get an error message, as the word size is now interpreted to be the second argument defining the input file. We can fix the situation by using quotation marks.

grep "file size" files.txt

Now the first argument, defining the string to be searched, is file size, (including the space between the words) and the second argument, defining the input file, is now files.txt, as originally intended.

The third quotation mark type ` has a special meaning. With these quotation marks, you can make one linux command to produce an argument for another linux command. The basic syntax ` ` marks is:

command1 `command2`

where command1 will use the product of command2 as an argument. In Bash script, the same functionality can be done also with syntax: $(command)

Loops and conditional statements

Loops and condition statements are rarely used in interactive command line usage. However they are frequently used in scripts to perform similar commands several times and to control the commands to be executed. Bash provides a wide selection loops, conditional statements and other control structures. In this section we show examples of some of the most commonly used control structures.

A [for] loop performs specified commands iteratively so that on each iteration the loop variable is set to be equal to one of the items in the given element list. In bash a for loop is made with command structure:

for variable in element_list
do
   commands
done

For example loop:

for filename in sample1.txt sample2.txt sample3.txt
do
  echo ${filename}
done

Would print out:

sample1.txt
sample2.txt
sample3.txt

Typically the argument list contains file names to be processed but it can also be any other parameter too. For example, say we have a directory called project_3 that contains nine files called: sample1.txt, sample2.txt ... sample9.txt. To see the content of the directory we in this case use command ls.

testuser@hippu1:/wrk/testuser> ls project_3/
sample1.txt sample3.txt sample5.txt sample7.txt sample9.txt  
sample2.txt sample4.txt sample6.txt sample8.txt

If we would like to rename each of these files so that they have extension .old we could run command mv nine times or we could use a for loop:

for filename in sample1.txt sample2.txt sample3.txt sample4.txt  \
sample5.txt  sample6.txt  sample7.txt  sample8.txt  sample9.txt
do
   echo "Renaming file: ${filename}"
   mv project_3/${filename} project_3/${filename}.old
done

The for loop above is still quite clumsy as we need to write all the file names to the element list. We can avoid this by substituting the element list with $(ls project_3/). Now, command ls project3 is used to produce a list of file names to be processed

for filename in $(ls project_3/)
do
  echo "Moving file: $filename"
  mv project_3/$filename project_3/"$filename".old
done

In bash you can also create a for loop where a numerical index variable that is increased automatically by certain step size in each iteration. In this case the syntax is:

for ((variable=start; variable<=end; i++))

Below is a for loop that performs the same renaming operation as above, but using just numbers as elements.

for ((number=1; number<=9; number++))
do
  echo "Moving file: sample${number}.txt"
  mv project_3/sample${number}.txt project_3/sample${number}.txt.old
done

In [while] loop, the loop keeps running as long as the defined condition statement is true. In bash a while loop can be made with syntax:

while [[ condition ]]
do
  commands
done

The renaming operation, made above with a for loop could also be done with while loop.

number=1
while [[ $number -le 9 ]]
do
   echo "Moving file: sample${number}.txt"
   mv project_3/sample${number}.txt project_3/sample${number}.txt.old
   ((number = number + 1))
done

In the example above a variable called number is first set to have value 1. The value of this variable is then increased by 1 in the end of each iteration cycle. The iterations are continued until the variable reaches value 10.

Conditional statements ([if]) can be made as follows.

if [[ condition ]]
then
   commands
else
   commands
fi

You can use operands, listed in table below, in the condition statements of if and while commands. Note, that bash uses different conditional statements for strings and integers. For example the equality of strings is tested with "==" while the equality of ** integers is tested with "-eq". The syntax is also strict about the spaces between the brackets and the condition statement: definition [[a == b]] will not work and should be fixed to [[ a == b ]].

Table: Commonly used string, integer and file operands of if and while statements

Statement	Operation
[[ a == b ]]	True if strings a and b are equal.
[[ a != b ]]	True if strings a and b are not equal.
[[ a =~ b ]]	True if strings a and b are similar (allows wildcards).
[[ a < b ]]	True if string a is alphabetically before string b.
[[ a > b ]]	True if string a is alphabetically after string b.
[[ a -eq b ]]	True if integers a and b are equal.
[[ a -ne b ]]	True if integers a and b are not equal.
[[ a -lt b ]]	True if integer a is less than b.
[[ a -gt b ]]	True if integer a is greater than b.
[[ a -le b ]]	True if integer a is less or equal to b.
[[ a -ge b ]]	True if integer a is greater or equal to b.
*[[ -e name* ]]**	True if file exists.
[[ -n a ]]	True if string a has non-zero length.
[[ A \|\| B ]]	True if condition A or condition B is true (logical OR).
[[ A && B ]]	True if condition A and condition B is true (logical AND).
[[ ! A ]]	True if condition A is not true.

Below is some examples of if command structures.

Check if the integer variable x is greater than 10:

if [[ $x -gt 10 ]]
then
   echo "The value of variable x is more than 10"
fi

Check that the variable x is greater than 10 but smaller than 20:

if [[ $x -gt 10 && $x -lt 20 ]]
then
   echo "The value of variable x is more than 10 but less than 20"
else
   echo "The value of x is out of range"
fi

You can compare also variables containing text (strings):

if [[ $answer == "yes" ]]
then
   echo " Your answer was: yes"
elif [[ $answer == "no" ]]
then
   echo "Your answer was no"
else
   echo "you didn't answer yes or no"
fi

When using less than and more than comparisons you should be careful not to mix string and integer comparisons. For example following condition:

[[ 123 > 3 ]]

is FALSE because string 123 is alphabetically before string 3. The numerical comparison:

[[ 123 -gt 3 ]]

is TRUE.

There are a number of operators you can use to test different attributes of a file. The most commonly used operator is -e that checks if a file exists. As an example, lets assume that we have a simple list of file names called: checklist.txt. Now we would like to check which of these files are found from the current directory. We can use for loop to study all the file names and if command with -e condition to test the file exists.

for file_name in $(cat checklist.txt)
do  
   if [[ -e $file_name ]]
   then
     echo "File $file_name was found"
   else
      echo "File $file_name was not found"
   fi
done

Printing the output

In the previous examples we have already used [echo] command to write text and variables to the standard output (i.e. to the screen or to a file by standard output redirection). For example command:

echo "Hello world"

Prints out:

Hello world

Echo can be used for printing output in many cases, but it does not provide good tools for creating well formatted output with defined columns. In situations where well structured text output is needed, [printf] should be used instead of echo. The syntax of printf is:

printf "format definition" arguments_to_print

The format definition defines what types of output is to be printed. Common types include text (%s), integers (%i), and floating point numbers (%f). The format statements can also define how much space is reserved for each argument and how it is located in the column. Below is some simple examples to illustrate the usage of printf command.

Command:

printf "%i %s %s %f\n" 1 Hello World 23.75

Prints out:

1 Hello World 23.750000

Here the format statement defines that the first argument is considered to be an integer, second and third as strings and the fourth argument as a floating point number. Note that by default, printf does not add newline character to the end of the output. To do that the format statement ends with definition \n.

In the next example we define how many characters are reserved for each argument. Command:

printf "%4i %10s %10s %6.2f\n" 1 Hello World 23.75

Prints out:

1 Hello World 23.75

Here we reserve four characters for the first integer, then ten characters for each of the strings. The floating point number is presented with six characters, two of which are after the decimal point.

You can also add text and control characters like tabulator (\t) to the format statement. Command:

printf "This is my %i:st %s %s\t %6.1f\n" 1 Hello World 23.75

Prints out:

This is my 1:st Hello World 23.8

In linux scripts printf is typically used to print out values stored in variables. For example commands:

unit=3g
value=5.3
printf "The resulting value from:%4s\t is:\t%6.2f\n" $unit $value

Prints out:

The resulting value from: 3g is: 5.30