Refrenced from: http://user.it.uu.se/~matkin/documents/shell/
This is a document that covers some issues regarding shell script programming. Note that this page is still under construction. The intension is that is should be possible to use it as a WWW text for "advanced" shell programming, but right now I am just collecting stuff.
Note! I use Bourne shell or derivatives thereof, like BASH. Therefore the scripts contained herein is written for Bourne shell (usually found under/bin/sh
), unless said otherwise.Also not that this is work in progress . Hence some of the descriptions might be bad, some might be confusing and yet some may be missing. If that is the case, please send me a note about it.
This document is structured as follows.
man
In the early days computers where used to run programs, and nothing more. You punched your program on cards, delivered the pack to the computer department. The staff there loaded the program into the computer, executed it and retrieved the result on paper. This was in turn returned to you and you had to sit down and figure out why you got the result you got.
Modern computers are a little more complex than that. You there have a complete environment where you can execute your programs and even have such astonishing things as interactive programs (hear-hear). It is no longer enough to be able to load your program and just print the result. You also need support to reformat the results, process them in other manners (maybe printing a nice diagram) and store them in a database. It would of course be possible to write specially designed programs that formatted the output of your programs according to your wishes, but the number of specialized programs would quickly increase, leaving your computer loaded with "might come in handy" programs.
A better approach would be to have a small set of processing programs together with a program made to "glue the parts together." On a UNIX system such a program is called the shell (in contrast with the core that contains time-sharing code, file access code and other system oriented code). The shell is used to issue commands, start processes, control jobs, redirect input and output, and other mundane things that you do on a modern computer. Not only that, the shell is a pretty complete programming language.
In this paper we will introduce concepts and methods that a good shell-programmer can use to get the most out of his/her UNIX system. We will start from the beginning, but a basic familiarity with programming and/or the basic principles of computers will be assumed. This is not an paper for the complete novice, although you are welcome to read the paper.
There are basically two types of commands: simple commands and complex commands . In this section we will explain what simple commands are and go through the complex commands one by one. We will start by going through the process of executing a simple program, together with some of the standard ones, and then continue by investigating the steps in the execution more throughly.
From the manual of sh(1), we have the following definition of simple commands.
A simple command is a sequence of non-blank words separated by blanks. The first word specifies the name of the command to be executed. Except as specified below, the remaining words are passed as arguments to the invoked command. The command name is passed as argument 0 (see execve(2V) ).
To execute a simple command you simply type it to the shell. For example, to execute the simple command ls
you do the following.
$ ls Mail doc lib public_html News emacs man bin outgoing incoming tmp
In this example, $
is the prompt , i.e. the text the computer prints to tell you that it is ready to accept commands. ls
is the command you type to the computer and the remaining lines are the result of the ls
command. We will in the sequel print the text that the computer prints in 'Courier
' and the text that you type to the computer in 'boldface Courier
'.
If you for some reason lack a prompt, you will not be able to give commands to the shell. There are many possible reasons to why you may not have a prompt.
There are many more simple commands that are useful. Some examples are: man
, cat
, echo
, and rm
. The man
command is one of the more useful. It is used to get a manual of a command, i.e. a description of what the command does, and possible variations of its use. To get information on how to use a command, you just type man
followed by the command you want information on, e.g. to get information on rm
you type
$ man rm
and the result will be a manual of how the command rm
works and what it does.
Exercise. Try the above command. What does the rm
command do? How should you use it? Give at least 2 examples of use.
Buildin simple commands. There is a special set of commands that are built into the shell. It is entirely up the the type of shell that you are using. The more common builtin commands are: echo
, cd
, pwd
, .
(or alternatively source
), trap
, return
, hash
, eval
, and kill
.
There is a set of predefined variables in the shell. These variables are used to store values and also to change the behavior of called programs. A simple example of a variable is the HOME
variable, which contains the path to your home directory (where you end up when you log in). If you type cd
without any arguments, this is the directory where you will end up. To print the value of the variable HOME
, you can write the following:
$ echo $HOME
/users/matkin
PATH
variable One of the more important variables is the path variable . The path variable controls where the shell searches for commands, when you type them to at the prompt. Let us go through the process in detail.
When you type a non-builtin command to the shell, the shell searches for a program to execute. The programs are simply executable files somewhere in the file system; they are executable either because they are compiled programs (written in C, C++, Pascal, Ada, or some other language) or because they are scripts that may be executed. Since we don't want to go through all files in the file system (on my account alone, I have approximately 3700 files), we have a path of directories where the program may be stored. This path is given to the shell as a colon separated list of directories stored in the environment variable PATH
. To change the directories where your shell should look, just alter the value of PATH
.
Example: If your path contains `/usr/bin:/bin:/usr/local/bin
' you may extend the path with /home/matkin/bin
(which could be the place where you put your own scripts) by writing
$ PATH=/home/matkin/bin:$PATH
The effect of this only remains as long as you are logged in. If you log out, your changes will be undone since every time you start a new shell: each shell starts with a fresh set of variables. To set the path each time you log in, you have to add a line to the startup file
.
Note! It is very common to put .
in your path, either at the beginning or at the end. This is highly insecure and you should never do that. There are some common traps that can be used to crack an account which have `.
' in their path.
There is a set of other predefined variables used for different purposes related to the behaviour of the shell. We will here go through the more important ones and describe their purpose.
HOME
cd
without any arguments.
PS1
$
'. This is what the computer prints whenever it is ready to process another command.
PS2
>
'. The shell prints this prompt whenever you have type an incomplete line, e.g., if you are missing a closing quote. For example:
$ echo 'hello > world' hello world
SHELL
SHELL
variables is usually set to
/bin/sh
.
IFS
When you give a command to the shell, it undergoes some transformations before being executed. Since we are not ready to give the complete figure of what happends, we start by considering one of the more fundamental transformation, that of variable expansion (also called parameter substitution ). As an example, let us assume that we have typed the following line to the shell:
$ ls $HOME
The shell will then see that $HOME
is a variable (also called parameter) and replace it with its value. In my case, $HOME
will be replaced with /home/matkin
. The line that the shell sees is therefore
$ ls /home/matkin
Observe that no more variable expansion takes place after this initial variable expansion. The meaning is that if $HOME
happend to have the value '$PS1
', the line above would try to find a file with the name '$PS1
'.
Here a lot of text is missing. I'll fill it in as I go along.
Here are some one-liners that might come in handy some time. These one-liners serve both as educational examples that solves "real problems". If you have suggestions for more one-liners, please mail me
.
If you have a number of files named foo.C
, bar.C.gz
, etc. and want to rename them to foo.cc
, bar.cc.gz
, etc. This line will do the trick.
\ls *.C* | sed 's/\(.*\).C\(.*\)/mv & \1.cc\2/' | sh
The backslash before the ls
command is to prevent is from being expanded, in the case that is is an alias and you are using shell that has aliases (such as Bash). We want to prevent the shell from doing this expansion since ls
might come out as ls -F
(which would behave strange) or ls -l
which is really
bad.
An alternative is to install the rename script, which is written in Perl.
If you want to find out the full name for a user name you can use one of these one-liners to do the job:
ypmatch matkin passwd | cut -d: -f5 | cut -d, -f1 grep "^matkin:" /etc/passwd | cut -d: -f5 | cut -d, -f1
Which version you use depend on what type of system you have. If you only want the first name (or only the surname) you can pipe the output through cut -d' ' -f1
(or alternatively cut -d' ' -f2
, if the second word is the surname).
If you have a number of processes that you want to kill, one of the following one-liners might be useful:
kill `ps xww | grep "sleep" | cut -c1-5` 2>/dev/null ps xww | grep "sleep" | cut -c1-5 | xargs kill 2>/dev/null
This will kill any processes that has the word "sleep" in the calling command. If your kill
does not handle multiple pids' you can either use the one-liner
ps xww | grep "sleep" | cut -c1-5 | xargs -i kill {} 2>/dev/null
or use a for-loop:
for x in `ps xww | grep "sleep" | cut -c1-5` do kill $x 2>/dev/null done
But then it is not a one-liner any more.
Note! Be very careful about what you use as expression to grep
. You might get more processes than you wanted killed.
Assume that you have a large directory containing a lot of small C-programs together with some real applications. The small files might be test programs to test details of the large program. In this case I usually compile the test programs into executable code with the same name as the file it came from (with the .c
removed). If I want to remove all the test programs in one go I type the following line to my shell
for x in *; do [ -x $x -a -f $x.c ] && echo $x; done | xargs rm -f
This will remove all executable file with a corresponding .c
-file but keep all other executable files.
Note! As always when you use a complex command, or a command with wildcards in, together with rm
, insert an echo
in front of the the rm
to make sure that the command does not do anything wierd.
Sometimes you want to find a program matching some wildcard. The following line might be useful. Here I wanted to find all executable files that matched the filename pattern *gif*
.
( IFS=: ; for D in $PATH; do for F in $D/*gif*; do [ -x $F ] && echo $F; done; done )
This scipt requires you to have some more elaborate commands. One command that you might not have on your system is the command socket(1) written by Juergen Nickelsen. Assuming that you have all your addresses in a file called mail-list
and want to verify them using the SMTP daemon at kay.docs.uu.se
(this is the computer I use), the following line will do the job.
( sed 's/.*/VRFY <&>/' mail-list ; echo QUIT ) | socket -c kay smtp
There are a few tricks that you can use when you write shell scripts, I will try to summarize some of them here.
sh
A useful trick is to rewrite a line into a shell command and then pipe the output into sh
. One such example can be seen above , when we rewrite a line containing a file name into a command involving the file name.
read
If you want to split a line into parts you can always use the read
command.
Assume that, for some reason, you have a variable FOO
containing "foo is a bar". The following code can be used to put the first word of the variable into the variable first
, and the rest into the variable rest
:
echo $FOO | { read first rest ; echo "$first and $rest"; }
See below for a better example of when this trick is useful.
IFS
If you want to split a line into parts where the parts are separated by something else than whitespace, you can reset the variable IFS
to split a line. It is worth noting that the IFS
is only regarded when using the read
command or the for
control structure. As an example, here is a script that works almost the same way as the which
command:
#!/bin/sh IFS=: for p in $PATH do if [ -x $p/$1 ] then echo $p/$1 return fi done echo "No $1 in your path" 1>&2 return 1
If you want to set a variable for one command only, you can always write the variable setting first, immediately followed by the simple command (don't write any ;
after the assignment). This will result in the variable being set for that command only, but keeping the old value (if it had one) after the command has been executed. As an example, if you execute the code
MANPATH=/usr/man:/usr/local/man man test
will only look for the manual for test
in directories /usr/man
and /usr/local/man
. This regardless of what value MANPATH
had before the call.
Sometimes you want to split a string into components, in the same manner as you do above , but now you want to store the components into variables, not go through them using a for loop. To do this you can feed a script (or a function) through standard input instead of supplying arguments to the script and then use the read
command to set variables to the supplied values. This is most useful when you have a string that you need to split at something else than whitespace.
Assume that you want to split an e-mail address into name and domain address. We will also supply an extra Subject line, to show that the IFS
doesn't affect the second use of read
. We write the following script into the file email
:
#!/bin/sh IFS=@ read name address echo "A mail to $name at $address" read subject echo "Subject: $subject"
The script is fed its input through standard input. Calling this script in the following manner (the >
and $
are prompts)
$ email <<EOT > [email protected] > Something strange @ my place > EOT
to a shell, will produce the output
A mail to matkin at docs.uu.se Subject: Something strange @ my place
When you use pipes to pass strings along, try to avoid putting the pipes deep down in the control structure. Remember that you are not restricted to putting a pipe after a simple command, but that you can put the pipe after a for-loop, a while-loop, or any other control structure. If you do this properly, you will not spawn more processes than necessary.
As an example; assume that you want to go through all C files of a directory and, if they are readable to you, convert the filenames to contain uppercase letters only (this example may be a little contrived). We write two scripts that do this, but they do it in slightly different ways.
The first script calls tr
inside the the for-loop:
#!/bin/sh for x in *.c do [ -r $x ] && echo $x | tr 'a-z' 'A-Z' done
and the second script calls tr
outside the loop:
#!/bin/sh for x in *.c do [ -r $x ] && echo $x done | tr 'a-z' 'A-Z'
On this computer (a SPARCstation 10), the first script takes approximately 6.2 seconds to process 33 C files while the second takes approximately 0.7 seconds.
This chapter contains some exercises. Some of them has to do with what is in this document and others require some thought and some testing.
If you have a file named `#foo.c#' in your directory and you want to remove it you cannot remove it by typing:
$
rm #foo.c#
If you have a file named `-foo' in your directory and you want to remove it it will not work if you type:
$
rm -foo
Following the above example you could try to type
$
rm '-foo'
but this will not work either.
How do you do to remove the file?
You are making a batch of beer. The beer has to stand in a warm place for 7 days and after that it has to stand in a cooler place for 2 weeks. You have a tendency to loose track of such mundane things so you want to write a small script that sends you a letter after 7 days telling you to move the beer and another letter 2 weeks after that that tells you that the beer is finished.
References: at(1), mail(1), sh(1)
Hint: Look up how the <<
redirection works in sh(1).
It is 3 o'clock in the morning and you have been working on a AD2 exercise that involves a lot of measuring and other dreadful stuff (don't laugh, you'll do them yourself some time :-). You have a program measure
that will take as argument a file name and write a line:
<file> <number> <average length>
on standard output. Unfortunately this is not what you want.
The compiler is mounted on a computer that just went down so you can't recompile the program to print the data you want. You are very tired and don't want to wait around for the computer to restart itself. Each run takes a very long time and you don't want to spend your time watching it. Write a script that runs the programs, times them and reformats the output to:
<number> <user time + system time>
and sorts in on increasing <number> . Also show what command you type before you log out and leave for a good nights rest.
References: batch(1), time(1), calc(1), sh(1), sort(1), test(1), echo(1)
Hints: This is not an easy exercise.
This section gives some examples of simple scripts that might come in handy at times. As far as possible I avoid explicit BASH constructs, but the more interesting examples uses BASH functionality quite heavily.
Sometimes you want to find all programs in your path that are executable, not only the first one. The program which
does almost this work, but unfortunately it only prints the first instance of the program it finds. What we want is a generalization of the command described above
.
Let us instead try the following approach. Let ut call the program whereare
(in contrast with the program whereis
which searches for a program. As always when writing programs, we first write down the specification. In this case in the form of a manual.
SYNOPSIS
whereare
pattern . . .DESCRIPTION
whereare
takes a list of file name pattern and looks for executable files in the path that matches the file name pattern.EXAMPLES
The following code will look forispell
orspell
.The following command will look for a file matching the pattern$ whereare ispell spell
foo:*.txt
.$ whereare 'foo:*.txt'
ENVIRONMENT
PATH
The path whereare
uses to search for executable programs.
Not too complicated. Well, the script appear as follows
#!/bin/sh for P in "$@"; do IFS=: for D in $PATH; do for F in $D/$P; do [ -x "$F" ] && echo $F done done done
The outer loop will go through the supplied list of file name patterns and the inner loop will, for each pattern, go through the path to see what matches the file name pattern. The inner-inner loop is used to expand the file name pattern. Remember that if the pattern matches one or more files, this will result in a list of file that matched. We then have go through each of the files to see which of the files that were executable.
A short example on how to use the program. Note that we have to surround the file patterns with single quotes ' to avoid the pattern from being expanded by the shell before it is sent to the script.
[ 17:42:56 ] @ Owein $ ./whereare '*mail*' /export/matkin/bin/mailserver /export/matkin/bin/mailto /export/matkin/bin/mailto-hebrew /export/matkin/bin/metamail /export/matkin/bin/patch-metamail /export/matkin/bin/splitmail /usr/ucb/mail /usr/sup/misc/bin/ml.mail /usr/openwin/bin/mailp /usr/openwin/bin/mailprint /usr/openwin/bin/mailtool /usr/bin/mail /usr/bin/mailcompat /usr/bin/mailq /usr/bin/mailstats /usr/bin/mailx /usr/bin/rmail
It seems to work quite ok.
Here are some complete scripts. You might have to do some minor tweaking with them to make them work (changing paths, turning ${foo:?well...}
into ${foo?well...}
if you use the "old style" BSD shell, etc.) Some of them are just hacks that I wrote to do something that I needed to be done and some other are serious scripts intended for distribution. These scripts can be used as they are, serve as inspiration, or as examples on ways to do things. I give no guarantee that they are correct or even that they will work .
/bin/sh
that will send mail to users having more garbage than a predefined limit.
What it does is that it assumes that there is an executable file named test_sort
in the same directory and that it accepts a number as its first (and only) argument. Then number represents the size of the array to be sorted.
The script makes $COUNT
measurements starting at $START
and increasing by $INC
each step. It then emits calc code to store the user time for each execution in a matrix m
and stores the supplied number in the matrix t
. Afterwards it generates code for a matrix A
representing the function n + n * log(n) + 1 (i.e. without the constants a , b , and c which we want to compute). It then generates code to multiply A
and m
with the transpose of A
, thereby projecting A
and m
to a 3-dimensional space and solves the resulting linear equation.
You don't really have to know any linear algebra to use the script, just replace the command and the functions accordingly to make least-squares approximation to other linear functions.
Last modified: Tue Jun 25 12:37:03 2002