The bash shell is just amazing. There are so many tasks that can be simplified using its handy features. This tutorial tells about some of those features, explains what exactly they do and learns you how to use them.
Difficulty: Basic - Medium
Sometimes you know that you ran a command a while ago and you want to run it again. You know a bit of the command, but you don't exactly know all options, or when you executed the command. Of course, you could just keep pressing the Up Arrow until you encounter the command again, but there is a better way. You can search the bash history in an interactive mode by pressing Ctrl + r. This will put bash in history mode, allowing you to type a part of the command you're looking for. In the meanwhile, it will show the most recent occasion where the string you're typing was used. If it is showing you a too recent command, you can go further back in history by pressing Ctrl + r again and again. Once you found the command you were looking for, press enter to run it. If you can't find what you're looking for and you want to try it again or if you want to get out of history mode for an other reason, just press Ctrl + c. By the way, Ctrl + c can be used in many other cases to cancel the current operation and/or start with a fresh new line.
You can repeat the last argument of the previous command in multiple ways. Have a look at this example:
[rechosen@localhost ~]$ mkdir /path/to/exampledir
[rechosen@localhost ~]$ cd !$
The second command might look a little strange, but it will just cd to /path/to/exampledir. The "!$" syntax repeats the last argument of the previous command. You can also insert the last argument of the previous command on the fly, which enables you to edit it before executing the command. The keyboard shortcut for this functionality is Esc + . (a period). You can also repeatedly press these keys to get the last argument of commands before the previous one.
There are some pretty useful keyboard shortcuts for editing in bash. They might appear familiar to Emacs users:
If you've just started a huge process (like backupping a lot of files) using an ssh terminal and you suddenly remember that you need to do something else on the same server, you might want to get the huge process to the background. You can do this by pressing Ctrl + z, which will suspend the process, and then executing the bg command:
[rechosen@localhost ~]$ bg
[1]+ hugeprocess &
This will make the huge process continue happily in the background, allowing you to do what you need to do. If you want to background another process with the huge one still running, just use the same steps. And if you want to get a process back to the foreground again, execute fg:
[rechosen@localhost ~]$ fg
hugeprocess
But what if you want to foreground an older process that's still running? In a case like that, use the jobs command to see which processes bash is managing:
[rechosen@localhost ~]$ jobs
[1]- Running hugeprocess &
[2]+ Running anotherprocess &
Note: A "+" after the job id means that that job is the 'current job', the one that will be affected if bg or fg is executed without any arguments. A "-" after the job id means that that job is the 'previous job'. You can refer to the previous job with "%-".
Use the job id (the number on the left), preceded by a "%", to specify which process to foreground / background, like this:
[rechosen@localhost ~]$ fg %3
And:
[rechosen@localhost ~]$ bg %7
The above snippets would foreground job [3] and background job [7].
There are multiple ways to embed a command in an other one. You could use the following way (which is called command substitution):
[rechosen@localhost ~]$ du -h -a -c $(find . -name *.conf 2>&-)
The above command is quite a mouthful of options and syntax, so I'll explain it.
And there's another way to substitute, called process substitution:
[rechosen@localhost ~]$ diff <(ps axo comm) <(ssh user@host ps axo comm)
The command in the snippet above will compare the running processes on the local system and a remote system with an ssh server. Let's have a closer look at it:
Let's have a look at the whole snippet now:
This way, you have done in one line what would normally require at least two: comparing the outputs of two processes.
And there even is another way, called xargs. This command can feed arguments, usually imported through a pipe, to a command. See the next chapter for more information about pipes. We'll now focus on xargs itself. Have a look at this example:
[rechosen@localhost ~]$ find . -name *.conf -print0 | xargs -0 grep -l -Z mem_limit | xargs -0 -i cp {} {}.bak
Note: the "-l" after grep is an L, not an i.
The command in the snippet above will make a backup of all .conf files in the current directory and accessible subdirectories that contain the string "mem_limit".
These substitutions can, once mastered, provide short and quick solutions for complicated problems.
One of the most powerful features is the ability to pipe data through commands. You could see this as letting bash take the output of a command, then feed it to an other command, take the output of that, feed it to another and so on. This is a simple example of using a pipe:
[rechosen@localhost ~]$ ps aux | grep init
If you don't know the commands yet: "ps aux" lists all processes executed by a valid user that are currently running on your system (the "a" means that processes of other users than the current user should also be listed, the "u" means that only processes executed by a valid user should be shown, and the "x" means that background processes (without a tty) should also be listed). The "grep init" searches the output of "ps aux" for the string "init". It does so because bash pipes the output of "ps aux" to "grep init", and bash does that because of the "|" operator.
The "|" operator makes bash redirect all data that the command left of it returns to the stdout (more about that later) to the stdin of the command right of it. There are a lot of commands that support taking data from the stdin, and almost every program supports returning data using the stdout.
The stdin and stdout are part of the standard streams; they were introduced with UNIX and are channels over which data can be transported. There are three standard streams (the third one is stderr, which should be used to report errors over). The stdin channel can be used by other programs to feed data to a running process, and the stdout channel can be used by a program to export data. Usually, stdout output (and stderr output, too) is received by the terminal environment in which the program is running, in our case bash. By default, bash will show you the output by echoing it onto the terminal screen, but now that we pipe it to an other command, we are not shown the data.
Please note that, as in a pipe only the stdout of the command on the left is passed on to the next one, the stderr output will still go to the terminal. I will explain how to alter this further on in this tutorial.
If you want to see the data that's passed on between programs in a pipe, you can insert the "tee" command between it. This program receives data from the stdin and then writes it to a file, while also passing it on again through the stdout. This way, if something is going wrong in a pipe sequence, you can see what data was passing through the pipes. The "tee" command is used like this:
[rechosen@localhost ~]$ ps aux | tee filename | grep init
The "grep" command will still receive the output of "ps aux", as tee just passes the data on, but you will be able to read the output of "ps aux" in the file <filename> after the commands have been executed. Note that "tee" tries to replace the file <filename> if you specify the command like this. If you don't want "tee" to replace the file but to append the data to it instead, use the -a option, like this:
[rechosen@localhost ~]$ ps aux | tee -a filename | grep init
As you have been able to see in the above command, you can place a lot of command with pipes after each other. This is not infinite, though. There is a maximum command-line length, which is usually determined by the kernel. However, this value usually is so big that you are very unlikely to hit the limit. If you do, you can always save the stdout output to a file somewhere inbetween and then use that file to continue operation. And that introduces the next subject: saving the stdout output to a file.
You can save the stdout output of a command to a file like this:
[rechosen@localhost ~]$ ps aux > filename
The above syntax will make bash write the stdout output of "ps aux" to the file filename. If filename already exists, bash will try to overwrite it. If you don't want bash to do so, but to append the output of "ps aux" to filename, you could do that this way:
[rechosen@localhost ~]$ ps aux >> filename
You can use this feature of bash to split a long line of pipes into multiple lines:
[rechosen@localhost ~]$ command1 | command2 | ... | commandN > tempfile1
[rechosen@localhost ~]$ cat tempfile1 | command1 | command2 | ... | commandN > tempfile2
And so on. Note that the above use of cat is, in most cases, a useless one. In many cases, you can let command1 in the second snippet read the file, like this:
[rechosen@localhost ~]$ command1 tempfile1 | command2 | ... | commandN > tempfile2
And in other cases, you can use a redirect to feed a file to command1:
[rechosen@localhost ~]$ command1 < tempfile1 | command2 | ... | commandN> tempfile2
To be honest, I mainly included this to avoid getting the Useless Use of Cat Award=).
Anyway, you can also use bash's ability to write streams to file for logging the output of script commands, for example. By the way, did you know that bash can also write the stderr output to a file, or both the stdout and the stderr streams?
The bash shell allows us to redirect streams to other streams or to files. This is quite a complicated feature, so I'll try to explain it as clearly as possible. Redirecting a stream is done like this:
[rechosen@localhost ~]$ ps aux 2>&1 | grep init
In the snippet above, "grep init" will not only search the stdout output of "ps aux", but also the stderr output. The stderr and the stdout streams are combined. This is caused by that strange "2>&1" after "ps aux". Let's have a closer look at that.
First, the "2". As said, there are three standard streams (stin, stdout and stderr).These standard streams also have default numbers:
As you can see, "2" is the stream number of stderr. And ">", we already know that from making bash write to a file. The actual meaning of this symbol is "redirect the stream on the left to the stream on the right". If there is no stream on the left, bash will assume you're trying to redirect stdout. If there's a filename on the right, bash will redirect the stream on the left to that file, so that everything passing through the pipe is written to the file.
Note: the ">" symbol is used with and without a space behind it in this tutorial. This is only to keep it clear whether we're redirecting to a file or to a stream: in reality, when dealing with streams, it doesn't matter whether a space is behind it or not. When substituting processes, you shouldn't use any spaces.
Back to our "2>&1". As explained, "2" is the stream number of stderr, ">" redirects the stream somewhere, but what is "&1"? You might have guessed, as the "grep init" command mentioned above searches both the stdout and stderr stream, that "&1" is the stdout stream. The "&" in front of it tells bash that you don't mean a file with filename "1". The streams are sent to the same destination, and to the command receiving them it will seem like they are combined.
If you'd want to write to a file with the name "&1", you'd have to escape the "&", like this:
[rechosen@localhost ~]$ ps aux > /&1
Or you could put "&1" between single quotes, like this:
[rechosen@localhost ~]$ ps aux > '&1'
Wrapping a filename containing problematic characters between single quotes generally is a good way to stop bash from messing with it (unless there are single quotes in the string, then you'd have have escape them by putting a / in front of them).
Back again to the "2>&1". Now that we know what it means, we can also apply it in other ways, like this:
[rechosen@localhost ~]$ ps aux > filename 2>&1
The stdout output of ps aux will be sent to the file filename, and the stderr output, too. Now, this might seem unlogical. If bash would interpret it from the left to the right (and it does), you might think that it should be like:
[rechosen@localhost ~]$ ps aux 2>&1 > filename
Well, it shouldn't. If you'd execute the above syntax, the stderr output would just be echoed to the terminal. Why? Because bash does not redirect to a stream, but to the current final destination of the stream. Let me explain it:
If we put the redirections the other way around ("> filename" first), it does work. I'll explain that, too:
Get it? The redirects cause a stream to go to the same final destination as the specified one. It does not actually merge the streams, however.
Now that we know how to redirect, we can use it in many ways. For example, we could pipe the stderr output instead of the stdout output:
[rechosen@localhost ~]$ ps aux 2>&1 > /dev/null | grep init
The syntax in this snippet will send the stderr output of "ps aux" to "grep init", while the stdout output is sent to /dev/null and therefore discarded. Note that "grep init" will probably not find anything in this case as "ps aux" is unlikely to report any errors.
When looking more closely to the snippet above, a problem arises. As bash reads the command statements from the left to the right, nothing should go through the pipe, you might say. At the moment that "2>&1" is specified, stdout should still point to the terminal, shouldn't it? Well, here's a thing you should remember: bash reads command statements from the left to the right, but, before that, determines if there are multiple command statements and in which way they are separated. Therefore, bash already read and applied the "|" pipe symbol and stdout is already pointing to the pipe. Note that this also means that stream redirections must be specified before the pipe operator. If you, for example, would move "2>&1" to the end of the command, after "grep init", it would not affect ps aux anymore.
We can also swap the stdout and the stderr stream. This allows to let the stderr stream pass through a pipe while the stdout is printed to the terminal. This will require a 3rd stream. Let's have a look at this example:
[rechosen@localhost ~]$ ps aux 3>&1 1>&2 2>&3 | grep init
That stuff seems to be quite complicated, right? Let's analyze what we're doing here:
So, by using a backup stream, we can swap the stdout and stderr stream. This backup stream does not belong to the standard streams, but it is pretty much always available in bash. If you're using it in a script, make sure you aren't breaking an earlier command by playing with the 3rd stream. You can also use stream 4, 5, 6 7 and so on if you need more streams. The highest stream number usually is 1023 (there are 1024 streams, but the first stream is stream 0, stdin). This may be different on other linux systems. Your mileage may vary. If you try to use a non-existing stream, you will get an error like this:
bash: 1: Bad file descriptor
If you want to return a non-standard stream to it's default state, redirect it to "&-", like this:
[rechosen@localhost ~]$ ps aux 3>&1 1>&2 2>&3 3>&- | grep init
Note that the stream redirections are always reset to their initial state if you're using them in a command. You'll only need to do this manually if you made a redirect using, for example, exec, as redirects made this way will last until changed manually.
Well, I hope you learned a lot from this tutorial. If the things you read here were new for you, don't worry if you can't apply them immediately. It already is useful if you just know what a statement means if you stumble upon it sometime. If you liked this, please help spreading the word about this blog by posting a link to it here and there. Thank you for reading, and good luck working with bash!