In Bash Baby Steps, we learned the difference between executing commands on the commandline for one-time only usage and collecting those commands into an executable script for repeated usage. We learned how to make a script executable and how to adjust your path so that you can execute your script anywhere. Today we'll learn about pipes and redirection.
Redirecting Input/Output. In UNIX, there are three terms you should know: standard input, standard ouput, and standard error. Standard input is data a program needs to run, such as an input file, a data file, or information it asks you to enter at the command line. Standard ouput is what the program says back to you, either when asking for more information or in telling you what you wanted to know in the first place. Standard error is how the program tells you about any problems it thinks you should know about. These three types of information are regarded by Bash (and all UNIX programs) as Type 0 (standard input), Type 1 (standard output), and Type 2 (standard error), respectively. (Actually, these numbers are file descriptors, a low level UNIX I/O concept we don't need to get into now. Instead, just think of these three types of information flowing through a program.)
Example:
$ ls -la ~/bin > myscripts.txt
This command lists all files in your $HOME/bin directory (where you would often keep your personal scripts). But instead of listing them on the display, the > redirection symbol redirects the output into the myscripts.txt file, which preserves the output and allows you to print it or email it or do something creative with it. This is a simple example of output redirection.
(Note: The default behavior of the shell is to overwrite any file output is being directed to. So, if filename.txt already exists, it will be overwritten, or ``clobbered,'' as it is known in the UNIX world. You can, however, turn off clobbering by default by issuing the set -C command.)
You can also append standard output to a file so as to preserve
previous standard output. Example:
$ du -sc >> filesystem.txt Now, everytime you execute
this line, filesystem.txt will receive another dump of
information about your filesystem size. The file will keep a
running log of changes in the size of the filesystem. But what
if another filesystem that is mounted under the current directory
becomes unmounted or inaccessible? There will probably be an
error, and it will be directed to filesystem.txt, too. But we
can fix this.
We can tell the program to redirect standard error
output into a different file:
$ du -sc >> filesystem.txt 2>duerrors.txt Remember about Type 0, Type 1, and Type 2? Here
we're telling Bash to take Type 2 output and direct it to the
file pserrors.txt, if there are any errors. (Type 1 output
is assumed, so it doesn't need to be specified explicitly, unless
you are directing some other output to it, as the example below
illustrates.)
Sometimes programs have already been told to direct standard error to someplace other than the display. But you might wish to see just what the errors are, so you want to redirect standard error back to the display where you can see how a program is proceeding. To do this, you would type:
$ programname 2>&1
Mind you, this assumes that programname already directs standard error to some place other than standard output, like /tmp/programname.err or something. But by typing 2>&1 you tell the program to take Type 2 output and redirect it to standard output. In other words, take standard error and redirect it to the display where you can see it as it is happening.
Other times, you may wish to dispense with all standard error from a program, because its worth is negligible, or it occurs from within a script where the errors are ignored. You could do this:
$ programname 2>/dev/null
Sometimes you want your program to read its input from a file or
from another program. You can tell it to do so with a
redirector. Let's say the administrator wants to mail himself a
copy of a logfile each hour. He can instruct the cron daemon
with a command like
mail -s "your log" < logfile root@localhost
Here, the mail program is sending a mail to
root@localhost consisting of the contents of logfile. The -s switch means ``Subject''. The mail program will set the
``Subject'' header to ``your log''.
You can test this redirection out thusly:
$ cat < textfile1 > textfile2
This is essentially like saying cp textfile1 textfile2. You're saying take the contents of textfile1 and redirect its contents to textfile2. Again, watch out for ``clobbering'' files whose contents you want to preserve.
Pipes. UNIX philosophy urges the use of small yet highly
focused programs that can be used together to perform complex tasks.
So, learning how to string together a number of small commands on the
commandline is an intrinsic part of being comfortable with Bash.
To do this, we direct the standard output of one program into the
standard input of another program, but we don't use redirectors,
we use the pipe operator, ``|''. In UNIX, processes
connected by pipes run together dynamically as data flows between
them.
Example: $ ps aml | sort -r -k 7 | less
In the above example, the ``processes status'' command (ps) lists all processes with memory information in long format, and standard output from this command is fed as standard input into the sort command, where it is reverse sorted (-r) on the seventh column (-k 7), and then is ``piped'' into less for easy viewing by the user. (Substitute more if less is not on your system.)
There are many examples of practical ways we can use pipes every day:
ps axl | grep zombie to help you find zombie
processes.
ls -al | sort -r -k 5 | head -10 will print the 10
largest files, sorted by size, in your current directory.
locate agick | grep -E Image\|image | more if you
were looking for ImageMagick on your system but didn't know how
it was capitalized, if at all. Also, note how you must escape
the pipe when using extended regular expressions so the shell
doesn't interpret it prematurely. (More about escaping special
symbols in another article about regular expressions!)
ps axl | grep FLAGS ; ps axl | grep xv if you wanted
a long listing of all Xview processes, with a nice header printed
on top. You could replace Xview with whatever processes you were
interested in. Notice how you can execute two processes
sequentially (not by piping them) by joining them with ``;'' on the commandline.
ls -C | more Lets you view large directories in a
neat and tidy format, page by page.
ls -ps | grep / | pr -3 | more Lets you view
directory names and their size (in kilobytes) in the current
directory, all formatted into columns.
These are just a very few ways we can use pipes and redirectors in our daily life at the command prompt. Your imagination is the only limit!