Piping
Since the glue of POSIX utilities is their ability to shuffle text around, a basic skill is being able to move text from one stream to file or back again.
Basic redirects
The characters >
, >>
, and <
all allow you redirect a program’s output to
a file, or in the case of <
, from a file to stdin.
cat file1 > file2
is a really silly way of writing cp file1 file2
. More
usefully, you can use >
to save the output of commands for use later. (This
includes command written in the Slurm command wrappers on clusters.)
>>
does the same thing, except it lets you append rather than overwrite.
stdout
and stderr
In console terms, stdout
is 1
and stderr
is 2
. (stdin
is 0
.)
Any redirect can redirect one or both of the streams. `
A very common use of this is when you want to grab both stdout
and stderr
and send them to a file using 2>&1
at the end of a command:
python myscript.py > output_log 2>&1
would run your python script, and output both stdout and stderr to the same
stream that then gets written to the file output_log
.
The logic of redirects is fairly complicated but interprets left to right.
1) Redirect the pointer that stdout points to output_log
2) Redirect the pointer for stderr to the pointer for 1
(output_log
)
This pointer logic (as always) causes oddities:
python myscript.py 2>&1 > output_log
You would think this would be functionally equivalent, but it is not. The logic is instead:
1) Redirect the pointer for 2
to be the same as the pointer for 1
.
2) Redirect 1
to point to output_log
(but NOT 2
, since it’s still pointing
at 1
!)
So you’ve just written a really complicated python myscript.py > output_log
You can also redirect any of the stream to a temporarily placeholder from
3
through 10
and then redirect that.
Bash also includes shortcuts for much of this (i.e., python myscript &> output_log
is equivalent to python myscript.py > output_log 2>&1
)
See the (manual)[http://www.gnu.org/software/bash/manual/bash.html#Redirections] and also a very illuminating answer (and argument) on (StackOverflow)[https://stackoverflow.com/questions/2342826/how-to-pipe-stderr-and-not-stdout]
Piping |
The pipe |
operator takes the stdout of a command (|&
for both stdout
and stderr
) and passes it to stdin
. If a program can read from stdin
(most can),
you can use this to chain input and outputs to an absolutely hilarious degree.
Here’s a oneliner I used recently:
find . -mindepth 1 -maxdepth 1 \
-type d ! -name media ! -name "log*" ! -name "font*" \
-printf "%T@\t%Tc\t%p\n" \
| sort -nr | sed -e "1,5d" \
| awk -F $'\t' '{print $3}' \
| xargs rm -rf
This uses the find
command to recurse from the current directory .
one level
(-mindepth
, -maxdepth
), finding only directories -type d
not named media
,
log
or fonts
, print their last modified time stamps and name, sort
them by
most recent, then use sed
to cut out the five most recent, awk
to strip out
just file names, and then finally xargs
to reformat those new lines as a list
for rm -rf
.
This is a bit ridiculous, but it gives you some idea of what you can really do in Bash with the built-in utilities.