Back to: Linux command line and the Shell free course beginner to expert by ProGuide365.
This module requires a sandbox to complete. A sandbox gives you access to free resources. Your personal subscription will not be charged. The sandbox may only be used to complete training on Microsoft Learn. Use for any other reason is prohibited, and may result in permanent loss of access to the sandbox.
Microsoft provides this lab experience and related content for educational purposes. All presented information is owned by Microsoft and intended solely for learning about the covered products and services in this Microsoft Learn module.
So far, you’ve been introduced to commands to inspect files, regular expressions, and the stream editor.
We also know now that commands such as cat
and grep
have the monitor screen as standard output. In many cases, you’ll want to redirect the output to another file, by using the >
redirect operator, or even to another program that transforms data for further processing. In UNIX, we call these programs filters.
You might have started to guess by now that it’s possible to construct a pipeline of commands by stringing together multiple filters. To do so, we can use pipes. Pipes are a form of redirection, which allows us to send the output of one program (or filter) to another program for further processing.
For example, you might want to merge the stdout
and stderr
streams and then display them on the terminal. You also might want to save the results to a file to examine build issues. You can take these actions by using pipes and filters.
Let’s see some examples of pipes and filters by using the commands you’ve already learned in the previous units.
- Here’s a simple pipe from the
cat
command to thenl
command by using theNASA-software-APi.txt
file:BashCopycat NASA-software-API.txt | nl
- Count the number of words in all
.txt
files in a directory, and sort the output in ascending order:BashCopywc -l *.txt | sort -n
Notice how we’re using the wildcard*.txt
to indicate all files ending in .txt. - Now what about displaying only the first line of the preceding command?BashCopy
wc -l *.txt | sort -n | head -n 1
- Number the lines in
NASA-software-API.txt
, filter the first five lines, and then sort them in reverse order:BashCopynl -s = NASA-software-API.txt | head -n 5 | sort -r
- Revisiting from the
sed
unit:- Use the
grep
command to find the first 10 lines that containNASA
in theNASA-software-APi.txt
file. - Sort the output in reverse order.
- Replace
NASA
forNational Aerospace Agency
(replicating grep’s behavior).
grep -m10 'NASA' NASA-logs-1995.txt | sort -r | sed -n s'/NASA/National Aerospace Agency/p'
- Use the
As you can see in the preceding examples, we can build many different pipelines with just a few commands. This module briefly introduced you to different kinds of tools and commands. But this introduction should be enough to get you started with data wrangling in the UNIX shell.
Summary
Some of the key takeaways from this module are:
- The
cat
command allows you to display, concatenate, and append files. - Pipes (
|
) send the output of one command as input of another command. - A filter takes input from one command, does some processing, and gives an output.
- Regular expressions (regex) are used for pattern matching and can be used by other commands and tools, such as
sed
andgrep
. - The
grep
command can be used to find strings and values in a text file. - Piping through
grep
is one of the most common uses. - The most common use of the stream editor (
sed
) is text substitution.