Awk, Sed, and Tee
awk
awk is a powerful text manipulation tool used for pattern scanning and processing.
awk 'pattern { action }' input-file
Where:
pattern
: A regular expression or condition that specifies which lines to match.action
: A series of commands to execute on each matched line.input-file
: The file to process. If omitted,awk
reads from standard input.
Common Options
-
-F
: Specifies the field separator.awk -F':' 'pattern { action }' input-file
-
-v
: Assigns a variable.awk -v var=value 'pattern { action }' input-file
Field Variables
$0
: The entire current line.$1
,$2
, ...: The first, second, ..., field in the current line.
Print Specific Fields
-
Print the first and second fields of each line:
awk '{ print $1, $2 }' file.txt
-
Print the first and second fields, using a comma as the field separator:
awk -F',' '{ print $1, $2 }' file.csv
Filtering and Patterns
-
Print lines containing a specific pattern:
awk '/pattern/ { print }' file.txt
-
Print lines where the second field is greater than 100:
awk '$2 > 100 { print }' file.txt
Built-in Functions
-
Print the length of the third field:
awk '{ print length($3) }' file.txt
-
Calculate the sum of the second field:
awk '{ sum += $2 } END { print sum }' file.txt
-
Convert the first field to uppercase:
awk '{ print toupper($1) }' file.txt
Formatting Output
-
Print fields with formatted output:
awk '{ printf "%-10s %-10s\n", $1, $2 }' file.txt
-
Add headers to the output:
awk 'BEGIN { print "Name\tAge" } { print $1, $2 }' file.txt
Advanced Examples
-
Calculate and print the average of the second field:
awk '{ sum += $2; count++ } END { if (count > 0) print sum / count }' file.txt
-
Extract specific columns from a CSV file and save to another file:
awk -F',' '{ print $1, $3, $5 }' input.csv > output.txt
-
Print lines where the first field is "John" and the second field is greater than 50:
awk '$1 == "John" && $2 > 50 { print }' file.txt
-
Pass an external variable to
awk
:threshold=100
awk -v threshold="$threshold" '$2 > threshold { print }' file.txt
Combining awk with other commands
-
Use
awk
to process the output of another command:ls -l | awk '{ print $9, $5 }'
-
Find and process files with
find
andawk
:find /path -type f -name "*.log" -exec awk '/ERROR/ { print FILENAME, $0 }' {} +
Sed
sed
(stream editor) is a powerful Unix utility for parsing and transforming text in a data stream (typically a file or input from a pipeline).
sed [options] 'script' input-file
Where:
options
: Command-line options to control the behavior ofsed
.script
: A sequence of editing commands.input-file
: The file to process. If omitted,sed
reads from standard input.
Common Options
-e script
: Adds the script to the commands to be executed.-f script-file
: Adds the contents of script-file to the commands to be executed.-n
: Suppresses automatic printing of pattern space.-i[SUFFIX]
: Edits files in-place (optionally creating a backup with the given suffix).
Basic Commands
-
s/pattern/replacement/flags
: Substitutesreplacement
forpattern
.g
: Global replacement.p
: Print the result.i
: Ignore case.n
: Replace the nth occurrence.
-
d
: Deletes lines matching the pattern. -
p
: Prints lines matching the pattern.
Substitution
-
Replace the first occurrence of "apple" with "orange" in each line:
sed 's/apple/orange/' file.txt
-
Replace all occurrences of "apple" with "orange" in each line:
sed 's/apple/orange/g' file.txt
-
Replace "apple" with "orange" only on lines containing "fruit":
sed '/fruit/s/apple/orange/' file.txt
-
Replace "apple" with "orange" ignoring case:
sed 's/apple/orange/I' file.txt
Deletion
-
Delete lines containing "apple":
sed '/apple/d' file.txt
-
Delete the first line of the file:
sed '1d' file.txt
-
Delete lines from 2 to 4:
sed '2,4d' file.txt
Printing
-
Print lines containing "apple":
sed -n '/apple/p' file.txt
-
Print the first line of the file:
sed -n '1p' file.txt
-
Print lines from 2 to 4:
sed -n '2,4p' file.txt
In-place Editing
-
Replace "apple" with "orange" in-place (modify the original file):
sed -i 's/apple/orange/g' file.txt
-
Replace "apple" with "orange" in-place, creating a backup with a
.bak
extension:sed -i.bak 's/apple/orange/g' file.txt
Multiple Commands
-
Delete lines 1 to 3 and replace "apple" with "orange":
sed '1,3d; s/apple/orange/g' file.txt
-
Replace "apple" with "orange" and "banana" with "grape":
sed 's/apple/orange/g; s/banana/grape/g' file.txt
Using a Script File
-
Create a
sed
script file (script.sed) with multiple commands:# script.sed
1,3d
s/apple/orange/g
s/banana/grape/g -
Run
sed
with the script file:sed -f script.sed file.txt
Extracting Data
-
Extract lines between patterns:
sed -n '/START/,/END/p' file.txt
-
Extract the first 10 lines of a file:
sed -n '1,10p' file.txt
Combining sed with Other Commands
-
Use
sed
in a pipeline to process the output of another command:ls -l | sed 's/^/Line: /'
-
Find and replace text in multiple files:
find /path -type f -name "*.txt" -exec sed -i 's/apple/orange/g' {} +
More example
We want to remove all html tags file.txt. Here's a sample file:
# file.txt
<html>
<p>
<>
hello world
hello galaxy
hello universe
We will need to search characters between angle brackets:
<...>
After search, replace them with space or nothing. then save changes.
[root@tst-rhel]# sed -i "s/<[^>]*>//" file.txt
[root@tst-rhel]# cat file.txt
hello world
hello galaxy
hello universe
Tee
Tee
reads from standard input and writes to standard output and one or more files simultaneously. This is useful for logging or capturing command output while still displaying it on the terminal.
command | tee [options] file(s)
Where:
command
: The command whose output you want to capture.file(s)
: One or more files to which the output should be written.
Common Options
-a
,--append
: Append the output to the files, rather than overwriting them.-i
,--ignore-interrupts
: Ignore interrupt signals.
Basic Usage
-
Write the output of
ls
to a file:ls | tee output.txt
This command lists the contents of the current directory and writes the output to
output.txt
while also displaying it on the terminal. -
Write the output of
ls
to multiple files:ls | tee output1.txt output2.txt
This command lists the directory contents and writes the output to both
output1.txt
andoutput2.txt
.
Examples
-
Append the output of
ls
to a file:ls | tee -a output.txt
-
Run the command while ignoring interrupt signals (e.g.,
Ctrl+C
):command | tee -i output.txt
-
Capture both standard output and standard error:
command 2>&1 | tee output.txt
Combining tee with Other Commands
-
Capture the output of a pipeline:
ps aux | grep httpd | tee processes.txt
This command captures the output of
ps aux | grep httpd
and writes it toprocesses.txt
. -
Use
tee
in a script to log output:#!/bin/bash
echo "Starting script..."
command1 | tee command1.log
command2 | tee command2.log
echo "Script completed."
Practical Use Cases
-
This command continuously monitors the system log and writes the output to
syslog_monitor.log
while displaying it on the terminal.tail -f /var/log/syslog | tee syslog_monitor.log
-
This command copies the configuration file and logs the backup operation to
backup.log
.cp /etc/config.conf /etc/config.conf.bak | tee backup.log