Awk, Sed, and Tee
awk
awk is a powerful text manipulation tool used for pattern scanning and processing.
awk 'pattern { action }' input-file
Where:
pattern: A regular expression or condition that specifies which lines to match.action: A series of commands to execute on each matched line.input-file: The file to process. If omitted,awkreads from standard input.
Common Options
-
-F: Specifies the field separator.awk -F':' 'pattern { action }' input-file -
-v: Assigns a variable.awk -v var=value 'pattern { action }' input-file
Field Variables
$0: The entire current line.$1,$2, ...: The first, second, ..., field in the current line.
Print Specific Fields
-
Print the first and second fields of each line:
awk '{ print $1, $2 }' file.txt -
Print the first and second fields, using a comma as the field separator:
awk -F',' '{ print $1, $2 }' file.csv
Filtering and Patterns
-
Print lines containing a specific pattern:
awk '/pattern/ { print }' file.txt -
Print lines where the second field is greater than 100:
awk '$2 > 100 { print }' file.txt
Built-in Functions
-
Print the length of the third field:
awk '{ print length($3) }' file.txt -
Calculate the sum of the second field:
awk '{ sum += $2 } END { print sum }' file.txt -
Convert the first field to uppercase:
awk '{ print toupper($1) }' file.txt
Formatting Output
-
Print fields with formatted output:
awk '{ printf "%-10s %-10s\n", $1, $2 }' file.txt -
Add headers to the output:
awk 'BEGIN { print "Name\tAge" } { print $1, $2 }' file.txt
Advanced Examples
-
Calculate and print the average of the second field:
awk '{ sum += $2; count++ } END { if (count > 0) print sum / count }' file.txt -
Extract specific columns from a CSV file and save to another file:
awk -F',' '{ print $1, $3, $5 }' input.csv > output.txt -
Print lines where the first field is "John" and the second field is greater than 50:
awk '$1 == "John" && $2 > 50 { print }' file.txt -
Pass an external variable to
awk:threshold=100
awk -v threshold="$threshold" '$2 > threshold { print }' file.txt
Combining awk with other commands
-
Use
awkto process the output of another command:ls -l | awk '{ print $9, $5 }' -
Find and process files with
findandawk:find /path -type f -name "*.log" -exec awk '/ERROR/ { print FILENAME, $0 }' {} +
Sed
sed (stream editor) is a powerful Unix utility for parsing and transforming text in a data stream (typically a file or input from a pipeline).
sed [options] 'script' input-file
Where:
options: Command-line options to control the behavior ofsed.script: A sequence of editing commands.input-file: The file to process. If omitted,sedreads from standard input.
Common Options
-e script: Adds the script to the commands to be executed.-f script-file: Adds the contents of script-file to the commands to be executed.-n: Suppresses automatic printing of pattern space.-i[SUFFIX]: Edits files in-place (optionally creating a backup with the given suffix).
Basic Commands
-
s/pattern/replacement/flags: Substitutesreplacementforpattern.g: Global replacement.p: Print the result.i: Ignore case.n: Replace the nth occurrence.
-
d: Deletes lines matching the pattern. -
p: Prints lines matching the pattern.
Substitution
-
Replace the first occurrence of "apple" with "orange" in each line:
sed 's/apple/orange/' file.txt -
Replace all occurrences of "apple" with "orange" in each line:
sed 's/apple/orange/g' file.txt -
Replace "apple" with "orange" only on lines containing "fruit":
sed '/fruit/s/apple/orange/' file.txt -
Replace "apple" with "orange" ignoring case:
sed 's/apple/orange/I' file.txt
Deletion
-
Delete lines containing "apple":
sed '/apple/d' file.txt -
Delete the first line of the file:
sed '1d' file.txt -
Delete lines from 2 to 4:
sed '2,4d' file.txt
Printing
-
Print lines containing "apple":
sed -n '/apple/p' file.txt -
Print the first line of the file:
sed -n '1p' file.txt -
Print lines from 2 to 4:
sed -n '2,4p' file.txt
In-place Editing
-
Replace "apple" with "orange" in-place (modify the original file):
sed -i 's/apple/orange/g' file.txt -
Replace "apple" with "orange" in-place, creating a backup with a
.bakextension:sed -i.bak 's/apple/orange/g' file.txt
Multiple Commands
-
Delete lines 1 to 3 and replace "apple" with "orange":
sed '1,3d; s/apple/orange/g' file.txt -
Replace "apple" with "orange" and "banana" with "grape":
sed 's/apple/orange/g; s/banana/grape/g' file.txt
Using a Script File
-
Create a
sedscript file (script.sed) with multiple commands:# script.sed
1,3d
s/apple/orange/g
s/banana/grape/g -
Run
sedwith the script file:sed -f script.sed file.txt
Extracting Data
-
Extract lines between patterns:
sed -n '/START/,/END/p' file.txt -
Extract the first 10 lines of a file:
sed -n '1,10p' file.txt
Combining sed with Other Commands
-
Use
sedin a pipeline to process the output of another command:ls -l | sed 's/^/Line: /' -
Find and replace text in multiple files:
find /path -type f -name "*.txt" -exec sed -i 's/apple/orange/g' {} +
More example
We want to remove all html tags file.txt. Here's a sample file:
# file.txt
<html>
<p>
<>
hello world
hello galaxy
hello universe
We will need to search characters between angle brackets:
<...>
After search, replace them with space or nothing. then save changes.
[root@tst-rhel]# sed -i "s/<[^>]*>//" file.txt
[root@tst-rhel]# cat file.txt
hello world
hello galaxy
hello universe