awk Command in Linux With Examples

Use the awk command in Linux for fields, filters, totals, grouped counts, printf reports, scripts, pipelines, and safe output.

PublishedAuthorJoshua JamesRead time11 minGuide typeLinux Commands

Delimited text becomes easier to work with when each column, line range, or repeated value can be tested directly. The awk command in Linux reads input record by record, splits each record into fields, and runs compact pattern/action programs for filtering logs, summarizing inventories, comparing files, and formatting reports.

Use awk when the result depends on fields, totals, grouped counts, ranges, or calculated output. For simpler line matching, use the grep command in Linux; for stream substitutions and targeted text edits, use sed command examples. Awk is the middle ground for column-aware logic that is too specific for line matching but too small for a full script.

Understand the awk Command in Linux

An awk program usually contains a pattern, an action, or both. The pattern decides which records match, and the action inside braces decides what to print, calculate, store, or transform. When no pattern is provided, the action runs for every record; when no action is provided, awk prints each matching record.

Basic awk Syntax

awk 'pattern { action }' file

Awk also reads from standard input, which makes it useful after commands that generate text:

command | awk 'pattern { action }'
  • pattern: A condition such as NR > 1, $2 == "frontend", or /ERROR/. If the condition is true, the action runs.
  • action: One or more statements inside braces, such as { print $1 }, { total += $4 }, or { printf "%s\n", $1 }.
  • file: One or more input files. If no file is named, awk reads standard input.

Records, Fields, and Separators

Awk reads input one record at a time. A record is usually one line, and fields are the columns inside that line. These built-in variables and field references appear throughout most awk one-liners:

ReferenceMeaningCommon Use
$0The full current record.Print or test the whole line.
$1, $2The first field, second field, and so on.Select columns from structured text.
NFThe number of fields in the current record.Skip blank lines or find the last field.
$NFThe field whose number equals NF.Print the last field without hardcoding its position.
NRThe total record number across all input.Skip headers, print line ranges, or add row numbers.
FNRThe record number inside the current file.Separate first-file setup from second-file processing.
FS or -FThe input field separator.Split records on colons, commas, tabs, or another pattern.
OFSThe output field separator.Control spacing or delimiters between printed fields.

By default, awk splits fields on runs of whitespace. Use -F when the input has a clearer delimiter, such as a colon in account-style data, a comma in simple CSV, or a tab in TSV files. Because -F takes a regular expression, simple delimiters work well, but plain -F',' should stay limited to rows where commas do not appear inside quoted fields.

Portable scripts should start with POSIX awk features such as -F, NR, FNR, NF, BEGIN, END, arrays, printf, -v, and -f. The POSIX awk utility definition defines that baseline. The GNU Awk manual covers GNU extensions, including newer CSV handling, but portable awk programs should depend on those extensions only when the target systems are known to use GNU awk.

awk Quick Reference

Use these patterns as starting points for common awk tasks. Predictable input files make each pattern easier to adapt without mixing several awk concepts at once.

Field and Filter Patterns

TaskCommand PatternWhat It Does
Print one fieldawk '{ print $1 }' filePrints the first field from each input record.
Use a delimiterawk -F':' '{ print $1 }' fileSplits records on colons instead of whitespace.
Print the last fieldawk '{ print $NF }' fileUses NF as the number of the final field.
Skip a headerawk 'NR > 1 { print }' fileStarts processing after the first record.
Filter by fieldawk '$2 == "error" { print $0 }' filePrints records where the second field matches exactly.
Filter by regexawk '/WARN|ERROR/ { print }' filePrints records matching either regular expression branch.
Print a line rangeawk 'NR >= 10 && NR <= 20 { print }' filePrints records by line number.
Skip blanks and commentsawk 'NF && $1 !~ /^#/ { print }' fileIgnores empty lines and lines whose first field starts with #.

Reporting and Reuse Patterns

TaskCommand PatternWhat It Does
Add valuesawk '{ total += $1 } END { print total }' fileAccumulates values and prints the final total after input ends.
Count valuesawk '{ count[$1]++ } END { for (key in count) print key, count[key] }' fileUses an associative array to count repeated values.
Remove duplicatesawk '!seen[$0]++' filePrints the first occurrence of each unique input record.
Compare two filesawk 'FNR == NR { seen[$1]; next } $1 in seen { print }' file1 file2Builds an array from the first file, then checks the second file.
Format outputawk '{ printf "%-10s %s\n", $1, $2 }' fileUses awk’s printf statement for aligned output.
Pass a shell value safelyawk -v limit=10 '$1 >= limit { print }' filePasses an external value without shell-interpolating the awk program.
Run a script fileawk -f report.awk fileLoads awk logic from a reusable script file.

Verify awk Availability and Build Example Files

Most general-purpose Linux distributions include awk. Confirm the command path before using the fixture commands.

command -v awk

A working system returns an executable path:

/usr/bin/awk

When the system uses GNU awk, the GNU-specific version option prints an implementation banner:

awk --version | head -1

A GNU awk version line begins with the implementation name and version:

GNU Awk 5.3.2, API 4.0, PMA Avon 8-g1, (GNU MPFR 4.2.2, GNU MP 6.3.0)

If awk --version fails, the command may still work through another implementation. A small BEGIN program proves awk can start without waiting for an input file:

awk 'BEGIN { print "awk works" }'
awk works

Create a disposable workspace for the examples. The files cover colon-delimited inventories, whitespace-delimited logs, simple CSV, two-file comparisons, section markers, comments, blank lines, and uneven spacing.

mkdir -p ~/awk-demo
cd ~/awk-demo

cat > inventory.txt <<'EOF'
hostname:role:cpu:memory:status
web01:frontend:4:8:active
web02:frontend:4:8:active
api01:backend:8:16:active
api02:backend:8:32:maintenance
db01:database:16:64:active
cache01:cache:4:16:active
EOF

cat > app.log <<'EOF'
2026-05-27T10:00:01Z INFO web01 request completed ms=42
2026-05-27T10:00:02Z WARN api01 retrying upstream ms=180
2026-05-27T10:00:03Z ERROR api01 upstream timeout ms=900
2026-05-27T10:00:04Z INFO db01 query completed ms=75
2026-05-27T10:00:05Z ERROR web02 disk full ms=310
2026-05-27T10:00:06Z WARN api02 cache stale ms=240
2026-05-27T10:00:07Z INFO cache01 request completed ms=35
EOF

cat > services.csv <<'EOF'
service,port,owner,enabled
ssh,22,ops,yes
http,80,web,yes
https,443,web,yes
postgres,5432,data,no
redis,6379,data,yes
EOF

cat > limits.txt <<'EOF'
frontend 12
backend 16
database 32
cache 8
EOF

cat > sections.conf <<'EOF'
[frontend]
web01
web02
[backend]
api01
api02
[database]
db01
[cache]
cache01
EOF

cat > settings.conf <<'EOF'
# local app settings
host = web01
port = 8080

mode = production
EOF

cat > messy.txt <<'EOF'
web01      frontend     active
api01 backend    maintenance
EOF

The fixture files keep each example focused on one awk behavior:

FileUsed For
inventory.txtColon-delimited fields, headers, numeric comparisons, totals, groups, and output formatting.
app.logWhitespace fields, log-level filters, timing extraction, duplicates, counts, and pipelines.
services.csvSimple comma-separated rows without quoted commas.
limits.txtTwo-file lookups with FNR and NR.
sections.confRange patterns and marker-controlled section extraction.
settings.conf and messy.txtComment skipping, blank-line handling, and whitespace normalization.

Preview the first records with awk itself:

awk 'NR <= 4 { print }' inventory.txt
hostname:role:cpu:memory:status
web01:frontend:4:8:active
web02:frontend:4:8:active
api01:backend:8:16:active

Print Specific Fields

Use -F':' to split the inventory file on colons, then print the first two fields. Awk separates comma-separated print arguments with a space unless you set a different output separator.

awk -F':' 'NR > 1 { print $1, $2 }' inventory.txt
web01 frontend
web02 frontend
api01 backend
api02 backend
db01 database
cache01 cache

Print the Last Field with NF

NF stores the number of fields in the current record, so $NF means the last field. This is useful when records have a stable ending column but a changing number of earlier fields.

awk -F':' 'NR > 1 { print $1, $NF }' inventory.txt
web01 active
web02 active
api01 active
api02 maintenance
db01 active
cache01 active

Add Line Numbers to Output

NR counts every input record read so far. Subtract one after skipping a header when the output should show data-row numbers instead of physical file line numbers.

awk -F':' 'NR > 1 { print NR - 1, $1 }' inventory.txt
1 web01
2 web02
3 api01
4 api02
5 db01
6 cache01

Append Text to Printed Fields

Awk concatenates adjacent strings and variables without a separate operator. This makes labels and units easy to attach to field values.

awk -F':' 'NR > 1 { print $1, $4 " GB" }' inventory.txt
web01 8 GB
web02 8 GB
api01 16 GB
api02 32 GB
db01 64 GB
cache01 16 GB

Read Simple CSV Fields

-F',' works for simple comma-separated rows where commas never appear inside quoted fields. Use it for quick admin data, not for full CSV exports with quoted commas, embedded newlines, or escaped quotes.

awk -F',' 'NR > 1 && $4 == "yes" { print $1, $2 }' services.csv
ssh 22
http 80
https 443
redis 6379

Filter Rows and Ranges with awk

Match an Exact Field Value

Field comparisons are one of awk’s strongest everyday uses. Print only hosts where the role field equals frontend.

awk -F':' '$2 == "frontend" { print $1 }' inventory.txt
web01
web02

Compare Numeric Fields

Numeric comparisons work directly when the field contains a number. Keep the header guard in place so text labels do not take part in the comparison.

awk -F':' 'NR > 1 && $3 >= 8 { print $1, $3 " CPUs" }' inventory.txt
api01 8 CPUs
api02 8 CPUs
db01 16 CPUs

Skip Blank Lines and Comments

NF is zero on blank lines, so it can remove empty records. Pair it with a comment check when parsing simple configuration-style files.

awk 'NF && $1 !~ /^#/ { print }' settings.conf
host = web01
port = 8080
mode = production

Match Log Lines with a Regular Expression

A bare regular expression pattern checks the whole record. This form is useful when you want awk to keep only log severities or message classes before printing selected fields.

awk '/WARN|ERROR/ { print $1, $2, $3 }' app.log
2026-05-27T10:00:02Z WARN api01
2026-05-27T10:00:03Z ERROR api01
2026-05-27T10:00:05Z ERROR web02
2026-05-27T10:00:06Z WARN api02

Print Rows by Line Number

Use NR comparisons for line-number ranges. The command here prints physical input lines 3 through 5, including the header’s effect on numbering.

awk -F':' 'NR >= 3 && NR <= 5 { print NR, $1 }' inventory.txt
3 web02
4 api01
5 api02

Print Lines Between Markers

Awk range patterns print from the first matching record through the next matching record, including both marker lines. That behavior is useful when the markers belong in the output.

awk '/^\[backend\]$/, /^\[database\]$/' sections.conf
[backend]
api01
api02
[database]

When you only want the records inside the section, set a flag at the start marker and clear it at the next section header:

awk '/^\[backend\]$/ { active = 1; next } /^\[/ { active = 0 } active { print }' sections.conf
api01
api02

Transform Text with awk

Extract Values with split

split() divides a string into an array. The log file stores timing as ms=value, so splitting field 6 on = extracts the numeric value without changing the original log line.

awk '$2 == "ERROR" { split($6, timing, "="); print $3, timing[2] }' app.log
api01 900
web02 310

Normalize Uneven Whitespace

Assigning to any field makes awk rebuild $0 with the output field separator. With the default OFS, this collapses uneven spacing to single spaces.

awk '{ $1 = $1; print }' messy.txt
web01 frontend active
api01 backend maintenance

Replace Text in One Field with gsub

gsub() replaces every match in the target string. Give it a field target, such as $1, when the replacement should not touch the whole record. Set OFS first when the edited record should keep the original delimiter.

awk -F':' 'BEGIN { OFS=":" } NR == 1 { print; next } { gsub(/^web/, "frontend-", $1); print }' inventory.txt | awk 'NR <= 3 { print }'
hostname:role:cpu:memory:status
frontend-01:frontend:4:8:active
frontend-02:frontend:4:8:active

The second awk process limits the preview to three records. Remove that final pipe when you want the complete transformed stream.

Calculate Totals and Reports with awk

Add Values Across Records

Awk variables are created on first use. Add each memory value to total, then print the result in an END block after awk has read every record.

awk -F':' 'NR > 1 { total += $4 } END { printf "total memory: %d GB\n", total }' inventory.txt
total memory: 144 GB

Calculate an Average

Track both the total and the number of data records when you need an average. Skipping the header keeps the count aligned with real host rows; add an END guard when a filter might match no records.

awk -F':' 'NR > 1 { total += $4; count++ } END { printf "average memory: %.1f GB\n", total / count }' inventory.txt
average memory: 24.0 GB

Sum Values by Group

Associative arrays let awk group values by any field. Add memory by role, then pipe the result to sort so the report order stays stable.

awk -F':' 'NR > 1 { memory[$2] += $4 } END { for (role in memory) print role, memory[role] }' inventory.txt | sort
backend 48
cache 16
database 64
frontend 16

Count Log Levels

The same array pattern counts repeated values. This command counts how often each severity appears in the log file.

awk '{ count[$2]++ } END { for (level in count) print level, count[level] }' app.log | sort
ERROR 2
INFO 3
WARN 2

Calculate Percentages by Group

Percentages need both the grouped subtotal and the overall total. The doubled percent sign in printf prints a literal percent character.

awk -F':' 'NR > 1 { memory[$2] += $4; total += $4 } END { for (role in memory) printf "%-10s %5.1f%%\n", role, memory[role] * 100 / total }' inventory.txt | sort
backend     33.3%
cache       11.1%
database    44.4%
frontend    11.1%

Show the Top Rows by Value

Awk can extract the sort key and label, then standard shell tools can order and limit the result. Put the numeric value first so sort -k1,1nr sorts by that column only.

awk -F':' 'NR > 1 { print $4, $1 }' inventory.txt | sort -k1,1nr | head -3
64 db01
32 api02
16 api01

Align Columns with printf

Use awk’s printf statement when spacing matters. The format string controls column width and data type; the separate printf command in Linux reference uses the same formatting ideas from the shell side.

awk -F':' 'BEGIN { printf "%-10s %-12s %8s\n", "HOST", "ROLE", "MEMORY" } NR > 1 { printf "%-10s %-12s %6s GB\n", $1, $2, $4 }' inventory.txt
HOST       ROLE           MEMORY
web01      frontend          8 GB
web02      frontend          8 GB
api01      backend          16 GB
api02      backend          32 GB
db01       database         64 GB
cache01    cache            16 GB

Set a Custom Output Separator

OFS controls the separator awk uses between comma-separated arguments to print. Set it in BEGIN before the first output record.

awk -F':' 'BEGIN { OFS="," } NR == 1 { print "host","role","memory_gb"; next } { print $1,$2,$4 }' inventory.txt
host,role,memory_gb
web01,frontend,8
web02,frontend,8
api01,backend,16
api02,backend,32
db01,database,64
cache01,cache,16

Compare Files, Remove Duplicates, and Use Pipelines with awk

Remove Duplicate Values While Preserving First Seen Order

The expression !seen[value]++ prints only the first time a value appears. Use a field as the array key when uniqueness should be based on one column instead of the whole line.

awk '!seen[$3]++ { print $3 }' app.log
web01
api01
db01
web02
api02
cache01

Count Unique Values

Counting a repeated field is often more useful than removing duplicates. Sort the final output when array traversal order should not affect the report.

awk '{ hits[$3]++ } END { for (host in hits) print host, hits[host] }' app.log | sort
api01 2
api02 1
cache01 1
db01 1
web01 1
web02 1

Compare Two Files with FNR and NR

FNR == NR is true only while awk reads the first file. This lets you load lookup data into an array, then use that array while reading the second file. The field separator here matches either colons or runs of whitespace, so one command can read both fixture formats.

awk -F'[:[:space:]]+' 'FNR == NR { min[$1] = $2; next } FNR > 1 && $4 < min[$2] { print $1, $2, $4 " GB", "below", min[$2] " GB" }' limits.txt inventory.txt
web01 frontend 8 GB below 12 GB
web02 frontend 8 GB below 12 GB

Use awk in a Pipeline

Pipeline input is useful when another command narrows the data first. Here, tail reads the newest log lines, then awk extracts only the timestamp, host, and timing field from error rows.

tail -n 5 app.log | awk '$2 == "ERROR" { print $1, $3, $6 }'
2026-05-27T10:00:03Z api01 ms=900
2026-05-27T10:00:05Z web02 ms=310

Reuse awk Logic with Variables and Scripts

Pass Shell Values with -v

Use -v to pass shell-controlled values into awk before awk starts reading input. This keeps shell values separate from the awk program, avoids quoting mistakes, and makes the condition easier to reuse.

awk -F':' -v min_mem=32 'NR > 1 && $4 >= min_mem { print $1, $4 }' inventory.txt
api02 32
db01 64

Move Reusable Logic into an awk Script

Short one-liners are convenient, but a script file is easier to review when the logic grows past a few statements. Keep shell setup separate from the awk program so each layer remains readable, and pass runtime values with -v instead of editing the script for each run.

cat > high-memory.awk <<'EOF'
BEGIN { FS = ":" }
NR == 1 { next }
$4 >= limit { printf "%s has %d GB\n", $1, $4 }
EOF

awk -v limit=32 -f high-memory.awk inventory.txt
api02 has 32 GB
db01 has 64 GB

Use awk Safely in Shell Workflows

  • Quote awk programs with single quotes so the shell does not expand $1, $2, or other awk variables before awk sees them.
  • Use -v for shell-provided values instead of inserting variables directly into the awk program string.
  • Remember that -F is a regular expression. Use simple delimiters confidently, but do not treat plain awk splitting as a full parser for quoted CSV, JSON, YAML, XML, or other structured formats.
  • Do not overwrite files as the first step. Write transformed output to a separate file, inspect it, then replace the original only after the preview matches what you intended.
  • Sort array-based reports when order matters. Normal awk does not promise the order of for (key in array) traversal.

Rewrite the frontend memory values into a separate file and preview only the first records. The original inventory.txt remains unchanged.

awk -F':' 'BEGIN { OFS=":" } NR == 1 { print; next } $2 == "frontend" { $4 = 12 } { print }' inventory.txt > inventory.updated
awk 'NR <= 3 { print }' inventory.updated
hostname:role:cpu:memory:status
web01:frontend:4:12:active
web02:frontend:4:12:active

Replace a real source file only after reviewing the generated output. A safer pattern is to write to a sibling file, inspect or compare it, make a backup when the source matters, and then move the replacement into place.

Troubleshoot Common awk Errors

awk Command Not Found

If the shell cannot find awk, confirm the command lookup first:

command -v awk

No output means the current shell cannot find awk in $PATH. On Ubuntu, Fedora, and Rocky Linux systems, GNU awk is provided by the gawk package. Install it with the distribution package manager, then open a new shell if the command path still does not refresh.

awk Version Check Does Not Work

awk --version is a GNU awk option. Another awk implementation can reject that option while still running normal awk programs. Test the interpreter with a BEGIN action before assuming awk itself is broken.

awk 'BEGIN { print "awk works" }'
awk works

Double Quotes Print the Whole Record Instead of a Field

The shell expands $1 inside double quotes before awk receives the program. In a normal shell with no first positional parameter, $1 becomes empty, so awk receives { print } and prints whole records. In a strict shell using set -u, the shell can stop earlier with an unbound-variable error.

awk -F':' "{ print $1 }" inventory.txt | head -3
hostname:role:cpu:memory:status
web01:frontend:4:8:active
web02:frontend:4:8:active

Single quotes keep awk variables intact until awk parses them:

awk -F':' '{ print $1 }' inventory.txt | head -3
hostname
web01
web02

Unexpected Newline or End of String

A missing quote or closing brace makes awk parse the program as incomplete. This broken command leaves the action unfinished:

awk '{ print $1 ' inventory.txt

Relevant GNU awk output includes:

awk: cmd. line:1: { print $1 
awk: cmd. line:1:            ^ unexpected newline or end of string

Close both the action brace and the shell quote before rerunning the command:

awk '{ print $1 }' inventory.txt

Fields Do Not Split Correctly

When $1, $2, or later fields do not contain what you expect, print each parsed field from one known record. This diagnoses the separator before you rewrite the main command.

awk -F':' 'NR == 2 { for (i = 1; i <= NF; i++) print i, $i }' inventory.txt
1 web01
2 frontend
3 4
4 8
5 active

If the whole line appears as field 1, the input probably uses a different delimiter than the one passed to -F. Preview one known record, confirm the real separator, then update the main command.

Averages Fail When No Rows Match

An average needs at least one matching record. Guard the END block so awk prints a clear result instead of dividing by zero or producing implementation-specific output.

awk -F':' '$2 == "worker" { total += $4; count++ } END { if (count) printf "%.1f\n", total / count; else print "no matching records" }' inventory.txt
no matching records

Grouped Output Appears in a Different Order

Awk arrays are associative, and normal awk does not promise a stable order when iterating with for (key in array). Sort the printed rows when report order matters:

awk -F':' 'NR > 1 { count[$2]++ } END { for (role in count) print role, count[role] }' inventory.txt | sort
backend 2
cache 1
database 1
frontend 2

Clean Up the awk Practice Files

Remove the disposable workspace after testing the examples. The command targets only the directory created earlier.

cd ~
rm -rf ~/awk-demo

Conclusion

Awk is ready for field-based filtering, line ranges, grouped counts, file comparisons, and formatted reports from plain text streams. Keep one-off programs single-quoted, pass shell values with -v, sort array reports when order matters, and write transformed data to a separate output file before replacing source data.

Share this guide

Help another Linux user troubleshoot faster

Share this guide with someone troubleshooting Linux systems or saving it for later.

Follow LinuxCapable

Want more LinuxCapable guides in Google?

Add LinuxCapable as a preferred source so Google can show our tutorials more often in Top Stories and mark them as preferred in AI Mode and AI Overviews when relevant.

Add LinuxCapable as a preferred source on Google
Search LinuxCapable

Need another guide?

Search LinuxCapable for package installs, commands, troubleshooting, and follow-up guides related to what you just read.

Found this guide useful?

Support LinuxCapable to keep tutorials free and up to date.

Buy me a coffeeBuy me a coffee
Before commenting, please review our Comments Policy.
Formatting tips for your comment

You can use basic HTML to format your comment. Useful tags currently allowed in published comments:

You type Result
<code>command</code> command
<strong>bold</strong> bold
<em>italic</em> italic
<a href="https://example.com">link</a> link
<blockquote>quote</blockquote> quote block

Add to the discussion

Questions, fixes, command output, and version notes help keep this guide current.

Verify before posting: