split Command in Linux with Examples

Use the split command to divide files by lines, bytes, or parts. Covers advanced options, tar piping, gzip, and troubleshooting.

Last updatedAuthorJoshua JamesRead time8 minGuide typeLinux Commands

Large logs, database exports, and backup archives get easier to move or process when you can cut them into predictable pieces without changing the original data. The split command in Linux handles that job from the terminal, using line counts, byte limits, record-aware size limits, or a fixed number of output chunks. GNU split also supports filters, custom suffixes, and round-robin distribution for workflows that need compressed pieces or balanced worker inputs.

Understand the split Command

GNU split is part of GNU coreutils and ships with most Linux distributions. Confirm that your system is using GNU split before relying on long options such as --filter or --additional-suffix:

split --version | head -n 1

Example output:

split (GNU coreutils) 9.4

The version number can differ by distribution. The important part is the GNU coreutils label; BusyBox implementations expose a smaller option set.

split Command Syntax

The basic syntax follows this pattern:

split [OPTION]... [INPUT [PREFIX]]

When INPUT is omitted, or when INPUT is -, split reads from standard input. Without options, GNU split writes 1000-line chunks, uses x as the prefix, and creates names such as xaa, xab, and xac. The GNU split manual documents the full GNU option set.

Use split when the split point is based on a line count, byte size, record size, or fixed number of output files. If the boundary depends on matching content, such as a section marker or header pattern, use csplit instead.

split Command Quick Reference

TaskCommand PatternWhat It Does
Split by line countsplit -l 1000 access.log chunk-Writes 1000 lines per output file.
Split by byte sizesplit -b 100M backup.tar.gz backup-part-Writes chunks near the requested byte size.
Keep records under a size limitsplit -C 50M app.log log-part-Keeps complete records when possible while limiting each file size.
Create a fixed number of byte-balanced piecessplit -n 4 dataset.csv parts-Creates four roughly equal byte ranges, which may split lines.
Create a fixed number of line-safe piecessplit -n l/4 dataset.csv parts-Creates four chunks without splitting records between files.
Distribute records round robinsplit -n r/4 requests.log worker-Sends alternating records to four output files.
Use numeric suffixessplit -d -a 4 -l 100 data.csv chunk-Creates names such as chunk-0000 and chunk-0001.
Add a file extensionsplit -d --additional-suffix=.log -l 500 app.log chunk-Appends an extension after each generated suffix.
Filter each chunk through a commandsplit -b 50M --filter='gzip > "$FILE.gz"' data.bin chunk-Passes each chunk to a shell command using the $FILE variable.
Use a custom record separatorsplit -t '\0' -l 100 records.bin record-Uses NUL instead of newline as the record separator.

GNU size values accept suffixes such as K, M, and G for powers of 1024, plus KB, MB, and GB for powers of 1000. Binary forms such as KiB and MiB are also valid in GNU coreutils.

GNU and BusyBox split Compatibility

These examples use GNU split. Minimal containers and embedded Linux systems may provide BusyBox split instead, so check split --help before using GNU-only options:

FeatureGNU splitBusyBox split
Line chunks with -lSupportedSupported
Byte chunks with -bSupportedSupported
Suffix length with -aSupportedSupported
Fixed chunk count with -nSupportedNot in the documented BusyBox applet
Record-aware byte limit with -CSupportedNot in the documented BusyBox applet
Numeric or hexadecimal suffixes-d, --numeric-suffixes, -x, --hex-suffixesNot in the documented BusyBox applet
Additional suffixes--additional-suffixNot in the documented BusyBox applet
Output filters--filter=COMMANDNot in the documented BusyBox applet
Custom record separator-t, --separatorNot in the documented BusyBox applet
Size suffixesLarge GNU unit set, including K, M, G, KB, MB, and GBDocumented as k and m

If you need a script to run on both GNU and BusyBox systems, limit the command to -l, -b, -a, and a simple prefix. Use GNU split when you need --filter, -n, -C, numeric suffix controls, or custom record separators.

Practical split Command Examples

Split a File by Line Count

Line-based splitting works well for logs, CSV files, and other text files where each record occupies one line. This command writes five lines per chunk and names the outputs with the log-chunk- prefix:

split -l 5 access.log log-chunk-

The generated files use alphabetic suffixes such as log-chunk-aa, log-chunk-ab, and log-chunk-ac. Verify the line counts with wc:

wc -l log-chunk-*

For a 12-line sample input, the output looks like this:

 5 log-chunk-aa
 5 log-chunk-ab
 2 log-chunk-ac
12 total

For large live logs, the tail command can inspect the newest lines in the final chunk without opening the entire file.

Split a File by Byte Size

Byte-based splitting is better for archives, disk images, and backups where line boundaries do not matter. This example creates chunks of 100 MiB each, except for the final remainder file:

split -b 100M backup.tar.gz backup-part-

Check the logical size of each generated file:

stat -c '%n %s bytes' backup-part-*

For a 243 MiB input, example output is:

backup-part-aa 104857600 bytes
backup-part-ab 104857600 bytes
backup-part-ac 45088768 bytes

If you are splitting files to fit a disk quota, the du command examples for disk usage analysis help check real storage consumption before you move the chunks.

Split a File into N Equal Parts

The -n option divides a file into a fixed number of output files. Plain -n N works by byte range, so it can cut through the middle of a text line:

split -n 4 dataset.csv parts-

List the generated files:

ls parts-*
parts-aa
parts-ab
parts-ac
parts-ad

Use the l/N form when every output file must contain complete lines:

split -n l/4 dataset.csv parts-

The line-safe form may create chunks with slightly different byte sizes because record boundaries rarely land on exact byte divisions.

Keep Lines Intact When Splitting by Size

The -C option sets a maximum byte count while trying to keep each record whole. It is useful for logs and line-oriented exports that need size-limited chunks without broken records:

split -C 50M app.log app-log-

GNU split keeps complete records when possible, but a single record longer than the requested size can still be split. Choose a size larger than the longest expected line when the input must remain strictly record-safe.

Use Numeric Suffixes for Split Files

Numeric suffixes make generated names easier to scan and sort in scripts. Combine -d with -a when you expect many output files:

split -l 1000 -d -a 4 auth.log auth-
ls auth-*

For a 4000-line input split into 1000-line chunks, the names look like this:

auth-0000
auth-0001
auth-0002
auth-0003

The -a 4 option reserves four suffix characters, which gives numeric output names from 0000 through 9999. Increase the value again if the input can produce more chunks.

Add a Custom File Extension to Split Output

By default, split output files have no extension. Add --additional-suffix when another tool or workflow expects a recognizable extension:

split -l 2000 -d --additional-suffix=.log server.log chunk-
ls chunk-*

For a 6000-line input, the output names look like this:

chunk-00.log
chunk-01.log
chunk-02.log

The extra suffix appears after the generated suffix. In this example, chunk-00 becomes chunk-00.log.

Use an Empty Prefix for Suffix-Only Output Names

GNU split accepts an empty string as the output prefix. This is useful only when suffix-only names are intentional, such as numbered objects inside a dedicated output directory:

split -b 10M -d --additional-suffix=.part archive.tar.gz ""

For a roughly 25 MiB input, the generated names look like this:

ls *.part
00.part
01.part
02.part

Use the empty prefix only in a clean directory or with an output filter that writes into a dedicated directory. Otherwise, short names such as 00 and 01 are easy to confuse with unrelated files.

Split and Archive a Directory with tar

The split command works on files and streams, not directories. To split a directory, create a tar stream first, then send that stream into split. The tar and gzip file guide covers the archive side in more detail.

tar -C "$HOME" -cf - project | split -b 100M -d - project-backup-

The tar -C "$HOME" -cf - project portion archives ~/project to standard output, and split writes numbered chunks such as project-backup-00 and project-backup-01. Restore the archive into a separate directory before replacing the original data:

mkdir -p ~/restore
cat project-backup-* | tar -C ~/restore -xf -

Combine split with gzip Compression

To compress first and split the compressed stream, pipe gzip -c into split. This keeps the original file in place and writes size-limited compressed pieces:

gzip -c app.log | split -b 25M -d - app-log.gz.part-

Reassemble the pieces in suffix order before decompressing:

cat app-log.gz.part-* | gunzip > app-restored.log

Reassemble Split Files with cat

Reassembly uses cat because split writes byte-for-byte pieces of the original input. Shell globbing sorts padded suffixes lexicographically, which matches the order GNU split creates by default:

cat backup-part-* > backup-restored.tar.gz

Confirm the restored file matches the original:

cmp -s backup.tar.gz backup-restored.tar.gz && echo "Files match"
Files match

For checksum-based verification, compare both files with sha256sum and make sure the hashes are identical.

Advanced split Techniques

Use Verbose Mode to Monitor split Progress

When a large split operation takes time, --verbose prints each output filename before GNU split opens it:

split -b 500M --verbose database-dump.sql db-part-

For a dump large enough to create four files, output looks like this:

creating file 'db-part-aa'
creating file 'db-part-ab'
creating file 'db-part-ac'
creating file 'db-part-ad'

Pipe Data Directly into split

Use - as the input name when another command produces the data. This example builds a file list with find with -exec, then splits the list into 50-line batches:

find ~/logs -name "*.log" -type f -exec printf '%s\n' {} \; | split -l 50 -d - filelist-batch-

Each output file contains up to 50 paths. The pipeline is useful when another script or worker process needs a manageable batch list.

Use split --filter and the FILE Variable

GNU split --filter sends each chunk through a shell command instead of writing the chunk directly. During each run, GNU split sets the $FILE environment variable to the output name it would have used:

split -b 50M --filter='gzip > "$FILE.gz"' large-export.csv chunk-

For a 120 MiB input, the compressed chunk names look like this:

ls chunk-*.gz
chunk-aa.gz
chunk-ab.gz
chunk-ac.gz

Use single quotes around the filter in your parent shell so $FILE reaches GNU split. If you use double quotes, your shell can expand $FILE too early and create a wrong name such as .gz. The filter syntax uses $FILE, not {}, %f, or another placeholder.

An empty prefix pairs well with --filter when you want suffix-only values inside a controlled directory:

mkdir -p chunks
split -b 50M -d --filter='gzip > "chunks/$FILE.csv.gz"' large-export.csv ""

With the empty prefix, $FILE becomes 00, 01, 02, and so on. The filter then writes names such as chunks/00.csv.gz.

Use Environment Variables for Dynamic Splitting

Shell variables make split commands easier to reuse in scripts. Quote the variables so spaces or special characters in a prefix do not break the command line:

CHUNK_SIZE="50M"
PREFIX="export-chunk-"
split -b "$CHUNK_SIZE" -d --verbose data-export.csv "$PREFIX"

The same pattern works with -l, -C, and -n as long as the variable contains the complete value that the option expects.

Suppress Empty Files When Splitting into N Parts

When -n asks for more output files than the input can fill, GNU split may create zero-byte files. Add -e to write only non-empty chunks:

split -n 20 -e small-file.txt config-part-

For example, a five-byte input split into 20 byte ranges produces only five files with -e, instead of 20 files where most are empty.

Round-Robin Distribution Across Files

The -n r/N form distributes records across N files in a round-robin pattern. The first record goes to the first file, the second to the second file, and the sequence repeats after the Nth file:

split -n r/4 requests.log worker-

Round-robin mode can balance worker inputs better than sequential splitting when line lengths or record costs vary widely.

Troubleshoot Common split Errors

Output File Suffixes Are Exhausted

If the suffix space is too small, GNU split stops with this error:

split: output file suffixes exhausted

This usually happens when -a sets a suffix length that cannot hold all required names. Increase the suffix length before rerunning the split:

split -l 10 -d -a 4 huge-log.txt chunk-

A four-digit numeric suffix provides 10,000 names. Use a larger value if your input and chunk size can produce more files.

split Cannot Open the Input File

Permission or path problems produce an error like this:

split: cannot open '/var/log/secure' for reading: Permission denied

Check that the file exists and that your user can read it:

ls -l /var/log/secure

If the input requires elevated read access but the output should stay in the current directory, read the file with sudo cat and let split run as your normal user:

sudo cat /var/log/secure | split -l 5000 - secure-chunk-

split Creates Empty Files

Empty files commonly appear when -n asks for more chunks than the input can fill. Find zero-byte output files with this check:

find . -maxdepth 1 -name 'chunk-*' -size 0 -print

Example output:

./chunk-ad
./chunk-ae

Add -e to suppress empty outputs, or choose a smaller chunk count:

split -n 10 -e small-file.txt chunk-

Reassembled File Does Not Match the Original

If a restored file differs from the original, first print the exact names that your shell will pass to cat. Replace backup-part- with your actual split prefix:

printf '%s\n' backup-part-*

Then confirm the expected number of chunk files exists:

find . -maxdepth 1 -type f -name 'backup-part-*' | wc -l

Missing chunks, renamed files, or a glob that matches unrelated files can corrupt the reassembly. Re-split from the original source when any chunk is missing or damaged.

Conclusion

The split command can now divide text, binary data, archives, and streams into pieces that fit the job: line-safe chunks, byte-limited parts, fixed worker batches, compressed filter output, or suffix-only names. Keep GNU-only options separate from BusyBox-compatible scripts, and verify reassembled files before deleting the original input.

Share this guide

Help another Linux user troubleshoot faster

Share this guide with someone troubleshooting Linux systems or saving it for later.

Follow LinuxCapable

Want more LinuxCapable guides in Google?

Add LinuxCapable as a preferred source so Google can show more of our fresh Linux tutorials in Top Stories and From your sources when relevant.

Add LinuxCapable as a preferred source on Google
Search LinuxCapable

Need another guide?

Search LinuxCapable for package installs, commands, troubleshooting, and follow-up guides related to what you just read.

Found this guide useful?

Support LinuxCapable to keep tutorials free and up to date.

Buy me a coffeeBuy me a coffee
Before commenting, please review our Comments Policy.
Formatting tips for your comment

You can use basic HTML to format your comment. Useful tags currently allowed in published comments:

You type Result
<code>command</code> command
<strong>bold</strong> bold
<em>italic</em> italic
<blockquote>quote</blockquote> quote block

Got a Question or Feedback?

We read and reply to every comment - let us know how we can help or improve this guide.

Verify before posting: