bzip2recover Command in Linux: Recover Damaged BZ2 Files

When a .bz2 file will not decompress, this workflow keeps the damaged source intact, splits recoverable blocks, tests each piece, and rebuilds usable text or tar data without pretending the archive is fully repaired.

PublishedAuthorJoshua JamesRead time7 minGuide typeLinux Commands

Interrupted transfers and failing disks can leave a .bz2 file that normal decompression refuses to read, but some compressed blocks may still be intact. The bzip2recover command in Linux extracts those blocks into separate .bz2 files so you can test each piece, decompress the usable parts, and salvage partial data without modifying the damaged original.

bzip2recover is not a full file repair tool. It cannot rebuild a damaged block, restore missing bytes, or make a broken .tar.bz2 archive complete again. Its value is narrower and practical: split a damaged bzip2 stream into recoverable block files, then let the bzip2 command with -t decide which pieces are safe to use.

Run bzip2recover on a Copy

Run bzip2recover against a copied damaged file inside a clean working directory. The command takes one filename, writes recovered block files into the current directory, and leaves the damaged input file in place.

mkdir -p ~/bzip2recover-work
cp -- ~/Downloads/damaged.log.bz2 ~/bzip2recover-work/
cd ~/bzip2recover-work
bzip2recover damaged.log.bz2

Replace ~/Downloads/damaged.log.bz2 with the real path before running the copy command. Working from a copy avoids adding new files beside an important backup, log, or download and keeps old recovery attempts from mixing with the current run.

The official bzip2 manual explains why this works: bzip2 stores data in independently handled blocks, usually up to 900 KB each, and each block has its own integrity check. Packaged Linux builds commonly write filenames such as rec00001damaged.log.bz2, rec00002damaged.log.bz2, and later numbered pieces.

TaskCommand PatternWhat It Does
Recover blocksbzip2recover damaged.log.bz2Scans the damaged bzip2 stream and writes recovered block files.
Test one recovered piecebzip2 -t rec00001damaged.log.bz2Checks whether a recovered block file is internally valid.
Decompress one valid piecebzip2 -dc rec00001damaged.log.bz2 > part001.logWrites decompressed data without deleting the recovered .bz2 file.
Join named valid piecesbzip2 -dc rec00001damaged.log.bz2 rec00002damaged.log.bz2 > recovered.logStreams only the recovered block files you already tested as valid.
Inspect file formatfile damaged.log.bz2Confirms whether the input is really bzip2 data before retrying recovery.

Do not run recovery in a directory that already contains old rec* files for the same damaged archive. Create a fresh working directory or move the older pieces aside first.

Verify or Install bzip2recover

The bzip2 package provides bzip2recover along with bzip2, the bunzip2 decompression command, the bzcat streaming command, bzdiff, and related helpers. Many Linux systems already include it, but minimal servers and containers may not.

command -v bzip2recover

A common installed path is:

/usr/bin/bzip2recover

Run the command without a file only when you want to check the build information. It prints usage text and exits with an error because no damaged file was supplied:

bzip2recover
bzip2recover 1.0.8: extracts blocks from damaged .bz2 files.
bzip2recover: usage is `bzip2recover damaged_file_name'.
	restrictions on size of recovered file: None

Older enterprise releases may print an older bzip2 version, such as 1.0.6, while keeping the same single-file bzip2recover syntax. The version line matters mainly when you are comparing local output with examples.

If the command is missing, install the bzip2 package for your distribution.

APT-Based Distributions

sudo apt update
sudo apt install bzip2

Fedora and RHEL-Family Distributions

sudo dnf install bzip2

Arch Linux and Manjaro

sudo pacman -S bzip2

openSUSE

sudo zypper install bzip2

After installation, repeat command -v bzip2recover. If the command still does not appear, open a new shell and confirm that standard system paths such as /usr/bin and /usr/sbin are present in $PATH.

Create a Disposable Damaged BZ2 File

A controlled practice file helps you learn the recovery workflow before touching an important archive. These commands create a text log under ~/bzip2recover-demo, then report its uncompressed size.

mkdir -p ~/bzip2recover-demo
cd ~/bzip2recover-demo
seq -f "line %08g status ok payload abcdefghijklmnopqrstuvwxyz0123456789" 1 18000 > server.log
wc -c server.log
1242000 server.log

Compress the practice log with a smaller bzip2 block size, copy the compressed file, and damage only the copy. The original server.log and server.log.bz2 stay available for comparison.

bzip2 -1 -k server.log
cp server.log.bz2 damaged.log.bz2
dd if=/dev/zero of=damaged.log.bz2 bs=1 seek=6000 count=64 conv=notrunc status=none
ls -1
damaged.log.bz2
server.log
server.log.bz2

The -1 option makes smaller 100 KB bzip2 blocks, which produces more pieces in a small demo file. For real backups, use the file you already have; do not recompress a damaged file before recovery.

Test the Damaged BZ2 File First

Start with bzip2 -t so you know whether normal decompression fails because of bzip2 data damage. The test command is read-only.

bzip2 -t damaged.log.bz2

Relevant output for the damaged practice file looks like this:

bzip2: damaged.log.bz2: data integrity (CRC) error in data

You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.

A CRC error means bzip2 recognized the stream but found damaged compressed data. If the error says bad magic number, confirm the file format before assuming bzip2 recovery is the right tool.

Recover Blocks from a Damaged BZ2 File

Run bzip2recover from the clean directory that contains the damaged file. The command scans for block boundaries, then writes one recovered .bz2 file per block that it can split out.

bzip2recover damaged.log.bz2

Relevant output from the practice file includes the detected block ranges and the generated piece names:

bzip2recover 1.0.8: extracts blocks from damaged .bz2 files.
bzip2recover: searching for block boundaries ...
   block 1 runs from 80 to 12690
   block 2 runs from 12739 to 26811
   block 3 runs from 26860 to 40214
   block 4 runs from 40263 to 54067
   block 5 runs from 54116 to 67783
   block 6 runs from 67832 to 81754
   block 7 runs from 81803 to 95112
   block 8 runs from 95161 to 109195
   block 9 runs from 109244 to 123550
   block 10 runs from 123599 to 137538
   block 11 runs from 137587 to 151772
   block 12 runs from 151821 to 166247
   block 13 runs from 166296 to 174536
   block 14 runs from 174585 to 174624 (incomplete)
bzip2recover: splitting into blocks
   writing block 1 to `rec00001damaged.log.bz2' ...
   writing block 2 to `rec00002damaged.log.bz2' ...
   writing block 3 to `rec00003damaged.log.bz2' ...
   writing block 4 to `rec00004damaged.log.bz2' ...
   writing block 5 to `rec00005damaged.log.bz2' ...
   writing block 6 to `rec00006damaged.log.bz2' ...
   writing block 7 to `rec00007damaged.log.bz2' ...
   writing block 8 to `rec00008damaged.log.bz2' ...
   writing block 9 to `rec00009damaged.log.bz2' ...
   writing block 10 to `rec00010damaged.log.bz2' ...
   writing block 11 to `rec00011damaged.log.bz2' ...
   writing block 12 to `rec00012damaged.log.bz2' ...
   writing block 13 to `rec00013damaged.log.bz2' ...
bzip2recover: finished

List the recovered files before testing them. This also confirms that the current directory contains only pieces from the current recovery attempt.

ls -1 rec*damaged.log.bz2 | head -n 5
rec00001damaged.log.bz2
rec00002damaged.log.bz2
rec00003damaged.log.bz2
rec00004damaged.log.bz2
rec00005damaged.log.bz2

Test Recovered Blocks Before Decompressing

Test every recovered block file before you join or decompress anything. A recovered piece can still contain damaged data, and skipping this step can put a corrupt block back into your output.

for f in rec*damaged.log.bz2; do
    if bzip2 -t "$f" 2>/dev/null; then
        printf 'OK %s\n' "$f"
    else
        printf 'BAD %s\n' "$f"
    fi
done

The practice file produced 13 recovered pieces. Twelve passed the integrity test, while one damaged block failed:

OK rec00001damaged.log.bz2
OK rec00002damaged.log.bz2
OK rec00003damaged.log.bz2
BAD rec00004damaged.log.bz2
OK rec00005damaged.log.bz2
OK rec00006damaged.log.bz2
OK rec00007damaged.log.bz2
OK rec00008damaged.log.bz2
OK rec00009damaged.log.bz2
OK rec00010damaged.log.bz2
OK rec00011damaged.log.bz2
OK rec00012damaged.log.bz2
OK rec00013damaged.log.bz2

Only use the OK pieces. Keep the BAD pieces until you finish inspecting the recovery, but do not feed them into the restored output.

Decompress Valid Blocks into a Recovered File

For a single compressed text file, append only valid recovered blocks into a new output file. The loop tests each piece again, skips failures, and keeps the recovered .bz2 files untouched.

: > recovered.log
for f in rec*damaged.log.bz2; do
    if bzip2 -t "$f" 2>/dev/null; then
        bzip2 -dc "$f" >> recovered.log
    fi
done
wc -l recovered.log
16572 recovered.log

The original practice file had 18,000 lines, so the recovered output is partial. That is the expected tradeoff: bzip2recover salvages readable blocks, not the bytes lost from damaged blocks.

Preview the recovered file before treating it as trustworthy:

head -n 3 recovered.log
tail -n 3 recovered.log
line 00000001 status ok payload abcdefghijklmnopqrstuvwxyz0123456789
line 00000002 status ok payload abcdefghijklmnopqrstuvwxyz0123456789
line 00000003 status ok payload abcdefghijklmnopqrstuvwxyz0123456789
line 00017998 status ok payload abcdefghijklmnopqrstuvwxyz0123456789
line 00017999 status ok payload abcdefghijklmnopqrstuvwxyz0123456789
line 00018000 status ok payload abcdefghijklmnopqrstuvwxyz0123456789

If the recovered text has ordered line numbers, timestamps, IDs, or sequence fields, scan for gaps before treating the file as complete. The practice log uses a line number in the second field, so this check exposes the missing range caused by the damaged block:

awk '
NR == 1 { previous = $2 + 0; next }
($2 + 0) != previous + 1 {
    printf "gap after line %08d before line %08d\n", previous, $2
}
{ previous = $2 + 0 }
' recovered.log
gap after line 00004301 before line 00005730

For binary data, choose an output name that matches the content, such as recovered.bin, and verify it with the application that normally reads that file. For text logs and exports, compare the recovered result against an older copy when one exists. The bzdiff command guide is useful when the comparison files are still compressed with bzip2.

Recover Data from tar.bz2 Archives Carefully

A .tar.bz2 file has two layers: a tar archive stream compressed by bzip2. bzip2recover only understands the bzip2 layer, so recovered blocks may decompress into partial tar data with missing tar records, shifted file data, or no usable tar listing at all.

Use the same block-testing workflow inside the recovery workspace, but treat the rebuilt tar file as damaged evidence. Replace backup.tar.bz2 in the pattern with the damaged archive’s exact basename, then confirm the pattern matches recovered pieces before rebuilding the tar stream.

ls -1 rec*backup.tar.bz2 | head -n 5

A matching pattern should show numbered recovered pieces for that archive:

rec00001backup.tar.bz2
rec00002backup.tar.bz2
rec00003backup.tar.bz2
rec00004backup.tar.bz2
rec00005backup.tar.bz2

Write valid pieces into a separate tar file first, then ask tar what it can still list before extracting anything.

: > recovered.tar
for f in rec*backup.tar.bz2; do
    if bzip2 -t "$f" 2>/dev/null; then
        bzip2 -dc "$f" >> recovered.tar
    fi
done
if tar -tf recovered.tar > recovered-tar-list.txt 2> recovered-tar-errors.txt; then
    sed -n '1,40p' recovered-tar-list.txt
else
    sed -n '1,40p' recovered-tar-list.txt
    sed -n '1,20p' recovered-tar-errors.txt
fi

A damaged tar stream may still list a few paths before tar reaches the broken record. Example output can look like this:

backup/
backup/etc/
backup/etc/app.conf
backup/logs/
backup/logs/app.log
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

If the list file is empty or the errors say the result does not look like a tar archive, stop and keep the damaged source plus recovered block files. If tar lists useful paths, extract into a new directory rather than the current working directory:

mkdir -p recovered-tar
if tar -xf recovered.tar -C recovered-tar 2> recovered-tar-extract-errors.txt; then
    find recovered-tar -maxdepth 3 -type f | sort | head
else
    find recovered-tar -maxdepth 3 -type f | sort | head
    sed -n '1,20p' recovered-tar-extract-errors.txt
fi

Extraction can also produce partial files before the archive error. Keep both the files and the error log until you decide whether the recovered data is usable:

recovered-tar/backup/etc/app.conf
recovered-tar/backup/logs/app.log
tar: Unexpected EOF in archive
tar: rmtlseek not stopped at a record boundary
tar: Error is not recoverable: exiting now

If tar reports archive errors after recovery, treat any extracted files as partial results. Preserve the damaged source, recovered pieces, listing file, error files, and extracted directory for a backup or forensic workflow. The tar command examples explain listing and extraction behavior, but they cannot replace missing tar records after a damaged bzip2 block.

Troubleshoot bzip2recover Errors

Fix Cannot Read File Errors

The message can't read usually means the filename is wrong, the current directory is not what you expected, or the user cannot read the file.

bzip2recover missing-file.bz2
bzip2recover: can't read `missing-file.bz2'
pwd
ls -lh -- damaged.log.bz2

If the file lives elsewhere, copy it into the clean recovery directory and retry from there:

cp -- /path/to/damaged.log.bz2 ~/bzip2recover-work/
cd ~/bzip2recover-work
bzip2recover damaged.log.bz2

Fix Bad Magic Number or Wrong Format Errors

A bad magic number error during the bzip2 -t precheck means the file does not begin like a bzip2 stream. Confirm the real format before retrying recovery.

file damaged.log.bz2

Example output for a mislabeled gzip file looks like this:

damaged.log.bz2: gzip compressed data, from Unix, original size modulo 2^32 17

If the file is gzip data, use gzip tools instead of bzip2 tools. The gunzip command examples cover gzip integrity checks and decompression choices.

Handle No Useful Blocks Recovered

bzip2recover helps most when a damaged file contains several bzip2 blocks. If the file was small enough to fit into one block, damage inside that block can leave nothing useful to recover. A recovery run that cannot find block boundaries is a signal to look for a clean copy instead of repeating the same command.

bzip2recover damaged.log.bz2
bzip2recover 1.0.8: extracts blocks from damaged .bz2 files.
bzip2recover: searching for block boundaries ...
bzip2recover: sorry, I couldn't find any block boundaries.

Confirm that the recovery run did not write usable numbered pieces:

find . -maxdepth 1 -type f -name 'rec*damaged.log.bz2' -print

No output from the find command means no matching recovery pieces exist in the current directory. Preserve the damaged source and retrieve another copy from the original system, backup target, mirror, or sender. Repeated recovery runs against the same bytes will not repair a damaged single block. For future backup jobs where localized damage recovery matters, a smaller compression block size such as bzip2 -1 can reduce how much data one damaged block costs, but it will not help a file that is already damaged.

Avoid Mixing Old and New Recovered Pieces

Old rec* files can make a later recovery look better than it is. Check the working directory before running another recovery attempt.

find . -maxdepth 1 -type f -name 'rec*.bz2' -print

If the listed files are from an older attempt, move them to an archive directory or start in a new recovery workspace. Do not delete recovered pieces until you know whether you still need them.

Clean Up the Practice Files

Remove only the disposable practice directory created for the examples, including any recovered tar listing or error files created inside it. Keep real recovery work until you have copied the recovered output somewhere safe and decided whether the damaged source is still needed.

cd ~
rm -rf ~/bzip2recover-demo

Conclusion

A damaged .bz2 file is recoverable only when intact bzip2 blocks remain: work from a copy, split it with bzip2recover, test each recovered piece, and decompress only passing blocks. Keep the damaged source and recovered pieces until you have verified the restored file or inspected any rebuilt tar archive in a separate directory.

Share this guide

Help another Linux user troubleshoot faster

Share this guide with someone troubleshooting Linux systems or saving it for later.

Follow LinuxCapable

Want more LinuxCapable guides in Google?

Add LinuxCapable as a preferred source so Google can show more of our fresh Linux tutorials in Top Stories and From your sources when relevant.

Add LinuxCapable as a preferred source on Google
Search LinuxCapable

Need another guide?

Search LinuxCapable for package installs, commands, troubleshooting, and follow-up guides related to what you just read.

Found this guide useful?

Support LinuxCapable to keep tutorials free and up to date.

Buy me a coffeeBuy me a coffee
Before commenting, please review our Comments Policy.
Formatting tips for your comment

You can use basic HTML to format your comment. Useful tags currently allowed in published comments:

You type Result
<code>command</code> command
<strong>bold</strong> bold
<em>italic</em> italic
<a href="https://example.com">link</a> link
<blockquote>quote</blockquote> quote block

Got a Question or Feedback?

We read and reply to every comment - let us know how we can help or improve this guide.

Verify before posting: