Interrupted transfers and failing disks can leave a .bz2 file that normal decompression refuses to read, but some compressed blocks may still be intact. The bzip2recover command in Linux extracts those blocks into separate .bz2 files so you can test each piece, decompress the usable parts, and salvage partial data without modifying the damaged original.
bzip2recover is not a full file repair tool. It cannot rebuild a damaged block, restore missing bytes, or make a broken .tar.bz2 archive complete again. Its value is narrower and practical: split a damaged bzip2 stream into recoverable block files, then let the bzip2 command with -t decide which pieces are safe to use.
Run bzip2recover on a Copy
Run bzip2recover against a copied damaged file inside a clean working directory. The command takes one filename, writes recovered block files into the current directory, and leaves the damaged input file in place.
mkdir -p ~/bzip2recover-work
cp -- ~/Downloads/damaged.log.bz2 ~/bzip2recover-work/
cd ~/bzip2recover-work
bzip2recover damaged.log.bz2
Replace ~/Downloads/damaged.log.bz2 with the real path before running the copy command. Working from a copy avoids adding new files beside an important backup, log, or download and keeps old recovery attempts from mixing with the current run.
The official bzip2 manual explains why this works: bzip2 stores data in independently handled blocks, usually up to 900 KB each, and each block has its own integrity check. Packaged Linux builds commonly write filenames such as rec00001damaged.log.bz2, rec00002damaged.log.bz2, and later numbered pieces.
| Task | Command Pattern | What It Does |
|---|---|---|
| Recover blocks | bzip2recover damaged.log.bz2 | Scans the damaged bzip2 stream and writes recovered block files. |
| Test one recovered piece | bzip2 -t rec00001damaged.log.bz2 | Checks whether a recovered block file is internally valid. |
| Decompress one valid piece | bzip2 -dc rec00001damaged.log.bz2 > part001.log | Writes decompressed data without deleting the recovered .bz2 file. |
| Join named valid pieces | bzip2 -dc rec00001damaged.log.bz2 rec00002damaged.log.bz2 > recovered.log | Streams only the recovered block files you already tested as valid. |
| Inspect file format | file damaged.log.bz2 | Confirms whether the input is really bzip2 data before retrying recovery. |
Do not run recovery in a directory that already contains old
rec*files for the same damaged archive. Create a fresh working directory or move the older pieces aside first.
Verify or Install bzip2recover
The bzip2 package provides bzip2recover along with bzip2, the bunzip2 decompression command, the bzcat streaming command, bzdiff, and related helpers. Many Linux systems already include it, but minimal servers and containers may not.
command -v bzip2recover
A common installed path is:
/usr/bin/bzip2recover
Run the command without a file only when you want to check the build information. It prints usage text and exits with an error because no damaged file was supplied:
bzip2recover
bzip2recover 1.0.8: extracts blocks from damaged .bz2 files. bzip2recover: usage is `bzip2recover damaged_file_name'. restrictions on size of recovered file: None
Older enterprise releases may print an older bzip2 version, such as 1.0.6, while keeping the same single-file bzip2recover syntax. The version line matters mainly when you are comparing local output with examples.
If the command is missing, install the bzip2 package for your distribution.
APT-Based Distributions
sudo apt update
sudo apt install bzip2
Fedora and RHEL-Family Distributions
sudo dnf install bzip2
Arch Linux and Manjaro
sudo pacman -S bzip2
openSUSE
sudo zypper install bzip2
After installation, repeat command -v bzip2recover. If the command still does not appear, open a new shell and confirm that standard system paths such as /usr/bin and /usr/sbin are present in $PATH.
Create a Disposable Damaged BZ2 File
A controlled practice file helps you learn the recovery workflow before touching an important archive. These commands create a text log under ~/bzip2recover-demo, then report its uncompressed size.
mkdir -p ~/bzip2recover-demo
cd ~/bzip2recover-demo
seq -f "line %08g status ok payload abcdefghijklmnopqrstuvwxyz0123456789" 1 18000 > server.log
wc -c server.log
1242000 server.log
Compress the practice log with a smaller bzip2 block size, copy the compressed file, and damage only the copy. The original server.log and server.log.bz2 stay available for comparison.
bzip2 -1 -k server.log
cp server.log.bz2 damaged.log.bz2
dd if=/dev/zero of=damaged.log.bz2 bs=1 seek=6000 count=64 conv=notrunc status=none
ls -1
damaged.log.bz2 server.log server.log.bz2
The -1 option makes smaller 100 KB bzip2 blocks, which produces more pieces in a small demo file. For real backups, use the file you already have; do not recompress a damaged file before recovery.
Test the Damaged BZ2 File First
Start with bzip2 -t so you know whether normal decompression fails because of bzip2 data damage. The test command is read-only.
bzip2 -t damaged.log.bz2
Relevant output for the damaged practice file looks like this:
bzip2: damaged.log.bz2: data integrity (CRC) error in data You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files.
A CRC error means bzip2 recognized the stream but found damaged compressed data. If the error says bad magic number, confirm the file format before assuming bzip2 recovery is the right tool.
Recover Blocks from a Damaged BZ2 File
Run bzip2recover from the clean directory that contains the damaged file. The command scans for block boundaries, then writes one recovered .bz2 file per block that it can split out.
bzip2recover damaged.log.bz2
Relevant output from the practice file includes the detected block ranges and the generated piece names:
bzip2recover 1.0.8: extracts blocks from damaged .bz2 files. bzip2recover: searching for block boundaries ... block 1 runs from 80 to 12690 block 2 runs from 12739 to 26811 block 3 runs from 26860 to 40214 block 4 runs from 40263 to 54067 block 5 runs from 54116 to 67783 block 6 runs from 67832 to 81754 block 7 runs from 81803 to 95112 block 8 runs from 95161 to 109195 block 9 runs from 109244 to 123550 block 10 runs from 123599 to 137538 block 11 runs from 137587 to 151772 block 12 runs from 151821 to 166247 block 13 runs from 166296 to 174536 block 14 runs from 174585 to 174624 (incomplete) bzip2recover: splitting into blocks writing block 1 to `rec00001damaged.log.bz2' ... writing block 2 to `rec00002damaged.log.bz2' ... writing block 3 to `rec00003damaged.log.bz2' ... writing block 4 to `rec00004damaged.log.bz2' ... writing block 5 to `rec00005damaged.log.bz2' ... writing block 6 to `rec00006damaged.log.bz2' ... writing block 7 to `rec00007damaged.log.bz2' ... writing block 8 to `rec00008damaged.log.bz2' ... writing block 9 to `rec00009damaged.log.bz2' ... writing block 10 to `rec00010damaged.log.bz2' ... writing block 11 to `rec00011damaged.log.bz2' ... writing block 12 to `rec00012damaged.log.bz2' ... writing block 13 to `rec00013damaged.log.bz2' ... bzip2recover: finished
List the recovered files before testing them. This also confirms that the current directory contains only pieces from the current recovery attempt.
ls -1 rec*damaged.log.bz2 | head -n 5
rec00001damaged.log.bz2 rec00002damaged.log.bz2 rec00003damaged.log.bz2 rec00004damaged.log.bz2 rec00005damaged.log.bz2
Test Recovered Blocks Before Decompressing
Test every recovered block file before you join or decompress anything. A recovered piece can still contain damaged data, and skipping this step can put a corrupt block back into your output.
for f in rec*damaged.log.bz2; do
if bzip2 -t "$f" 2>/dev/null; then
printf 'OK %s\n' "$f"
else
printf 'BAD %s\n' "$f"
fi
done
The practice file produced 13 recovered pieces. Twelve passed the integrity test, while one damaged block failed:
OK rec00001damaged.log.bz2 OK rec00002damaged.log.bz2 OK rec00003damaged.log.bz2 BAD rec00004damaged.log.bz2 OK rec00005damaged.log.bz2 OK rec00006damaged.log.bz2 OK rec00007damaged.log.bz2 OK rec00008damaged.log.bz2 OK rec00009damaged.log.bz2 OK rec00010damaged.log.bz2 OK rec00011damaged.log.bz2 OK rec00012damaged.log.bz2 OK rec00013damaged.log.bz2
Only use the OK pieces. Keep the BAD pieces until you finish inspecting the recovery, but do not feed them into the restored output.
Decompress Valid Blocks into a Recovered File
For a single compressed text file, append only valid recovered blocks into a new output file. The loop tests each piece again, skips failures, and keeps the recovered .bz2 files untouched.
: > recovered.log
for f in rec*damaged.log.bz2; do
if bzip2 -t "$f" 2>/dev/null; then
bzip2 -dc "$f" >> recovered.log
fi
done
wc -l recovered.log
16572 recovered.log
The original practice file had 18,000 lines, so the recovered output is partial. That is the expected tradeoff: bzip2recover salvages readable blocks, not the bytes lost from damaged blocks.
Preview the recovered file before treating it as trustworthy:
head -n 3 recovered.log
tail -n 3 recovered.log
line 00000001 status ok payload abcdefghijklmnopqrstuvwxyz0123456789 line 00000002 status ok payload abcdefghijklmnopqrstuvwxyz0123456789 line 00000003 status ok payload abcdefghijklmnopqrstuvwxyz0123456789 line 00017998 status ok payload abcdefghijklmnopqrstuvwxyz0123456789 line 00017999 status ok payload abcdefghijklmnopqrstuvwxyz0123456789 line 00018000 status ok payload abcdefghijklmnopqrstuvwxyz0123456789
If the recovered text has ordered line numbers, timestamps, IDs, or sequence fields, scan for gaps before treating the file as complete. The practice log uses a line number in the second field, so this check exposes the missing range caused by the damaged block:
awk '
NR == 1 { previous = $2 + 0; next }
($2 + 0) != previous + 1 {
printf "gap after line %08d before line %08d\n", previous, $2
}
{ previous = $2 + 0 }
' recovered.log
gap after line 00004301 before line 00005730
For binary data, choose an output name that matches the content, such as recovered.bin, and verify it with the application that normally reads that file. For text logs and exports, compare the recovered result against an older copy when one exists. The bzdiff command guide is useful when the comparison files are still compressed with bzip2.
Recover Data from tar.bz2 Archives Carefully
A .tar.bz2 file has two layers: a tar archive stream compressed by bzip2. bzip2recover only understands the bzip2 layer, so recovered blocks may decompress into partial tar data with missing tar records, shifted file data, or no usable tar listing at all.
Use the same block-testing workflow inside the recovery workspace, but treat the rebuilt tar file as damaged evidence. Replace backup.tar.bz2 in the pattern with the damaged archive’s exact basename, then confirm the pattern matches recovered pieces before rebuilding the tar stream.
ls -1 rec*backup.tar.bz2 | head -n 5
A matching pattern should show numbered recovered pieces for that archive:
rec00001backup.tar.bz2 rec00002backup.tar.bz2 rec00003backup.tar.bz2 rec00004backup.tar.bz2 rec00005backup.tar.bz2
Write valid pieces into a separate tar file first, then ask tar what it can still list before extracting anything.
: > recovered.tar
for f in rec*backup.tar.bz2; do
if bzip2 -t "$f" 2>/dev/null; then
bzip2 -dc "$f" >> recovered.tar
fi
done
if tar -tf recovered.tar > recovered-tar-list.txt 2> recovered-tar-errors.txt; then
sed -n '1,40p' recovered-tar-list.txt
else
sed -n '1,40p' recovered-tar-list.txt
sed -n '1,20p' recovered-tar-errors.txt
fi
A damaged tar stream may still list a few paths before tar reaches the broken record. Example output can look like this:
backup/ backup/etc/ backup/etc/app.conf backup/logs/ backup/logs/app.log tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now
If the list file is empty or the errors say the result does not look like a tar archive, stop and keep the damaged source plus recovered block files. If tar lists useful paths, extract into a new directory rather than the current working directory:
mkdir -p recovered-tar
if tar -xf recovered.tar -C recovered-tar 2> recovered-tar-extract-errors.txt; then
find recovered-tar -maxdepth 3 -type f | sort | head
else
find recovered-tar -maxdepth 3 -type f | sort | head
sed -n '1,20p' recovered-tar-extract-errors.txt
fi
Extraction can also produce partial files before the archive error. Keep both the files and the error log until you decide whether the recovered data is usable:
recovered-tar/backup/etc/app.conf recovered-tar/backup/logs/app.log tar: Unexpected EOF in archive tar: rmtlseek not stopped at a record boundary tar: Error is not recoverable: exiting now
If tar reports archive errors after recovery, treat any extracted files as partial results. Preserve the damaged source, recovered pieces, listing file, error files, and extracted directory for a backup or forensic workflow. The tar command examples explain listing and extraction behavior, but they cannot replace missing tar records after a damaged bzip2 block.
Troubleshoot bzip2recover Errors
Fix Cannot Read File Errors
The message can't read usually means the filename is wrong, the current directory is not what you expected, or the user cannot read the file.
bzip2recover missing-file.bz2
bzip2recover: can't read `missing-file.bz2'
pwd
ls -lh -- damaged.log.bz2
If the file lives elsewhere, copy it into the clean recovery directory and retry from there:
cp -- /path/to/damaged.log.bz2 ~/bzip2recover-work/
cd ~/bzip2recover-work
bzip2recover damaged.log.bz2
Fix Bad Magic Number or Wrong Format Errors
A bad magic number error during the bzip2 -t precheck means the file does not begin like a bzip2 stream. Confirm the real format before retrying recovery.
file damaged.log.bz2
Example output for a mislabeled gzip file looks like this:
damaged.log.bz2: gzip compressed data, from Unix, original size modulo 2^32 17
If the file is gzip data, use gzip tools instead of bzip2 tools. The gunzip command examples cover gzip integrity checks and decompression choices.
Handle No Useful Blocks Recovered
bzip2recover helps most when a damaged file contains several bzip2 blocks. If the file was small enough to fit into one block, damage inside that block can leave nothing useful to recover. A recovery run that cannot find block boundaries is a signal to look for a clean copy instead of repeating the same command.
bzip2recover damaged.log.bz2
bzip2recover 1.0.8: extracts blocks from damaged .bz2 files. bzip2recover: searching for block boundaries ... bzip2recover: sorry, I couldn't find any block boundaries.
Confirm that the recovery run did not write usable numbered pieces:
find . -maxdepth 1 -type f -name 'rec*damaged.log.bz2' -print
No output from the find command means no matching recovery pieces exist in the current directory. Preserve the damaged source and retrieve another copy from the original system, backup target, mirror, or sender. Repeated recovery runs against the same bytes will not repair a damaged single block. For future backup jobs where localized damage recovery matters, a smaller compression block size such as bzip2 -1 can reduce how much data one damaged block costs, but it will not help a file that is already damaged.
Avoid Mixing Old and New Recovered Pieces
Old rec* files can make a later recovery look better than it is. Check the working directory before running another recovery attempt.
find . -maxdepth 1 -type f -name 'rec*.bz2' -print
If the listed files are from an older attempt, move them to an archive directory or start in a new recovery workspace. Do not delete recovered pieces until you know whether you still need them.
Clean Up the Practice Files
Remove only the disposable practice directory created for the examples, including any recovered tar listing or error files created inside it. Keep real recovery work until you have copied the recovered output somewhere safe and decided whether the damaged source is still needed.
cd ~
rm -rf ~/bzip2recover-demo
Conclusion
A damaged .bz2 file is recoverable only when intact bzip2 blocks remain: work from a copy, split it with bzip2recover, test each recovered piece, and decompress only passing blocks. Keep the damaged source and recovered pieces until you have verified the restored file or inspected any rebuilt tar archive in a separate directory.


Formatting tips for your comment
You can use basic HTML to format your comment. Useful tags currently allowed in published comments:
<code>command</code>command<strong>bold</strong><em>italic</em><a href="https://example.com">link</a><blockquote>quote</blockquote>