Corrupt an Excel File for QA Tests

Corrupt an Excel File for QA Tests (Checksum & Cell-Level Tips)

Intro
When you need to validate error handling, recovery flows, or user messaging around broken spreadsheets, a deliberately corrupted Excel file is a safe, reliable test input. This guide shows simple, reversible ways to “break” an .xlsx without special tools, plus how to verify the change by size and checksum. We’ll also cover gentle, cell-level tweaks that trip parsing without nuking the whole file. Always work on copies, keep backups, and never touch real production data.

What You’ll Need

  • A copy of the file (never the original)
  • A backup location (local folder or versioned cloud drive)
  • Any simple tools already on your computer (File Explorer/Finder, PowerShell/Terminal, built-in checksum utilities)

Quick Methods (choose 2–3)

Below are low-risk ways to corrupt an Excel file. Excel .xlsx files are ZIP containers with XML parts inside. Tiny changes at the archive level can create realistic failures in Excel and other viewers.

Tip: Keep notes on what you changed so results are repeatable.

1) Small change to the file header (fast & effective)

Goal: Flip 1 byte in the ZIP signature so apps refuse to open the file.

What’s happening?
Every ZIP starts with the bytes 50 4B (“PK”). Change one of them and the archive looks invalid.

macOS/Linux (Terminal)

# Make a working copy first!
cp report.xlsx report_corrupt.xlsx

# Flip the 2nd byte from 0x4B to 0x4C (or any different value)
printf '\x4C' | dd of=report_corrupt.xlsx bs=1 seek=1 conv=notrunc

Windows (PowerShell)

# Make a working copy first!
Copy-Item report.xlsx report_corrupt.xlsx

# Read bytes, modify one, write back
$bytes = [IO.File]::ReadAllBytes("report_corrupt.xlsx")
$bytes[1] = 0x4C   # change second byte (from 0x4B)
[IO.File]::WriteAllBytes("report_corrupt.xlsx", $bytes)

Expected result:
Excel shows an “invalid file format” or “We found a problem with some content” message and may refuse to open or prompt to repair.

2) Trim or pad the file slightly (breaks end-of-archive metadata)

Goal: Change the file length by a tiny amount so central directory offsets inside the ZIP no longer match.

macOS/Linux (Terminal)

cp report.xlsx report_trim.xlsx
# Trim 16 bytes from the end
truncate -s -16 report_trim.xlsx

cp report.xlsx report_pad.xlsx
# Add 8 bytes of junk padding at the end
truncate -s +8 report_pad.xlsx

Windows (PowerShell)

# Trim 16 bytes
Copy-Item report.xlsx report_trim.xlsx
$fs = [IO.File]::OpenWrite("report_trim.xlsx")
$fs.SetLength($fs.Length - 16)
$fs.Close()

# Pad 8 bytes
Copy-Item report.xlsx report_pad.xlsx
$bytes = [IO.File]::ReadAllBytes("report_pad.xlsx")
$bytes += ,0,0,0,0,0,0,0,0
[IO.File]::WriteAllBytes("report_pad.xlsx", $bytes)

Expected result:
Opening fails or Excel tries to repair but reports damaged parts (often styles, workbook, or shared strings).

3) Archive tweak (XLSX is a ZIP): remove or break one XML part

Goal: Keep the file a valid ZIP, but make a specific XML part invalid (great for targeted tests and cell-level scenarios).

Steps (any OS):

  1. Duplicate your file (e.g., report.xlsxreport_tweak.xlsx).
  2. Rename the copy to .zip: report_tweak.xlsxreport_tweak.zip.
  3. Extract it (double-click or right-click > Extract). You’ll see folders like:
    • xl/worksheets/sheet1.xml
    • xl/sharedStrings.xml
    • xl/styles.xml
    • xl/workbook.xml
    • [Content_Types].xml
  4. Break a part (pick one):
    • Delete xl/sharedStrings.xml (if the sheet uses shared strings, this often breaks).
    • Open xl/worksheets/sheet1.xml in a plain text editor and remove a closing tag (e.g., delete </row> or </sheetData>).
    • Change an attribute to create a type mismatch (see Cell-Level Tips below).
  5. Re-zip the extracted contents (not the parent folder) back into a ZIP:
    • On macOS: select all extracted items → right-click → Compress
    • On Windows: select all → Send to → Compressed (zipped) folder
  6. Rename back to .xlsx: report_tweak.zipreport_tweak.xlsx.

Expected result:
Excel opens with a repair dialog (“Removed Records: /xl/worksheets/sheet1.xml”) or fails with a parsing error—perfect for testing error messages and recovery flows.

How to Check It Worked

  • Try opening the file
    Note the exact error text, any repair prompts, and which parts were reported as damaged. This wording is gold for acceptance criteria.
  • Compare size and checksum
    Create a before/after hash to prove the file changed and to track your test asset over time. macOS shasum -a 256 report.xlsx shasum -a 256 report_corrupt.xlsx Linux sha256sum report.xlsx sha256sum report_corrupt.xlsx Windows (PowerShell) CertUtil -hashfile report.xlsx SHA256 CertUtil -hashfile report_corrupt.xlsx SHA256
  • Add a short note to your QA ticket
    Include what you changed (“flipped byte 1”, “trimmed 16 bytes”, “removed sharedStrings.xml”), expected behavior (error/repair), and a before/after hash.

Undo & Safety

  • Restore from your backup copy
    Keep the original safe in a separate folder or a versioned drive.
  • Keep changes small and documented
    One tiny tweak per file makes root-causing much easier. Name files descriptively (e.g., report_trim16.xlsx).
  • Never test on production data
    Use anonymized or synthetic spreadsheets. Don’t upload corrupted files to systems that sync or auto-process production artifacts.

Troubleshooting

  • File still opens cleanly?
    • Make a slightly bigger change (trim 64 bytes instead of 16).
    • Target the start (header flip) or end (trim) of the file.
    • If you removed the wrong part, try breaking xl/worksheets/sheet1.xml or xl/workbook.xml.
  • App crashes instead of showing a friendly error?
    That’s a valuable bug. Capture:
    • Exact steps to reproduce
    • The corrupted sample
    • App version/OS build
    • Any logs or crash dumps
  • Excel “repairs” and silently opens
    That might be acceptable—or not. If your acceptance criteria require a visible warning, choose a harder break (malformed XML tag) rather than a missing optional part.

Cell-Level Tips (precise, realistic failures)

Sometimes you want Excel to pinpoint a worksheet or cell instead of failing the whole file. Use the ZIP tweak method above and edit XML inside xl/.

  1. Type mismatch in a cell
    • Open xl/worksheets/sheet1.xml.
    • Find a cell like: <c r="A1" t="s"><v>0</v></c> t="s" means “shared string”; the <v> holds an index into sharedStrings.xml.
    • Change it to a numeric type with a non-numeric value, e.g.: <c r="A1" t="n"><v>hello</v></c>
    • Result: Excel flags “Removed Records: Cell information from /xl/worksheets/sheet1.xml”.
  2. Broken reference
    • In xl/workbook.xml, change a sheet relationship ID to a non-existent one (r:id="rId999").
    • Or in xl/_rels/workbook.xml.rels, modify a Target so it points to a missing worksheets/sheetX.xml.
    • Result: Excel reports a missing sheet or relationship.
  3. Malformed XML
    • In xl/worksheets/sheet1.xml, delete a single closing tag like </row> or </sheetData>.
    • Result: Excel shows an XML parsing error and may offer to repair just that sheet.
  4. Shared strings inconsistency
    • In xl/sharedStrings.xml, remove one <si> block but keep its index referenced in the sheet.
    • Result: Excel flags shared string corruption with details during repair.

Why these help

  • They preserve most of the workbook, so you can test error UX that points to a sheet/cell and validate granular recovery flows.
  • They generate consistent, repeatable warnings—ideal for automation screenshots and acceptance tests.

Extra Validation Ideas

Extra Validation Ideas
  • Open with multiple apps (Excel desktop, Excel Online, Google Sheets, LibreOffice) and compare error messages.
  • Automate checks:
    • Use a CI step that asserts a corrupted file’s SHA256 matches the expected value for your test fixture.
    • Script opening via headless libraries (where permitted) to capture return codes or exceptions.
  • Log capture: On Windows, use Event Viewer; on macOS, Console.app to record app-level errors during open attempts.

Example Playbook (copy/paste)

Goal: Create three predictable corrupted variants from report.xlsx.

  1. Header flip
    • Copy to report_hdr.xlsx.
    • Flip byte 1 (see commands above).
    • Expect: “The file format is invalid” (no open).
  2. Trim 16 bytes
    • Copy to report_trim16.xlsx.
    • Trim 16 bytes (see truncate / PowerShell).
    • Expect: “We found a problem with some content” (repair prompt).
  3. Sheet1 malformed XML
    • Copy to report_sheet1_broken.xlsx.
    • Unzip → delete </sheetData> in xl/worksheets/sheet1.xml.
    • Re-zip → rename back to .xlsx.
    • Expect: “Removed Records: /xl/worksheets/sheet1.xml” (sheet repairs).

Record each file’s SHA256 and observed message in your QA ticket.

FAQ

Will antivirus complain?
Unlikely. You’re not introducing malware—just making a container unreadable. Still, keep these files in a dedicated test folder and avoid sharing them widely.

Do cloud previews fail the same way as desktop Excel?
Not always. Web viewers sometimes sanitize or partially repair files. Test both desktop and cloud if your users rely on each.

How big should the change be?
Start tiny: flip 1 byte or trim 8–16 bytes. Increase only if the app still opens cleanly. Small, focused changes make defects easier to diagnose.

Can I target a single cell for a tidy error?
Yes—use the Cell-Level Tips to create a type mismatch or malformed XML inside one worksheet. Excel usually reports exactly which part was repaired.

Scroll to Top