How to test a migration you can't see inside
Most migration tooling is opaque to the team that has to trust it. Maybe it's a vendor product whose conversion logic is a sealed black box. Maybe the mappings are proprietary, or generated, or buried in a configuration layer nobody on the project fully owns. Or maybe it's just too large to audit line by line in any honest timeframe — tens of thousands of field rules, accreted over years. Whatever the reason, you end up in the same place: a machine turns mainframe records into something modern, and you cannot read its insides closely enough to certify them.
The instinct is to demand the source, instrument the internals, trace every mapping. Usually you can't, and even when you can, reading the implementation tells you what the tool intends to do, not what it actually did to your records. There's a better move.
You don't need the inside — you need the boundary
A migration is a function. Records go in, a new representation comes out. You don't have to understand the function's body to test it; you have to test its input/output contract. Treat the converter as a black box and ask one behavioral question: does the output faithfully represent the input? That question has a definitive answer, and getting it doesn't require seeing a single line of the tool's code.
Round-trip equivalence
The core technique is a round trip. Take the migrated representation, decode it back into the original record format, and compare the result to the source bytes. If they match, every field survived the journey. The comparison is a hash:
original F0 F1 F2 F3 C4 12 34 5C ... (EBCDIC + COMP-3, 80 bytes) │ ▼ migrate (black box) migrated { "polNum": "0123", "premium": 12345.67, ... } │ ▼ re-encode back to copybook layout re-encoded F0 F1 F2 F3 C4 12 34 5C ... sha256(re-encoded) == sha256(original) ──────────────────────────────────────── match -> every field round-tripped exactly differ -> something was lost; the hash tells you so
The re-encoder is yours, not the vendor's — it's a small, auditable inverse you control. If sha256(re-encoded) == sha256(original), the migrated representation carried enough information to reconstruct the source perfectly, which is the strongest possible statement that nothing was dropped or distorted.
Why this beats spot-checking fields
The usual QA approach is field-by-field sampling: pick some records, pick some fields, eyeball the values, sign off. The trouble is that you can only check the errors you thought to look for. The dangerous failures in COBOL migrations live in the places nobody samples — a sign nibble that flipped S9(5) from negative to positive, a trailing space that should have been low-values padding, a COMP-3 field that decoded one digit short, an EBCDIC-to-Unicode mapping that quietly substituted a character.
A round trip needs no such list. A single wrong bit anywhere in the record changes the re-encoded bytes, which changes the hash, which fails the test. You don't enumerate the failure modes — padding, encoding, implied decimals, array boundaries — because the hash covers all of them at once, including the ones you'd never have written a check for. That is the whole advantage: it catches the errors you didn't predict, not just the ones you did.
The checks that complete the picture
Round-trip equivalence is the load-bearing test, but a few cheaper checks frame it and catch problems earlier. Confirm the structure parses — the migrated output is well-formed in its target format. Confirm the field count matches the source, so nothing was added or silently merged. Confirm every PIC clause decodes to a concrete, typed value rather than a fallback blob. And confirm the emitted schema compiles, so the shape you're promising downstream is real. Together with the byte-identical round trip, these turn a vague "looks migrated" into a small set of pass/fail facts. For how the field-level clauses map, see the clause field guide; for the full check definitions, the spec.
From "trust the tool" to "prove the output"
Black-box round-trip verification changes the question you're answering at sign-off. You're no longer asserting that the migration tool is correct in general — an assertion you can't actually back without auditing its internals. You're asserting that these records round-tripped to a byte-identical reconstruction of the source, which you can back with a hash anyone can recompute. The internals stay sealed; the boundary is proven. That proof — structure parses, count matches, every clause decodes, schema compiles, bytes round-trip — is exactly what a parity receipt attests, record by record, with a signature you can verify independently of the tool that produced the data.
See it on a real record
IronParse runs the round trip deterministically and signs a parity receipt you can verify yourself. Generate one against a sample ACORD record, or read the spec.
Live parity receipt → The spec