How to verify AI output before you trust it

AI is fluent enough to make wrong answers sound right. A practical checklist for catching hallucinations, broken facts, and silent drift — for code, copy, research, and analysis.

7 min read·Updated May 25, 2026

AI produces fluent output. Fluent is not the same as correct. This guide is the practical checklist for catching AI mistakes before they ship — for code, copy, research, analysis, and decisions.

The right amount of verification depends on what you are using the output for. A draft email to yourself: light check. Customer-facing copy, production code, a board update: thorough check. The patterns below scale up or down based on stakes.

The five categories AI gets wrong

Before you check, know what you are looking for. AI's failure modes cluster into five categories:

1. Hallucinated facts. Confident-sounding statements that are not true. Wrong dates, wrong company names, made-up product features, fabricated quotes. The output reads as authoritative because the model was trained to write authoritatively.

2. Hallucinated sources. Citations, URLs, paper titles, author names that look real and are not. The single most common AI failure mode.

3. Numeric drift. Arithmetic that is close to right but not right. Percentages that do not add up. Sums that are off by a meaningful amount. Models generate digits, they do not calculate.

4. Code that looks right but is not. Method names that do not exist, library versions that do not match, signatures that almost compile, edge cases that crash in production.

5. Subtle factual drift. Statements that are partly true. "X was acquired by Y in 2020" when the acquisition was 2021. "The default port for Z is 5432" when it is actually 5433. These are the hardest to catch because they sound plausible.

Each category has a different verification step.

Verification by content type

Verifying AI-generated copy (emails, blog posts, marketing, social)

Run this pass before you publish or send anything customer-facing:

  1. Highlight every name, company, date, number, and product feature. For each one, ask: "Where did the model get this?" If it was not in your input, verify it.
  2. Check any claim that compares your product to a competitor. Competitive claims are where AI most often invents specifics.
  3. Check any statistic. "70% of buyers prefer..." is exactly the kind of fluent-sounding number AI will fabricate.
  4. Read it out loud. Stilted phrasing, off-brand sentences, and tone breaks are easier to catch by ear than eye.
  5. If it cited a source, open the source. Most AI-generated citations are wrong in some way — wrong title, wrong author, or the claim is not actually in the article.

Verifying AI-generated code

The full version is in Catching AI-generated bugs before they ship. The short version:

  1. Run it. Most hallucinated code fails at runtime. Linters and TypeScript catch a meaningful share before runtime.
  2. Check every import. AI invents method names that almost exist. Spot-check 3-4 against the actual library.
  3. Read every error handler. Empty catch blocks and silently-swallowed errors are common AI patterns.
  4. Check async boundaries. Missing await or wrongly-added await calls are frequent.
  5. Look at any new file the model created. Ask why it exists. AI often spawns helper files that should not have been created.

If the code calls an API or uses a library you do not know well, verify the API exists in the version you are using. The AI Output Verifier skill automates this.

Verifying AI-generated research summaries

AI is good at producing coherent-sounding summaries. It is bad at being right about specifics.

  1. Treat every citation as suspicious. Click each link. If the link does not exist or does not say what the model claims, the source is hallucinated.
  2. Check quotes against the source. AI will paraphrase a quote slightly enough that it is no longer what the source actually said.
  3. Re-derive any numbers in the summary. If the summary says "study found 42%", open the study and check.
  4. Check the source date. "A 2023 study" might be a 2019 study that the model is confidently mis-dating.
  5. Cross-check controversial claims against a second source. If only one source supports a strong claim, treat it as unverified.

For AI-search tools (Perplexity, ChatGPT with browsing, Gemini), the citations are usually real, but the model's summary of what each citation said can still drift. Read the actual source paragraph, not just the AI summary of it.

Verifying AI-generated analysis of data

The most dangerous AI output. Looks like analysis, sometimes is not.

  1. Re-compute any key number yourself. If the analysis says "revenue grew 23% year-over-year", open the spreadsheet and check.
  2. Use a code interpreter (ChatGPT, Claude with code execution). When AI runs Python on your data, the numbers in the output are computed, not generated. Much higher reliability than prose analysis.
  3. Watch for spurious precision. "Conversion rate improved 14.7% in Q3" is usually generated. AI tends to add false specificity to make numbers sound real.
  4. Check the categorical breakdowns. AI sometimes invents segment names that are not in your data, or assigns numbers to segments that do not have those numbers.
  5. Sanity-check the totals. If the segment breakdowns are supposed to sum to a total, check that they do.

Verifying AI-generated decisions or recommendations

For decision frameworks, pros/cons, "should I do X" output:

  1. Treat the framework as a draft, not the answer. AI is good at structure, often weak on judgment.
  2. Check that it addresses the actual constraint. AI sometimes optimizes for a generic version of the problem, not the one you have.
  3. Look for missing options. AI tends to give you the obvious choices and miss the unconventional one.
  4. Re-stress-test it against your context. "What changes if my team is 3 people, not 30?" "What changes if I am cash-constrained?" The original answer was usually generic.
  5. Cross-check with a real expert if the decision is meaningful. AI is not a substitute for a lawyer, doctor, accountant, or experienced operator on high-stakes calls.

When verification is not worth it

You do not need to verify everything. The verification overhead defeats the speed benefit on low-stakes work.

Skip thorough verification when:

  • The output is for your own internal use (a brainstorm list, a research note for yourself).
  • The output is a transformation of something you provided (a summary of your own document).
  • The cost of a small error is low (a first-draft email you will edit before sending).
  • You are using AI to learn, not to ship (asking it to explain something to you).

Verify thoroughly when:

  • The output is customer-facing or public.
  • The output contains specific facts, numbers, citations, or claims about other people or companies.
  • The output goes into production code, financial calculations, or any system with downstream effects.
  • The output influences a meaningful decision.

The fastest verification habit

If you do nothing else: before you ship AI output, ask "what is the single most-load-bearing claim in this, and is it true?"

For an email: the customer's name and the specific thing they said. For a sales claim: the percentage or the comparison. For code: the method name that does the actual work. For an analysis: the headline number. For a citation: that the source exists and says what was claimed.

One claim, one check, 30 seconds. It catches most of what would have embarrassed you.

Tools that help

A few tools shift the verification burden from "you" to "machine":

  • Code: linters, type checkers, runtime tests, the AI Output Verifier skill. Catch a meaningful share of hallucinated code automatically.
  • Numbers: code interpreter (ChatGPT, Claude). Let AI run Python rather than generate digits as prose.
  • Sources: AI search tools (Perplexity, Gemini with search). Real citations, though you still read the actual source.
  • Drift over time: the 12-point AI PR review checklist. Structured pass on AI-generated code changes.

These reduce verification time, not eliminate it. The judgment call about whether output is correct stays with you.


Related: What AI is good at, and what it still gets wrong for the upstream calibration. AI at work: what is safe to share for the related question of what AI should see in the first place.

How to verify AI output before you trust it | AINews