Skip to Content
SOP05 Verdict Rules

SOP 05: Verdict Rules

How Verdicts Are Made

Verdicts are made after reviewing:

  • primary metric movement
  • guardrail behavior
  • implementation quality
  • sample quality
  • contextual business changes that may distort interpretation

The goal is not to force every test into a win or loss.

The goal is to make a clean next decision.

Accepted Verdict Types

  • KEEP
  • KILL
  • ITERATE
  • INCONCLUSIVE

What Each Verdict Means

KEEP

The change created enough positive evidence to retain or roll out.

KILL

The change failed, hurt outcomes, or does not justify continued use.

ITERATE

The direction looks promising but the current version is not final.

INCONCLUSIVE

The signal is too weak, too noisy, or too compromised to support a confident decision.

How Neutral Results Are Handled

Neutral results should not be forced into KEEP.

If the test was clean but showed no meaningful movement, choose:

  • KILL if the hypothesis no longer looks worth pursuing
  • ITERATE if the underlying logic still seems strong but the treatment was weak
  • INCONCLUSIVE if the sample or setup was not good enough

When to Kill a Test

Kill when:

  • the primary metric declines
  • guardrails are harmed
  • the user friction clearly increases
  • the hypothesis was wrong enough that further investment is hard to justify

When to Keep a Change

Keep when:

  • the primary metric improves in a meaningful way
  • guardrails remain acceptable
  • the implementation is stable
  • the learning is strong enough to support rollout

When to Iterate

Iterate when:

  • the problem still looks real
  • the direction seems valid
  • the current execution was too weak, too broad, or too unclear

Simple Decision Table

SituationVerdict
Primary metric up, guardrails stableKEEP
Primary metric down or harmful side effectsKILL
Direction looks right but treatment needs refinementITERATE
Data quality or sample quality too weak to decideINCONCLUSIVE

Rule

A verdict must always create a next action:

  • keep and scale
  • kill and archive
  • iterate and rewrite
  • mark inconclusive and decide whether to rerun or stop
Last updated on