Measurement science
Advertising budget decision evidence ladder
Budget meetings often turn mixed evidence into a single yes-or-no vote: scale, cut, renew, or move on. That is where measurement language can outrun the design. The better habit is to choose the strongest action the evidence can actually support.
Use this ladder when a campaign readout, attribution report, MMM, lift test, brand study, or attention dashboard is being used to justify a budget decision. It separates operational actions from causal actions, so a useful signal can improve the next move without pretending to prove more than it does.
Start with the size of the decision
Evidence needs rise with the size and irreversibility of the action. A creative repair can use weaker evidence than a major budget shift. A causal impact claim needs stronger evidence than a renewal decision.
| Decision | Evidence needed | Claim ceiling |
|---|---|---|
| Fix tracking, creative, destination, or lead routing. | Operational QA, row-level fields, traffic-quality checks, and clear failure symptoms. | The campaign has a fixable execution or handoff issue. |
| Renew the same scope. | Clean delivery, response quality, outcome status, and a recommendation that stays inside observed performance. | The package produced useful observed response under the reported conditions. |
| Shift mix among placements, audiences, devices, or creative. | Comparable rows, minimum delivery thresholds, quality flags, and stable slice differences. | These rows look stronger within this campaign record. |
| Increase or cut material budget. | A credible comparison, visible uncertainty, business threshold, and sensitivity to exclusions. | The evidence supports a bounded budget-direction decision. |
| Claim incremental impact. | Protected holdout, randomized test, matched market, credible calibrated model, or another design that estimates what would have happened anyway. | The design supports a bounded lift claim for this population, outcome, and window. |
| Change future planning assumptions. | Repeatable evidence, model calibration, documented priors, transfer limits, and a next-test plan for unresolved uncertainty. | The planning assumption is plausible within named boundaries. |
The evidence ladder
Use the highest rung that all material evidence can clear. If one critical lane is weaker, the decision should inherit that weaker limit.
| Rung | What the reader has | Best action | Do not claim |
|---|---|---|---|
| 0. Missing source trail | Totals, screenshots, or summary statements without placement, creative, outcome, or date fields. | Request the evidence packet before deciding. | That the campaign worked, failed, or deserves budget movement. |
| 1. Delivery and exposure evidence | Eligible placements, impressions, viewability, frequency, invalid-traffic review, and pacing. | Fix delivery, preserve clean inventory, or rebrief scope. | That delivery quality caused business outcomes. |
| 2. Observed response evidence | Qualified visits, leads, conversions, matches, survey responses, or engagement after quality filters. | Renew carefully, improve creative or destination, or plan a better comparison. | That observed response is incremental lift. |
| 3. Directional comparison | Prior period, matched context, benchmark, model estimate, or survey-control result with visible caveats. | Make a bounded budget-direction call or prioritize the next test. | That the comparison settles causality or future scale. |
| 4. Preplanned comparison | Holdout, matched market, fixed baseline, or fixed readout rule defined before results were visible. | Act within the design limit and archive the assumptions. | That the result applies to every audience, season, channel, or creative. |
| 5. Repeatable decision-grade evidence | Several credible tests, calibrated model evidence, stable operational quality, and clear uncertainty ranges. | Update planning assumptions and define the next uncertainty to test. | That uncertainty has disappeared or that all future conditions are covered. |
Method-to-action crosswalk
Most methods can inform budget decisions, but each method earns a different kind of action. The method name is less important than whether the design matches the decision.
| Evidence source | Usually supports | Needs help before | Pair with |
|---|---|---|---|
| Attribution or path report | Tracking QA, journey diagnosis, destination repair, and tactical follow-up. | Budget lift, channel incrementality, or long-term planning claims. | Attribution window checklist |
| Campaign readout | Renewal, creative repair, issue closeout, outcome-status review, and next-test planning. | Incrementality language or broad scaling. | Campaign evidence triage tree |
| Brand study | Awareness, recall, favorability, or consideration decisions inside the surveyed population. | Sales, profit, or pipeline budget claims. | Brand lift readout checklist |
| Attention metric | Exposure-quality diagnosis, creative rotation checks, and placement-quality discussions. | Business impact, sales lift, or universal value scoring. | Attention measurement guide |
| Randomized lift test | Bounded causal readout for the assigned population, outcome, and window. | Generalizing to other tactics, seasons, audiences, or future budget levels. | Randomized lift test checklist |
| Geo or matched-market test | Market-level budget decisions when pre-period balance and local shocks are visible. | National scaling or channel ranking outside the tested markets. | Geo lift test checklist |
| MMM | Planning ranges, channel response curves, and scenario testing when controls, priors, calibration, and uncertainty are visible. | Treating modeled contribution as settled causal truth. | MMM readout QA checklist |
Budget meeting packet
Before a recommendation becomes a budget move, the packet should make the action, evidence, limits, and next uncertainty visible. Use the budget decision meeting worksheet when the team needs a printable record of the same fields.
Decision requested
Name the exact action: repair, renew, narrow scope, shift mix, increase budget, cut budget, add test, or hold.
Evidence rung
State the highest rung the evidence clears and the weakest material lane that limits the recommendation.
Rows used
List the campaign, package, placement, creative, device, date, audience, and outcome rows that define the decision.
Rows excluded
Name pooled, off-scope, immature, low-volume, missing, or quality-flagged rows that should not drive the decision.
Threshold
Show the business threshold, minimum effect, outcome maturity, or quality bar needed for the action.
Next uncertainty
Define the specific unknown that should be measured next if the decision remains material.
Downgrade triggers
- The recommendation depends on attributed conversions, but the report does not show pre-existing intent or a comparison group.
- The strongest result is a small slice without minimum cell thresholds, comparable delivery, or quality flags.
- A platform, vendor, or internal report says lift but does not show assignment, exposure compliance, control leakage, or uncertainty.
- A model has strong predictive fit but weak controls, unexplained priors, no calibration sensitivity, or unstable channel rankings.
- A survey result is being used for budget movement without showing sample source, respondent balance, question wording, and outcome relevance.
- The recommendation would change if incomplete status rows, unmatched outcomes, low-viewability impressions, or delayed conversions were removed.
Decision language matrix
| Evidence condition | Careful wording | Wording to avoid |
|---|---|---|
| Delivery and exposure are clean, but outcome evidence is weak. | The campaign delivered in the intended context; response evidence is not yet strong enough for renewal or impact language. | The media worked but conversions lagged. |
| Observed response is strong without a comparison. | The campaign produced strong observed response under the reported conditions. | The campaign drove the response. |
| Directional comparison favors the campaign but uncertainty is material. | The evidence supports a cautious budget-direction decision and a cleaner next test. | The result proves the budget should scale. |
| Preplanned comparison supports lift within a narrow design. | The design supports a bounded lift claim for this audience, outcome, and window. | The channel now has a proven return everywhere. |
| MMM and experiments point in the same direction with visible uncertainty. | The evidence supports updating the planning range while naming the assumptions to retest. | The model and tests settle the budget question. |
Pair with
Use this ladder after the measurement method selector and before the budget decision meeting worksheet or campaign renewal memo template when a budget decision needs wording. If the worksheet shows the evidence is below the needed rung, use the next measurement method guide to choose MMM calibration, geo lift, randomized holdouts, or brand studies. Pair it with the campaign evidence triage tree for readout diagnosis, the uncertainty interval readout checklist for thresholds and ranges, the minimum detectable effect planning checklist before testing, the evidence-to-claim language matrix for final language, and the MMM calibration evidence checklist when model assumptions depend on experiments or benchmarks.