Incrementality
Comparison market and holdout planning guide
Most advertising measurement claims depend on one quiet question: compared with what? A campaign can look persuasive when the comparison group was easier to convert, exposed through another route, already trending differently, or chosen after the result was visible.
This guide helps a team choose the right comparison before launch, protect it during the campaign, and write a readout that does not claim more than the design can support.
Choose the comparison type
| Design | Use when | Main risk | Minimum protection |
|---|---|---|---|
| User or household holdout | Exposure can be suppressed for eligible users, households, accounts, or devices. | Identity gaps, cross-device exposure, or other campaigns reaching the control group. | Fixed eligibility, suppression audit, exposure leakage check, and one primary outcome window. |
| Geo or store holdout | Media, retail activity, pricing, or operations can be changed by market, store, or region. | Markets differ in trend, seasonality, distribution, or competitor pressure. | Pre-period trend checks, matched controls, logged local events, and market-level readout. |
| Matched-market comparison | Random assignment is not practical, but untreated markets can be chosen before launch. | Convenient controls that match on size while missing the business trend. | Matching locked before results, placebo period checks, and sensitivity analysis. |
| Business-as-usual benchmark | No clean holdout exists and the decision can tolerate weaker evidence. | Seasonality, pricing, distribution, or demand changes masquerade as campaign impact. | Plain language that labels the result directional rather than causal. |
The pre-launch planning sheet
The planning sheet should be written before the campaign team can see the answer. It does not need to be long. It needs to remove avoidable judgment calls from the readout.
DecisionName the action the result will inform: renew a package, scale spend, change audience, adjust bids, approve a market rollout, or pause the tactic.
Eligible populationDefine who can enter treatment or control before exposure begins. Include geography, customer status, product availability, audience rules, and any exclusions.
Comparison ruleExplain how the control group, holdout, or matched market is chosen. The rule should not depend on which option later produces the strongest lift.
Primary outcomeChoose one outcome with a source of truth, reporting lag, and fixed readout window. Keep secondary metrics useful but clearly subordinate.
Downgrade triggersList events that would weaken the claim: suppression failure, budget underdelivery, tracking changes, stockouts, price changes, overlapping launches, or large unplanned promotions.
Balance checks that matter
| Check | What to compare | Why it matters |
|---|---|---|
| Starting level | Revenue, conversions, traffic, account base, distribution, store count, or eligible customers before launch. | A much larger or smaller control can make normal movement look like lift. |
| Pre-period trend | Daily or weekly movement before treatment, including seasonality and promotion cycles. | A treatment group already accelerating before launch is not a neutral comparison. |
| Outcome volatility | Historic swings, outlier weeks, small-market instability, and high-value transactions. | A noisy comparison can produce confident-looking but fragile readouts. |
| Business drivers | Price, inventory, retail distribution, sales coverage, product launches, local events, and competitor pressure. | Untracked operating changes can become hidden causes in the readout. |
| Exposure separation | Suppression logs, geography boundaries, audience extension, frequency overlap, and partner delivery. | A control group that receives the treatment no longer estimates what would have happened without it. |
Protect the control during the campaign
- Keep treatment and control eligibility fixed unless a pre-stated rule says otherwise.
- Log delivery by the same units used for assignment: user, household, store, market, account, or region.
- Check that control units were not reached by audience extension, reseller media, retargeting, sales outreach, or overlapping promotions.
- Record budget underdelivery, pacing changes, creative substitutions, tracking outages, product availability issues, and price changes as they happen.
- Do not replace weak-looking controls after the campaign starts unless the readout is explicitly downgraded.
Readout language by evidence strength
| Design condition | Stronger language | Language to avoid |
|---|---|---|
| Clean randomized holdout, limited leakage, clear outcome window. | The test estimates incremental effect for this eligible population and campaign setup. | The channel always drives this lift. |
| Matched markets with strong pre-period trend similarity and logged controls. | The tested markets outperformed a pre-selected comparison estimate under these conditions. | The campaign proved national incrementality. |
| Control contamination or major operating changes. | The result is directional because the comparison no longer cleanly represents business as usual. | The positive result confirms the tactic. |
| No holdout, only before-and-after reporting. | The report shows observed movement after launch, not a causal estimate of lift. | The campaign caused the full change from the prior period. |
Useful questions for a vendor call
- Who was eligible for treatment and control before the campaign began?
- What exact rule assigned or matched the comparison group?
- How much treatment exposure reached the control group?
- Which events, markets, users, or outcomes were excluded, and were those rules set before results were visible?
- What result would have been called inconclusive even if the point estimate was positive?
Takeaway
A comparison group is not a formality. It is the claim. If the holdout or matched market does not represent what would have happened anyway, the readout may still be useful as operational reporting, but it should not be treated as strong causal evidence.