The 2026 RIDGE Back-test

169 historical Amazon FBA niches · 2022-2023 entry cohort · observed through 2026 · published April 2026

Headline Results

96.2%

NO-GO precision

97.8%

Accuracy on GO calls

41%

HIDDEN GEM precision
(2.04× base-rate lift)

47.1%

HIDDEN GEM recall

What We Measured

We sampled 169 Amazon FBA niches where market entry occurred in the 2022-2023 window — a period far enough in the past that outcomes are now observable. For each niche we reconstructed the data RIDGE would have seen at time-of-entry, ran a verdict through the current engine, and compared that verdict to the observed outcome in 2026 (still-selling / died / commercially viable / not).

The test is deliberately brutal: the model has never seen these niches, the ground truth is observable reality (not a model prediction), and the time gap eliminates leakage. This is the standard back-test protocol used in quantitative finance, adapted to FBA.

What the Numbers Mean

96.2% NO-GO precision — of the niches RIDGE recommended avoiding, 96.2% turned out to actually be non-viable (n=159 NO-GO verdicts, 2,000-sample bootstrap). When RIDGE says "don't enter," you should listen.
41% HIDDEN GEM precision — the HIDDEN GEM signal (DIY, accessory-anchor, craft/décor categories) correctly identified viable opportunities at 2.04× the base rate. False positives exist, but the lift is real and meaningful.
97.8% GO precision — overall GO/NO-GO calls were correct 97.8% of the time on the 169-niche set.
2,710 ground-truth labels — the calibrated machine-learning verdict head was trained against an independently-held cohort and ranks borderline niches; the deterministic verdict rule remains the binary gate.

Baseline disclosure: the 2024-2026 Amazon FBA shakeout drove the always-DEAD heuristic on this test set to 46.2% accuracy. We disclose this so the headline 97.8% GO precision can be read in context — the value of RIDGE on a DEAD-heavy sample is in NO-GO precision with a bootstrap CI and in correctly identifying the minority of viable niches, not in beating the trivial baseline.

Where RIDGE Was Wrong

RIDGE is not perfect. The two places it under-performs:

Emerging-category novelty — niches with no historical cohort are harder to score. Where the underlying category did not exist in 2022, the model leans conservative.
Black-swan macro events — supply-chain shocks and policy changes are under-weighted if they occurred after the training window. We refresh the calibration quarterly to compensate.

These caveats are disclosed in every RIDGE report. No forecast is a guarantee; it is a probabilistic recommendation grounded in triangulated evidence.

Why No Competitor Publishes This

Publishing back-tested accuracy is expensive and commercially risky. If the number were mediocre, it would drive customers elsewhere. The typical FBA-SaaS playbook is to market on testimonials, screenshots, and case studies — all low-falsifiability marketing. RIDGE takes the opposite bet: the methodology is strong enough to publish, and customers who care about rigor will reward transparency. The numbers above are updated whenever new cohorts become observable.

Related research

Methodology & Ablation — what each model version contributed and which experiments were rejected.
Calibration Audit — calibrated machine-learning verdict head, audited against 2,710 ground-truth labels with bootstrap 95% CI on every metric.
Public Leaderboard — open submission benchmark with nested temporal CV.

Apply the back-tested engine to your niche

Order Analysis