RIDGE Open Benchmark
The first public held-out leaderboard for Amazon FBA niche outcome prediction.
169 ground-truth niches observed 2022–2023 entry → 2026 outcome. Any FBA research vendor or independent researcher may submit predictions and be added to the public leaderboard.
Leaderboard
Honest disclosure: the 169-niche test set is DEAD-heavy (46.2% always-DEAD baseline). Binary accuracy on a DEAD-heavy set is a weaker signal than NO-GO precision; we cite both, with bootstrap 95% confidence intervals on each, so the reader can compare against the trivial baseline directly. The baseline cannot abstain; RIDGE flags borderline niches as uncertain rather than guessing.
GO-side disclosure. The GO-side test cell on the 169-niche cohort is small. We publish NO-GO precision as the headline metric because it is the cell with sufficient samples to support a tight bootstrap interval. Cohort expansion is in progress so future versions can be stress-tested on a wider GO-side slice.
Download the benchmark
- niches.jsonl — 169 rows: keyword, category, entry-window ASINs, ground-truth outcome
- predictions.jsonl — RIDGE out-of-fold scores, conformal sets, binary correctness
- leaderboard.md — raw markdown leaderboard (re-generated on each benchmark run)
- README.md — how the labels were observed and how to reproduce
Production catalog scale
Beyond the 169-niche held-out leaderboard, RIDGE has scored a wider production catalog of 28,922 niches across 4 marketplaces using the same v10 multi-class classifier (DEAD / ALIVE / THRIVING). We publish the per-market distribution because the marketplace gap is real and we will not paper over it.
* Honest disclosure on DE/UK/JP DEAD-rate. The current sampling window for DE/UK/JP catalog pulls reaches recent BSR data only and does not yet have enough trajectory depth to surface the DEAD class — which is structurally rarer (≈3-4% of US niches) and requires a longer back-window than ALIVE detection. We surface the catalog under each market's BSR scale via the routing layer, but until verified-cohort counts pass ≥200 ASINs per marketplace we will not claim native per-market DEAD detection. This is a sampling gap, not a model gap, and the reader should not interpret 0% DEAD as "no risky niches in DE/UK/JP."
Conformal prediction — why this matters
Every other vendor gives you a single confidence number — or nothing. RIDGE ships a coverage-guaranteed prediction set per verdict. At 90% target coverage on cross-validation we measure 90% empirical coverage, with a singleton (decisive) verdict on the supermajority of niches and an honest abstain flag on the rest. When the model cannot confidently decide, it says so instead of guessing.
Cohort discipline — the model does not drift
Cross-validation inside a fixed dataset proves calibration but does not prove stability as new labels arrive. To check that, RIDGE keeps the 169-niche cohort completely held out — no niche from this cohort is ever used for training. Headline numbers come from this slice only. Bootstrap 95% confidence intervals (2,000 resamples) bound every metric we publish; no point estimates without CI.
How to submit competing numbers
- Download
niches.jsonland run your tool or classifier on the same keywords and entry-window ASINs. - Produce a JSONL with one row per niche:
{"keyword":"...", "binary_pred":0|1, "confidence":0.0-1.0}. - Email the JSONL plus a short methodology description to research@ridgeworldwide.com.
- We re-score the submission against held-out ground truth and add the row to the leaderboard with method description, submission date, and a link back to the submitter's methodology.
Related research
- Methodology & Validation — how RIDGE produces a verdict and what we publish vs. what we keep proprietary.
- Calibration Audit — how we audit calibration and why we cite expected calibration error on held-out folds.
- 2026 Back-test Report — full 169-niche held-out report with bootstrap 95% confidence intervals.
- Niche Database — 6,779 scored niches, 16 categories, 19 marketplaces.
Use the evidence, not adjectives
Order a RIDGE report — the same methodology leading this leaderboard is applied to your niche. 48-hour delivery. 40+ sections. 14-day money-back guarantee.
Order Analysis