YC Startup Failure Dataset: 1,764 Dead Companies — Patterns, Sources & Scraping Methods
Merged YC Graveyard + Startups.RIP dataset of 1,764 dead YC startups with failure patterns, industry breakdowns, and data acquisition methods.
Core Dataset
1,764 deduplicated dead/inactive YC companies merged from two primary sources:
- YC Graveyard (1,002 companies, structured: id/name/batch/industry/website/description/tags)
- Startups.RIP (1,000 slugs via sitemap; detailed post-mortems behind Pro paywall)
- Overlap between sources: only 238 companies (13.5%) — definitions and coverage diverge significantly
Data Acquisition Methods
YC Graveyard
- Request
ycgraveyard.iamwillwang.com— Astro SSR page embeds full JSON in HTML (entity-encoded) - Extract with regex: yields id, name, slug, batch, industry, website, description, tags for all 1,002 entries
- No auth required; single request
Startups.RIP
- Fetch
/sitemap.xml→ extract company URL slugs (1,000+) - Full post-mortem content requires Pro subscription
- Sitemap gives company list for free; use as slug index
Failory / Forbes
- Both protected by CAPTCHA (Cloudflare / Datadome)
- Workaround: pull from search engine cached results / snippets
- Sufficient for top-N case extraction, not bulk scraping
Failure Statistics
| Metric | Value |
|--------|-------|
| YC total portfolio (2005–2025) | 5,700+ |
| Confirmed dead/inactive | 1,764+ |
| Surface failure rate | ~20% |
| Early cohorts (first 17 batches) actual failure rate | ~40% |
Death by Vintage
| Period | Deaths | Note |
|--------|--------|------|
| 2005–2009 | 82 | Small early batches |
| 2010–2014 | 206 | Scaling phase |
| 2015–2018 | 273 | Rapid expansion |
| 2019–2021 | 294 | Peak deaths — W20=57, W21=58, S21=53 |
| 2022–2025 | 147 | Still in observation window |