← Back to leaderboard

What we scraped, what we didn't, and why

The dataset behind these awards is publicly auditable. This page enumerates every field on the site, where it came from, and where it's missing — so nothing is taken on faith.

Data coverage at a glance

Top-25 communities

25

All have full public-data coverage.

Fully covered

5

Public + auth-gated fields populated.

Partial coverage

20

Auth-gated fields missing.

Field-by-field source map

Every field surfaced on the site is one of two kinds: public (sourced from the public Skool API, authoritative across all 25 communities) or auth-gated (requires an authenticated session cookie to scrape — currently complete for 5/25).

Field Gate Coverage Source / actor
Member count Public 25/25 Public Skool API. Authoritative.
Price per month Public 25/25 Public Skool API (monthlyPriceUsd). Authoritative.
MRR tier Public 25/25 Public — Skool's internal MRR ranking exposed on owner profile.
Founder social links Public 25/25 Public — owner.social object on community page.
Landing page copy Public 25/25 Public — marketing.landingPage.description.
Category labels & post counts Auth-gated 5/25 Requires authenticated community page scrape (community.labels).
Top contributors Auth-gated 5/25 Requires authenticated leaderboard scrape (sourabhbgp/skool-scraper).
Recent posts feed Auth-gated 5/25 Requires auth_token cookie + memo23/skool-posts-with-comments-scraper.

Actors used

Each row below is an Apify actor we ran. Hard cap: $40 spend across the whole pipeline (discovery + deep-dive + control group + posts feed + founder research). Spend tracked in data/raw/spend.json.

Actor Purpose Auth Phase
easyapi/skool-groups-scraper Discovery — query-based search across 50 AI terms Public Phase 2
futurizerush/skool-group-scraper Per-community deep-dive (member count, price, owner) Public Phase 4
goat255/skool-scraper-goat Enriched community profile (features, MRR tier, marketing) Public Phase 4
sourabhbgp/skool-scraper Top contributors leaderboard Auth-gated Phase 4
memo23/skool-posts-with-comments-scraper Posts feed + comment threads Auth-gated Phase 4
futurizerush/skool-profile-scraper Founder property crawl (Phase 6.8 Pass 2) Public Phase 6.8

Why some data is missing

Skool.com gates posts feeds, leaderboards, and category labels behind an authenticated session. Scraping those fields requires a fresh auth_token cookie from a logged-in browser session, which expires in 24-72 hours.

During this build, the auth cookie expired partway through the deep-dive pass. 5 of 25 communities completed before expiry; the remaining 20 have authoritative public-data fields but empty auth-gated fields. We deliberately do not fabricate or interpolate missing values — gaps are surfaced as 🔒 callouts on each affected community page rather than hidden.

To complete coverage: refresh the cookie at data/skool.cookies.json and re-run npm run posts-feed && npm run render-data.

Causal blueprint methodology

Beyond the descriptive leaderboard, we ran a control-group pairwise analysis: each top-25 community was paired with a near-miss "control" (same niche, same launch year, different founder, 5-25% the size). Differences that show up in the top-25 but not in controls are candidate causal factors; differences common to both are noise.

Eighteen factors across six thematic clusters (founder/distribution, timing/waves, offer/format, trust/proof, AI-platform/infrastructure, network/geography) are scored on prevalence gap. Factors with ≥18/25 paired comparisons differentiated are flagged "high confidence" — the load-bearing levers in the K1 launch playbook.

Full results: /blueprint.