From aa33c361c25db1b37c70a13c94a2f5c58f42a708 Mon Sep 17 00:00:00 2001 From: sttil Date: Wed, 6 May 2026 20:09:18 +0000 Subject: [PATCH] Update README for CLI args usage --- README.md | 105 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 105 insertions(+) diff --git a/README.md b/README.md index e69de29..6530ff3 100644 --- a/README.md +++ b/README.md @@ -0,0 +1,105 @@ +# skool-lesson-scrape + +Download lessons from any Skool community classroom to local Markdown files. + +- Works on Mac, Windows, and Linux +- Skips lessons already saved — safe to re-run when new content is added +- Saves one `.md` file per lesson: `Course Name -- Lesson Title.md` +- Respects your membership tier — only downloads content your account can access +- Works great with Obsidian, Notion, or any Markdown-based knowledge system + +--- + +## Requirements + +- Python 3.8 or later +- A paid Skool account with access to the community you want to scrape + +--- + +## Setup + +```bash +pip install -r requirements.txt +playwright install chromium +``` + +--- + +## Usage + +```bash +python scrape.py +``` + +A browser window will open. Log in to Skool normally (email/password or Google). The script takes over automatically once you land on the community. + +**With a custom output folder:** +```bash +python scrape.py --output ~/Documents/my-lessons +``` + +**Re-run anytime** — already-saved lessons are skipped automatically. + +**Debug mode** — inspect page structure without saving anything: +```bash +python scrape.py --discover +``` + +### Finding your community slug + +Your community slug is the part of the URL after `skool.com/`: + +``` +https://www.skool.com/my-community -> my-community +``` + +### Examples + +```bash +python scrape.py navaigate +python scrape.py my-community --output ~/Documents/Lessons +python scrape.py my-community --discover +``` + +### Default output folder + +If you omit `--output`, lessons are saved to `~/skool-lessons`. + +**Obsidian users** — point `--output` at a folder inside your vault: +```bash +python scrape.py my-community --output ~/Documents/MyVault/Lessons +``` + +**Windows users** — use quotes around paths with spaces: +```bash +python scrape.py my-community --output "C:/Users/YourName/Documents/skool-lessons" +``` + +--- + +## How it works + +Skool embeds course and lesson structure as JSON in the page source (`__NEXT_DATA__`). The script reads that directly to get course and lesson IDs, then navigates to each lesson and extracts the body text from Skool's TipTap editor (`.ProseMirror` selector). No fragile DOM scraping — the JSON structure is stable. + +--- + +## Notes + +- Content is gated by your own Skool membership — you can only download lessons your account has access to +- This tool is for personal offline backup, not redistribution of community content +- Re-running after new lessons are posted will only download what's new + +--- + +## Troubleshooting + +**"No courses found"** — a diagnostic HTML file is saved to your system temp folder. The page structure may have changed; open an issue with the HTML attached. + +**Browser closes immediately** — make sure you completed the Playwright browser install: `playwright install chromium` + +**Lessons saving as navigation boilerplate** — run `--discover` and open an issue with the output. + +--- + +Built by [Kisa Fenn](https://github.com/kisasttil-gif) — STTIL Solutions