Update README for CLI args usage
This commit is contained in:
parent
6d56c30f3b
commit
aa33c361c2
1 changed files with 105 additions and 0 deletions
105
README.md
105
README.md
|
|
@ -0,0 +1,105 @@
|
|||
# skool-lesson-scrape
|
||||
|
||||
Download lessons from any Skool community classroom to local Markdown files.
|
||||
|
||||
- Works on Mac, Windows, and Linux
|
||||
- Skips lessons already saved — safe to re-run when new content is added
|
||||
- Saves one `.md` file per lesson: `Course Name -- Lesson Title.md`
|
||||
- Respects your membership tier — only downloads content your account can access
|
||||
- Works great with Obsidian, Notion, or any Markdown-based knowledge system
|
||||
|
||||
---
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.8 or later
|
||||
- A paid Skool account with access to the community you want to scrape
|
||||
|
||||
---
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
playwright install chromium
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
python scrape.py <community>
|
||||
```
|
||||
|
||||
A browser window will open. Log in to Skool normally (email/password or Google). The script takes over automatically once you land on the community.
|
||||
|
||||
**With a custom output folder:**
|
||||
```bash
|
||||
python scrape.py <community> --output ~/Documents/my-lessons
|
||||
```
|
||||
|
||||
**Re-run anytime** — already-saved lessons are skipped automatically.
|
||||
|
||||
**Debug mode** — inspect page structure without saving anything:
|
||||
```bash
|
||||
python scrape.py <community> --discover
|
||||
```
|
||||
|
||||
### Finding your community slug
|
||||
|
||||
Your community slug is the part of the URL after `skool.com/`:
|
||||
|
||||
```
|
||||
https://www.skool.com/my-community -> my-community
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
python scrape.py navaigate
|
||||
python scrape.py my-community --output ~/Documents/Lessons
|
||||
python scrape.py my-community --discover
|
||||
```
|
||||
|
||||
### Default output folder
|
||||
|
||||
If you omit `--output`, lessons are saved to `~/skool-lessons`.
|
||||
|
||||
**Obsidian users** — point `--output` at a folder inside your vault:
|
||||
```bash
|
||||
python scrape.py my-community --output ~/Documents/MyVault/Lessons
|
||||
```
|
||||
|
||||
**Windows users** — use quotes around paths with spaces:
|
||||
```bash
|
||||
python scrape.py my-community --output "C:/Users/YourName/Documents/skool-lessons"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How it works
|
||||
|
||||
Skool embeds course and lesson structure as JSON in the page source (`__NEXT_DATA__`). The script reads that directly to get course and lesson IDs, then navigates to each lesson and extracts the body text from Skool's TipTap editor (`.ProseMirror` selector). No fragile DOM scraping — the JSON structure is stable.
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- Content is gated by your own Skool membership — you can only download lessons your account has access to
|
||||
- This tool is for personal offline backup, not redistribution of community content
|
||||
- Re-running after new lessons are posted will only download what's new
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**"No courses found"** — a diagnostic HTML file is saved to your system temp folder. The page structure may have changed; open an issue with the HTML attached.
|
||||
|
||||
**Browser closes immediately** — make sure you completed the Playwright browser install: `playwright install chromium`
|
||||
|
||||
**Lessons saving as navigation boilerplate** — run `--discover` and open an issue with the output.
|
||||
|
||||
---
|
||||
|
||||
Built by [Kisa Fenn](https://github.com/kisasttil-gif) — STTIL Solutions
|
||||
Loading…
Reference in a new issue