* Add worker threads architecture plan for blog generation * fix: tries to optimize rendering, still slow * feat: moved site rendering into web worker * fix: calendar grabs from central data source for calendar * fix: feeds now use blog language content and not canonical content --------- Co-authored-by: hugo <hugoms@me.com>
8.8 KiB
Worker Threads Architecture for Blog Generation
Problem
Cmd-R (Render Site) is slow with 10k+ posts / ~20k pages. The rendering pipeline is CPU-bound (Liquid templates + Markdown parsing). All current Promise.all parallelism just interleaves I/O on a single core. Actual multi-core parallelism via worker_threads is needed.
Current Architecture
blogHandlers.ts (IPC entry)
→ preloadGenerationData() — loads all posts, translations, menu
→ Promise.all([5 section tasks]) ← runs on ONE CPU core
├── core (root pages, sitemap, feeds, assets)
├── single (one page per post — THE bottleneck, ~20k pages)
├── category (paginated category index pages)
├── tag (paginated tag index pages)
└── date (year/month/day archive pages)
Each section calls BlogGenerationEngine.generate() which:
1. Builds GenerationPostIndex (tags, categories, date buckets)
2. Bulk-loads file hashes from DB
3. Creates route renderer (Liquid + PreviewServer + cached engines)
4. Renders pages sequentially/batched → writes files if hash changed
Shared mutable state across sections:
- SQLite database (libsql, WAL mode) — singleton connection
- File system output directory (but sections write to disjoint paths)
- Template caches (Liquid
cache: true) — populated once, read-only after - PreloadedGenerationData — read-only after creation
Existing worker_threads usage: Pyodide macro workers (pythonMacro.worker.ts, BlogmarkPythonWorkerRuntime.ts) already use worker_threads successfully.
Target Architecture
MAIN THREAD WORKER THREADS (N = cpu_count - 1)
─────────── ──────────────────────────────────
blogHandlers.ts
preloadGenerationData()
serialize + partition posts
│
├──► Worker 1: posts[0..chunk] → own DB conn, own Liquid, render + write
├──► Worker 2: posts[chunk..2chunk] → own DB conn, own Liquid, render + write
├──► Worker 3: posts[2chunk..3chunk] → own DB conn, own Liquid, render + write
└──► Worker N: posts[...rest] → own DB conn, own Liquid, render + write
│
├── receive progress messages → TaskManager.emit()
├── receive results → merge stats
└── return merged BlogGenerationResult
Phased Implementation
Phase 1 — Single Post Worker Pool (highest impact)
Move generateSinglePostPages to a worker pool. Single posts are the bottleneck (~20k of ~20k pages). Other sections stay in main thread.
1.1 Spike: Verify dependencies work in worker_threads
- Test
@libsql/clientopens a second connection in a worker thread (WAL mode) - Test
liquidjsrenders a template in a worker thread - Measure memory overhead per worker with 10k post metadata
1.2 Create generation.worker.ts
New worker entry point that:
- Receives via
workerData: serialized options, post chunk, template roots, DB path, hash cache - Opens its own
@libsql/clientconnection (WAL mode allows concurrent readers/writers) - Creates its own
Liquidinstance withcache: true+ registers custom filters - Creates its own
PageRenderer,PreviewServer, route renderer - Renders assigned posts → writes HTML files + updates file hashes in DB
- Sends progress via
parentPort.postMessage({ type: 'progress', ... }) - Sends result via
parentPort.postMessage({ type: 'result', stats })
1.3 Serialize PreloadedGenerationData
PostData[]containsDateobjects → serialize to ISO strings, parse back in worker- Post content is lazy-loaded from filesystem → workers read post files directly
HtmlRewriteContextmaps → pass as plainRecord<string, string>(already partially converted)- Each worker bulk-loads its own
generatedHashCachefrom DB
1.4 Extract PageRenderer factory for workers
- Extract filter registration (markdown, i18n) into a shared
createPageRenderer(config)function - Workers call this with their own DB-backed engines
- Keep
macroTemplateCacheandmacroLiquidas worker-local singletons (they self-populate)
1.5 Create GenerationWorkerPool
New class that:
- Spawns N workers (
os.cpus().length - 1, configurable, min 1) - Distributes post chunks to workers (round-robin or equal split)
- Collects progress messages → relays to
onProgresscallback - Collects results → merges stats (pagesWritten, pagesSkipped)
- Handles worker errors/crashes gracefully
- Tears down workers when generation completes
1.6 Refactor BlogGenerationEngine.generate() coordinator
- Split into coordinator (main thread) + worker payload
- Coordinator: loads data, partitions posts, manages worker pool, merges results
- Multi-language subtree loop: each language pass creates a new set of worker tasks
- Non-single sections (core, category, tag, date) remain in main thread
1.7 Progress reporting
- Workers:
parentPort.postMessage({ type: 'progress', value, message }) - Main thread: listen on each worker, relay to
TaskManager.emit()→ IPC → renderer - Aggregate progress across all workers for accurate progress bar
1.8 Testing
- Unit tests: mock worker pool, test coordinator logic
- Integration test: spawn real workers with in-memory SQLite + template files
- Verify existing
BlogGenerationEngine.test.tstests still pass (they mock at engine boundary)
Phase 2 — All Sections in Workers
Move category/tag/date sections to workers too. Each section gets one worker.
- Category pages → one worker
- Tag pages → one worker
- Date archive pages → one worker
- Core pages stay in main thread (sitemap/feeds/assets are one-time + small)
Phase 3 — Python Macro Handling
Handle posts with Python macros across worker boundaries.
Recommended approach: Pre-expansion pass
- Before distributing posts to workers, scan for Python macro markers
- Expand macros in main thread (Pyodide is already in a worker — reuse existing
PythonMacroWorkerRuntime) - Cache expanded content
- Pass pre-expanded content to generation workers
Alternative approaches (if pre-expansion is too slow):
- Workers send macro calls back to main thread via messages (RPC pattern)
- Workers skip macro posts; main thread renders them in a second pass
Key Files to Modify
| File | Change |
|---|---|
src/main/engine/generation.worker.ts |
NEW — worker entry point |
src/main/engine/GenerationWorkerPool.ts |
NEW — worker pool manager |
src/main/engine/BlogGenerationEngine.ts |
Refactor generate() into coordinator |
src/main/engine/PageRenderer.ts |
Extract filter registration into factory function |
src/main/engine/GenerationRouteRendererFactory.ts |
Make usable from worker context |
src/main/ipc/blogHandlers.ts |
Pass DB path + template roots to worker pool |
src/main/engine/RoutePageGenerationService.ts |
generateSinglePostPages moves to worker |
vite.config.ts / tsconfig.main.json |
Worker entry point build config |
Data Serialization Requirements
| Data | Size (10k posts) | Strategy |
|---|---|---|
BlogGenerationOptions |
~1KB | Pass as workerData (plain object) |
PreloadedGenerationData |
~2-5MB | Serialize Date→ISO string, pass via workerData |
| Post content (body) | N/A | Workers read from filesystem (lazy) |
HtmlRewriteContext |
~500KB | Pass as Record<string, string> in workerData |
generatedHashCache |
~1MB | Each worker bulk-loads from DB independently |
| Template files | ~50KB | Workers read from filesystem |
| Progress updates | tiny | parentPort.postMessage() |
Risks & Mitigations
| Risk | Mitigation |
|---|---|
@libsql/client native bindings may not work in workers |
Spike first (1.1). Fallback: use better-sqlite3 directly in workers. |
| Memory pressure (N copies of post metadata) | Measure in spike. Could use SharedArrayBuffer or reduce per-worker data. |
| Pyodide macros can't run in generation workers | Phase 3 pre-expansion pass. Most posts don't use Python macros. |
| Worker crashes lose progress | Pool manager catches errors, reports partial results, coordinator can retry failed chunks. |
| Template root paths differ in packaged app | Pass process.resourcesPath via workerData. Already has CWD fallback. |
| Build configuration for worker entry point | Add worker to Vite/esbuild config (existing pattern from pythonMacro.worker.ts). |
Success Criteria
- Render Site with 10k posts uses all available CPU cores
- Wall-clock time scales roughly linearly with core count (e.g., 4 cores → ~4x faster)
- No regression in output correctness (identical HTML output)
- Progress bar still works smoothly
- Memory usage stays under 2GB total