Files
bDS/WORKER_PLAN.md
Georg Bauer 4f9be93c6d Feature/worker threads generation (#43)
* Add worker threads architecture plan for blog generation

* fix: tries to optimize rendering, still slow

* feat: moved site rendering into web worker

* fix: calendar grabs from central data source for calendar

* fix: feeds now use blog language content and not canonical content

---------

Co-authored-by: hugo <hugoms@me.com>
2026-03-09 22:49:25 +01:00

8.8 KiB

Worker Threads Architecture for Blog Generation

Problem

Cmd-R (Render Site) is slow with 10k+ posts / ~20k pages. The rendering pipeline is CPU-bound (Liquid templates + Markdown parsing). All current Promise.all parallelism just interleaves I/O on a single core. Actual multi-core parallelism via worker_threads is needed.

Current Architecture

blogHandlers.ts (IPC entry)
  → preloadGenerationData() — loads all posts, translations, menu
  → Promise.all([5 section tasks])  ← runs on ONE CPU core
      ├── core     (root pages, sitemap, feeds, assets)
      ├── single   (one page per post — THE bottleneck, ~20k pages)
      ├── category (paginated category index pages)
      ├── tag      (paginated tag index pages)
      └── date     (year/month/day archive pages)

Each section calls BlogGenerationEngine.generate() which:
  1. Builds GenerationPostIndex (tags, categories, date buckets)
  2. Bulk-loads file hashes from DB
  3. Creates route renderer (Liquid + PreviewServer + cached engines)
  4. Renders pages sequentially/batched → writes files if hash changed

Shared mutable state across sections:

  • SQLite database (libsql, WAL mode) — singleton connection
  • File system output directory (but sections write to disjoint paths)
  • Template caches (Liquid cache: true) — populated once, read-only after
  • PreloadedGenerationData — read-only after creation

Existing worker_threads usage: Pyodide macro workers (pythonMacro.worker.ts, BlogmarkPythonWorkerRuntime.ts) already use worker_threads successfully.

Target Architecture

MAIN THREAD                              WORKER THREADS (N = cpu_count - 1)
───────────                              ──────────────────────────────────
blogHandlers.ts                          
  preloadGenerationData()               
  serialize + partition posts            
       │                                 
       ├──► Worker 1: posts[0..chunk]    → own DB conn, own Liquid, render + write
       ├──► Worker 2: posts[chunk..2chunk] → own DB conn, own Liquid, render + write
       ├──► Worker 3: posts[2chunk..3chunk] → own DB conn, own Liquid, render + write
       └──► Worker N: posts[...rest]     → own DB conn, own Liquid, render + write
       │
       ├── receive progress messages → TaskManager.emit()
       ├── receive results → merge stats
       └── return merged BlogGenerationResult

Phased Implementation

Phase 1 — Single Post Worker Pool (highest impact)

Move generateSinglePostPages to a worker pool. Single posts are the bottleneck (~20k of ~20k pages). Other sections stay in main thread.

1.1 Spike: Verify dependencies work in worker_threads

  • Test @libsql/client opens a second connection in a worker thread (WAL mode)
  • Test liquidjs renders a template in a worker thread
  • Measure memory overhead per worker with 10k post metadata

1.2 Create generation.worker.ts

New worker entry point that:

  • Receives via workerData: serialized options, post chunk, template roots, DB path, hash cache
  • Opens its own @libsql/client connection (WAL mode allows concurrent readers/writers)
  • Creates its own Liquid instance with cache: true + registers custom filters
  • Creates its own PageRenderer, PreviewServer, route renderer
  • Renders assigned posts → writes HTML files + updates file hashes in DB
  • Sends progress via parentPort.postMessage({ type: 'progress', ... })
  • Sends result via parentPort.postMessage({ type: 'result', stats })

1.3 Serialize PreloadedGenerationData

  • PostData[] contains Date objects → serialize to ISO strings, parse back in worker
  • Post content is lazy-loaded from filesystem → workers read post files directly
  • HtmlRewriteContext maps → pass as plain Record<string, string> (already partially converted)
  • Each worker bulk-loads its own generatedHashCache from DB

1.4 Extract PageRenderer factory for workers

  • Extract filter registration (markdown, i18n) into a shared createPageRenderer(config) function
  • Workers call this with their own DB-backed engines
  • Keep macroTemplateCache and macroLiquid as worker-local singletons (they self-populate)

1.5 Create GenerationWorkerPool

New class that:

  • Spawns N workers (os.cpus().length - 1, configurable, min 1)
  • Distributes post chunks to workers (round-robin or equal split)
  • Collects progress messages → relays to onProgress callback
  • Collects results → merges stats (pagesWritten, pagesSkipped)
  • Handles worker errors/crashes gracefully
  • Tears down workers when generation completes

1.6 Refactor BlogGenerationEngine.generate() coordinator

  • Split into coordinator (main thread) + worker payload
  • Coordinator: loads data, partitions posts, manages worker pool, merges results
  • Multi-language subtree loop: each language pass creates a new set of worker tasks
  • Non-single sections (core, category, tag, date) remain in main thread

1.7 Progress reporting

  • Workers: parentPort.postMessage({ type: 'progress', value, message })
  • Main thread: listen on each worker, relay to TaskManager.emit() → IPC → renderer
  • Aggregate progress across all workers for accurate progress bar

1.8 Testing

  • Unit tests: mock worker pool, test coordinator logic
  • Integration test: spawn real workers with in-memory SQLite + template files
  • Verify existing BlogGenerationEngine.test.ts tests still pass (they mock at engine boundary)

Phase 2 — All Sections in Workers

Move category/tag/date sections to workers too. Each section gets one worker.

  • Category pages → one worker
  • Tag pages → one worker
  • Date archive pages → one worker
  • Core pages stay in main thread (sitemap/feeds/assets are one-time + small)

Phase 3 — Python Macro Handling

Handle posts with Python macros across worker boundaries.

Recommended approach: Pre-expansion pass

  1. Before distributing posts to workers, scan for Python macro markers
  2. Expand macros in main thread (Pyodide is already in a worker — reuse existing PythonMacroWorkerRuntime)
  3. Cache expanded content
  4. Pass pre-expanded content to generation workers

Alternative approaches (if pre-expansion is too slow):

  • Workers send macro calls back to main thread via messages (RPC pattern)
  • Workers skip macro posts; main thread renders them in a second pass

Key Files to Modify

File Change
src/main/engine/generation.worker.ts NEW — worker entry point
src/main/engine/GenerationWorkerPool.ts NEW — worker pool manager
src/main/engine/BlogGenerationEngine.ts Refactor generate() into coordinator
src/main/engine/PageRenderer.ts Extract filter registration into factory function
src/main/engine/GenerationRouteRendererFactory.ts Make usable from worker context
src/main/ipc/blogHandlers.ts Pass DB path + template roots to worker pool
src/main/engine/RoutePageGenerationService.ts generateSinglePostPages moves to worker
vite.config.ts / tsconfig.main.json Worker entry point build config

Data Serialization Requirements

Data Size (10k posts) Strategy
BlogGenerationOptions ~1KB Pass as workerData (plain object)
PreloadedGenerationData ~2-5MB Serialize Date→ISO string, pass via workerData
Post content (body) N/A Workers read from filesystem (lazy)
HtmlRewriteContext ~500KB Pass as Record<string, string> in workerData
generatedHashCache ~1MB Each worker bulk-loads from DB independently
Template files ~50KB Workers read from filesystem
Progress updates tiny parentPort.postMessage()

Risks & Mitigations

Risk Mitigation
@libsql/client native bindings may not work in workers Spike first (1.1). Fallback: use better-sqlite3 directly in workers.
Memory pressure (N copies of post metadata) Measure in spike. Could use SharedArrayBuffer or reduce per-worker data.
Pyodide macros can't run in generation workers Phase 3 pre-expansion pass. Most posts don't use Python macros.
Worker crashes lose progress Pool manager catches errors, reports partial results, coordinator can retry failed chunks.
Template root paths differ in packaged app Pass process.resourcesPath via workerData. Already has CWD fallback.
Build configuration for worker entry point Add worker to Vite/esbuild config (existing pattern from pythonMacro.worker.ts).

Success Criteria

  • Render Site with 10k posts uses all available CPU cores
  • Wall-clock time scales roughly linearly with core count (e.g., 4 cores → ~4x faster)
  • No regression in output correctness (identical HTML output)
  • Progress bar still works smoothly
  • Memory usage stays under 2GB total