# Python Scripting Integration Plan (Electron + Pyodide) ## 1. Goal and Scope Primary goal: all render-time macros run in Python, with predictable performance and safe sandboxing. Secondary goals: - User-editable Python scripts with project persistence (`scripts/` folder + DB index). - Python scripting reuse in bookmarklet/post-processing pipelines. - Keep architecture consistent with existing `main/engine` + `ipc` + `renderer` boundaries. This document defines a staged path from MVP to full scope. --- ## 2. Viability Summary (Realistic Expectations) ### Is this realistic? Yes, if we optimize for **low bridge overhead** and **stable execution contracts**. Key reality checks: - Pyodide in a worker is viable for user scripting and sandboxing. - Macro execution in render loops can be fast enough if we avoid frequent JS↔Python conversions. - `< 1ms` per macro call is possible only for simple macros with precompiled code and minimal marshaling; it must be treated as a benchmark target, not a guarantee. - For heavy loops, the work should stay inside Python once called (coarse-grained calls), not bounce per item between JS and Python. Decision: - Keep Pyodide as the default engine. - Design a strict, minimal ABI (Application Binary Interface-like contract) for macro inputs/outputs. - Use preloading, precompilation, and caching before adding advanced optimizations. --- ## 3. Architecture Fit for bDS ### 3.1 Layering Keep existing project boundaries: - `src/main/engine`: script metadata, script storage/indexing, render orchestration, benchmark/logging. - `src/main/ipc`: typed handlers for script CRUD, run, and diagnostics. - `src/renderer`: script editor UI, run controls, output panel integration. - Python runtime (Pyodide) stays in renderer-side worker context; main process never executes untrusted script code. ### 3.2 Runtime Placement - Host a long-lived `PythonRuntimeWorker` in renderer. - Initialize Pyodide once per app session (or lazily on first script run). - Maintain an in-memory registry of loaded scripts and compiled callables. ### 3.3 Macro Execution Contract (Performance-Critical) Use one narrow contract for all macros: Python side: ```python def render(context: dict) -> dict: ... ``` Contract rules: - Input `context` is plain JSON-compatible data only. - Output is plain JSON-compatible data only. - No Node/Electron direct access from Python. - No per-token/per-node callbacks into JS while rendering. ### 3.4 Bridge Strategy (Keep Conversions Simple) - Preferred: pass compact JSON payloads (single call in, single result out). - Avoid dynamic proxy-style JS objects in hot paths. - Avoid `toPy()/toJs()` inside tight loops. - Use `pyodide.globals` only for stable utility bindings set once during worker startup. ### 3.5 Security Model - Scripts execute only in worker. - Hard timeout + termination + runtime restart on runaway scripts. - Allowlist API surface exposed to Python (pure functions where possible). - Validate and sanitize all script outputs in JS before applying to render pipeline. --- ## 4. Staged Delivery Plan ## Phase 0 — Technical Spike (timeboxed) Objective: prove runtime viability before product surface growth. Deliverables: - [ ] Add `pyodide` dependency and worker boot sequence. - [ ] Run a sample script end-to-end (`run_script`, timeout, captured stdout). - [ ] Benchmark baseline cold start + warm run + repeated macro calls. - [ ] Define initial macro ABI (`render(context) -> result`) and schema docs. Exit criteria: - Warm script execution is stable. - Timeout recovery works. - Measured baseline captured in repo docs. ## Phase 1 — MVP (Minimal but Usable) Objective: user can create/run scripts and see output. Deliverables: - [ ] Script storage model (DB index + filesystem source in `scripts/*.py`). - [ ] CRUD APIs in `main/engine` + `ipc` handlers. - [ ] Renderer scripts list + editor + run button. - [ ] Console/output capture in existing bottom output area. - [ ] Project rebuild picks up `scripts/` changes. Out of scope for MVP: - Macro replacement. - Bookmarklet integration. - AI assistant tool access from Python. Exit criteria: - Scripts can be created, persisted, run, and debugged. - Script files round-trip correctly with filesystem. ## Phase 2 — Macro Runtime Foundation Objective: integrate Python macros into renderer loop with low overhead. Deliverables: - [ ] Add script type/metadata (`kind: macro | utility | transform`). - [ ] Resolve macro references from content to script IDs. - [ ] Implement macro runtime cache: module load once, callable reuse. - [ ] Convert existing macro parameter parsing into typed context object once per macro invocation. - [ ] Add perf counters (call count, p50/p95 runtime, timeout count). Exit criteria: - Python macro path is feature-equivalent for at least 1–2 existing macros. - Measured overhead acceptable against baseline. ## Phase 3 — Macro Migration (Full Goal) Objective: all current built-in macros are Python-backed. Deliverables: - [ ] Port each existing macro implementation to Python scripts. - [ ] Keep default macro scripts versioned in repo and bundled with app. - [ ] On startup/project init, seed missing default macro scripts into filesystem + DB. - [ ] Add script-as-macro assignment in metadata and editor UX. - [ ] Keep parameter typing rules explicit (`"123"` quoted string stays string; unquoted numerics map to int/float). Exit criteria: - All built-in macros execute via Python runtime. - Legacy JS macro path is removed after parity confirmation. ## Phase 4 — Performance Hardening Objective: reach production-grade speed and stability for render loops. Deliverables: - [ ] Precompile/load scripts once per worker lifecycle. - [ ] Batch render APIs where beneficial (`render_many(contexts)`). - [ ] Reduce marshaling size (compact context shape, no redundant fields). - [ ] Optional SharedArrayBuffer experiments only if measured need justifies added complexity. - [ ] Failure isolation and automatic runtime reset strategy. Exit criteria: - Stable long-run benchmarks in CI/manual perf suite. - No UI thread stalls during heavy generation. ## Phase 5 — Bookmarklet/Post Transform Integration Objective: reuse Python runtime for post-ingest transformations. Deliverables: - [ ] Hook script transforms into bookmarklet pipeline after data sanitization. - [ ] Input: validated post object; output: transformed validated post object. - [ ] Add transform-specific script type and error handling/reporting. Exit criteria: - Transform scripts can safely modify incoming post content. - Fallback behavior exists when transform fails. ## Phase 6 — Advanced Capabilities (Optional) Objective: add power-user features only after core stability. Candidates: - [ ] Python-accessible app tools (strict allowlist). - [ ] AI assistant tooling from Python scripts. - [ ] Script package/dependency policy for curated modules. --- ## 5. Data and Storage Design - Source of truth for scripts follows existing pattern: filesystem + DB index. - Files: `scripts/.py`. - Metadata can be stored in: - DB columns (preferred for indexing/query), and/or - leading Python block comment for file portability. - Rebuild/meta-diff must include `scripts/` exactly like posts/media flow. Recommended script metadata: - `id`, `slug`, `title`, `kind`, `entrypoint`, `enabled`, `version`, `updatedAt`. --- ## 6. Performance Plan (Macro-Critical) Principles: - Coarse-grained calls: one macro invocation should do meaningful work in Python. - Stable ABI: small, predictable context payload. - Warm runtime reuse: no repeated Pyodide boot. - Compile/load once, execute many. Initial target envelope (to validate in Phase 0/2): - Warm invocation overhead target: low single-digit milliseconds for typical macros. - p95 render stability target under large generation batches. - Timeout and memory guardrails for pathological scripts. Note: The previous strict `<1ms` universal target is replaced by benchmark tiers by macro class (simple/medium/heavy), which is more realistic. --- ## 7. Security and Reliability - No direct filesystem/network/process APIs in Python runtime. - Worker watchdog timeout and hard-kill policy. - Structured errors returned to UI and logs. - Script output validation before use in rendering. - Versioned default scripts to ensure deterministic behavior across app updates. --- ## 8. Testing and Rollout Strategy - Unit tests for engine-level script registry, metadata, and macro resolution. - Integration tests for worker protocol and timeout recovery. - Golden tests to compare macro output parity before/after migration. - Performance regression checks for macro hot paths. - Feature flag for staged rollout before removing legacy macro path. --- ## 9. Coding Agent Execution Pack This section makes the plan directly executable by coding agents. ### 9.1 Working Rules for Agents - Work one phase at a time; do not start the next phase before exit criteria pass. - Keep changes layered by architecture boundary (`main/engine`, `main/ipc`, `renderer`). - For each task: write/adjust tests first where feasible, then implement minimal code. - Keep runtime contract stable once introduced; changes require updating ABI docs and tests. - Do not add broad API exposure from JS/Electron into Python; only allowlisted calls. ### 9.2 Definition of Done (Per Phase) Each phase is done only if all are true: - [ ] Deliverables implemented. - [ ] Exit criteria verified. - [ ] Relevant tests pass. - [ ] Full test suite passes (`npm test`). - [ ] Full build passes (`npm run build`). - [ ] Plan document updated with decisions/benchmarks where applicable. ### 9.3 Task Card Template (Use for Every Agent Task) ```md Task: Scope: Files expected to change: Out of scope: Acceptance checks: Commands to run: Notes/Risks: ``` ### 9.4 Phase-by-Phase Agent Backlog (Suggested) #### Phase 0 backlog 1. Runtime bootstrap spike - Scope: add Pyodide dependency and worker startup path only. - Files likely: `package.json`, new worker file under `src/renderer/`. - Acceptance: worker initializes once, reports ready state. 2. Safe execute protocol - Scope: request/response protocol (`run`, `stdout`, `error`, `timeout`). - Files likely: renderer runtime manager + worker + related types. - Acceptance: sample script run succeeds; timeout kills and recovers runtime. 3. Baseline benchmark harness - Scope: cold start, warm run, repeated macro invoke metrics. - Files likely: engine/diagnostic service or dedicated benchmark utility + docs. - Acceptance: numbers recorded in this document or linked benchmark doc. 4. ABI v1 spec - Scope: formal JSON schema for macro `context` and `result`. - Files likely: shared type definitions + docs. - Acceptance: schema used by both caller and worker-side validator. #### Phase 1 backlog 1. Script persistence model - Scope: DB + filesystem mapping for `scripts/*.py`. - Acceptance: create/update/delete round-trips both stores. 2. Main engine + IPC CRUD - Scope: add script engine methods and typed IPC handlers. - Acceptance: renderer can list/read/write scripts through IPC only. 3. Renderer MVP UI - Scope: scripts list, editor panel, run button, output panel integration. - Acceptance: user edits script, runs it, sees stdout/errors. 4. Rebuild/meta-diff integration - Scope: include scripts in existing rebuild and metadata diff flow. - Acceptance: external file changes in `scripts/` are detected and synchronized. #### Phase 2 backlog 1. Macro script typing + mapping - Scope: `kind` metadata and mapping from macro token to script id. - Acceptance: at least one macro resolved to Python script. 2. Runtime cache path - Scope: load/compile once; callable reuse. - Acceptance: repeated macro invocations avoid re-init/re-import. 3. Context adapter - Scope: convert existing macro params into ABI v1 `context` once per invocation. - Acceptance: typed values obey conversion rules. 4. Perf counters - Scope: call count, p50/p95, timeout/error counts. - Acceptance: counters visible in logs/diagnostics. #### Phase 3 backlog 1. Built-in macro parity migration - Scope: port each macro to Python scripts and add parity tests. - Acceptance: output parity with legacy macros for baseline fixtures. 2. Default script seeding/versioning - Scope: bundle defaults, seed missing scripts on init. - Acceptance: clean project bootstraps required macro scripts automatically. 3. Legacy path removal - Scope: remove JS macro implementations after parity gate. - Acceptance: tests pass with Python-only macro path. #### Phase 4–6 backlog - Keep as optimization/integration tracks only after parity and stability gates pass. ### 9.5 Anti-Patterns for Agents (Do Not Do) - Do not call JS functions per token/item from Python in hot paths. - Do not pass large proxy objects through the bridge in render loops. - Do not introduce direct filesystem/network access in Python runtime. - Do not couple UI/editor work with macro migration in one PR-sized change. - Do not remove legacy macro code before golden parity tests pass. ### 9.6 Handoff Checklist (Agent to Agent) Every handoff should include: - Completed task cards and remaining task cards. - Files changed and rationale. - Test/build command outputs summary. - Known risks and benchmark deltas. - Any ABI changes (must be explicit). ### 9.7 Suggested PR Boundaries (One Task, One PR) Use small PRs with one primary purpose each. PR-00: Pyodide bootstrap spike - Includes: dependency, worker init, ready signal. - Excludes: script persistence, UI/editor. - Merge gate: runtime initializes and tests/build pass. PR-01: Worker run protocol + timeout recovery - Includes: run/stdout/error/timeout messaging, watchdog + restart behavior. - Excludes: macro integration. - Merge gate: timeout test and recovery test pass. PR-02: ABI v1 types + schema validation - Includes: shared types and validation for `context/result`. - Excludes: macro migration. - Merge gate: caller and worker both use ABI validators. PR-03: Script persistence model - Includes: DB + filesystem model for `scripts/*.py`. - Excludes: renderer UI. - Merge gate: round-trip persistence tests pass. PR-04: Script engine + IPC CRUD - Includes: `main/engine` methods and typed `ipc` handlers. - Excludes: macro runtime. - Merge gate: IPC integration tests pass. PR-05: Renderer MVP scripts UI - Includes: scripts list/editor/run/output integration. - Excludes: macro substitution. - Merge gate: end-to-end manual run path works + tests/build pass. PR-06: Rebuild/meta-diff integration - Includes: include `scripts/` in rebuild and metadata diff paths. - Excludes: macro migration. - Merge gate: external script file changes are detected and synchronized. PR-07: Macro mapping + runtime cache foundation - Includes: macro-to-script mapping, callable cache, first Python-backed macro. - Excludes: full macro parity. - Merge gate: at least one macro parity fixture passes. PR-08: Macro parity migration batch A - Includes: port a small set of built-in macros (e.g., 2–3) + golden tests. - Excludes: removal of legacy path. - Merge gate: parity fixtures pass for migrated macros. PR-09: Macro parity migration batch B (repeat as needed) - Includes: additional macro ports + fixtures. - Excludes: removal of legacy path. - Merge gate: all targeted macro parity tests pass. PR-10: Default script seeding/versioning - Includes: bundled default scripts + startup seeding behavior. - Excludes: advanced scripting APIs. - Merge gate: clean project gets default scripts deterministically. PR-11: Legacy JS macro path removal - Includes: delete legacy macro implementations after full parity. - Excludes: bookmarklet transforms. - Merge gate: full test suite and render parity suite pass. PR-12: Performance hardening - Includes: benchmark harness refinements, caching improvements, optional batch APIs. - Excludes: unrelated UI changes. - Merge gate: regression thresholds (p50/p95) stay within agreed envelope. PR-13: Bookmarklet transform integration - Includes: transform script type, pipeline hook, validation/fallback. - Excludes: optional advanced tool APIs. - Merge gate: sanitized input/output transform tests pass. PR-14+: Optional advanced capabilities - Includes: allowlisted app tools, AI-assistant script tools, curated package policy. - Merge gate: explicit security review and feature-flag rollout. --- ## 10. Current Status Status: Revised staged plan (MVP-first, full-scope preserved). Recommended next action: 1. Approve Phase 0 scope and benchmarks. 2. Implement spike and record numbers. 3. Lock ABI before building full UI and migration layers.