16 KiB
Python Scripting Integration Plan (Electron + Pyodide)
1. Goal and Scope
Primary goal: all render-time macros run in Python, with predictable performance and safe sandboxing.
Secondary goals:
- User-editable Python scripts with project persistence (
scripts/folder + DB index). - Python scripting reuse in bookmarklet/post-processing pipelines.
- Keep architecture consistent with existing
main/engine+ipc+rendererboundaries.
This document defines a staged path from MVP to full scope.
2. Viability Summary (Realistic Expectations)
Is this realistic?
Yes, if we optimize for low bridge overhead and stable execution contracts.
Key reality checks:
- Pyodide in a worker is viable for user scripting and sandboxing.
- Macro execution in render loops can be fast enough if we avoid frequent JS↔Python conversions.
< 1msper macro call is possible only for simple macros with precompiled code and minimal marshaling; it must be treated as a benchmark target, not a guarantee.- For heavy loops, the work should stay inside Python once called (coarse-grained calls), not bounce per item between JS and Python.
Decision:
- Keep Pyodide as the default engine.
- Design a strict, minimal ABI (Application Binary Interface-like contract) for macro inputs/outputs.
- Use preloading, precompilation, and caching before adding advanced optimizations.
3. Architecture Fit for bDS
3.1 Layering
Keep existing project boundaries:
src/main/engine: script metadata, script storage/indexing, render orchestration, benchmark/logging.src/main/ipc: typed handlers for script CRUD, run, and diagnostics.src/renderer: script editor UI, run controls, output panel integration.- Python runtime (Pyodide) stays in renderer-side worker context; main process never executes untrusted script code.
3.2 Runtime Placement
- Host a long-lived
PythonRuntimeWorkerin renderer. - Initialize Pyodide once per app session (or lazily on first script run).
- Maintain an in-memory registry of loaded scripts and compiled callables.
3.3 Macro Execution Contract (Performance-Critical)
Use one narrow contract for all macros:
Python side:
def render(context: dict) -> dict:
...
Contract rules:
- Input
contextis plain JSON-compatible data only. - Output is plain JSON-compatible data only.
- No Node/Electron direct access from Python.
- No per-token/per-node callbacks into JS while rendering.
3.4 Bridge Strategy (Keep Conversions Simple)
- Preferred: pass compact JSON payloads (single call in, single result out).
- Avoid dynamic proxy-style JS objects in hot paths.
- Avoid
toPy()/toJs()inside tight loops. - Use
pyodide.globalsonly for stable utility bindings set once during worker startup.
3.5 Security Model
- Scripts execute only in worker.
- Hard timeout + termination + runtime restart on runaway scripts.
- Allowlist API surface exposed to Python (pure functions where possible).
- Validate and sanitize all script outputs in JS before applying to render pipeline.
4. Staged Delivery Plan
Phase 0 — Technical Spike (timeboxed)
Objective: prove runtime viability before product surface growth.
Deliverables:
- Add
pyodidedependency and worker boot sequence. - Run a sample script end-to-end (
run_script, timeout, captured stdout). - Benchmark baseline cold start + warm run + repeated macro calls.
- Define initial macro ABI (
render(context) -> result) and schema docs.
Exit criteria:
- Warm script execution is stable.
- Timeout recovery works.
- Measured baseline captured in repo docs.
Phase 1 — MVP (Minimal but Usable)
Objective: user can create/run scripts and see output.
Deliverables:
- Script storage model (DB index + filesystem source in
scripts/*.py). - CRUD APIs in
main/engine+ipchandlers. - Renderer scripts list + editor + run button.
- Console/output capture in existing bottom output area.
- Project rebuild picks up
scripts/changes.
Out of scope for MVP:
- Macro replacement.
- Bookmarklet integration.
- AI assistant tool access from Python.
Exit criteria:
- Scripts can be created, persisted, run, and debugged.
- Script files round-trip correctly with filesystem.
Phase 2 — Macro Runtime Foundation
Objective: integrate Python macros into renderer loop with low overhead.
Deliverables:
- Add script type/metadata (
kind: macro | utility | transform). - Resolve macro references from content to script IDs.
- Implement macro runtime cache: module load once, callable reuse.
- Convert existing macro parameter parsing into typed context object once per macro invocation.
- Add perf counters (call count, p50/p95 runtime, timeout count).
Exit criteria:
- Python macro path is feature-equivalent for at least 1–2 existing macros.
- Measured overhead acceptable against baseline.
Phase 3 — Macro Migration (Full Goal)
Objective: all current built-in macros are Python-backed.
Deliverables:
- Port each existing macro implementation to Python scripts.
- Keep default macro scripts versioned in repo and bundled with app.
- On startup/project init, seed missing default macro scripts into filesystem + DB.
- Add script-as-macro assignment in metadata and editor UX.
- Keep parameter typing rules explicit (
"123"quoted string stays string; unquoted numerics map to int/float).
Exit criteria:
- All built-in macros execute via Python runtime.
- Legacy JS macro path is removed after parity confirmation.
Phase 4 — Performance Hardening
Objective: reach production-grade speed and stability for render loops.
Deliverables:
- Precompile/load scripts once per worker lifecycle.
- Batch render APIs where beneficial (
render_many(contexts)). - Reduce marshaling size (compact context shape, no redundant fields).
- Optional SharedArrayBuffer experiments only if measured need justifies added complexity.
- Failure isolation and automatic runtime reset strategy.
Exit criteria:
- Stable long-run benchmarks in CI/manual perf suite.
- No UI thread stalls during heavy generation.
Phase 5 — Bookmarklet/Post Transform Integration
Objective: reuse Python runtime for post-ingest transformations.
Deliverables:
- Hook script transforms into bookmarklet pipeline after data sanitization.
- Input: validated post object; output: transformed validated post object.
- Add transform-specific script type and error handling/reporting.
Exit criteria:
- Transform scripts can safely modify incoming post content.
- Fallback behavior exists when transform fails.
Phase 6 — Advanced Capabilities (Optional)
Objective: add power-user features only after core stability.
Candidates:
- Python-accessible app tools (strict allowlist).
- AI assistant tooling from Python scripts.
- Script package/dependency policy for curated modules.
5. Data and Storage Design
- Source of truth for scripts follows existing pattern: filesystem + DB index.
- Files:
scripts/<slug>.py. - Metadata can be stored in:
- DB columns (preferred for indexing/query), and/or
- leading Python block comment for file portability.
- Rebuild/meta-diff must include
scripts/exactly like posts/media flow.
Recommended script metadata:
id,slug,title,kind,entrypoint,enabled,version,updatedAt.
6. Performance Plan (Macro-Critical)
Principles:
- Coarse-grained calls: one macro invocation should do meaningful work in Python.
- Stable ABI: small, predictable context payload.
- Warm runtime reuse: no repeated Pyodide boot.
- Compile/load once, execute many.
Initial target envelope (to validate in Phase 0/2):
- Warm invocation overhead target: low single-digit milliseconds for typical macros.
- p95 render stability target under large generation batches.
- Timeout and memory guardrails for pathological scripts.
Note: The previous strict <1ms universal target is replaced by benchmark tiers by macro class (simple/medium/heavy), which is more realistic.
7. Security and Reliability
- No direct filesystem/network/process APIs in Python runtime.
- Worker watchdog timeout and hard-kill policy.
- Structured errors returned to UI and logs.
- Script output validation before use in rendering.
- Versioned default scripts to ensure deterministic behavior across app updates.
8. Testing and Rollout Strategy
- Unit tests for engine-level script registry, metadata, and macro resolution.
- Integration tests for worker protocol and timeout recovery.
- Golden tests to compare macro output parity before/after migration.
- Performance regression checks for macro hot paths.
- Feature flag for staged rollout before removing legacy macro path.
9. Coding Agent Execution Pack
This section makes the plan directly executable by coding agents.
9.1 Working Rules for Agents
- Work one phase at a time; do not start the next phase before exit criteria pass.
- Keep changes layered by architecture boundary (
main/engine,main/ipc,renderer). - For each task: write/adjust tests first where feasible, then implement minimal code.
- Keep runtime contract stable once introduced; changes require updating ABI docs and tests.
- Do not add broad API exposure from JS/Electron into Python; only allowlisted calls.
9.2 Definition of Done (Per Phase)
Each phase is done only if all are true:
- Deliverables implemented.
- Exit criteria verified.
- Relevant tests pass.
- Full test suite passes (
npm test). - Full build passes (
npm run build). - Plan document updated with decisions/benchmarks where applicable.
9.3 Task Card Template (Use for Every Agent Task)
Task:
Scope:
Files expected to change:
Out of scope:
Acceptance checks:
Commands to run:
Notes/Risks:
9.4 Phase-by-Phase Agent Backlog (Suggested)
Phase 0 backlog
- Runtime bootstrap spike
- Scope: add Pyodide dependency and worker startup path only.
- Files likely:
package.json, new worker file undersrc/renderer/. - Acceptance: worker initializes once, reports ready state.
- Safe execute protocol
- Scope: request/response protocol (
run,stdout,error,timeout). - Files likely: renderer runtime manager + worker + related types.
- Acceptance: sample script run succeeds; timeout kills and recovers runtime.
- Baseline benchmark harness
- Scope: cold start, warm run, repeated macro invoke metrics.
- Files likely: engine/diagnostic service or dedicated benchmark utility + docs.
- Acceptance: numbers recorded in this document or linked benchmark doc.
- ABI v1 spec
- Scope: formal JSON schema for macro
contextandresult. - Files likely: shared type definitions + docs.
- Acceptance: schema used by both caller and worker-side validator.
Phase 1 backlog
- Script persistence model
- Scope: DB + filesystem mapping for
scripts/*.py. - Acceptance: create/update/delete round-trips both stores.
- Main engine + IPC CRUD
- Scope: add script engine methods and typed IPC handlers.
- Acceptance: renderer can list/read/write scripts through IPC only.
- Renderer MVP UI
- Scope: scripts list, editor panel, run button, output panel integration.
- Acceptance: user edits script, runs it, sees stdout/errors.
- Rebuild/meta-diff integration
- Scope: include scripts in existing rebuild and metadata diff flow.
- Acceptance: external file changes in
scripts/are detected and synchronized.
Phase 2 backlog
- Macro script typing + mapping
- Scope:
kindmetadata and mapping from macro token to script id. - Acceptance: at least one macro resolved to Python script.
- Runtime cache path
- Scope: load/compile once; callable reuse.
- Acceptance: repeated macro invocations avoid re-init/re-import.
- Context adapter
- Scope: convert existing macro params into ABI v1
contextonce per invocation. - Acceptance: typed values obey conversion rules.
- Perf counters
- Scope: call count, p50/p95, timeout/error counts.
- Acceptance: counters visible in logs/diagnostics.
Phase 3 backlog
- Built-in macro parity migration
- Scope: port each macro to Python scripts and add parity tests.
- Acceptance: output parity with legacy macros for baseline fixtures.
- Default script seeding/versioning
- Scope: bundle defaults, seed missing scripts on init.
- Acceptance: clean project bootstraps required macro scripts automatically.
- Legacy path removal
- Scope: remove JS macro implementations after parity gate.
- Acceptance: tests pass with Python-only macro path.
Phase 4–6 backlog
- Keep as optimization/integration tracks only after parity and stability gates pass.
9.5 Anti-Patterns for Agents (Do Not Do)
- Do not call JS functions per token/item from Python in hot paths.
- Do not pass large proxy objects through the bridge in render loops.
- Do not introduce direct filesystem/network access in Python runtime.
- Do not couple UI/editor work with macro migration in one PR-sized change.
- Do not remove legacy macro code before golden parity tests pass.
9.6 Handoff Checklist (Agent to Agent)
Every handoff should include:
- Completed task cards and remaining task cards.
- Files changed and rationale.
- Test/build command outputs summary.
- Known risks and benchmark deltas.
- Any ABI changes (must be explicit).
9.7 Suggested PR Boundaries (One Task, One PR)
Use small PRs with one primary purpose each.
PR-00: Pyodide bootstrap spike
- Includes: dependency, worker init, ready signal.
- Excludes: script persistence, UI/editor.
- Merge gate: runtime initializes and tests/build pass.
PR-01: Worker run protocol + timeout recovery
- Includes: run/stdout/error/timeout messaging, watchdog + restart behavior.
- Excludes: macro integration.
- Merge gate: timeout test and recovery test pass.
PR-02: ABI v1 types + schema validation
- Includes: shared types and validation for
context/result. - Excludes: macro migration.
- Merge gate: caller and worker both use ABI validators.
PR-03: Script persistence model
- Includes: DB + filesystem model for
scripts/*.py. - Excludes: renderer UI.
- Merge gate: round-trip persistence tests pass.
PR-04: Script engine + IPC CRUD
- Includes:
main/enginemethods and typedipchandlers. - Excludes: macro runtime.
- Merge gate: IPC integration tests pass.
PR-05: Renderer MVP scripts UI
- Includes: scripts list/editor/run/output integration.
- Excludes: macro substitution.
- Merge gate: end-to-end manual run path works + tests/build pass.
PR-06: Rebuild/meta-diff integration
- Includes: include
scripts/in rebuild and metadata diff paths. - Excludes: macro migration.
- Merge gate: external script file changes are detected and synchronized.
PR-07: Macro mapping + runtime cache foundation
- Includes: macro-to-script mapping, callable cache, first Python-backed macro.
- Excludes: full macro parity.
- Merge gate: at least one macro parity fixture passes.
PR-08: Macro parity migration batch A
- Includes: port a small set of built-in macros (e.g., 2–3) + golden tests.
- Excludes: removal of legacy path.
- Merge gate: parity fixtures pass for migrated macros.
PR-09: Macro parity migration batch B (repeat as needed)
- Includes: additional macro ports + fixtures.
- Excludes: removal of legacy path.
- Merge gate: all targeted macro parity tests pass.
PR-10: Default script seeding/versioning
- Includes: bundled default scripts + startup seeding behavior.
- Excludes: advanced scripting APIs.
- Merge gate: clean project gets default scripts deterministically.
PR-11: Legacy JS macro path removal
- Includes: delete legacy macro implementations after full parity.
- Excludes: bookmarklet transforms.
- Merge gate: full test suite and render parity suite pass.
PR-12: Performance hardening
- Includes: benchmark harness refinements, caching improvements, optional batch APIs.
- Excludes: unrelated UI changes.
- Merge gate: regression thresholds (p50/p95) stay within agreed envelope.
PR-13: Bookmarklet transform integration
- Includes: transform script type, pipeline hook, validation/fallback.
- Excludes: optional advanced tool APIs.
- Merge gate: sanitized input/output transform tests pass.
PR-14+: Optional advanced capabilities
- Includes: allowlisted app tools, AI-assistant script tools, curated package policy.
- Merge gate: explicit security review and feature-flag rollout.
10. Current Status
Status: Revised staged plan (MVP-first, full-scope preserved).
Recommended next action:
- Approve Phase 0 scope and benchmarks.
- Implement spike and record numbers.
- Lock ABI before building full UI and migration layers.