chore: updated python scripting plan

This commit is contained in:
2026-02-22 21:08:32 +01:00
parent 51c1963c55
commit d7c97f4d85

View File

@@ -1,70 +1,455 @@
# Python Scripting Integration Plan (Electron + Pyodide)
## 1. Overview
This document outlines the architecture for integrating a user-accessible Python scripting engine into the Electron application. The goal is to allow users to write custom scripts stored in the database, which can be executed both interactively and in high-performance batch rendering processes.
## 1. Goal and Scope
## 2. Core Technology: Pyodide
We will use **[Pyodide](https://pyodide.org)** (Python via WebAssembly) as the scripting engine.
- **Security:** Scripts run in a WebAssembly sandbox, protecting the host system.
- **Zero-Dependency:** No local Python installation required on the user's machine.
- **Extensibility:** Support for libraries like `numpy` for heavy data processing.
Primary goal: all render-time macros run in Python, with predictable performance and safe sandboxing.
## 3. Architecture Components
Secondary goals:
- User-editable Python scripts with project persistence (`scripts/` folder + DB index).
- Python scripting reuse in bookmarklet/post-processing pipelines.
- Keep architecture consistent with existing `main/engine` + `ipc` + `renderer` boundaries.
### A. The Scripting Worker (Background Thread)
To ensure the UI remains responsive during "large render-batches," all Python execution will occur in a **Web Worker**.
- **Isolation:** Each script or batch job runs in a dedicated worker.
- **Sync Execution:** Within the worker, Pyodide executes scripts synchronously to minimize the overhead of the JavaScript event loop during loops.
### B. The API Bridge
A bi-directional bridge will expose application internals to the Python environment:
- **`js` module:** Users can `import js` to access exposed APIs.
- **Synchronous Proxies:** Functions required for rendering will be injected into `pyodide.globals` to ensure minimal latency during batch calls.
### C. Data Management (Zero-Copy)
For performance-critical rendering:
- Use **[SharedArrayBuffer](https://developer.mozilla.org)** to share large datasets (e.g., pixel buffers or geometry data) between the main Electron process and the Python Worker.
- Use `pyodide.toPy()` to pass JavaScript objects by reference rather than cloning.
## 4. Implementation Roadmap
### Phase 1: Integration & Setup
- [ ] Integrate `pyodide` NPM package into the Electron Renderer process.
- [ ] Implement a `PythonWorker.js` to initialize the WASM runtime off-main-thread.
- [ ] Create a "Safe-Execute" wrapper that fetches script strings from the database.
### Phase 2: API Surface Definition
- [ ] Define the `AppAPI` object to be exposed to Python.
- [ ] Implement same tools as for API for user scripts (use identical code).
- [ ] Implement tools to use the AI assistant from python scripts
- [ ] Map Python-style snake_case to JavaScript camelCase for a native Python feel.
### Phase 3: Performance Optimization
- [ ] **Script Pre-compilation:** Store compiled `PyCode` objects in memory for batch processing to avoid re-parsing Python code.
- [ ] **Batch Loop:** Implement a high-speed loop in JavaScript that calls the Python `render()` function 10,000+ times without thread-switching.
### Phase 4: User Experience
- [ ] Implement a scripts sidebar that opens a list of stored scripts
- [ ] Integrate a code editor (**Monaco Editor**) with Python syntax highlighting.
- [ ] store scripts like posts in a scripts/ project subfolder. put the frontmatter into a leading python block comment. Extension for python scripts is .py
- [ ] Handle changes the same as with media files, saving a script updates also the file storage
- [ ] also look at script files on rebuild, just like you do for other elements like posts and media. add it to the overall rebuild and add a separate button in preferences where the other buttons are
- [ ] wire the scripts folder also into the metadata diff
- [ ] Add a "Console Output" redirect to capture Python `print()` statements into the UI. Use the bottom panel where tasks and git log show for that, using the "outpout" tab
- [ ] add a run button that fires off a python script
## 7. Rewrite macros into python
- [ ] provide a way to set a script as macro in the editor, keep this fact as metadata
- [ ] convert all existing macros into python scripts
- [ ] keep in mind that scripts can get unquoted content that should be a string still, but fully numeric parameters will be numbers (int or float depending on format)
- [ ] on program start, if no scripts are there, the python scripts are created in the filesystem with default content and loaded into the database
- [ ] version the default scripts in the repository so they can be bundled with the app and propagated to the project
## 6. Security Considerations
- **Resource Limits:** Implement a watchdog timer to terminate Web Workers if a user script enters an infinite loop.
- **Filesystem Access:** Restrict all I/O to the provided application APIs; do not expose Node.js `fs` module to the Pyodide environment.
This document defines a staged path from MVP to full scope.
---
**Status:** Initial Draft
**Target Performance:** < 1ms overhead per script execution in batch mode.
## 2. Viability Summary (Realistic Expectations)
### Is this realistic?
Yes, if we optimize for **low bridge overhead** and **stable execution contracts**.
Key reality checks:
- Pyodide in a worker is viable for user scripting and sandboxing.
- Macro execution in render loops can be fast enough if we avoid frequent JS↔Python conversions.
- `< 1ms` per macro call is possible only for simple macros with precompiled code and minimal marshaling; it must be treated as a benchmark target, not a guarantee.
- For heavy loops, the work should stay inside Python once called (coarse-grained calls), not bounce per item between JS and Python.
Decision:
- Keep Pyodide as the default engine.
- Design a strict, minimal ABI (Application Binary Interface-like contract) for macro inputs/outputs.
- Use preloading, precompilation, and caching before adding advanced optimizations.
---
## 3. Architecture Fit for bDS
### 3.1 Layering
Keep existing project boundaries:
- `src/main/engine`: script metadata, script storage/indexing, render orchestration, benchmark/logging.
- `src/main/ipc`: typed handlers for script CRUD, run, and diagnostics.
- `src/renderer`: script editor UI, run controls, output panel integration.
- Python runtime (Pyodide) stays in renderer-side worker context; main process never executes untrusted script code.
### 3.2 Runtime Placement
- Host a long-lived `PythonRuntimeWorker` in renderer.
- Initialize Pyodide once per app session (or lazily on first script run).
- Maintain an in-memory registry of loaded scripts and compiled callables.
### 3.3 Macro Execution Contract (Performance-Critical)
Use one narrow contract for all macros:
Python side:
```python
def render(context: dict) -> dict:
...
```
Contract rules:
- Input `context` is plain JSON-compatible data only.
- Output is plain JSON-compatible data only.
- No Node/Electron direct access from Python.
- No per-token/per-node callbacks into JS while rendering.
### 3.4 Bridge Strategy (Keep Conversions Simple)
- Preferred: pass compact JSON payloads (single call in, single result out).
- Avoid dynamic proxy-style JS objects in hot paths.
- Avoid `toPy()/toJs()` inside tight loops.
- Use `pyodide.globals` only for stable utility bindings set once during worker startup.
### 3.5 Security Model
- Scripts execute only in worker.
- Hard timeout + termination + runtime restart on runaway scripts.
- Allowlist API surface exposed to Python (pure functions where possible).
- Validate and sanitize all script outputs in JS before applying to render pipeline.
---
## 4. Staged Delivery Plan
## Phase 0 — Technical Spike (timeboxed)
Objective: prove runtime viability before product surface growth.
Deliverables:
- [ ] Add `pyodide` dependency and worker boot sequence.
- [ ] Run a sample script end-to-end (`run_script`, timeout, captured stdout).
- [ ] Benchmark baseline cold start + warm run + repeated macro calls.
- [ ] Define initial macro ABI (`render(context) -> result`) and schema docs.
Exit criteria:
- Warm script execution is stable.
- Timeout recovery works.
- Measured baseline captured in repo docs.
## Phase 1 — MVP (Minimal but Usable)
Objective: user can create/run scripts and see output.
Deliverables:
- [ ] Script storage model (DB index + filesystem source in `scripts/*.py`).
- [ ] CRUD APIs in `main/engine` + `ipc` handlers.
- [ ] Renderer scripts list + editor + run button.
- [ ] Console/output capture in existing bottom output area.
- [ ] Project rebuild picks up `scripts/` changes.
Out of scope for MVP:
- Macro replacement.
- Bookmarklet integration.
- AI assistant tool access from Python.
Exit criteria:
- Scripts can be created, persisted, run, and debugged.
- Script files round-trip correctly with filesystem.
## Phase 2 — Macro Runtime Foundation
Objective: integrate Python macros into renderer loop with low overhead.
Deliverables:
- [ ] Add script type/metadata (`kind: macro | utility | transform`).
- [ ] Resolve macro references from content to script IDs.
- [ ] Implement macro runtime cache: module load once, callable reuse.
- [ ] Convert existing macro parameter parsing into typed context object once per macro invocation.
- [ ] Add perf counters (call count, p50/p95 runtime, timeout count).
Exit criteria:
- Python macro path is feature-equivalent for at least 12 existing macros.
- Measured overhead acceptable against baseline.
## Phase 3 — Macro Migration (Full Goal)
Objective: all current built-in macros are Python-backed.
Deliverables:
- [ ] Port each existing macro implementation to Python scripts.
- [ ] Keep default macro scripts versioned in repo and bundled with app.
- [ ] On startup/project init, seed missing default macro scripts into filesystem + DB.
- [ ] Add script-as-macro assignment in metadata and editor UX.
- [ ] Keep parameter typing rules explicit (`"123"` quoted string stays string; unquoted numerics map to int/float).
Exit criteria:
- All built-in macros execute via Python runtime.
- Legacy JS macro path is removed after parity confirmation.
## Phase 4 — Performance Hardening
Objective: reach production-grade speed and stability for render loops.
Deliverables:
- [ ] Precompile/load scripts once per worker lifecycle.
- [ ] Batch render APIs where beneficial (`render_many(contexts)`).
- [ ] Reduce marshaling size (compact context shape, no redundant fields).
- [ ] Optional SharedArrayBuffer experiments only if measured need justifies added complexity.
- [ ] Failure isolation and automatic runtime reset strategy.
Exit criteria:
- Stable long-run benchmarks in CI/manual perf suite.
- No UI thread stalls during heavy generation.
## Phase 5 — Bookmarklet/Post Transform Integration
Objective: reuse Python runtime for post-ingest transformations.
Deliverables:
- [ ] Hook script transforms into bookmarklet pipeline after data sanitization.
- [ ] Input: validated post object; output: transformed validated post object.
- [ ] Add transform-specific script type and error handling/reporting.
Exit criteria:
- Transform scripts can safely modify incoming post content.
- Fallback behavior exists when transform fails.
## Phase 6 — Advanced Capabilities (Optional)
Objective: add power-user features only after core stability.
Candidates:
- [ ] Python-accessible app tools (strict allowlist).
- [ ] AI assistant tooling from Python scripts.
- [ ] Script package/dependency policy for curated modules.
---
## 5. Data and Storage Design
- Source of truth for scripts follows existing pattern: filesystem + DB index.
- Files: `scripts/<slug>.py`.
- Metadata can be stored in:
- DB columns (preferred for indexing/query), and/or
- leading Python block comment for file portability.
- Rebuild/meta-diff must include `scripts/` exactly like posts/media flow.
Recommended script metadata:
- `id`, `slug`, `title`, `kind`, `entrypoint`, `enabled`, `version`, `updatedAt`.
---
## 6. Performance Plan (Macro-Critical)
Principles:
- Coarse-grained calls: one macro invocation should do meaningful work in Python.
- Stable ABI: small, predictable context payload.
- Warm runtime reuse: no repeated Pyodide boot.
- Compile/load once, execute many.
Initial target envelope (to validate in Phase 0/2):
- Warm invocation overhead target: low single-digit milliseconds for typical macros.
- p95 render stability target under large generation batches.
- Timeout and memory guardrails for pathological scripts.
Note: The previous strict `<1ms` universal target is replaced by benchmark tiers by macro class (simple/medium/heavy), which is more realistic.
---
## 7. Security and Reliability
- No direct filesystem/network/process APIs in Python runtime.
- Worker watchdog timeout and hard-kill policy.
- Structured errors returned to UI and logs.
- Script output validation before use in rendering.
- Versioned default scripts to ensure deterministic behavior across app updates.
---
## 8. Testing and Rollout Strategy
- Unit tests for engine-level script registry, metadata, and macro resolution.
- Integration tests for worker protocol and timeout recovery.
- Golden tests to compare macro output parity before/after migration.
- Performance regression checks for macro hot paths.
- Feature flag for staged rollout before removing legacy macro path.
---
## 9. Coding Agent Execution Pack
This section makes the plan directly executable by coding agents.
### 9.1 Working Rules for Agents
- Work one phase at a time; do not start the next phase before exit criteria pass.
- Keep changes layered by architecture boundary (`main/engine`, `main/ipc`, `renderer`).
- For each task: write/adjust tests first where feasible, then implement minimal code.
- Keep runtime contract stable once introduced; changes require updating ABI docs and tests.
- Do not add broad API exposure from JS/Electron into Python; only allowlisted calls.
### 9.2 Definition of Done (Per Phase)
Each phase is done only if all are true:
- [ ] Deliverables implemented.
- [ ] Exit criteria verified.
- [ ] Relevant tests pass.
- [ ] Full test suite passes (`npm test`).
- [ ] Full build passes (`npm run build`).
- [ ] Plan document updated with decisions/benchmarks where applicable.
### 9.3 Task Card Template (Use for Every Agent Task)
```md
Task:
Scope:
Files expected to change:
Out of scope:
Acceptance checks:
Commands to run:
Notes/Risks:
```
### 9.4 Phase-by-Phase Agent Backlog (Suggested)
#### Phase 0 backlog
1. Runtime bootstrap spike
- Scope: add Pyodide dependency and worker startup path only.
- Files likely: `package.json`, new worker file under `src/renderer/`.
- Acceptance: worker initializes once, reports ready state.
2. Safe execute protocol
- Scope: request/response protocol (`run`, `stdout`, `error`, `timeout`).
- Files likely: renderer runtime manager + worker + related types.
- Acceptance: sample script run succeeds; timeout kills and recovers runtime.
3. Baseline benchmark harness
- Scope: cold start, warm run, repeated macro invoke metrics.
- Files likely: engine/diagnostic service or dedicated benchmark utility + docs.
- Acceptance: numbers recorded in this document or linked benchmark doc.
4. ABI v1 spec
- Scope: formal JSON schema for macro `context` and `result`.
- Files likely: shared type definitions + docs.
- Acceptance: schema used by both caller and worker-side validator.
#### Phase 1 backlog
1. Script persistence model
- Scope: DB + filesystem mapping for `scripts/*.py`.
- Acceptance: create/update/delete round-trips both stores.
2. Main engine + IPC CRUD
- Scope: add script engine methods and typed IPC handlers.
- Acceptance: renderer can list/read/write scripts through IPC only.
3. Renderer MVP UI
- Scope: scripts list, editor panel, run button, output panel integration.
- Acceptance: user edits script, runs it, sees stdout/errors.
4. Rebuild/meta-diff integration
- Scope: include scripts in existing rebuild and metadata diff flow.
- Acceptance: external file changes in `scripts/` are detected and synchronized.
#### Phase 2 backlog
1. Macro script typing + mapping
- Scope: `kind` metadata and mapping from macro token to script id.
- Acceptance: at least one macro resolved to Python script.
2. Runtime cache path
- Scope: load/compile once; callable reuse.
- Acceptance: repeated macro invocations avoid re-init/re-import.
3. Context adapter
- Scope: convert existing macro params into ABI v1 `context` once per invocation.
- Acceptance: typed values obey conversion rules.
4. Perf counters
- Scope: call count, p50/p95, timeout/error counts.
- Acceptance: counters visible in logs/diagnostics.
#### Phase 3 backlog
1. Built-in macro parity migration
- Scope: port each macro to Python scripts and add parity tests.
- Acceptance: output parity with legacy macros for baseline fixtures.
2. Default script seeding/versioning
- Scope: bundle defaults, seed missing scripts on init.
- Acceptance: clean project bootstraps required macro scripts automatically.
3. Legacy path removal
- Scope: remove JS macro implementations after parity gate.
- Acceptance: tests pass with Python-only macro path.
#### Phase 46 backlog
- Keep as optimization/integration tracks only after parity and stability gates pass.
### 9.5 Anti-Patterns for Agents (Do Not Do)
- Do not call JS functions per token/item from Python in hot paths.
- Do not pass large proxy objects through the bridge in render loops.
- Do not introduce direct filesystem/network access in Python runtime.
- Do not couple UI/editor work with macro migration in one PR-sized change.
- Do not remove legacy macro code before golden parity tests pass.
### 9.6 Handoff Checklist (Agent to Agent)
Every handoff should include:
- Completed task cards and remaining task cards.
- Files changed and rationale.
- Test/build command outputs summary.
- Known risks and benchmark deltas.
- Any ABI changes (must be explicit).
### 9.7 Suggested PR Boundaries (One Task, One PR)
Use small PRs with one primary purpose each.
PR-00: Pyodide bootstrap spike
- Includes: dependency, worker init, ready signal.
- Excludes: script persistence, UI/editor.
- Merge gate: runtime initializes and tests/build pass.
PR-01: Worker run protocol + timeout recovery
- Includes: run/stdout/error/timeout messaging, watchdog + restart behavior.
- Excludes: macro integration.
- Merge gate: timeout test and recovery test pass.
PR-02: ABI v1 types + schema validation
- Includes: shared types and validation for `context/result`.
- Excludes: macro migration.
- Merge gate: caller and worker both use ABI validators.
PR-03: Script persistence model
- Includes: DB + filesystem model for `scripts/*.py`.
- Excludes: renderer UI.
- Merge gate: round-trip persistence tests pass.
PR-04: Script engine + IPC CRUD
- Includes: `main/engine` methods and typed `ipc` handlers.
- Excludes: macro runtime.
- Merge gate: IPC integration tests pass.
PR-05: Renderer MVP scripts UI
- Includes: scripts list/editor/run/output integration.
- Excludes: macro substitution.
- Merge gate: end-to-end manual run path works + tests/build pass.
PR-06: Rebuild/meta-diff integration
- Includes: include `scripts/` in rebuild and metadata diff paths.
- Excludes: macro migration.
- Merge gate: external script file changes are detected and synchronized.
PR-07: Macro mapping + runtime cache foundation
- Includes: macro-to-script mapping, callable cache, first Python-backed macro.
- Excludes: full macro parity.
- Merge gate: at least one macro parity fixture passes.
PR-08: Macro parity migration batch A
- Includes: port a small set of built-in macros (e.g., 23) + golden tests.
- Excludes: removal of legacy path.
- Merge gate: parity fixtures pass for migrated macros.
PR-09: Macro parity migration batch B (repeat as needed)
- Includes: additional macro ports + fixtures.
- Excludes: removal of legacy path.
- Merge gate: all targeted macro parity tests pass.
PR-10: Default script seeding/versioning
- Includes: bundled default scripts + startup seeding behavior.
- Excludes: advanced scripting APIs.
- Merge gate: clean project gets default scripts deterministically.
PR-11: Legacy JS macro path removal
- Includes: delete legacy macro implementations after full parity.
- Excludes: bookmarklet transforms.
- Merge gate: full test suite and render parity suite pass.
PR-12: Performance hardening
- Includes: benchmark harness refinements, caching improvements, optional batch APIs.
- Excludes: unrelated UI changes.
- Merge gate: regression thresholds (p50/p95) stay within agreed envelope.
PR-13: Bookmarklet transform integration
- Includes: transform script type, pipeline hook, validation/fallback.
- Excludes: optional advanced tool APIs.
- Merge gate: sanitized input/output transform tests pass.
PR-14+: Optional advanced capabilities
- Includes: allowlisted app tools, AI-assistant script tools, curated package policy.
- Merge gate: explicit security review and feature-flag rollout.
---
## 10. Current Status
Status: Revised staged plan (MVP-first, full-scope preserved).
Recommended next action:
1. Approve Phase 0 scope and benchmarks.
2. Implement spike and record numbers.
3. Lock ABI before building full UI and migration layers.