hugo/bDS

Fork 0

Files

hugo d7c97f4d85 chore: updated python scripting plan

2026-02-22 21:08:32 +01:00

16 KiB

Raw Blame History

Python Scripting Integration Plan (Electron + Pyodide)

1. Goal and Scope

Primary goal: all render-time macros run in Python, with predictable performance and safe sandboxing.

Secondary goals:

User-editable Python scripts with project persistence (scripts/ folder + DB index).
Python scripting reuse in bookmarklet/post-processing pipelines.
Keep architecture consistent with existing main/engine + ipc + renderer boundaries.

This document defines a staged path from MVP to full scope.

2. Viability Summary (Realistic Expectations)

Is this realistic?

Yes, if we optimize for low bridge overhead and stable execution contracts.

Key reality checks:

Pyodide in a worker is viable for user scripting and sandboxing.
Macro execution in render loops can be fast enough if we avoid frequent JS↔Python conversions.
< 1ms per macro call is possible only for simple macros with precompiled code and minimal marshaling; it must be treated as a benchmark target, not a guarantee.
For heavy loops, the work should stay inside Python once called (coarse-grained calls), not bounce per item between JS and Python.

Decision:

Keep Pyodide as the default engine.
Design a strict, minimal ABI (Application Binary Interface-like contract) for macro inputs/outputs.
Use preloading, precompilation, and caching before adding advanced optimizations.

3. Architecture Fit for bDS

3.1 Layering

Keep existing project boundaries:

src/main/engine: script metadata, script storage/indexing, render orchestration, benchmark/logging.
src/main/ipc: typed handlers for script CRUD, run, and diagnostics.
src/renderer: script editor UI, run controls, output panel integration.
Python runtime (Pyodide) stays in renderer-side worker context; main process never executes untrusted script code.

3.2 Runtime Placement

Host a long-lived PythonRuntimeWorker in renderer.
Initialize Pyodide once per app session (or lazily on first script run).
Maintain an in-memory registry of loaded scripts and compiled callables.

3.3 Macro Execution Contract (Performance-Critical)

Use one narrow contract for all macros:

Python side:

def render(context: dict) -> dict:
		...

Contract rules:

Input context is plain JSON-compatible data only.
Output is plain JSON-compatible data only.
No Node/Electron direct access from Python.
No per-token/per-node callbacks into JS while rendering.

3.4 Bridge Strategy (Keep Conversions Simple)

Preferred: pass compact JSON payloads (single call in, single result out).
Avoid dynamic proxy-style JS objects in hot paths.
Avoid toPy()/toJs() inside tight loops.
Use pyodide.globals only for stable utility bindings set once during worker startup.

3.5 Security Model

Scripts execute only in worker.
Hard timeout + termination + runtime restart on runaway scripts.
Allowlist API surface exposed to Python (pure functions where possible).
Validate and sanitize all script outputs in JS before applying to render pipeline.

4. Staged Delivery Plan

Phase 0 — Technical Spike (timeboxed)

Objective: prove runtime viability before product surface growth.

Deliverables:

Add pyodide dependency and worker boot sequence.
Run a sample script end-to-end (run_script, timeout, captured stdout).
Benchmark baseline cold start + warm run + repeated macro calls.
Define initial macro ABI (render(context) -> result) and schema docs.

Exit criteria:

Warm script execution is stable.
Timeout recovery works.
Measured baseline captured in repo docs.

Phase 1 — MVP (Minimal but Usable)

Objective: user can create/run scripts and see output.

Deliverables:

Script storage model (DB index + filesystem source in scripts/*.py).
CRUD APIs in main/engine + ipc handlers.
Renderer scripts list + editor + run button.
Console/output capture in existing bottom output area.
Project rebuild picks up scripts/ changes.

Out of scope for MVP:

Macro replacement.
Bookmarklet integration.
AI assistant tool access from Python.

Exit criteria:

Scripts can be created, persisted, run, and debugged.
Script files round-trip correctly with filesystem.

Phase 2 — Macro Runtime Foundation

Objective: integrate Python macros into renderer loop with low overhead.

Deliverables:

Add script type/metadata (kind: macro | utility | transform).
Resolve macro references from content to script IDs.
Implement macro runtime cache: module load once, callable reuse.
Convert existing macro parameter parsing into typed context object once per macro invocation.
Add perf counters (call count, p50/p95 runtime, timeout count).

Exit criteria:

Python macro path is feature-equivalent for at least 1–2 existing macros.
Measured overhead acceptable against baseline.

Phase 3 — Macro Migration (Full Goal)

Objective: all current built-in macros are Python-backed.

Deliverables:

Port each existing macro implementation to Python scripts.
Keep default macro scripts versioned in repo and bundled with app.
On startup/project init, seed missing default macro scripts into filesystem + DB.
Add script-as-macro assignment in metadata and editor UX.
Keep parameter typing rules explicit ("123" quoted string stays string; unquoted numerics map to int/float).

Exit criteria:

All built-in macros execute via Python runtime.
Legacy JS macro path is removed after parity confirmation.

Phase 4 — Performance Hardening

Objective: reach production-grade speed and stability for render loops.

Deliverables:

Precompile/load scripts once per worker lifecycle.
Batch render APIs where beneficial (render_many(contexts)).
Reduce marshaling size (compact context shape, no redundant fields).
Optional SharedArrayBuffer experiments only if measured need justifies added complexity.
Failure isolation and automatic runtime reset strategy.

Exit criteria:

Stable long-run benchmarks in CI/manual perf suite.
No UI thread stalls during heavy generation.

Phase 5 — Bookmarklet/Post Transform Integration

Objective: reuse Python runtime for post-ingest transformations.

Deliverables:

Hook script transforms into bookmarklet pipeline after data sanitization.
Input: validated post object; output: transformed validated post object.
Add transform-specific script type and error handling/reporting.

Exit criteria:

Transform scripts can safely modify incoming post content.
Fallback behavior exists when transform fails.

Phase 6 — Advanced Capabilities (Optional)

Objective: add power-user features only after core stability.

Candidates:

Python-accessible app tools (strict allowlist).
AI assistant tooling from Python scripts.
Script package/dependency policy for curated modules.

5. Data and Storage Design

Source of truth for scripts follows existing pattern: filesystem + DB index.
Files: scripts/<slug>.py.
Metadata can be stored in:
- DB columns (preferred for indexing/query), and/or
- leading Python block comment for file portability.
Rebuild/meta-diff must include scripts/ exactly like posts/media flow.

Recommended script metadata:

id, slug, title, kind, entrypoint, enabled, version, updatedAt.

6. Performance Plan (Macro-Critical)

Principles:

Coarse-grained calls: one macro invocation should do meaningful work in Python.
Stable ABI: small, predictable context payload.
Warm runtime reuse: no repeated Pyodide boot.
Compile/load once, execute many.

Initial target envelope (to validate in Phase 0/2):

Warm invocation overhead target: low single-digit milliseconds for typical macros.
p95 render stability target under large generation batches.
Timeout and memory guardrails for pathological scripts.

Note: The previous strict <1ms universal target is replaced by benchmark tiers by macro class (simple/medium/heavy), which is more realistic.

7. Security and Reliability

No direct filesystem/network/process APIs in Python runtime.
Worker watchdog timeout and hard-kill policy.
Structured errors returned to UI and logs.
Script output validation before use in rendering.
Versioned default scripts to ensure deterministic behavior across app updates.

8. Testing and Rollout Strategy

Unit tests for engine-level script registry, metadata, and macro resolution.
Integration tests for worker protocol and timeout recovery.
Golden tests to compare macro output parity before/after migration.
Performance regression checks for macro hot paths.
Feature flag for staged rollout before removing legacy macro path.

9. Coding Agent Execution Pack

This section makes the plan directly executable by coding agents.

9.1 Working Rules for Agents

Work one phase at a time; do not start the next phase before exit criteria pass.
Keep changes layered by architecture boundary (main/engine, main/ipc, renderer).
For each task: write/adjust tests first where feasible, then implement minimal code.
Keep runtime contract stable once introduced; changes require updating ABI docs and tests.
Do not add broad API exposure from JS/Electron into Python; only allowlisted calls.

9.2 Definition of Done (Per Phase)

Each phase is done only if all are true:

Deliverables implemented.
Exit criteria verified.
Relevant tests pass.
Full test suite passes (npm test).
Full build passes (npm run build).
Plan document updated with decisions/benchmarks where applicable.

9.3 Task Card Template (Use for Every Agent Task)

Task:
Scope:
Files expected to change:
Out of scope:
Acceptance checks:
Commands to run:
Notes/Risks:

9.4 Phase-by-Phase Agent Backlog (Suggested)

Phase 0 backlog

Runtime bootstrap spike

Scope: add Pyodide dependency and worker startup path only.
Files likely: package.json, new worker file under src/renderer/.
Acceptance: worker initializes once, reports ready state.

Safe execute protocol

Scope: request/response protocol (run, stdout, error, timeout).
Files likely: renderer runtime manager + worker + related types.
Acceptance: sample script run succeeds; timeout kills and recovers runtime.

Baseline benchmark harness

Scope: cold start, warm run, repeated macro invoke metrics.
Files likely: engine/diagnostic service or dedicated benchmark utility + docs.
Acceptance: numbers recorded in this document or linked benchmark doc.

ABI v1 spec

Scope: formal JSON schema for macro context and result.
Files likely: shared type definitions + docs.
Acceptance: schema used by both caller and worker-side validator.

Phase 1 backlog

Script persistence model

Scope: DB + filesystem mapping for scripts/*.py.
Acceptance: create/update/delete round-trips both stores.

Main engine + IPC CRUD

Scope: add script engine methods and typed IPC handlers.
Acceptance: renderer can list/read/write scripts through IPC only.

Renderer MVP UI

Scope: scripts list, editor panel, run button, output panel integration.
Acceptance: user edits script, runs it, sees stdout/errors.

Rebuild/meta-diff integration

Scope: include scripts in existing rebuild and metadata diff flow.
Acceptance: external file changes in scripts/ are detected and synchronized.

Phase 2 backlog

Macro script typing + mapping

Scope: kind metadata and mapping from macro token to script id.
Acceptance: at least one macro resolved to Python script.

Runtime cache path

Scope: load/compile once; callable reuse.
Acceptance: repeated macro invocations avoid re-init/re-import.

Context adapter

Scope: convert existing macro params into ABI v1 context once per invocation.
Acceptance: typed values obey conversion rules.

Perf counters

Scope: call count, p50/p95, timeout/error counts.
Acceptance: counters visible in logs/diagnostics.

Phase 3 backlog

Built-in macro parity migration

Scope: port each macro to Python scripts and add parity tests.
Acceptance: output parity with legacy macros for baseline fixtures.

Default script seeding/versioning

Scope: bundle defaults, seed missing scripts on init.
Acceptance: clean project bootstraps required macro scripts automatically.

Legacy path removal

Scope: remove JS macro implementations after parity gate.
Acceptance: tests pass with Python-only macro path.

Phase 4–6 backlog

Keep as optimization/integration tracks only after parity and stability gates pass.

9.5 Anti-Patterns for Agents (Do Not Do)

Do not call JS functions per token/item from Python in hot paths.
Do not pass large proxy objects through the bridge in render loops.
Do not introduce direct filesystem/network access in Python runtime.
Do not couple UI/editor work with macro migration in one PR-sized change.
Do not remove legacy macro code before golden parity tests pass.

9.6 Handoff Checklist (Agent to Agent)

Every handoff should include:

Completed task cards and remaining task cards.
Files changed and rationale.
Test/build command outputs summary.
Known risks and benchmark deltas.
Any ABI changes (must be explicit).

9.7 Suggested PR Boundaries (One Task, One PR)

Use small PRs with one primary purpose each.

PR-00: Pyodide bootstrap spike

Includes: dependency, worker init, ready signal.
Excludes: script persistence, UI/editor.
Merge gate: runtime initializes and tests/build pass.

PR-01: Worker run protocol + timeout recovery

Includes: run/stdout/error/timeout messaging, watchdog + restart behavior.
Excludes: macro integration.
Merge gate: timeout test and recovery test pass.

PR-02: ABI v1 types + schema validation

Includes: shared types and validation for context/result.
Excludes: macro migration.
Merge gate: caller and worker both use ABI validators.

PR-03: Script persistence model

Includes: DB + filesystem model for scripts/*.py.
Excludes: renderer UI.
Merge gate: round-trip persistence tests pass.

PR-04: Script engine + IPC CRUD

Includes: main/engine methods and typed ipc handlers.
Excludes: macro runtime.
Merge gate: IPC integration tests pass.

PR-05: Renderer MVP scripts UI

Includes: scripts list/editor/run/output integration.
Excludes: macro substitution.
Merge gate: end-to-end manual run path works + tests/build pass.

PR-06: Rebuild/meta-diff integration

Includes: include scripts/ in rebuild and metadata diff paths.
Excludes: macro migration.
Merge gate: external script file changes are detected and synchronized.

PR-07: Macro mapping + runtime cache foundation

Includes: macro-to-script mapping, callable cache, first Python-backed macro.
Excludes: full macro parity.
Merge gate: at least one macro parity fixture passes.

PR-08: Macro parity migration batch A

Includes: port a small set of built-in macros (e.g., 2–3) + golden tests.
Excludes: removal of legacy path.
Merge gate: parity fixtures pass for migrated macros.

PR-09: Macro parity migration batch B (repeat as needed)

Includes: additional macro ports + fixtures.
Excludes: removal of legacy path.
Merge gate: all targeted macro parity tests pass.

PR-10: Default script seeding/versioning

Includes: bundled default scripts + startup seeding behavior.
Excludes: advanced scripting APIs.
Merge gate: clean project gets default scripts deterministically.

PR-11: Legacy JS macro path removal

Includes: delete legacy macro implementations after full parity.
Excludes: bookmarklet transforms.
Merge gate: full test suite and render parity suite pass.

PR-12: Performance hardening

Includes: benchmark harness refinements, caching improvements, optional batch APIs.
Excludes: unrelated UI changes.
Merge gate: regression thresholds (p50/p95) stay within agreed envelope.

PR-13: Bookmarklet transform integration

Includes: transform script type, pipeline hook, validation/fallback.
Excludes: optional advanced tool APIs.
Merge gate: sanitized input/output transform tests pass.

PR-14+: Optional advanced capabilities

Includes: allowlisted app tools, AI-assistant script tools, curated package policy.
Merge gate: explicit security review and feature-flag rollout.

10. Current Status

Status: Revised staged plan (MVP-first, full-scope preserved).

Recommended next action:

Approve Phase 0 scope and benchmarks.
Implement spike and record numbers.
Lock ABI before building full UI and migration layers.

16 KiB Raw Blame History Unescape Escape

Python Scripting Integration Plan (Electron + Pyodide)

1. Goal and Scope

2. Viability Summary (Realistic Expectations)

Is this realistic?

3. Architecture Fit for bDS

3.1 Layering

3.2 Runtime Placement

3.3 Macro Execution Contract (Performance-Critical)

3.4 Bridge Strategy (Keep Conversions Simple)

3.5 Security Model

4. Staged Delivery Plan

Phase 0 — Technical Spike (timeboxed)

Phase 1 — MVP (Minimal but Usable)

Phase 2 — Macro Runtime Foundation

Phase 3 — Macro Migration (Full Goal)

Phase 4 — Performance Hardening

Phase 5 — Bookmarklet/Post Transform Integration

Phase 6 — Advanced Capabilities (Optional)

5. Data and Storage Design

6. Performance Plan (Macro-Critical)

7. Security and Reliability

8. Testing and Rollout Strategy

9. Coding Agent Execution Pack

9.1 Working Rules for Agents

9.2 Definition of Done (Per Phase)

9.3 Task Card Template (Use for Every Agent Task)

9.4 Phase-by-Phase Agent Backlog (Suggested)

Phase 0 backlog

Phase 1 backlog

Phase 2 backlog

Phase 3 backlog

Phase 4–6 backlog

9.5 Anti-Patterns for Agents (Do Not Do)

9.6 Handoff Checklist (Agent to Agent)

9.7 Suggested PR Boundaries (One Task, One PR)

10. Current Status

16 KiB

Raw Blame History