diff --git a/AGUI.md b/AGUI.md new file mode 100644 index 0000000..dccfe41 --- /dev/null +++ b/AGUI.md @@ -0,0 +1,352 @@ +# AGUI Modernization Plan (Protocol-First) + +## Purpose + +This document defines the target architecture and execution plan to evolve the current chat assistant from a retrofit UI renderer into a mature, best-practice, protocol-first Agentic UI system. + +Branch goal: ship a full-fledged assistant that can use all app capabilities, ask clarifying questions through rich UI controls, and reliably present actionable responses in both chat surfaces. + +--- + +## Product Goal for This Branch + +Build a **protocol-first chat assistant** that: + +1. Emits deterministic structured responses and UI controls. +2. Uses rich UI to gather missing inputs (forms, pickers, approvals). +3. Executes app actions safely and reports outcomes back into the reasoning loop. +4. Works identically in editor chat and assistant sidebar. +5. Is observable, testable, and versioned for internal iteration. + +--- + +## Current State (Baseline) + +### Already implemented + +- AGUI schema parsing for canonical `specVersion: "1"` payloads. +- Rich widget rendering (chart, form, input, datePicker, card, image, tabs). +- Shared controls renderer reused by both chat surfaces. +- Text + UI mixed response extraction. +- Action dispatcher and action-result persistence to chat history. +- Metadata propagation (`surface: tab|sidebar`) through renderer → preload → IPC → manager. +- Compatibility normalization for common non-canonical model outputs. + +### Why this is still retrofit + +- Structured output is still inferred from free-form model text. +- UI contract is validated client-side after generation, not guaranteed at generation time. +- No explicit protocol envelope enforced server-side each turn. +- No formal capability handshake/version negotiation. +- No repair/retry orchestration as a first-class protocol step. +- No end-to-end telemetry contract for AGUI reliability metrics. + +--- + +## Missing Elements to Reach Best Practice + +## 1) Protocol Contract Maturity + +- Missing: strict response envelope (`assistant_text`, `ui`, `intent`, `confidence`, `needs_input`). +- Missing: request envelope (`messages`, `context`, `capabilities`, `surface`, `protocol_version`). +- Missing: version negotiation and deprecation strategy. +- Missing: canonical machine-readable error model for invalid AGUI payloads. + +## 2) Model Interaction Strategy + +- Missing: hard structured-output mode at provider call boundary. +- Missing: deterministic prompt templates per intent class (analysis, edit, workflow). +- Missing: first-class clarification mode (`needs_input`) with required fields contract. +- Missing: server-side repair loop when response is invalid or incomplete. + +## 3) Capability Discovery / Negotiation + +- Missing: runtime capability registry sent to model each turn. +- Missing: widget/action availability by surface and app state. +- Missing: capability flags for disabled features and permission gates. + +## 4) Agent Workflow Engine + +- Missing: protocol-level finite-state turn machine (Plan → Ask → Execute → Observe → Continue). +- Missing: action dependency graph for multi-step guided flows. +- Missing: state checkpointing for resumable workflows. + +## 5) UI Runtime and Action Safety + +- Missing: explicit action confirmation policy levels (`silent`, `confirm`, `danger`). +- Missing: standardized form validation and inline error schema. +- Missing: universal fallback component for unsupported widgets. +- Missing: deterministic conflict handling for stale context. + +## 6) Observability and Reliability + +- Missing: telemetry for parse success, fallback rate, repair rate, action success. +- Missing: protocol health dashboard and SLOs. +- Missing: reproducible turn traces for debugging and regression analysis. + +## 7) Test Architecture + +- Missing: protocol conformance suite (golden request/response cases). +- Missing: end-to-end AGUI scenario tests (clarify + execute + reflect). +- Missing: fuzz tests for malformed payload handling. +- Missing: migration tests for protocol version compatibility. + +## 8) Governance and Docs + +- Missing: authoritative AGUI protocol spec doc with examples. +- Missing: widget/action compatibility matrix by version. +- Missing: internal governance for protocol changes and ownership boundaries. + +--- + +## Target Architecture (Protocol-First) + +## A. Core Components + +1. **AGUI Protocol Layer (Main Process)** + - Owns request/response envelopes, validation, normalization, repair loop. + +2. **Capability Registry Service (Main Process)** + - Publishes current widgets/actions/tools/features by surface and context. + +3. **Agent Orchestrator (Main Process)** + - Executes turn state machine and mediates tool calls + UI requests. + +4. **Action Runtime (Renderer + Main IPC)** + - Executes declared actions with policy checks and structured result events. + +5. **AGUI Renderer (Renderer Shared)** + - Renders protocol `ui` payloads with strict schema and graceful fallbacks. + +6. **Observability Pipeline (Main + Renderer)** + - Emits protocol metrics and trace IDs for each turn. + +## B. Canonical Envelope (v2 proposal) + +```json +{ + "protocolVersion": "2.0", + "assistantText": "...", + "ui": { + "specVersion": "1", + "elements": [] + }, + "intent": "analyze|ask_input|propose_action|execute_action|summarize", + "needsInput": { + "required": false, + "fields": [] + }, + "actions": [], + "confidence": 0.0, + "traceId": "..." +} +``` + +Rules: + +- `assistantText` is always present (possibly empty string). +- `ui` is optional but schema-valid when present. +- `needsInput.required=true` requires at least one field in `needsInput.fields`. +- `actions` are declarative, validated against capability registry. +- Unknown properties are rejected in strict mode. + +--- + +## Implementation Plan (Phased) + +## Phase 0 — Foundation Contracts (Required First) + +### Scope + +- Add AGUI protocol specification section in repo docs. +- Introduce canonical request/response TypeScript contracts. +- Add protocol validator module in main process. + +### Deliverables + +- `src/main/agentic/protocol/types.ts` +- `src/main/agentic/protocol/validator.ts` +- `AGUI.md` + protocol examples + error codes + +### Acceptance Criteria + +- All responses crossing IPC can be validated by `protocolVersion`. +- Invalid payloads produce structured protocol errors, never silent drops. + +## Phase 1 — Server-Side Enforcement + Repair Loop + +### Scope + +- Move UI parsing/normalization from renderer-first to main-first. +- Implement deterministic repair retry when response violates contract. +- Remove legacy response handling once envelope enforcement is in place. + +### Deliverables + +- `ProtocolResponseBuilder` in main process. +- `repairAttempt` policy with max retry count + fallback response. +- Trace IDs propagated to renderer. + +### Acceptance Criteria + +- 95%+ UI-intent prompts return valid envelope without client fallback. +- No renderer-side protocol normalization required for valid turns. + +## Phase 2 — Capability Registry + Negotiation + +### Scope + +- Add runtime registry of widgets/actions/tools per surface/context. +- Inject capability snapshot into every model request. +- Validate action and widget availability before returning to renderer. + +### Deliverables + +- `src/main/agentic/capabilities/registry.ts` +- Capability snapshot in request envelope. +- Contract tests for per-surface differences. + +### Acceptance Criteria + +- Model never receives unsupported widget/action lists. +- Unsupported action attempts are blocked pre-render with clear diagnostics. + +## Phase 3 — Workflow Engine and Clarification UX + +### Scope + +- Implement turn state machine with explicit `needsInput` handling. +- Add reusable clarification controls (forms/selects/date/radio). +- Persist workflow state for resumable conversations. + +### Deliverables + +- `AgentTurnStateMachine` module. +- Clarification form primitives + validation schema. +- Workflow checkpoint storage. + +### Acceptance Criteria + +- Assistant can pause for missing data and resume execution deterministically. +- Multi-step tasks survive app refresh/reopen. + +## Phase 4 — Action Policy and Safety + +### Scope + +- Add action policy levels and confirmation requirements. +- Add preconditions and postconditions for critical actions. +- Add structured rollback hints for reversible actions. + +### Deliverables + +- Action policy map. +- Confirmation UI flow integrated into AGUI actions. +- Action audit log entries with trace IDs. + +### Acceptance Criteria + +- Dangerous actions always require explicit confirmation. +- Every executed action emits structured success/failure payload. + +## Phase 5 — Observability and QA + +### Scope + +- Instrument protocol metrics and error taxonomy. +- Build conformance + E2E AGUI test suites. +- Add internal test gates that block merges on protocol drift. + +### Deliverables + +- Protocol metrics dashboard. +- Golden test fixtures for representative workflows. +- CI quality gates for protocol conformance and AGUI scenarios. + +### Acceptance Criteria + +- Defined SLOs met for parse validity, action success, and fallback rate. +- Regression suite blocks merges on protocol drift. + +--- + +## Architectural Rework Tasks (Concrete Backlog) + +## Main Process + +- Create `agentic/` domain package with protocol/orchestrator/capabilities. +- Refactor `OpenCodeManager` response handling into envelope builder. +- Add repair retry policy for chart/control requests. +- Add `traceId` generation and propagation. + +## IPC / Shared Contracts + +- Version all chat IPC payloads with `protocolVersion`. +- Replace raw string assistant response return with envelope return. +- Remove or rewrite legacy IPC methods that cannot satisfy envelope guarantees. + +## Renderer + +- Consume envelope (`assistantText` + `ui`) directly. +- Remove renderer responsibility for protocol normalization over time. +- Keep widget renderer pure and stateless by schema. + +## Testing + +- Add protocol conformance tests for all envelope fields. +- Add model-output compatibility fixtures and repair-path tests. +- Add scenario tests: “ask chart” → chart render → user input → action execute. + +## Documentation + +- Add AGUI protocol appendix with canonical and invalid examples. +- Add migration guide from legacy message parsing to v2 envelope. + +--- + +## Success Metrics (Definition of Done) + +The branch is complete when: + +1. **Protocol reliability** + - ≥ 98% of AGUI-intent turns produce valid envelope without renderer fallback. + +2. **UI execution reliability** + - ≥ 95% of emitted actions execute successfully or fail with structured actionable error. + +3. **Clarification quality** + - Missing-input tasks use `needsInput` controls instead of textual back-and-forth in ≥ 90% of cases. + +4. **Cross-surface parity** + - Same AGUI payload renders and behaves equivalently in chat tab and sidebar. + +5. **Governance and maintainability** + - Protocol conformance suite and migration tests are mandatory in CI. + +--- + +## Risks and Mitigations + +- **Risk:** Provider inconsistency in structured outputs. + - **Mitigation:** enforce server-side validation + repair loop + fallback envelope. + +- **Risk:** Action safety regressions. + - **Mitigation:** policy levels, confirmations, audit logs, blocked dangerous defaults. + +- **Risk:** Protocol churn breaks compatibility. + - **Mitigation:** strict versioned envelopes, migration tests, and single-source protocol ownership. + +- **Risk:** Feature complexity growth. + - **Mitigation:** domain separation (`protocol`, `orchestrator`, `renderer`), clear ownership boundaries. + +--- + +## Recommended Execution Order for This Branch + +1. Phase 0 + Phase 1 (contract + enforcement + repair) +2. Phase 2 (capability negotiation) +3. Phase 3 (clarification/workflow engine) +4. Phase 4 (safety policy) +5. Phase 5 (observability + QA) + +This order converts the current retrofit into a stable protocol platform first, then builds mature agentic behavior on top.