wip: desparate models fucking around
This commit is contained in:
434
A2UI.md
Normal file
434
A2UI.md
Normal file
@@ -0,0 +1,434 @@
|
||||
# A2UI Modernization Plan (Protocol-First)
|
||||
|
||||
## Purpose
|
||||
|
||||
This document defines the target architecture and execution plan to evolve the current chat assistant from a retrofit UI renderer into a mature, best-practice, protocol-first Agentic UI system.
|
||||
|
||||
Branch goal: ship a full-fledged assistant that can use all app capabilities, ask clarifying questions through rich UI controls, and reliably present actionable responses in both chat surfaces.
|
||||
|
||||
---
|
||||
|
||||
## Product Goal for This Branch
|
||||
|
||||
Build a **protocol-first chat assistant** that:
|
||||
|
||||
1. Emits deterministic structured responses and UI controls.
|
||||
2. Uses rich UI to gather missing inputs (forms, pickers, approvals).
|
||||
3. Executes app actions safely and reports outcomes back into the reasoning loop.
|
||||
4. Works identically in editor chat and assistant sidebar.
|
||||
5. Is observable, testable, and versioned for internal iteration.
|
||||
|
||||
## Implementation Status (This Branch)
|
||||
|
||||
- ✅ Phase 0 — Foundation Contracts (**completed 2026-02-25**)
|
||||
- ✅ Phase 1 — Server-Side Enforcement + Repair Loop (**completed 2026-02-25**)
|
||||
- ✅ Phase 2 — Capability Registry + Negotiation (**completed 2026-02-25**)
|
||||
- ✅ Phase 3 — Workflow Engine and Clarification UX (**completed 2026-02-25**)
|
||||
- ✅ Phase 4 — Action Policy and Safety (**completed 2026-02-25**)
|
||||
- ✅ Phase 5 — Observability and QA (**completed 2026-02-25**)
|
||||
|
||||
### Implemented Artifacts
|
||||
|
||||
- `src/main/agentic/protocol/types.ts`
|
||||
- `src/main/agentic/protocol/errors.ts`
|
||||
- `src/main/agentic/protocol/validator.ts`
|
||||
- `src/main/agentic/protocol/uiSchema.ts`
|
||||
- `src/main/agentic/protocol/uiSpecParser.ts`
|
||||
- `src/main/agentic/protocol/responseBuilder.ts`
|
||||
- `src/main/agentic/capabilities/registry.ts`
|
||||
- `src/main/agentic/workflow/turnStateMachine.ts`
|
||||
- `src/main/agentic/workflow/checkpointStore.ts`
|
||||
- `src/main/agentic/policy/actionPolicy.ts`
|
||||
- `src/main/agentic/observability/protocolTelemetry.ts`
|
||||
|
||||
### Branch Completion Notes
|
||||
|
||||
- Main-process protocol validation/envelope building now runs server-side before renderer consumption.
|
||||
- Renderer consumes protocol envelope directly; legacy text parsing remains as compatibility fallback only.
|
||||
- Capability snapshots are injected into each model request and unsupported widgets/actions are filtered with diagnostics.
|
||||
- Workflow checkpoints are persisted per conversation for resumable needs-input turns.
|
||||
- Action policy levels (`silent`, `confirm`, `danger`) are enforced at render-time before dispatch.
|
||||
- Protocol telemetry is emitted and available via `chat:getProtocolHealth` IPC.
|
||||
|
||||
---
|
||||
|
||||
## Current State (Baseline)
|
||||
|
||||
### Already implemented
|
||||
|
||||
- A2UI schema parsing for canonical `specVersion: "1"` payloads.
|
||||
- Rich widget rendering (chart, form, input, datePicker, card, image, tabs).
|
||||
- Shared controls renderer reused by both chat surfaces.
|
||||
- Text + UI mixed response extraction.
|
||||
- Action dispatcher and action-result persistence to chat history.
|
||||
- Metadata propagation (`surface: tab|sidebar`) through renderer → preload → IPC → manager.
|
||||
- Compatibility normalization for common non-canonical model outputs.
|
||||
|
||||
### Why this is still retrofit
|
||||
|
||||
- Structured output is still inferred from free-form model text.
|
||||
- UI contract is validated client-side after generation, not guaranteed at generation time.
|
||||
- No explicit protocol envelope enforced server-side each turn.
|
||||
- No formal capability handshake/version negotiation.
|
||||
- No repair/retry orchestration as a first-class protocol step.
|
||||
- No end-to-end telemetry contract for A2UI reliability metrics.
|
||||
|
||||
---
|
||||
|
||||
## Missing Elements to Reach Best Practice
|
||||
|
||||
## 1) Protocol Contract Maturity
|
||||
|
||||
- Missing: strict response envelope (`assistant_text`, `ui`, `intent`, `confidence`, `needs_input`).
|
||||
- Missing: request envelope (`messages`, `context`, `capabilities`, `surface`, `protocol_version`).
|
||||
- Missing: version negotiation and deprecation strategy.
|
||||
- Missing: canonical machine-readable error model for invalid A2UI payloads.
|
||||
|
||||
## 2) Model Interaction Strategy
|
||||
|
||||
- Missing: hard structured-output mode at provider call boundary.
|
||||
- Missing: deterministic prompt templates per intent class (analysis, edit, workflow).
|
||||
- Missing: first-class clarification mode (`needs_input`) with required fields contract.
|
||||
- Missing: server-side repair loop when response is invalid or incomplete.
|
||||
|
||||
## 3) Capability Discovery / Negotiation
|
||||
|
||||
- Missing: runtime capability registry sent to model each turn.
|
||||
- Missing: widget/action availability by surface and app state.
|
||||
- Missing: capability flags for disabled features and permission gates.
|
||||
|
||||
## 4) Agent Workflow Engine
|
||||
|
||||
- Missing: protocol-level finite-state turn machine (Plan → Ask → Execute → Observe → Continue).
|
||||
- Missing: action dependency graph for multi-step guided flows.
|
||||
- Missing: state checkpointing for resumable workflows.
|
||||
|
||||
## 5) UI Runtime and Action Safety
|
||||
|
||||
- Missing: explicit action confirmation policy levels (`silent`, `confirm`, `danger`).
|
||||
- Missing: standardized form validation and inline error schema.
|
||||
- Missing: universal fallback component for unsupported widgets.
|
||||
- Missing: deterministic conflict handling for stale context.
|
||||
|
||||
## 6) Observability and Reliability
|
||||
|
||||
- Missing: telemetry for parse success, fallback rate, repair rate, action success.
|
||||
- Missing: protocol health dashboard and SLOs.
|
||||
- Missing: reproducible turn traces for debugging and regression analysis.
|
||||
|
||||
## 7) Test Architecture
|
||||
|
||||
- Missing: protocol conformance suite (golden request/response cases).
|
||||
- Missing: end-to-end A2UI scenario tests (clarify + execute + reflect).
|
||||
- Missing: fuzz tests for malformed payload handling.
|
||||
- Missing: migration tests for protocol version compatibility.
|
||||
|
||||
## 8) Governance and Docs
|
||||
|
||||
- Missing: authoritative A2UI protocol spec doc with examples.
|
||||
- Missing: widget/action compatibility matrix by version.
|
||||
- Missing: internal governance for protocol changes and ownership boundaries.
|
||||
|
||||
---
|
||||
|
||||
## Target Architecture (Protocol-First)
|
||||
|
||||
## A. Core Components
|
||||
|
||||
1. **A2UI Protocol Layer (Main Process)**
|
||||
- Owns request/response envelopes, validation, normalization, repair loop.
|
||||
|
||||
2. **Capability Registry Service (Main Process)**
|
||||
- Publishes current widgets/actions/tools/features by surface and context.
|
||||
|
||||
3. **Agent Orchestrator (Main Process)**
|
||||
- Executes turn state machine and mediates tool calls + UI requests.
|
||||
|
||||
4. **Action Runtime (Renderer + Main IPC)**
|
||||
- Executes declared actions with policy checks and structured result events.
|
||||
|
||||
5. **A2UI Renderer (Renderer Shared)**
|
||||
- Renders protocol `ui` payloads with strict schema and graceful fallbacks.
|
||||
|
||||
6. **Observability Pipeline (Main + Renderer)**
|
||||
- Emits protocol metrics and trace IDs for each turn.
|
||||
|
||||
## B. Canonical Envelope (v2 proposal)
|
||||
|
||||
```json
|
||||
{
|
||||
"protocolVersion": "2.0",
|
||||
"assistantText": "...",
|
||||
"ui": {
|
||||
"specVersion": "1",
|
||||
"elements": []
|
||||
},
|
||||
"intent": "analyze|ask_input|propose_action|execute_action|summarize",
|
||||
"needsInput": {
|
||||
"required": false,
|
||||
"fields": []
|
||||
},
|
||||
"actions": [],
|
||||
"confidence": 0.0,
|
||||
"traceId": "..."
|
||||
}
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- `assistantText` is always present (possibly empty string).
|
||||
- `ui` is optional but schema-valid when present.
|
||||
- `needsInput.required=true` requires at least one field in `needsInput.fields`.
|
||||
- `actions` are declarative, validated against capability registry.
|
||||
- Unknown properties are rejected in strict mode.
|
||||
|
||||
### Canonical Request Envelope Example
|
||||
|
||||
```json
|
||||
{
|
||||
"protocolVersion": "2.0",
|
||||
"surface": "tab",
|
||||
"messages": [
|
||||
{ "role": "user", "content": "Show posting trend by month" }
|
||||
],
|
||||
"context": {
|
||||
"projectId": "project-1"
|
||||
},
|
||||
"capabilities": {
|
||||
"widgets": ["chart", "form", "tabs"],
|
||||
"actions": ["openPost", "openSettings"],
|
||||
"tools": ["search_posts", "list_posts"],
|
||||
"disabled": []
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Invalid Envelope Example (strict mode)
|
||||
|
||||
```json
|
||||
{
|
||||
"protocolVersion": "2.0",
|
||||
"assistantText": "Please provide missing fields",
|
||||
"intent": "ask_input",
|
||||
"needsInput": {
|
||||
"required": true,
|
||||
"fields": []
|
||||
},
|
||||
"actions": [],
|
||||
"confidence": 0.9,
|
||||
"traceId": "trace-123",
|
||||
"extra": "not-allowed"
|
||||
}
|
||||
```
|
||||
|
||||
Reason invalid:
|
||||
|
||||
- `needsInput.required=true` but `needsInput.fields` is empty.
|
||||
- `extra` is an unknown property in strict mode.
|
||||
|
||||
### Protocol Error Codes
|
||||
|
||||
- `A2UI_PROTOCOL_VALIDATION_ERROR`
|
||||
- Emitted for request/response envelope validation failures.
|
||||
- Includes human-readable `message` and per-field `details`.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Plan (Phased)
|
||||
|
||||
## Phase 0 — Foundation Contracts (Required First)
|
||||
|
||||
### Scope
|
||||
|
||||
- Add A2UI protocol specification section in repo docs.
|
||||
- Introduce canonical request/response TypeScript contracts.
|
||||
- Add protocol validator module in main process.
|
||||
|
||||
### Deliverables
|
||||
|
||||
- `src/main/agentic/protocol/types.ts`
|
||||
- `src/main/agentic/protocol/validator.ts`
|
||||
- `A2UI.md` + protocol examples + error codes
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- All responses crossing IPC can be validated by `protocolVersion`.
|
||||
- Invalid payloads produce structured protocol errors, never silent drops.
|
||||
|
||||
## Phase 1 — Server-Side Enforcement + Repair Loop
|
||||
|
||||
### Scope
|
||||
|
||||
- Move UI parsing/normalization from renderer-first to main-first.
|
||||
- Implement deterministic repair retry when response violates contract.
|
||||
- Remove legacy response handling once envelope enforcement is in place.
|
||||
|
||||
### Deliverables
|
||||
|
||||
- `ProtocolResponseBuilder` in main process.
|
||||
- `repairAttempt` policy with max retry count + fallback response.
|
||||
- Trace IDs propagated to renderer.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- 95%+ UI-intent prompts return valid envelope without client fallback.
|
||||
- No renderer-side protocol normalization required for valid turns.
|
||||
|
||||
## Phase 2 — Capability Registry + Negotiation
|
||||
|
||||
### Scope
|
||||
|
||||
- Add runtime registry of widgets/actions/tools per surface/context.
|
||||
- Inject capability snapshot into every model request.
|
||||
- Validate action and widget availability before returning to renderer.
|
||||
|
||||
### Deliverables
|
||||
|
||||
- `src/main/agentic/capabilities/registry.ts`
|
||||
- Capability snapshot in request envelope.
|
||||
- Contract tests for per-surface differences.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Model never receives unsupported widget/action lists.
|
||||
- Unsupported action attempts are blocked pre-render with clear diagnostics.
|
||||
|
||||
## Phase 3 — Workflow Engine and Clarification UX
|
||||
|
||||
### Scope
|
||||
|
||||
- Implement turn state machine with explicit `needsInput` handling.
|
||||
- Add reusable clarification controls (forms/selects/date/radio).
|
||||
- Persist workflow state for resumable conversations.
|
||||
|
||||
### Deliverables
|
||||
|
||||
- `AgentTurnStateMachine` module.
|
||||
- Clarification form primitives + validation schema.
|
||||
- Workflow checkpoint storage.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Assistant can pause for missing data and resume execution deterministically.
|
||||
- Multi-step tasks survive app refresh/reopen.
|
||||
|
||||
## Phase 4 — Action Policy and Safety
|
||||
|
||||
### Scope
|
||||
|
||||
- Add action policy levels and confirmation requirements.
|
||||
- Add preconditions and postconditions for critical actions.
|
||||
- Add structured rollback hints for reversible actions.
|
||||
|
||||
### Deliverables
|
||||
|
||||
- Action policy map.
|
||||
- Confirmation UI flow integrated into A2UI actions.
|
||||
- Action audit log entries with trace IDs.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Dangerous actions always require explicit confirmation.
|
||||
- Every executed action emits structured success/failure payload.
|
||||
|
||||
## Phase 5 — Observability and QA
|
||||
|
||||
### Scope
|
||||
|
||||
- Instrument protocol metrics and error taxonomy.
|
||||
- Build conformance + E2E A2UI test suites.
|
||||
- Add internal test gates that block merges on protocol drift.
|
||||
|
||||
### Deliverables
|
||||
|
||||
- Protocol metrics dashboard.
|
||||
- Golden test fixtures for representative workflows.
|
||||
- CI quality gates for protocol conformance and A2UI scenarios.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Defined SLOs met for parse validity, action success, and fallback rate.
|
||||
- Regression suite blocks merges on protocol drift.
|
||||
|
||||
---
|
||||
|
||||
## Architectural Rework Tasks (Concrete Backlog)
|
||||
|
||||
## Main Process
|
||||
|
||||
- Create `agentic/` domain package with protocol/orchestrator/capabilities.
|
||||
- Refactor `OpenCodeManager` response handling into envelope builder.
|
||||
- Add repair retry policy for chart/control requests.
|
||||
- Add `traceId` generation and propagation.
|
||||
|
||||
## IPC / Shared Contracts
|
||||
|
||||
- Version all chat IPC payloads with `protocolVersion`.
|
||||
- Replace raw string assistant response return with envelope return.
|
||||
- Remove or rewrite legacy IPC methods that cannot satisfy envelope guarantees.
|
||||
|
||||
## Renderer
|
||||
|
||||
- Consume envelope (`assistantText` + `ui`) directly.
|
||||
- Remove renderer responsibility for protocol normalization over time.
|
||||
- Keep widget renderer pure and stateless by schema.
|
||||
|
||||
## Testing
|
||||
|
||||
- Add protocol conformance tests for all envelope fields.
|
||||
- Add model-output compatibility fixtures and repair-path tests.
|
||||
- Add scenario tests: “ask chart” → chart render → user input → action execute.
|
||||
|
||||
## Documentation
|
||||
|
||||
- Add A2UI protocol appendix with canonical and invalid examples.
|
||||
- Add migration guide from legacy message parsing to v2 envelope.
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics (Definition of Done)
|
||||
|
||||
The branch is complete when:
|
||||
|
||||
1. **Protocol reliability**
|
||||
- ≥ 98% of A2UI-intent turns produce valid envelope without renderer fallback.
|
||||
|
||||
2. **UI execution reliability**
|
||||
- ≥ 95% of emitted actions execute successfully or fail with structured actionable error.
|
||||
|
||||
3. **Clarification quality**
|
||||
- Missing-input tasks use `needsInput` controls instead of textual back-and-forth in ≥ 90% of cases.
|
||||
|
||||
4. **Cross-surface parity**
|
||||
- Same A2UI payload renders and behaves equivalently in chat tab and sidebar.
|
||||
|
||||
5. **Governance and maintainability**
|
||||
- Protocol conformance suite and migration tests are mandatory in CI.
|
||||
|
||||
---
|
||||
|
||||
## Risks and Mitigations
|
||||
|
||||
- **Risk:** Provider inconsistency in structured outputs.
|
||||
- **Mitigation:** enforce server-side validation + repair loop + fallback envelope.
|
||||
|
||||
- **Risk:** Action safety regressions.
|
||||
- **Mitigation:** policy levels, confirmations, audit logs, blocked dangerous defaults.
|
||||
|
||||
- **Risk:** Protocol churn breaks compatibility.
|
||||
- **Mitigation:** strict versioned envelopes, migration tests, and single-source protocol ownership.
|
||||
|
||||
- **Risk:** Feature complexity growth.
|
||||
- **Mitigation:** domain separation (`protocol`, `orchestrator`, `renderer`), clear ownership boundaries.
|
||||
|
||||
---
|
||||
|
||||
## Recommended Execution Order for This Branch
|
||||
|
||||
1. Phase 0 + Phase 1 (contract + enforcement + repair)
|
||||
2. Phase 2 (capability negotiation)
|
||||
3. Phase 3 (clarification/workflow engine)
|
||||
4. Phase 4 (safety policy)
|
||||
5. Phase 5 (observability + QA)
|
||||
|
||||
This order converts the current retrofit into a stable protocol platform first, then builds mature agentic behavior on top.
|
||||
Reference in New Issue
Block a user