chore: analyse specs against code

chore: remove redundant export-only tests, add test audit procedure
Deleted chat_editor_test.exs and import_editor_test.exs which only checked function_exported?/Code.ensure_loaded? without exercising any behavior — both components are already tested via LiveView rendering in shell_live_test.exs and import_shell_live_test.exs respectively. Added TESTAUDIT.md documenting the procedure for periodic test suite audits to catch non-behavioral tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-11 11:56:34 +02:00 · 2026-05-11 10:35:24 +02:00
5 changed files with 474 additions and 20 deletions
--- a/SPECAUDIT.md
+++ b/SPECAUDIT.md
@@ -0,0 +1,191 @@
+# Spec Audit Process
+
+This document describes the repeatable process for auditing the Allium specifications against the bDS2 codebase and test suite. Run it whenever specs or code change materially.
+
+## Overview
+
+The audit produces three categories of findings:
+
+1. **Spec-claims-not-in-code** — spec describes behavior the code does not implement
+2. **Code-not-in-spec** — code implements behavior the spec does not describe
+3. **Spec-claims-not-in-tests** — spec invariants/rules/behaviors lack test coverage
+
+## Step 1: Map the Territory
+
+```bash
+# List all spec files
+ls specs/*.allium
+
+# List all source modules
+ls lib/bds/ lib/bds/**/
+
+# List all test files
+ls test/bds/ test/bds/**/
+```
+
+Record the mapping between specs and code/test files. Use `specs/bds.allium` as the index — it lists every `use` directive with its domain label.
+
+## Step 2: Extract Spec Claims
+
+For each `.allium` file, extract:
+
+| Claim Type | Pattern | Example |
+|---|---|---|
+| **invariant** | `invariant Name:` or lines describing always-true properties | `UniqueSlugPerProject: slugs unique within project` |
+| **rule** | `rule Name { requires: ... ensures: ... }` | `CreatePost: creates with slug, status=draft` |
+| **guarantee** | `guarantee Name:` | `SandboxedExecution: no filesystem/process loading` |
+| **config** | `config { key = value }` | `macro_timeout = 10.seconds` |
+| **behavior** | Explicit claims in comments or entity descriptions | `"HomeAlwaysPresent: menu always has Home entry"` |
+
+Record the spec file name, claim name, claim type, and line number for each.
+
+## Step 3: Compare Spec Claims Against Code
+
+For each claim, find the corresponding code and verify:
+
+### 3a. Entity/field existence
+- Does the Ecto schema have the fields the spec declares?
+- Are relationships (has_many, belongs_to) present?
+- Are enum/status values complete?
+
+```bash
+# Check schema fields
+grep -n "field :" lib/bds/posts/post.ex
+grep -n "has_many\|belongs_to" lib/bds/posts/post.ex
+```
+
+### 3b. Rule implementation
+- Does the code enforce the `requires` preconditions?
+- Does the code produce the `ensures` postconditions?
+- Are side-effects (FTS, embeddings, file writes) triggered?
+
+```bash
+# Check function implementation
+grep -n "def create_post" lib/bds/posts.ex
+grep -n "def publish_post" lib/bds/posts.ex
+```
+
+### 3c. Invariant enforcement
+- Are constraints enforced at the schema level (unique_index, check_constraint)?
+- Are constraints enforced in changeset validations?
+- Are constraints enforced in business logic?
+
+```bash
+# Check database constraints
+grep -n "unique_index\|check_constraint" priv/repo/migrations/*.ex
+grep -n "unique_constraint\|validate_" lib/bds/posts/post.ex
+```
+
+### 3d. File format compliance
+- Does the serialization format match the spec's frontmatter values?
+- Are conditional fields omitted when falsy?
+- Are required fields always present?
+
+```bash
+# Check serialization
+grep -n "serialize\|write_file\|Frontmatter" lib/bds/frontmatter.ex lib/bds/posts/file_sync.ex
+```
+
+## Step 4: Compare Code Against Spec Claims
+
+Search for code that implements behavior NOT described in any spec:
+
+### 4a. Public API functions not in any spec rule
+```bash
+# List public functions in a module
+grep -n "def " lib/bds/posts.ex | grep -v "defp"
+```
+
+### 4b. Schema fields not in any spec entity
+```bash
+# List all fields
+grep -n "field :" lib/bds/posts/post.ex
+```
+
+### 4c. Side effects not in engine_side_effects.allium
+```bash
+# Check what happens after CRUD operations
+grep -n "sync_post\|sync_media\|Search\.\|Embeddings\.\|AutoTranslation" lib/bds/posts.ex lib/bds/media.ex
+```
+
+### 4d. UI features not in any editor spec
+```bash
+# Check HEEx templates for UI elements
+grep -n "phx-click\|data-phx-" lib/bds/desktop/post_editor_html/post_editor.html.heex
+```
+
+## Step 5: Compare Spec Claims Against Tests
+
+For each invariant, rule, and guarantee, search for a test that verifies it:
+
+### 5a. Direct test search
+```bash
+# Search test names and bodies
+grep -rn "test \"" test/bds/posts_test.exs | head -30
+grep -rn "test \"" test/bds/media_test.exs | head -30
+```
+
+### 5b. Invariant coverage check
+For each invariant, determine:
+- **YES**: Test explicitly verifies the invariant (creates violation, expects rejection)
+- **PARTIAL**: Test verifies the happy path but not violation scenarios
+- **NO**: No test exists
+
+### 5c. Rule coverage check
+For each rule, determine:
+- **YES**: Test exercises `requires` precondition and `ensures` postcondition
+- **PARTIAL**: Test exercises the happy path but not preconditions or all postconditions
+- **NO**: No test exists
+
+### 5d. Side-effect chain coverage
+For each side-effect rule in `engine_side_effects.allium`, check whether a test verifies ALL `ensures` clauses fire together (not just individually).
+
+## Step 6: Classify Findings
+
+Each gap falls into one of these categories with a recommended action:
+
+| Category | Direction | Action |
+|---|---|---|
+| **Spec correct, code wrong** | Spec → Code | Fix the code |
+| **Code correct, spec drifted** | Code → Spec | Update the spec |
+| **Code behavior, no spec** | Code → Spec | Distill into spec |
+| **Spec claim, no test** | Spec → Test | Write test |
+| **Internal spec inconsistency** | Spec → Spec | Align specs |
+| **Decision needed** | Both | Resolve with stakeholder |
+
+## Step 7: Produce SPECGAPS.md
+
+Consolidate all findings into `SPECGAPS.md` with:
+- Gap ID for tracking
+- Clear description of the gap
+- Which spec file and line
+- Which code file and line
+- Recommended path (fix code / update spec / write test / decide)
+- Priority (HIGH/MEDIUM/LOW)
+
+## Step 8: Validate
+
+After making changes:
+```bash
+# Run full test suite
+mix test
+
+# Run dialyzer
+mix dialyzer
+
+# Validate allium specs (if tool available)
+# Use the allium CLI to validate spec files
+```
+
+## Re-running the Audit
+
+1. Start from Step 2 — re-extract claims from updated specs
+2. Run Steps 3-5 against current code and tests
+3. Compare against previous SPECGAPS.md to identify resolved and new gaps
+4. Update SPECGAPS.md
+
+The audit should be re-run after:
+- Adding new spec files or significant spec changes
+- Adding new features or refactoring code
+- Adding new test files
+- Before any release milestone
--- a/SPECGAPS.md
+++ b/SPECGAPS.md
@@ -0,0 +1,196 @@
+# Spec Gaps — Allium Specs vs Code vs Tests
+
+Gap categories: **SC** = spec correct, fix code | **CS** = code correct, update spec | **ST** = write test | **SD** = decide | **SI** = fix internal spec inconsistency
+
+---
+
+## A. Spec Claims Not Fulfilled by Code
+
+### A1. Code Must Change (spec is normative)
+
+| ID | Gap | Spec | Code | Path |
+|---|---|---|---|---|
+| A1-1 | No `archived→draft` or `archived→published` transition | post.allium:121-122 | No code path to unarchive | Fix code or spec-restrict transitions |
+| A1-2 | `DeletePost` must delete translations + translation files | post.allium:209-212 | `delete_post/1` skips translation cleanup | Fix code: delete PostTranslation rows + files |
+| A1-3 | Publish must delete old file when path changes | engine_side_effects.allium:73-74 | `publish_post` does not delete old file | Fix code: add old file deletion on path change |
+| A1-4 | `doNotTranslate: false` written to frontmatter despite "only when true" | frontmatter.allium:398 | `lib/bds/frontmatter.ex:38-39` writes false | Fix code: omit `doNotTranslate` when false |
+
+### A2. Spec Should Update (code is normative)
+
+| ID | Gap | Spec | Code | Path |
+|---|---|---|---|---|
+| A2-1 | WYSIWYG/visual editor mode (3 modes) | editor_post.allium:159-164 | Only markdown+preview; visual normalizes to markdown | Drop from spec or mark future |
+| A2-2 | Auto-save after 3000ms idle | editor_post.allium:183-188 | No auto-save timer | Drop from spec or mark future |
+| A2-3 | On-demand rendering in preview | preview.allium:53-93 | Static file serving from generated output | Update spec: preview serves pre-generated files |
+| A2-4 | Template/Script are global entities | template.allium, script.allium | Both have `project_id`, per-project uniqueness | Update spec to per-project scoping |
+| A2-5 | TagsFile uses `{tags: [...]}` wrapper | frontmatter.allium:255-273 | Code writes bare array `[...]` | Update spec |
+| A2-6 | Sidecar is "YAML-like, not gray-matter" | frontmatter.allium:174 | Code wraps with `---` delimiters | Update spec to gray-matter style |
+| A2-7 | Translation frontmatter omits status/timestamps | frontmatter.allium:107-117 | Code writes status, createdAt, updatedAt, publishedAt | Update spec to match written fields |
+| A2-8 | Search index has single `stemmed_content` | search.allium:40-54 | FTS5 per-field stemmed columns | Update spec to per-field model |
+| A2-9 | Tag archives are single-page | generation.allium:142-147 | Code paginates | Update spec |
+| A2-10 | Date archives year+month only | generation.allium:151-159 | Code also generates day-level | Update spec |
+| A2-11 | Menu is DB entity | menu.allium:20-26 | Purely file-based OPML, no DB table | Update spec to file-only model |
+| A2-12 | Panel tabs: problems, terminal | layout.allium:235-240 | `[:tasks, :output, :post_links, :git_log]` | Update spec |
+| A2-13 | Template lookup 4 levels (post→tag→category→default) | template_context.allium:267-277 | Only levels 1 and 4 implemented | Drop levels 2-3 or implement |
+| A2-14 | `ValidateLiquid`/`ValidateScript` before publish | template.allium:110, script.allium:165 | No validation gate before publish | Add to code or drop from spec |
+| A2-15 | Graceful shutdown with inflight tracking | preview.allium:47-48 | Kills acceptor, no inflight tracking | Drop from spec |
+| A2-16 | Pagefind as real library | generation.allium:208 | Simplified JSON-based mock | Update spec to mock model |
+| A2-17 | 24 Snowball stemmers all with algorithms | search.allium:26-31 | Only 15/24 have algorithms; 9 pass through unstemmed | Update spec: 15 stemmed + 9 passthrough |
+| A2-18 | Git sidebar: commit input, history, push/pull | sidebar_views.allium | Only "Working tree" item | Mark as partial/TODO in spec |
+| A2-19 | 17 preset colors in tag picker | editor_tags.allium | Native `<input type="color">`, no preset palette | Update spec |
+| A2-20 | Slug timestamp fallback after 999 | post.allium:21 | Unbounded numeric suffix | Update spec or fix code |
+| A2-21 | Thumbnail generation is async | engine_side_effects.allium:117 | Synchronous | Update spec or fix code |
+
+### A3. Decisions Needed
+
+| ID | Gap | Spec | Code | Path |
+|---|---|---|---|---|
+| A3-1 | Template file written on create | engine_side_effects.allium:151-153 | Draft templates have `file_path=""` | Decide: write file on create, or update spec |
+| A3-2 | `provider_package_ref` on AiModel | schema.allium:282 | Not in code | Decide: add field or drop from spec |
+| A3-3 | AiModelModality: :video vs :file/:tool | schema.allium:291 | Code has `:file`, `:tool` instead of `:video` | Decide: which modalities are correct |
+| A3-4 | JSON key convention: snake_case vs camelCase | frontmatter.allium values | Code uses camelCase for all metadata JSON | Decide normative convention |
+
+---
+
+## B. Code Behavior Not in Spec
+
+### B1. Must Add to Spec (domain-level, affects behavior)
+
+| ID | Behavior | Code Location | Path |
+|---|---|---|---|
+| B1-1 | Chat inline surfaces (9 types: card, chart, form, list, metric, mindmap, table, tabs, text/json) | `lib/bds/ui/chat/tool_surfaces.ex:6-15` | Distill into spec |
+| B1-2 | Auto-translation system (AutoTranslation.maybe_schedule, media cascade, batch fill) | `lib/bds/posts/auto_translation.ex` | Distill into spec |
+| B1-3 | 3 extra settings sections (Technology, MCP, Data Maintenance) | `lib/bds/ui/settings_editor/` | Distill into spec |
+| B1-4 | Style/Theme as separate tab (`:style`), not settings section | `lib/bds/ui/style_editor.ex` | Distill into spec |
+| B1-5 | `published_*` snapshot fields on Post for diffing | `lib/bds/posts/post.ex:61-65` | Add to post.allium entity |
+| B1-6 | Full rendering subsystem (Liquex, Filters, Labels, LinksAndLanguages, PostRendering) | `lib/bds/rendering/` | Distill into spec |
+| B1-7 | 404.html generation | `lib/bds/generation/outputs.ex:344-345` | Add to generation.allium |
+| B1-8 | `linkedPostIds` in media sidecar | `lib/bds/media/sidecars.ex:42` | Add to frontmatter.allium MediaSidecar |
+| B1-9 | `projectId` in template/script frontmatter | `templates.ex:337`, `scripts.ex:268` | Add to frontmatter.allium |
+| B1-10 | Media translation editing modal | `media_editor.html.heex:275-303` | Add to editor_media.allium |
+| B1-11 | Menu editor drag-drop, indent/unindent/move | `lib/bds/desktop/menu_editor/tree_ops.ex` | Add to editor_misc.allium |
+| B1-12 | `:language_picker` overlay with flag emojis | `shell_overlay.html.heex:116-139` | Add to modals.allium |
+| B1-13 | `:confirm_dialog` generic confirmation | `shell_overlay.html.heex:171-187` | Add to modals.allium |
+| B1-14 | Publish actions for scripts and templates | `script_editor.html.heex:10-12`, `template_editor.html.heex:10-12` | Add to editor_script.allium, editor_template.allium |
+| B1-15 | `:import` as full editor tab | `lib/bds/ui/import_editor.ex` | Add to tabs.allium |
+| B1-16 | `:documentation`/`:api_documentation` tab types | `lib/bds/desktop/misc_editor/` | Add to tabs.allium |
+| B1-17 | Metadata diff covers embedding, media_translation, post_translation as entity types | `lib/bds/maintenance/repair.ex` | Add to metadata_diff.allium |
+| B1-18 | Finished task TTL eviction (1h, keep last 10) | `lib/bds/tasks.ex:365-386` | Add to task.allium |
+| B1-19 | `discard_post_changes/1` | `lib/bds/posts.ex:201-227` | Add to post.allium |
+| B1-20 | `replace_media_file/2` with checksum/backup | `lib/bds/media.ex:288-337` | Add to media.allium |
+
+### B2. Lower Priority (implementation detail or minor)
+
+| ID | Behavior | Code Location |
+|---|---|---|
+| B2-1 | `editor_body/1` content resolver | `lib/bds/posts.ex:229-252` |
+| B2-2 | `sync_post_from_file/1` single-post reimport | `lib/bds/posts.ex:254-279` |
+| B2-3 | `import_orphan_post_file/1` | `lib/bds/posts.ex:289-291` |
+| B2-4 | `dashboard_stats/1`, `post_counts_by_year_month/1` | `lib/bds/posts.ex:378-413` |
+| B2-5 | `regenerate_missing_thumbnails/2` | `lib/bds/media.ex:47-48` |
+| B2-6 | Cache dir computation | `lib/bds/projects.ex:101-106` |
+| B2-7 | `remove_stale_published_templates` | `lib/bds/templates.ex:524-552` |
+| B2-8 | Rendering Labels module (30+ i18n strings) | `lib/bds/rendering/labels.ex` |
+| B2-9 | Progress reporting during reindex | `lib/bds/generation/progress.ex` |
+
+---
+
+## C. Internal Spec Inconsistencies
+
+| ID | Conflict | Location | Path |
+|---|---|---|---|
+| C-1 | schema.allium ChatMessage has no cache tokens; ai.allium ChatMessage has `cache_read_tokens`/`cache_write_tokens` | schema.allium:235-243 vs ai.allium:147-156 | Align schema.allium with ai.allium (code matches ai.allium) |
+| C-2 | media.allium SidecarFile mentions `linkedPostIds`; frontmatter.allium MediaSidecar does NOT list it | media.allium:28 vs frontmatter.allium:171-190 | Add `linkedPostIds` to frontmatter.allium |
+| C-3 | translation.allium says status/timestamps omitted from translation files; frontmatter.allium TranslationFrontmatter defines only 5 fields; code writes 8+ fields | translation.allium:67-74, frontmatter.allium:107-117 | Reconcile: either update spec or fix code |
+
+---
+
+## D. Spec Claims Not Covered by Tests
+
+### D1. No Test Coverage (HIGH priority — invariants/guarantees)
+
+| ID | Claim | Spec | Path |
+|---|---|---|---|
+| D1-1 | UniqueMediaTranslation invariant | media.allium:108 | Write test: create duplicate media translation, expect rejection |
+| D1-2 | UniqueTranslationPerLanguage invariant | translation.allium:94 | Write test: create duplicate post translation, expect rejection |
+| D1-3 | BundledDefaultTemplatesExistOutsideProjectData | template.allium:65 | Write test: render with no Template rows, bundled template found |
+| D1-4 | UserTemplateDirectoryOverridesBundledDefaults | template.allium:75 | Write test: project template overrides bundled same-slug |
+| D1-5 | LiquidTagSubset (5 tags only) | template.allium:179 | Write test: unsupported tag raises error |
+| D1-6 | LiquidFilterSubset (4 standard + 2 custom) | template.allium:191 | Write test: unsupported filter raises error |
+| D1-7 | LiquidOperatorSubset | template.allium:210 | Write test: unsupported operator raises error |
+| D1-8 | MacroTimeout guarantee | script.allium:94-95 | Write test: macro times out within budget |
+| D1-9 | ExecuteTransform rule (pipeline, ordering, toast budget) | script.allium:229-263 | Write test: transform pipeline executes in order, toast budget enforced |
+| D1-10 | TransformPipelineContinuation | script.allium:247-249 | Write test: error in transform doesn't halt pipeline |
+| D1-11 | ChatContextTruncation invariant | ai.allium:375-379 | Write test: long chat history trimmed to context window |
+| D1-12 | BoundedToolLoop enforcement | ai.allium:381-385 | Write test: tool rounds bounded by chat_max_tool_rounds |
+| D1-13 | DiscardPostChangesSideEffects | engine_side_effects.allium:99-104 | Write test: FTS updated after discard |
+| D1-14 | ReplaceMediaFileSideEffects | engine_side_effects.allium:128-134 | Write test: file replaced, thumbnails regenerated |
+| D1-15 | Drag-and-drop image chain | action_patterns.allium:84-103 | Write integration test |
+| D1-16 | DebouncedPersistence (5s) | embedding.allium:204-208 | Write test: index persistence debounced |
+| D1-17 | Protected categories cannot be deleted | editor_settings.allium:81-84 | Write test: article/aside/page/picture deletion rejected |
+| D1-18 | HomeItemProtection (menu) | editor_misc.allium:206-209 | Write test: cannot move/reorder/delete Home |
+
+### D2. No Test Coverage (MEDIUM priority — rules/behaviors)
+
+| ID | Claim | Spec | Path |
+|---|---|---|---|
+| D2-1 | RemoveCategory rule | metadata.allium:100 | Write test: remove category, verify list+settings+JSON updated |
+| D2-2 | CreateAndPublishTemplate rule | template.allium:105 | Write test: create+publish in one step |
+| D2-3 | CreateAndPublishScript rule | script.allium:160 | Write test: create+publish in one step |
+| D2-4 | UniqueScriptSlug dedup | script.allium:115 | Write test: two scripts same title → dedup slug |
+| D2-5 | FrontmatterRoundtrip invariant | post.allium:223 | Write test: write file, read back, assert all DB fields match |
+| D2-6 | SidecarRoundtrip invariant | media.allium:198 | Write test: write sidecar, read back, assert all fields match |
+| D2-7 | ConditionalPostFields: nil fields absent from frontmatter | frontmatter.allium:398 | Write test: post with nil excerpt/author/language → fields not in file |
+| D2-8 | ConditionalMediaFields: nil fields absent from sidecar | frontmatter.allium:417 | Write test: media with nil title/alt → fields not in sidecar |
+| D2-9 | max_posts_per_page 1..500 constraint | metadata.allium:75-77 | Write test: values outside range rejected |
+| D2-10 | SandboxedExecution: restricted capabilities blocked | script.allium:84-88 | Write test: filesystem/process/package loading blocked |
+| D2-11 | TransformToastBudget enforcement | script.allium:251-258 | Write test: per-script and total toast limits enforced |
+| D2-12 | ProgressThrottled: 250ms throttle | task.allium:110-113 | Write test: rapid progress reports throttled |
+| D2-13 | archived→draft transition | post.allium:121 | Write test: unarchive post → draft |
+| D2-14 | archived→published transition | post.allium:122 | Write test: unarchive post → published |
+| D2-15 | AppNoopNotifier: app writes don't produce notification rows | cli_sync.allium:64-68 | Write test: app mutation produces no notification row |
+| D2-16 | ValidateMedia rule | media_processing.allium:318-343 | Write test: missing/corrupted/orphan media detected |
+| D2-17 | ContentHashSkipsUnchanged during reindex | embedding.allium:199-202 | Write test: unchanged content_hash skips re-embedding |
+
+### D3. Partial Test Coverage (needs expansion)
+
+| ID | Claim | Spec | Gap | Path |
+|---|---|---|---|---|
+| D3-1 | PublishPost: content=null after publish | post.allium:186 | Not explicitly tested | Add assertion |
+| D3-2 | PublishPost: old file deleted on path change | engine_side_effects.allium:73-74 | Not tested | Add test |
+| D3-3 | UpsertPostTranslation: do_not_translate guard | translation.allium:113 | Indirectly covered only | Add direct test |
+| D3-4 | PublishTemplate: Liquid validation prerequisite | template.allium:139 | Not tested as publish gate | Add test |
+| D3-5 | PublishScript: validation prerequisite | script.allium:181 | Not tested as publish gate | Add test |
+| D3-6 | ExecuteMacro failure degrades to empty | script.allium:199 | Returns error tuple, not empty | Fix code or update spec |
+| D3-7 | TemplateFrontmatter roundtrip | template.allium:53 | Slug verified, no full parse-back | Add roundtrip test |
+| D3-8 | DefaultCategories for fresh project | metadata.allium:60 | Defaults present after add, not verified fresh | Add fresh-project test |
+| D3-9 | FtsIncludesTranslations | translation.allium:178 | Tested for one language; expand | Test all stemmer languages |
+| D3-10 | PostCanonicalUrl format | post.allium:33-40 | Constructed in links test, not asserted as invariant | Add format assertion |
+| D3-11 | Slug generation: German transliteration | post.allium:14-22 | "Föö Bär" → "foo-bar-blog" tested; expand ä/ö/ü/ß/ÄÖÜ | Expand test |
+
+### D4. UI Test Coverage Gaps (whole-editor specs)
+
+| ID | Spec | Covered | Not Covered |
+|---|---|---|---|
+| D4-1 | editor_media.allium | AI analysis, delete | Translate, replace file, link-to-post, translation CRUD, detect language |
+| D4-2 | editor_settings.allium | AI endpoints, airplane toggle, rebuild | Protected categories, MCP agents, style/theme, search filter, categories CRUD |
+| D4-3 | editor_chat.allium | Chat creation, pinned tab | API key screen, message rendering, input area, model selector, inline surfaces |
+| D4-4 | editor_script.allium | Editor layout, create defaults | Save, syntax check, run, delete |
+| D4-5 | editor_template.allium | Editor layout, create defaults | Save with validation, validate, delete with references |
+| D4-6 | editor_tags.allium | Sync/discover, merge | Cloud sizing, color picker, delete confirmation, create form |
+| D4-7 | editor_misc.allium | Menu add/save, metadata diff, validation | Menu protection, import analysis, translation fix, duplicate dismiss, git diff |
+
+---
+
+## Priority Order for Resolution
+
+1. **A1-1 through A1-4** — code bugs (spec is correct)
+2. **D1-1 through D1-18** — untested invariants/guarantees
+3. **C-1 through C-3** — internal spec inconsistencies
+4. **B1-1 through B1-6** — major code behaviors missing from spec
+5. **A2-1 through A2-21** — spec drift (code is normative)
+6. **D2-1 through D2-17** — untested rules
+7. **D3-1 through D3-11** — partial test coverage
+8. **B1-7 through B1-20** — minor code behaviors missing from spec
+9. **D4-1 through D4-7** — UI test coverage
+10. **A3-1 through A3-4** — decisions needed
--- a/TESTAUDIT.md
+++ b/TESTAUDIT.md
@@ -0,0 +1,87 @@
+# Test Audit Procedure
+
+Periodic review of the unit test suite to ensure every test exercises production
+code against real assumptions and behavior.
+
+## Scope
+
+All `*_test.exs` files under `test/`.
+
+## What counts as a valid unit test
+
+A valid unit test **calls at least one production function** from `lib/bds/` and
+**asserts on its return value, side effects, or observable behavior**.
+
+Acceptable patterns:
+
+- Calling a production function and asserting its return value.
+- Calling a production function with injected test doubles (fake HTTP clients,
+  fake runtimes) and asserting the production code's orchestration logic.
+- Mounting a LiveView or rendering a LiveComponent and asserting HTML output
+  or database state after interactions.
+- Sending events to a GenServer and asserting state transitions.
+
+### Source-property tests (acceptable, not flagged)
+
+Tests that verify structural properties of source code are acceptable and should
+not be flagged during this audit. Examples:
+
+- Checking that all public functions have `@spec` annotations (AST parsing).
+- Asserting absence of `String.to_atom` or `cond do` in specific files.
+- Verifying CSS/JS/template assets contain expected class names or imports.
+- Checking that `API.md` matches the output of a documentation generator.
+- Verifying database indexes exist via `EXPLAIN QUERY PLAN`.
+- Asserting `.allium` spec files have consistent parameter signatures.
+- Checking config files for expected values.
+- Verifying function decomposition patterns in source.
+
+These are linting/contract/consistency checks. They serve a purpose but are
+distinct from behavioral tests.
+
+## What gets flagged
+
+1. **Export-existence-only tests** — tests that call `function_exported?/3` or
+   `Code.ensure_loaded?/1` without ever invoking the function. These verify
+   compilation, not behavior. They are redundant when the same module is already
+   tested via rendering or direct calls in another test file.
+
+2. **Mock-only tests** — tests that define a fake/stub module and only assert
+   on that fake's behavior without routing through any production code path.
+
+3. **Trivially-passing tests** — tests whose assertions succeed regardless of
+   whether the production code is correct (e.g., asserting on a hardcoded value
+   that never touches production logic).
+
+## How to run the audit
+
+Ask Claude Code to:
+
+> Analyse the unit tests of the project and check if all of them actually call
+> proper production code or if there are tests that essentially only test
+> scaffolds, mocks and helper functions. Every unit test must test proper
+> production code against assumptions and behaviour. Source-property tests
+> (structure, @spec, asset presence, schema verification, doc staleness) are
+> acceptable and should not be flagged.
+
+The audit should:
+
+1. Read every `*_test.exs` file under `test/` in full.
+2. For each test block, identify which production function (if any) is called.
+3. Flag any test that falls into the categories above.
+4. Report flagged tests with file path, line number, and explanation.
+
+## Audit log
+
+### 2026-05-11
+
+Reviewed all 71 test files (69 after cleanup). Found 2 redundant files:
+
+- `test/bds/desktop/shell_live/chat_editor_test.exs` — single test only called
+  `function_exported?` for `ChatEditor`. The component was already fully tested
+  via `render_component` in `shell_live_test.exs`. **Deleted.**
+
+- `test/bds/desktop/shell_live/import_editor_test.exs` — single test only called
+  `Code.ensure_loaded?` + `function_exported?` for `ImportEditor`. The component
+  was already exercised in `import_shell_live_test.exs`. **Deleted.**
+
+Result after cleanup: 646 tests, 0 failures, 4 skipped.
--- a/test/bds/desktop/shell_live/chat_editor_test.exs
+++ b/test/bds/desktop/shell_live/chat_editor_test.exs
@@ -1,9 +0,0 @@
-defmodule BDS.Desktop.ShellLive.ChatEditorTest do
-  use ExUnit.Case, async: false
-
-  test "ChatEditor exports LiveComponent callbacks" do
-    assert function_exported?(BDS.Desktop.ShellLive.ChatEditor, :update, 2)
-    assert function_exported?(BDS.Desktop.ShellLive.ChatEditor, :handle_event, 3)
-    assert function_exported?(BDS.Desktop.ShellLive.ChatEditor, :render, 1)
-  end
-end
--- a/test/bds/desktop/shell_live/import_editor_test.exs
+++ b/test/bds/desktop/shell_live/import_editor_test.exs
@@ -1,11 +0,0 @@
-defmodule BDS.Desktop.ShellLive.ImportEditorTest do
-  use ExUnit.Case, async: false
-
-  test "ImportEditor exports LiveComponent callbacks" do
-    module = BDS.Desktop.ShellLive.ImportEditor
-    assert Code.ensure_loaded?(module)
-    assert function_exported?(module, :update, 2)
-    assert function_exported?(module, :handle_event, 3)
-    assert function_exported?(module, :render, 1)
-  end
-end