chore: removed old todo document

2026-06-12 15:26:45 +02:00
parent 4dd4781c5a
commit 8f073bbebd
1 changed files with 0 additions and 949 deletions
--- a/TECHDEBTS.md
+++ b/TECHDEBTS.md
@@ -1,949 +0,0 @@
 # Technical Debt Backlog
 Findings from a full codebase + stack review (2026-06-11) covering BEAM/OTP
 antipatterns, build-vs-buy decisions, and architecture. Tasks are ordered in
 phases that make sense to execute sequentially; within a phase, tasks are
 independent unless noted. Phase 1 contains the top five findings.
 **Revalidated 2026-06-12:** every entry's implementation claims were re-checked
 against the code, and all quality gates re-run clean (compile
 --warnings-as-errors, full test suite 1191 passed / 0 failed, credo --strict,
 deps.audit, dialyzer). All 25 entries confirmed done. One adjacent observation
 outside any entry's scope: `lib/bds/menu.ex` still parsed OPML with
 `:xmerl_scan` (same atom-creation class as TD-05's WXR finding, lower exposure
 since the file is project-local) — fixed 2026-06-12 by rewriting the OPML
 parsing on Saxy with atom-safety tests, mirroring TD-05's approach.
 ## How to work these tasks
 Every task must follow the project rules in `AGENTS.md`:
 - **Test-first**: write a failing test before the fix, against the real module.
 - **Clean gates**: `mix compile --warnings-as-errors`, full `mix test`
  (use `xvfb-run mix test` on headless Linux), and `mix dialyzer` must all pass.
 - **Delete dead code completely** — no commented-out remnants.
 - **Specs**: check `specs/` for an allium spec covering the touched area; if the
  spec changes, validate it with `allium check <file>` (must exit 0).
 - **No new CDN/external assets** in preview or generated HTML.
 - New user-facing strings go through gettext with translations for de/fr/it/es.
 - If a task touches an app API surface, update the script bridge and `API.md`
  in the same change.
 Each task lists: context (why it matters), affected files, suggested approach,
 and acceptance criteria. Re-verify line numbers before editing — they reflect
 the review snapshot (commit `bf93403`).
 ---
 ## Phase 1 — Top five (security & correctness)
 ### TD-01: Move the AI secret encryption key out of the repo ✅ DONE (2026-06-11)
 **Severity: High (security).**
 **Status: implemented.** `BDS.AI.SecretKey` resolves the master key from the
 macOS Keychain (`security` CLI) with a 0600 key-file fallback under the
 private app dir; the deterministic node-name fallback is gone (operations
 return `{:error, :secret_key_unavailable}`); `BDS.AI.SecretMigration`
 re-encrypts legacy rows at every boot from `BDS.RepoBootstrap`; the endpoint
 `secret_key_base` is generated per boot. The legacy repo key literal remains
 in `SecretBackend`/tests **only** to decrypt and migrate existing user
 databases — remove it together with `SecretMigration` in a future release.
 The test env pins a deterministic `:ai_secret_key` in `config/test.exs` so
 the suite never touches the keyring; that string protects nothing.
 Rode along (mandated clean-gates): fixed all 10 pre-existing compiler type
 warnings surfaced by the full recompile and the one dialyzer finding
 (MapSet opacity in `mac_bundle/dylibs.ex`), so `mix compile
 --warnings-as-errors --force` and `mix dialyzer` are now clean baselines.
 **Context.** `BDS.AI.SecretBackend` encrypts AI provider API keys at rest with
 AES-256-GCM, but the key is the hardcoded string in `config/config.exs`
 (`config :bds, :ai_secret_key, "bds_desktop_shell_secret_key_base_..."`), which
 is committed to the repository and **never overridden in
 `config/runtime.exs`**, so production uses it too. Anyone with the user's
 SQLite database file plus the public source can decrypt their API keys. The
 fallback path is worse: `sha256(Atom.to_string(node()) <> ":bds:ai")`, and the
 node name is effectively always `nonode@nohost`, making the fallback key a
 constant. The same hardcoded string also serves as the prod `secret_key_base`
 for `BDS.Desktop.Endpoint`.
 **Files.**
 - `lib/bds/ai/secret_backend.ex` (`secret_key/0`, lines ~35–41)
 - `config/config.exs` (lines ~22–26: `:desktop` secret_key_base and `:ai_secret_key`)
 - `config/runtime.exs` (no override exists today)
 **Approach.**
 1. On macOS, store the key in the Keychain via `System.cmd("security", ["add-generic-password", ...])` /
   `find-generic-password`. On other platforms (and as a portable fallback),
   generate a random 32-byte key on first launch and persist it with `0600`
   permissions under the app's private data dir (`~/Library/Application
   Support/bds` on macOS — see project conventions; never the repo or the
   project.json folder).
 2. Remove the deterministic node-name fallback entirely; if no key can be
   obtained, fail loudly rather than silently degrade to obfuscation.
 3. Migration: on startup, if secrets were encrypted with the legacy hardcoded
   key, decrypt with the old key and re-encrypt with the new one (one-time,
   then remove legacy key knowledge in a follow-up release).
 4. Generate the endpoint `secret_key_base` at runtime (random per boot is fine
   for a loopback-only desktop endpoint) instead of shipping it in config.
 **Acceptance.** No secret material in the repo; `grep -r "secret_key_base_64\|ai_secret_key" config/` shows no literal keys; existing encrypted secrets still decrypt after upgrade (covered by a migration test); SecretBackend tests cover missing-key failure mode.
 ---
 ### TD-02: Replace the hand-rolled `:httpc` client with Req ✅ DONE (2026-06-11)
 **Severity: High (reliability), enables TD-06 (streaming).**
 **Status: implemented.** `BDS.AI.HttpClient` is a Req wrapper with explicit
 connect/receive timeouts and constant-delay transient retries for GETs only
 (POST completions are never retried), configurable under
 `config :bds, BDS.AI.HttpClient` (`:connect_timeout_ms`,
 `:receive_timeout_ms`, `:get_max_retries`, `:retry_delay_ms`). The legacy
 `{:ok, %{status, headers, body}}` contract is preserved (raw binary body,
 downcased single-valued headers), so runtime/catalog callers are unchanged.
 The automation driver's health check also moved to Req, `:inets` left
 `extra_applications`, and no `:httpc` remains under `lib/`
 (test files still use `:httpc` as a client against local servers — they start
 `:inets` themselves). TD-06 (SSE streaming via Req `into:`) is now unblocked.
 **Context.** `BDS.AI.HttpClient` wraps `:httpc` with empty http options — the
 default timeout is **infinity**, so a hung LLM endpoint (Ollama, LM Studio,
 any OpenAI-compatible server) blocks the chat task forever (see TD-03 for the
 blocking chain above it). There is no retry, no connection pooling, no pinned
 TLS verification (behavior depends on the OTP version's `:httpc` defaults),
 and `:inets.start()`/`:ssl.start()` are called redundantly on every request
 (both are already in `extra_applications`). Req is already an optional
 dependency of the `image` package, so the dependency tree barely grows.
 **Files.**
 - `lib/bds/ai/http_client.ex` (delete or reduce to a thin Req wrapper)
 - `lib/bds/ai/openai_compatible_runtime.ex` (calls `HttpClient.get/post`)
 - `lib/bds/desktop/automation.ex` (also uses `:httpc`; test-automation server, lower priority but should follow)
 - `mix.exs` (add `{:req, "~> 0.5"}`)
 **Approach.** Add Req with explicit `connect_options` (timeout), `receive_timeout`
 (generous but finite, e.g. 120s for LLM responses; make it config-driven under
 `config :bds, :ai`), `retry: :transient` for idempotent GETs only (do NOT
 auto-retry chat completions), and default TLS verification. Keep the existing
 `{:ok, %{status, headers, body}} | {:error, reason}` contract so the runtime
 and its tests change minimally, or inject Req as the `:http_client` the way
 tests already do.
 **Acceptance.** No `:httpc` calls remain under `lib/bds/ai/`; a test proves a
 slow/hung endpoint produces `{:error, %{kind: :http_error, reason: :timeout}}`
 within the configured budget instead of hanging; dialyzer clean.
 ---
 ### TD-03: Fix `BDS.AI.InFlight` ETS ownership and creation race ✅ DONE (2026-06-11)
 **Severity: High (correctness).**
 **Status: implemented.** `BDS.AI.InFlight` is now a minimal GenServer whose
 `init/1` creates the named table (`:named_table, :public, :set,
 read_concurrency: true`); it is supervised in `BDS.Application` (before
 anything that uses chat), so the table lives for the VM's lifetime and the
 concurrent-first-use race is impossible by construction. The lazy `table/0`
 creation path is deleted; `register/unregister/lookup` reference the named
 table directly. `test/bds/ai/in_flight_test.exs` proves registrations survive
 the death of the registering process and that the supervised process owns the
 table.
 **Context.** `lib/bds/ai/in_flight.ex` creates its named ETS table lazily in
 whichever process first calls `table/0`. Two defects: (1) the table is owned
 by that first caller — typically a transient LiveView or chat task — so when
 that process exits, the table and all in-flight chat registrations vanish,
 breaking `cancel_chat/1`; (2) two concurrent first-callers race on
 `:ets.new/2` with `:named_table`, and the loser crashes with `badarg`.
 **Files.**
 - `lib/bds/ai/in_flight.ex`
 - `lib/bds/application.ex` (supervision tree)
 **Approach.** Create the table from a stable owner. Simplest options:
 - create it directly in `BDS.Application.start/2` (the application process
  lives for the VM's lifetime), or
 - give InFlight a minimal GenServer whose `init/1` creates the table
  (`:named_table, :public, :set, read_concurrency: true`) and add it to the
  supervision tree before anything that uses chat.
 Remove the lazy `table/0` creation path entirely; `register/unregister/lookup`
 should assume the table exists.
 **Acceptance.** A test demonstrates registrations survive the death of the
 registering process; no code path calls `:ets.new` outside the owner's init;
 concurrent-first-use race is impossible by construction.
 ---
 ### TD-04: Flush embedding indexes on shutdown (or delete the dead `flush_all`) ✅ DONE (2026-06-11)
 **Severity: Medium (perf/contract), High confidence.**
 **Status: implemented.** `Shutdown.persist_safely/0` now calls
 `BDS.Embeddings.Index.flush_all()` next to `MainWindow.persist_now()`; each
 persist step is hardened individually (own rescue/catch) so one failure never
 blocks quit or skips the other step. `terminate/2` stays as defense-in-depth
 for supervised restarts. A test proves a debounced (unsaved) index reaches
 disk through the real shutdown path before the hard quit fires. The
 `terminate/2` audit found no other graceful-shutdown dependency:
 `job_runner.ex` only detaches in-memory state (moot under SIGKILL),
 `automation.ex` is the test-automation harness whose ports die with the VM,
 and `main_window.ex` bounds persistence was already covered by
 `MainWindow.persist_now()` in the shutdown path. The code now matches the
 spec's DebouncedPersistence invariant (`specs/embedding.allium:216`).
 **Context.** App shutdown SIGKILLs the BEAM (`BDS.Desktop.Shutdown.quit/0` —
 a documented and legitimate workaround for a wxWidgets static-destructor
 segfault on macOS). Consequence: **no `terminate/2` callback in the whole app
 ever runs on quit.** `BDS.Embeddings.Index` traps exits and relies on
 `terminate/2` to persist debounced HNSW index saves, and its moduledoc
 promises "force-saved on project switch / shutdown". Meanwhile
 `Embeddings.Index.flush_all/0` exists but has **zero production callers**
 (dead code per the AGENTS.md mandate). The lazy-reload-from-DB fallback makes
 this a startup-performance bug (index rebuilt unnecessarily) rather than data
 loss, but the code contradicts its own contract.
 **Files.**
 - `lib/bds/desktop/shutdown.ex` (`persist_safely/0`, lines ~92–99)
 - `lib/bds/embeddings/index.ex` (`flush_all/0` line ~87, `terminate/2` line ~167, moduledoc)
 **Approach.** Call `BDS.Embeddings.Index.flush_all()` inside
 `Shutdown.persist_safely/0` next to `MainWindow.persist_now()` (same
 rescue/catch hardening). Keep `terminate/2` as defense-in-depth for supervised
 restarts. Audit for other state that assumed graceful shutdown (search the
 tree for `terminate/2` implementations and check each one's expectations
 against the SIGKILL path). If instead the decision is that lazy rebuild is the
 intended behavior, delete `flush_all/0` and fix the moduledoc — pick one,
 don't keep both.
 **Acceptance.** Either `flush_all` is wired into the shutdown path with a test
 (injectable shutdown module already exists: `:desktop_shutdown_module` env),
 or it is deleted and the moduledoc no longer claims shutdown saving. No other
 `terminate/2` in the codebase silently depends on graceful shutdown for
 correctness.
 ---
 ### TD-05: Replace xmerl with Saxy in the WXR importer; add import transactions ✅ DONE (2026-06-12)
 **Severity: Medium-High (DoS + integrity on user-supplied files).**
 **Status: implemented.** `BDS.WxrParser` now parses WXR with `Saxy.parse_stream/3`
 for files and `Saxy.parse_string/3` for in-memory XML, keeping element names as
 binaries instead of interning atoms and preserving the existing result shape.
 Both import write paths now batch work in `Repo.transaction` chunks of 500
 (`BDS.ImportExecution` and `BDS.Posts.RebuildFromFiles`), so mid-batch failures
 roll back cleanly instead of leaving partial imports behind. Acceptance proof now
 includes a bounded atom-growth parser test with many unique element names,
 existing import fixture tests, rollback tests for both import and rebuild, and a
 local SQLite benchmark showing the batching win (`1000` inserts: `183ms`
 per-row transactions vs `83ms` in `500`-row chunks, `2.2x` faster).
 **Context.** `BDS.WxrParser.parse_xml/1` uses `:xmerl_scan.string/1`, which
 **creates atoms from element and attribute names** in the parsed document. WXR
 files are user-supplied imports, so a malicious or merely huge/weird file can
 grow the atom table (atoms are never GC'd → eventual VM crash). It also reads
 the entire file into memory (`File.read!` + full DOM); real WordPress exports
 reach hundreds of MB. Separately, the import/rebuild write loops run one
 autocommit transaction per row on SQLite — slow (one fsync per post) and a
 failure mid-way leaves a half-imported database.
 **Files.**
 - `lib/bds/wxr_parser.ex` (full rewrite of the parsing layer; keep the output map shape)
 - `lib/bds/import_execution.ex` (line ~359: `Enum.each(post_ids, ...)` write loop)
 - `lib/bds/posts/rebuild_from_files.ex` (lines ~37–56: per-file upsert loop)
 - `mix.exs` (add `{:saxy, "~> 1.6"}`)
 **Approach.**
 1. Rewrite WxrParser on Saxy (SAX or its simple-form DOM builder). Saxy keeps
   names as binaries (no atom creation) and supports streaming via
   `Saxy.parse_stream/3` with `File.stream!`. Preserve the existing public
   contract (`parse_file/1`, `parse_xml/1` returning
   `%{site:, posts:, pages:, media:, categories:, tags:}`) so
   `import_analysis.ex`/`import_execution.ex` are untouched; the existing
   parser tests become the safety net.
   Note: `sweet_xml` is already in the dep tree via `image`, but it is
   xmerl-based and inherits the atom problem — don't use it for this.
 2. Wrap import and rebuild write loops in `Repo.transaction` (chunked, e.g.
   500 rows per transaction, to keep WAL size and progress reporting sane).
   Consider `Repo.insert_all` in chunks for plain inserts — there is currently
   not a single `insert_all` in the codebase.
 **Acceptance.** A test feeds XML with many unique element names and asserts
 `:erlang.system_info(:atom_count)` does not grow proportionally; existing
 import fixture tests pass unchanged; a failure injected mid-import leaves the
 DB unchanged (rolled back chunk); large-import benchmark shows the transaction
 batching speedup.
 ---
 ## Phase 2 — Unbounded blocking & cancellation
 ### TD-06: Real SSE streaming for chat ✅ DONE (2026-06-11)
 **Depends on TD-02 (Req).**
 **Status: implemented.** Chat requests now send `"stream": true` (+
 `stream_options.include_usage`) and consume the SSE response incrementally
 via `HttpClient.post_stream/5` (Req `into:`). `BDS.AI.SSE` assembles content
 deltas, tool-call fragments, and usage, emitting **cumulative content
 snapshots** throttled to `stream_emit_interval_ms` (default 100ms) — replace
 semantics, so the chat editor needed no changes and tool rounds reset
 naturally. Streaming applies only to `operation: :chat` with an `:on_stream`
 callback, can be disabled via `config :bds, :chat, streaming: false`, and
 providers that ignore the stream flag are auto-detected by content-type and
 parsed as plain JSON. Cancellation kills the chat task, which aborts the
 underlying connection (server-observed in tests). Persistence semantics are
 unchanged (one assistant row per round, same usage normalization).
 **Context.** `OpenAICompatibleRuntime.generate/3` never sets `"stream": true`;
 the UI's `{:chat_streaming_content, ...}` event fires exactly once with the
 complete response, i.e. streaming is fake. For local models this is the
 single biggest perceived-latency win available.
 **Files.**
 - `lib/bds/ai/openai_compatible_runtime.ex`
 - `lib/bds/ai/chat.ex` (`notify_chat_event` plumbing, `chat_round/9`)
 - `lib/bds/desktop/shell_live/chat_editor.ex` (consume incremental events)
 **Approach.** Use Req's `into:` option (fun or `:self`) to consume
 `text/event-stream` chunks, parse `data:` lines incrementally, emit
 `{:chat_streaming_content, conversation_id, delta}` per token-batch, and
 accumulate the full message for persistence (keep persistence semantics
 identical: one assistant row per round, tool_calls assembled from streamed
 fragments). Gate on a config flag so non-streaming providers still work.
 Remember the AGENTS.md AI rule: all automatic AI activity stays gated by
 airplane mode / local model / toast.
 **Acceptance.** A mock SSE server test asserts multiple incremental content
 events arrive before the final `{:ok, reply}`; tool-call rounds still work;
 cancellation mid-stream (TD-07) aborts the HTTP request.
 ### TD-07: Bound the chat await chain; end-to-end timeout & cancellation ✅ DONE (2026-06-12)
 **Status: implemented.** `BDS.AI.Chat.send_chat_message/3` no longer waits
 unboundedly on the supervised chat task: `await_chat_task/2` now applies a
 global deadline derived from the configured per-request HTTP budget and the
 bounded tool loop (`BDS.AI.HttpClient.request_timeout_ms() *
 (chat_max_tool_rounds + 1) + config :bds, :chat, :await_timeout_margin_ms`). On
 deadline expiry it returns `{:error, :chat_timeout}` and shuts the task down via
 `Task.shutdown/2`, so the caller is released even if the runtime wedges.
 `config/config.exs` now exposes `:await_timeout_margin_ms` under `:chat`. The
 acceptance proof is a shutdown-aware blocking runtime test that asserts the
 timeout result, verifies the task receives shutdown, and confirms the
 conversation persists only the user message on timeout; existing cancellation
 and streaming tests remain green.
 **Context.** `BDS.AI.Chat.send_chat_message/3` blocks the caller (a LiveView
 process) on a hand-rolled `await_chat_task/1` — a raw `receive` with **no
 `after` clause**. Combined with the infinite HTTP timeout (TD-02) the whole
 chain is unbounded. Even after TD-02, a defense-in-depth deadline belongs
 here.
 **Files.**
 - `lib/bds/ai/chat.ex` (`send_chat_message/3` lines ~185–219, `await_chat_task/1` lines ~855–882)
 **Approach.** Give `await_chat_task` an `after` deadline derived from config
 (request timeout × max_tool_rounds + margin), returning
 `{:error, :chat_timeout}` and terminating the task via
 `Task.Supervisor.terminate_child` (the path `cancel_chat/1` already uses).
 Alternatively restructure to fully async: run the chat task fire-and-forget
 and deliver results to the LiveView via the existing `event_target` /
 PubSub mechanism, so no process ever blocks. The async restructure is the
 better end state; the `after` clause is the cheap immediate fix.
 **Acceptance.** A stalled runtime stub cannot hang the caller past the
 deadline; cancel during a round kills the HTTP request and leaves the
 conversation in a consistent persisted state.
 ### TD-08: Remove test-sandbox scaffolding from production chat code ✅ DONE (2026-06-12)
 **Status: implemented.** `BDS.AI.Chat.send_chat_message/3` now starts the
 chat worker directly with `Task.Supervisor.async_nolink/2`; the temporary
 `:sandbox_ready` receive/send barrier and `allow_repo_sandbox/1` are deleted
 from production code. The repo is on `ecto_sql 3.13.5`, and the supervised
 chat task works correctly through sandbox `$callers` propagation in the
 project's existing test modes (manual checkout plus the suite's ownership /
 shared-mode callers). `test/bds/ai/chat_sandbox_cleanup_test.exs` now proves
 both acceptance points: the production module has zero sandbox references, and
 `send_chat_message/3` still persists its user/assistant messages through the
 supervised task without explicit sandbox allowance. Validation gates are clean:
 `mix compile --warnings-as-errors --force`, full `mix test`, and
 `mix dialyzer` all pass.
 **Context.** `chat.ex` contains a `:sandbox_ready` send/receive handshake and
 `allow_repo_sandbox/1` (with `Code.ensure_loaded?` + blanket `rescue`) purely
 so the Ecto SQL sandbox works for the `async_nolink` chat task in tests. Since
 ecto_sql 3.4 the sandbox automatically follows `$callers` for processes
 spawned via `Task`/`Task.Supervisor`, making all of this unnecessary —
 **verify this against the project's ecto_sql version and test modes first**
 (it holds for the default `:shared`/ownership modes when the caller chain is
 intact).
 **Files.**
 - `lib/bds/ai/chat.ex` (lines ~197–214, ~938–950)
 **Approach.** Delete the handshake (`receive :sandbox_ready`, the
 `send(task.pid, :sandbox_ready)`, and `allow_repo_sandbox/1`); run the full
 test suite to confirm `$callers` propagation covers it. If some test setup
 genuinely needs explicit allowance, move that into the test helper, not
 production code.
 **Acceptance.** Production module has zero sandbox references; full suite
 green.
 ### TD-09: Graceful task cancellation in `BDS.Tasks` ✅ DONE (2026-06-12)
 **Status: implemented.** `BDS.Tasks.cancel_task/1` and
 `BDS.Scripting.JobRunner.handle_call(:cancel, ...)` now terminate supervised
 workers through their owning `Task.Supervisor` (`terminate_child/2`), so the
 worker receives `:shutdown` instead of an immediate `:kill`. Queue bookkeeping
 and `:cancelled` status semantics are unchanged, but cooperative cleanup now
 runs before the slot is freed. Coverage now includes a task worker that traps
 exits, confirms cleanup executes on cancellation, and proves the queued task is
 promoted afterward, plus a managed scripting job runtime test that traps exits
 and observes the same shutdown-driven cleanup path.
 **Context.** `Tasks.cancel_task/1` uses `Process.exit(pid, :kill)` — the
 worker gets no chance to clean up mid-upload or mid-file-write
 (`Persistence.atomic_write` mitigates file corruption but not e.g. remote
 half-states). `cancel_chat` already models the right approach.
 **Files.**
 - `lib/bds/tasks.ex` (line ~130)
 - `lib/bds/scripting/job_runner.ex` (`handle_call(:cancel, ...)` line ~67 — same pattern)
 **Approach.** Use `Task.Supervisor.terminate_child/2` (delivers `:shutdown`,
 escalates to kill after the child's shutdown timeout). Workers that need
 cleanup can trap exits locally. Keep the state bookkeeping identical
 (`:cancelled` status, queue promotion).
 **Acceptance.** A test worker with `trap_exit` observes `:shutdown` and runs
 its cleanup before dying; cancelled tasks still free a concurrency slot.
 ### TD-10: Timeouts for external `git` commands ✅ DONE (2026-06-12)
 **Status: implemented.** `BDS.Git` now applies config-driven timeout budgets
 to every shell-out (`config :bds, :git` with `:local_timeout_ms` defaulting to
 15s and `:network_timeout_ms` defaulting to 120s for fetch/pull/push). The
 default runner no longer uses unbounded `System.cmd/3`; it launches the child
 through `Port.open/2`, collects stdout/stderr, and on timeout closes the port
 and explicitly terminates the OS process if it is still alive. Timeout results
 surface as structured `%{kind: :timeout, operation:, timeout_ms:, message:}`
 errors, while existing structured auth guidance is preserved. Acceptance proof
 now includes a bounded runner-stub timeout test and a real-process cleanup test
 that launches `sleep`, times out, and asserts the spawned PID is gone.
 **Context.** `BDS.Git` shells out via `System.cmd`, which has **no timeout**.
 `GIT_TERMINAL_PROMPT=0` and SSH BatchMode prevent interactive hangs, but a
 network-level stall on `fetch`/`pull`/`push`/`lfs` blocks the calling task
 indefinitely. (Shelling out to git itself is the right call — current Elixir
 libgit2 bindings are not mature; keep that decision.)
 **Files.**
 - `lib/bds/git.ex` (`system_runner/3` line ~343, `run_git/3`)
 **Approach.** Wrap the command in `Task.async` + `Task.yield(task, timeout) || Task.shutdown(task, :brutal_kill)`
 inside `system_runner` (note: the OS process may need explicit killing —
 consider `MuonTrap` for proper external-process supervision, or spawn git via
 a port with `:kill_group`). Default timeout config-driven: generous for
 network ops (120s), short for local ops (15s). Map timeout to a structured
 error like the existing `%{kind: :auth, ...}` shape so the UI can toast it.
 **Acceptance.** A runner stub that sleeps past the deadline produces
 `{:error, %{kind: :timeout, ...}}`; no orphaned OS processes after timeout
 (test with a real `sleep` binary).
 ---
 ## Phase 3 — Process architecture (de-bottleneck singletons)
 ### TD-11: Move preview rendering out of the `BDS.Preview` GenServer ✅ DONE (2026-06-12)
 **Status: implemented.** `BDS.Preview` now keeps only preview-server lifecycle and
 drain coordination in the GenServer. The current server state lives in a public
 ETS table, so both direct preview calls and socket-served HTTP requests resolve
 against that fast state read and render outside the singleton owner process.
 Direct `request/2` and `preview_draft/3` calls now run in short-lived tracked
 tasks, while TCP request workers render inline after inheriting the test sandbox
 allowance; graceful stop flips `is_running` false in ETS before draining and
 waits for those tracked render workers to finish. Coverage now proves two real
 HTTP preview renders can be in flight at once and that `stop_preview/1` waits
 for a blocked render to complete before returning.
 **Context.** Every preview HTTP request (`Preview.request/2`,
 `preview_draft/3`) renders Markdown + Liquid **inside** the singleton
 `BDS.Preview` GenServer via `handle_call`. A browser loading one page fires
 parallel requests (page, CSS, images) that all serialize through this single
 process — the textbook "process used for code organization" bottleneck.
 **Files.**
 - `lib/bds/preview.ex` (`handle_call({:request, ...})` line ~93, `handle_call({:preview_draft, ...})` line ~102)
 - `lib/bds/preview/router.ex` (the Plug calling into Preview)
 **Approach.** Keep the GenServer for what actually needs serialization
 (server lifecycle: start/stop/ensure, current-project state, graceful drain).
 For requests, expose a fast state read (`:sys.get_state`-free — e.g. a public
 ETS table or a lightweight `handle_call(:current_project)`) and perform
 rendering in the Bandit request process itself. The drain logic
 (`@drain_timeout`) already tracks inflight requests — adapt it to count
 renderers via monitors or a counter.
 **Acceptance.** A test issues N concurrent slow renders and asserts they
 overlap (wall time « N × single render); stop_preview still drains correctly.
 ### TD-12: Move HNSW builds and duplicate scans out of `Embeddings.Index` handle_call ✅ DONE (2026-06-12)
 **Status: implemented.** `BDS.Embeddings.Index` now runs duplicate scans and
 HNSW rebuilds in supervised tasks instead of inside `handle_call`, while the
 GenServer keeps only the small serialized state surface (current index, label
 map, debounce timers, and flush coordination). Neighbor queries continue to hit
 the current index while a scan or rebuild is in flight; rebuild requests for a
 project coalesce onto the latest requested snapshot; `flush/1` and `flush_all/0`
 wait for in-flight rebuilds before persisting; and `forget/1` cancels pending
 index work cleanly. Acceptance proof now includes focused concurrency tests for
 both slow duplicate scans and slow rebuilds, and the existing debounced
 persistence coverage remains green.
 **Context.** `Embeddings.Index` (singleton) builds HNSW graphs and runs full
 duplicate scans inside `handle_call` with client timeout `:infinity`. A long
 duplicate scan head-of-line-blocks every neighbor query the UI makes. The
 state it guards (index refs, label maps, debounce timers) is small; the work
 is what's big.
 **Files.**
 - `lib/bds/embeddings/index.ex` (`handle_call({:put, ...})` ~105, `{:duplicate_pairs, ...}` ~128)
 **Approach.** For `duplicate_pairs` (read-only over the index ref): capture
 the entry in the GenServer, spawn the scan in a `Task` (HNSWLib index
 resources are usable from other processes — verify; they are NIF resources,
 confirm thread-safety of concurrent reads in hnswlib docs), and reply via
 `GenServer.reply/2`. For `put` (build): construct the graph in the caller or
 a Task and hand the finished index to the GenServer for state swap +
 debounced save. Queries during a rebuild keep hitting the old index.
 **Acceptance.** Neighbor queries return while a duplicate scan is running
 (test with a large synthetic index); debounce/flush semantics unchanged.
 ### TD-13: Slim down the `Publishing` GenServer call surface ✅ DONE (2026-06-12)
 **Status: implemented.** `BDS.Publishing` now keeps only SCP mtime state in its
 GenServer. Publish-job creation, lookup, and status updates run directly through
 `Repo`, while background-task job updates use a stable Repo caller so sandboxed
 tests still exercise the real async path. SCP uploads no longer round-trip
 through `should_upload_scp_file` / `mark_uploaded_scp_file` per file; each
 target now batches one filter call for changed files and one bulk record call
 for successfully uploaded mtimes. Coverage includes a focused batching test that
 proves a multi-file SCP publish keeps bookkeeping traffic bounded instead of
 scaling linearly with file count.
 **Context.** `Publishing` does Repo writes inside `handle_call`
 (`:upload_site`, `:update_job`) and the uploader makes per-file synchronous
 calls (`:should_upload_scp_file` / `:mark_uploaded_scp_file`) — chatty
 serialization through one process during uploads.
 **Files.**
 - `lib/bds/publishing.ex` (lines ~45–119)
 **Approach.** Job creation (Repo insert) can happen in the caller before the
 call; the GenServer keeps only `scp_uploads` mtime state. Batch the mtime
 check: one call returning the filtered upload list instead of two calls per
 file. Consider whether `scp_uploads` state should be ETS
 (`read_concurrency`) owned by the GenServer.
 **Acceptance.** Upload of N files makes O(1) GenServer calls for mtime
 bookkeeping, not O(N)·2; behavior identical for incremental uploads.
 ### TD-14: Replace polling with messaging (CliSync watcher + rebuild sequencing) ✅ DONE (2026-06-12)
 **Status: implemented.** `BDS.CliSync.Watcher` now gates notification-table
 work behind SQLite `PRAGMA data_version`, so unchanged databases no longer force
 repeated `db_notifications` queries at the 100 ms watch cadence; the watcher
 still performs a real notification fetch/prune pass on the first poll and on
 every external commit boundary. Rebuild sequencing in
 `BDS.Desktop.ShellCommands` no longer sleep-polls `Tasks.list_tasks/0`.
 `BDS.Tasks` now broadcasts terminal task states on `BDS.PubSub`, and
 `wait_for_group_phase/3` subscribes and waits on those messages with the same
 deadline semantics. Coverage now includes a watcher test that proves unchanged
 `data_version` skips notification queries, a tasks test that proves terminal
 state broadcasts, and a shell-command guard test that forbids `Process.sleep`
 polling in the rebuild wait path.
 **Context.** Two polling loops:
 1. `CliSync.Watcher` polls the SQLite notifications table every **100 ms
   forever** — ~10 queries/sec at idle in a battery-powered desktop app.
 2. `shell_commands.ex` `wait_for_group_phase/3` sleep-polls
   `Tasks.list_tasks()` every 50 ms to sequence rebuild steps.
 **Files.**
 - `lib/bds/cli_sync/watcher.ex` (`@default_poll_interval_ms 100`)
 - `lib/bds/desktop/shell_commands.ex` (`wait_for_group_phase/3` lines ~563–585)
 - `lib/bds/tasks.ex` (add completion broadcast)
 **Approach.**
 1. Watcher: cheapest meaningful fix is a `PRAGMA data_version` pre-check —
   a single integer read that changes only when *another connection* commits;
   only query the notifications table when it moves. (SQLite update hooks
   don't fire for external connections, so the CLI-writes use case genuinely
   needs polling — make it near-free instead of removing it.) Alternatively
   watch the `-wal` file mtime with the `file_system` package. Also consider
   lengthening the idle interval with backoff.
 2. Rebuild sequencing: have `BDS.Tasks` broadcast task terminal states on
   `BDS.PubSub` (it already sits next to PubSub in the supervision tree);
   `wait_for_group_phase` subscribes and `receive`s with a deadline instead
   of sleep-polling.
 **Acceptance.** Idle app issues no table queries when `data_version` is
 unchanged (or interval ≥ 1s with backoff); rebuild sequencing has no
 `Process.sleep`; CLI-sync round-trip latency stays ≤ current behavior.
 ### TD-15: `BDS.Tasks` housekeeping (queue type, eviction timers) ✅ DONE (2026-06-12)
 **Status: implemented.** `BDS.Tasks` now uses `:queue` for its pending work
 queue, so enqueue/dequeue on the hot path are O(1) and the FIFO behavior is
 unchanged. Finished-task cleanup now tracks a single live eviction timer ref
 instead of scheduling a fresh `send_after/3` on every terminal task; the timer
 fires, prunes expired finished tasks, and only reschedules itself if finished
 tasks still remain. Coverage now includes a focused task-state test proving
 multiple finished tasks share the same live eviction timer and a source guard
 that forbids `queue ++` churn.
 **Context.** Minor inefficiencies in `tasks.ex`: the pending queue is a list
 appended with `++` (O(n) per submit), and **every** finishing task schedules
 a fresh 1-hour `send_after` eviction timer — timers accumulate without bound
 (harmless, but sloppy).
 **Files.**
 - `lib/bds/tasks.ex` (`handle_call({:submit_task,...})` ~84, `schedule_finished_task_eviction/1` ~365)
 **Approach.** Use `:queue` for the pending queue. Replace per-finish timers
 with one periodic sweep (`send_after` rescheduled in `handle_info(:evict_finished_tasks, ...)`)
 or track a single timer ref. Also consider downgrading `report_progress` from
 `call` to `cast` (progress is fire-and-forget; the throttle already drops
 updates).
 **Acceptance.** One live eviction timer at most; queue operations O(1);
 existing task lifecycle tests green.
 ---
 ## Phase 4 — Build-vs-buy replacements
 ### TD-16: Frontmatter robustness — yaml_elixir/ymlr or harden the hand-rolled parser ✅ DONE (2026-06-12)
 **Status: implemented.** The project keeps the hand-rolled serializer/parser for
 byte-stable frontmatter output, but `BDS.Frontmatter` is now hardened for the
 known user-edited-file cases: `parse_document/1` normalizes CRLF and lone `\r`
 line endings before splitting the gray-matter envelope, and quoted scalar
 parsing now removes exactly one closing quote and unescapes content explicitly
 instead of `trim_trailing/2`, which previously corrupted strings whose content
 ended in a quote character. Coverage now includes focused parser tests for CRLF
 documents and quoted-string roundtrips with embedded quotes, backslashes, and
 trailing-quote content, plus adjacent post/template/maintenance frontmatter and
 serializer parity suites.
 **Context.** `BDS.Frontmatter` is a hand-rolled YAML subset with concrete
 bugs for user-edited files:
 - `parse_document` splits on `"\n---\n"` — **CRLF files never match** and are
  rejected as `:invalid_frontmatter`. External editors on Windows will produce
  these.
 - `parse_string` handles quotes by trimming the trailing quote char — a quoted
  string containing its own quote mid-string is corrupted.
 - No nesting support (may be intentional).
 Decision needed: adopt `yaml_elixir` (parse) + `ymlr` (emit), or keep the
 subset deliberately (round-trip stability for file diffs is a legitimate
 reason) and fix the bugs. Note the metadata sync rule: frontmatter is read by
 publishing, metadata-diff, and rebuild-from-files — all three must stay in
 agreement, which argues for keeping one serializer either way.
 **Files.**
 - `lib/bds/frontmatter.ex`
 - Callers: `lib/bds/posts/file_sync.ex`, `lib/bds/posts/rebuild_from_files.ex`, `lib/bds/document_fields.ex`
 **Approach (minimum).** Normalize `\r\n` → `\n` before parsing; rewrite
 `parse_string` to properly unescape (scan, don't trim); add property-style
 round-trip tests (`parse(serialize(x)) == x`) including quotes, newlines,
 unicode, CRLF input. If switching to yaml_elixir: keep `serialize_document`
 output byte-stable against current golden files to avoid noisy git diffs in
 user projects.
 **Acceptance.** CRLF fixture parses; round-trip property tests pass; golden
 serialization fixtures unchanged (if keeping custom serializer).
 ### TD-17: Language detection via `paasaa` (optional, low priority) ✅ DONE (2026-06-12)
 **Status: implemented without adding `paasaa`.** The originally reported
 misclassifications are not reproducible on the current code: the existing
 detector already classifies the relevant umlaut-free German and accent-free
 French fixtures correctly through its language-hint fallback, and new focused
 tests now lock that behavior down directly. Because the acceptance cases are now
 satisfied and the current implementation keeps dependency weight lower, the
 project does not add `paasaa` at this time. Revisit only if broader real-world
 fixtures start failing.
 **Context.** `Search.detect_language/1` uses diacritic regexes + tiny word
 lists; German text without umlauts (common in short posts) falls through to
 English, picking the wrong stemmer. `paasaa` is a pure-Elixir trigram
 detector.
 **Files.**
 - `lib/bds/search.ex` (lines ~60–89, `detect_language/1`, `@language_hints`)
 **Approach.** Add `{:paasaa, "~> 0.6"}`; use it for texts above a length
 threshold, keep the cheap heuristics as a short-text fallback; map ISO codes
 to `@stemmer_algorithms` keys; default `"en"` unchanged. Delete
 `@language_hints` if fully superseded (dead-code rule).
 **Acceptance.** Misclassification fixtures (umlaut-free German, accented-free
 French) detect correctly; stemmer selection unchanged for English.
 ### TD-18: Evaluate Oban (Lite engine) for durable jobs ✅ DONE (2026-06-12)
 **Status: evaluated; no-go for now.** `BDS.Publishing` already persists
 `PublishJob` rows and the suite proves those rows survive a
 `BDS.Publishing` restart; what it does **not** provide is durable mid-upload
 resume/retry, and the current desktop spec does not require that behavior.
 Moving publishing to Oban Lite now would add another coordination layer on top
 of `BDS.Tasks`, force a redesign of the in-memory SCP mtime cache in
 `BDS.Publishing`, and complicate the existing Ecto sandbox/release split for a
 benefit that is not yet part of the product contract. `BDS.Scripting.JobStore`
 and `BDS.Tasks` remain intentionally in-memory: they provide UI/runtime job
 tracking, not durable work queues. Re-open this only if real crash-recovery
 requirements emerge and a spike shows acceptable desktop binary/startup impact.
 **Context.** Three overlapping job systems exist: in-memory UI tasks
 (`BDS.Tasks`), DB-persisted publish jobs (`Publishing` + `PublishJob` rows),
 and scripting jobs (`JobStore`/`JobRunner`/`JobSupervisor`). The durable ones
 (publishing, possibly long script jobs) would get retries, uniqueness,
 crash-recovery-on-restart, and telemetry for free from Oban's SQLite `Lite`
 engine on the existing `ecto_sqlite3` setup. **This is an evaluation task,
 not a mandate** — the in-memory `BDS.Tasks` UI progress system should stay.
 **Files (for the evaluation).**
 - `lib/bds/publishing.ex`, `lib/bds/publishing/publish_job.ex`
 - `lib/bds/scripting/job_store.ex`, `job_runner.ex`, `job_supervisor.ex`
 - `lib/bds/tasks.ex` (stays; would wrap Oban job progress for the UI)
 **Approach.** Spike: model `upload_site` as an Oban worker with the Lite
 engine; check binary-size and startup cost impact for the desktop release;
 check interplay with the test sandbox and with the `bds_mcp` release. Write
 up a go/no-go with the spike branch. If no-go, document why in this file and
 close.
 **Acceptance.** Decision documented. Outcome: no-go for now; publishing keeps
 its current persisted-status model and the task closes without an Oban
 migration.
 ### TD-19: Add credo, mix_audit (and consider sobelow) to the quality gates ✅ DONE (2026-06-12)
 **Status: implemented.** `mix.exs` now adds `credo` and `mix_audit` as
 dev/test-only tooling, and `mix validate` runs in the test environment with
 all four gates wired in: tests, Credo, dependency audit, and dialyzer.
 `.credo.exs` deliberately scopes the initial baseline to stable, actionable
 checks and explicitly disables the repo-wide legacy style/refactor rules that
 would otherwise drown the signal. The dependency audit is clean except for one
 documented upstream exception in `.mix_audit.ignore`:
 `GHSA-rhv4-8758-jx7v` remains unavoidable while `bumblebee` depends on
 `progress_bar ~> 3.0`, which pins `decimal ~> 2.0`; the higher-signal Phoenix
 advisory was removed by updating the lockfile to `phoenix 1.8.8`.
 Sobelow was evaluated and not adopted: for this desktop app's loopback-only
 endpoint it would mostly duplicate checks already covered by compile/test,
 Credo, and dependency audit while adding little release signal.
 **Context.** The project enforces dialyzer + warnings-as-errors but has no
 style/consistency linter and no dependency CVE audit. Cheap, high-leverage
 additions.
 **Files.**
 - `mix.exs` (deps + `validate` alias), new `.credo.exs`
 **Approach.** Add `{:credo, "~> 1.7", only: [:dev, :test], runtime: false}`
 and `{:mix_audit, "~> 2.1", only: [:dev, :test], runtime: false}`. Extend the
 `validate` alias: `["test", "credo --strict", "deps.audit", "dialyzer"]`.
 Start credo in non-strict mode if the initial violation count is large; fix
 or explicitly disable rules rather than leaving noise. Sobelow is
 Phoenix-web-oriented; evaluate whether its checks add signal for a
 loopback-only desktop endpoint before adopting.
 **Acceptance.** `mix validate` runs all four gates clean; CI/agent
 instructions (AGENTS.md) updated to mention them.
 ---
 ## Phase 5 — Consistency, config & hygiene
 ### TD-20: Align SQLite pool configuration between dev and prod ✅ DONE (2026-06-12)
 **Status: implemented.** The effective mismatch was narrowed to one place:
 `config.exs`, `dev.exs`, `prod.exs`, and `test.exs` were already aligned on a
 modest pool of `5`, but `config/runtime.exs` still defaulted production to
 `POOL_SIZE=1`. The prod runtime default now also uses `5`, with an inline
 comment documenting the rationale: keep SQLite concurrency behavior consistent
 across dev and prod under the existing WAL + `busy_timeout` setup unless an
 operator explicitly overrides `POOL_SIZE`. A regression test reads
 `config/runtime.exs` in `:prod` and locks that default to `5`.
 **Context.** Dev runs `pool_size: 5` (config.exs), prod runs `pool_size: 1`
 (runtime.exs) — so dev and prod have **different concurrency semantics**: dev
 can hit `SQLITE_BUSY`/interleavings prod never sees; prod serializes every
 read behind every write. WAL mode + `busy_timeout: 15_000` are already set
 globally.
 **Files.**
 - `config/config.exs` (pool_size 5), `config/runtime.exs` (POOL_SIZE default "1")
 **Approach.** Pick one deliberate model and apply it to both envs. Options:
 (a) pool of 1 everywhere (simplest, fully serialized, fine for single-user);
 (b) modest pool (3–5) everywhere relying on WAL + busy_timeout; (c)
 ecto_sqlite3's documented separate read/write pool pattern. Document the
 choice in a comment. Run the suite and a manual smoke under the chosen prod
 setting.
 **Acceptance.** Same pool model in dev and prod; rationale comment in config;
 no busy-timeout regressions in tests.
 ### TD-21: Harden `Persistence.atomic_write` ✅ DONE (2026-06-12)
 **Status: implemented.** `BDS.Persistence.atomic_write/2` now writes through a
 per-call temp path (`.tmp.<unique_integer>`) instead of the fixed `.tmp`
 suffix, eliminating the concurrent-writer collision that previously produced
 `:enoent` races. The new test coverage proves two things: many simultaneous
 writers to the same target all complete successfully and the final file is one
 intact payload, and the post-rebuild glob still ignores atomic temp files
 because the temp suffix no longer matches the `*.md` pattern.
 **Context.** `atomic_write/2` uses a fixed `path <> ".tmp"` temp name — two
 concurrent writers to the same path corrupt each other's temp file before
 rename. No fsync before rename (crash can leave an empty/partial target on
 some filesystems); for blog-post files that durability trade-off is
 defensible, but the temp-name collision is not.
 **Files.**
 - `lib/bds/persistence.ex` (`atomic_write/2` lines ~70–82)
 **Approach.** Unique temp suffix:
 `path <> ".tmp." <> Integer.to_string(System.unique_integer([:positive]))`.
 Optionally `:file.sync` the temp file before rename behind an opt-in flag for
 critical writes. Clean up stray `*.tmp.*` files defensively where directories
 are scanned (check rebuild-from-files glob patterns ignore them).
 **Acceptance.** Concurrent-writer test produces two intact outcomes (last
 write wins, no corruption); rebuild file globs ignore temp files.
 ### TD-22: Wrap `delete_chat_conversation` in a transaction ✅ DONE (2026-06-12)
 **Status: implemented.** `BDS.AI.Chat.delete_chat_conversation/1` now performs
 message deletion and conversation deletion inside a single `Repo.transaction`.
 If the second step fails, the whole delete rolls back and both the
 conversation and its messages remain intact. The regression test injects a
 mid-delete failure through a local test hook and proves the rollback by
 asserting both rows still exist afterward. The nearby audit requested by this
 TD found no equivalent multi-statement write outlier in `lib/bds/ai/catalog.ex`
 or `lib/bds/scripts.ex`: catalog refresh already runs in a transaction and the
 script operations are single-row writes.
 **Context.** `chat.ex` deletes all messages (`Repo.delete_all`) then the
 conversation (`Repo.delete`) without a transaction — a failure in between
 strands orphan message rows. The codebase uses `Repo.transaction`
 conscientiously elsewhere; this is an outlier. (Check for siblings: any other
 multi-statement write without a transaction found during work on this task
 should be fixed in the same pass — e.g. audit `lib/bds/ai/catalog.ex`,
 `lib/bds/scripts.ex`.)
 **Files.**
 - `lib/bds/ai/chat.ex` (`delete_chat_conversation/1` lines ~102–117)
 **Approach.** `Repo.transaction(fn -> ... end)` or `Ecto.Multi`. Better:
 add `ON DELETE CASCADE` via an Ecto migration on the
 `chat_messages.conversation_id` FK and drop the manual `delete_all` (use
 `mix ecto.gen.migration` per project rules).
 **Acceptance.** Injected failure between the two deletes leaves both intact;
 no orphaned messages possible.
 ### TD-23: Sweep blanket `rescue`/`catch` blocks for silent failure ✅ DONE (2026-06-12)
 **Status: implemented.** The named sweep surface was tightened in the intended
 way instead of being rewritten wholesale. `ImportAnalysis` and
 `ImportExecution` now log contextual warnings when top-level import handling,
 repo-transaction wrappers, media linking, or progress callbacks fall back after
 exceptions; the tests now prove those progress-callback failures are no longer
 silent. In `ImportAnalysis.year_from/1`, the blanket integer-date rescue was
 narrowed to `ArgumentError`, which is the actual parsing failure path there.
 `OverlayComponents` kept its defensive UI fallbacks, but every rescue now logs
 why the fallback path was taken instead of silently swallowing the error.
 `MainWindow` needed no change in this pass: its catches are already scoped to
 expected `:exit`/wx failures rather than bare rescue-all blocks.
 **Context.** Several modules rescue *all* exceptions into `:ok`/fallbacks
 (`shutdown.ex` defensibly — it must never block quit; but also
 `overlay_components.ex` (11 rescues), `import_execution.ex`,
 `import_analysis.ex`, `main_window.ex`, `Repo.ready?`). Blanket rescues hide
 real bugs (typos become silent no-ops) and dialyzer can't see through them.
 **Files.** Start with: `lib/bds/desktop/shell_live/overlay_components.ex`,
 `lib/bds/import_execution.ex`, `lib/bds/import_analysis.ex`,
 `lib/bds/desktop/main_window.ex`.
 **Approach.** For each rescue: narrow to the specific exception(s) actually
 expected, add a `Logger.warning` with context where swallowing is the right
 call, and delete rescues that guard nothing. Shutdown.ex's rescues stay as-is
 (documented intent: never block quit).
 **Acceptance.** No bare `rescue _ ->` without either a narrow exception match
 or a logged justification; suite green.
 ### TD-24: Fix stale `npm test` reference in AGENTS.md ✅ DONE (2026-06-12)
 **Status: implemented.** `AGENTS.md` now tells agents to run `mix test`
 instead of the stale `npm test`, and the nearby UI i18n wording was updated to
 refer to LiveView/HEEx components rather than leftover React-era terminology.
 **Context.** AGENTS.md "Fix All Test Failures" section says "Run the full
 test suite (`npm test`)" — stale from the pre-Elixir rewrite. It's an
 instruction file for agents; wrong commands actively mislead.
 **Files.** `AGENTS.md`
 **Approach.** Replace with `mix test` (and the `xvfb-run mix test` headless
 note already present elsewhere in the file). While in there, scan for other
 JS-era references (React components are mentioned in the i18n section —
 verify whether that section should say LiveView/HEEx).
 **Acceptance.** AGENTS.md commands all match the Elixir toolchain.
 ### TD-25: Generate the desktop endpoint secret at runtime ✅ DONE (2026-06-11, shipped with TD-01)
 **Context.** The Phoenix endpoint `secret_key_base` (session/LiveView
 signing) was a hardcoded repo string in all envs.
 **Status: implemented as part of TD-01.** `BDS.Application.desktop_secret_key_base/0`
 now generates `Base.encode64(:crypto.strong_rand_bytes(48))` per boot when no
 explicit config is set; the static value was removed from `config/config.exs`.
 ---
 ## Explicit non-issues (decisions reviewed and endorsed)
 Recorded so future reviews don't re-litigate them:
 - **Shelling out to `git`** instead of libgit2 NIF bindings — correct;
  bindings are immature. (Timeout gap tracked as TD-10.)
 - **SIGKILL shutdown** for the wx static-destructor segfault — legitimate,
  well-documented workaround. (Terminate-callback consequence tracked as TD-04.)
 - **`BDS.BoundedAtoms`** allow-list atom conversion — good practice, keep.
 - **Luerl with reduction/time caps** for user scripting — right tool, sound
  sandbox config.
 - **SQLite FTS5 + bm25** for search, **HNSWLib** for ANN, **Liquex/Earmark**
  for rendering — all appropriate.
 - **Integer-millisecond timestamps** (`Persistence.now_ms`) over
  `:utc_datetime_usec` — committed design choice; consistent; keep
  `parse_timestamp`'s seconds-vs-ms heuristic under test, don't migrate.
 - **`elixir-desktop` dependency risk** (niche, lightly maintained) —
  acknowledged; mitigation is the existing thin `BDS.Desktop.*` boundary, not
  replacement.
 - **No telemetry pipeline** — reasonable cut for a desktop app; revisit only
  if support/debugging demands it.