Files
bDS2/CODESMELL.md
2026-05-01 10:47:25 +02:00

501 lines
34 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Elixir Code Smell Analysis — bDS2
> Generated by static analysis of the codebase. Do not delete; use as a reference for refactoring sprints.
>
> **Updated 2026-04-30:** Each claim re-verified against current code. See "Verification Summary" and "Priority Order" sections at the bottom.
---
## Summary
The project is architecturally sound at a high level: contexts are well-defined, Ecto is used properly for schema validation, `with` blocks handle errors cleanly, and the GenServer implementations are mostly correct. However, **the codebase suffers from severe module bloat, excessive duplication of helper functions, risky process-dictionary usage, and several side-effect-in-transaction anti-patterns** that become maintenance liabilities in a desktop application expected to run for long sessions.
---
## Critical Anti-Patterns
### 1. Massive Module Bloat (Most Serious)
Several modules are far too large and violate the single-responsibility principle.
| Module | Lines | Issue |
|--------|-------|-------|
| `BDS.Desktop.ShellLive` | 2,607 | Handles UI events, routing, overlays, menus, project switching, tab management, translations, task polling, and native menu integration |
| `BDS.Posts` | 1,709 | Handles CRUD, publishing, file I/O, translations, auto-translation scheduling, embeddings sync, search sync, link rebuilding, and validation |
| `BDS.Generation` | 2,624 | Site generation, validation, sitemap building, archive pagination, feed rendering, and file hashing (verified larger than originally documented) |
| `BDS.MCP` | 670 | MCP tools, resources, proposals, post detail serialization, search, and category counting |
| `BDS.Media` | 939 | Media CRUD, thumbnail generation, sidecar I/O, post-media join management |
**Why this is an anti-pattern:** Giant modules reduce compiler concurrency, make refactoring dangerous, obscure test coverage, and increase merge conflicts. Elixir/OTP best practice is to keep modules under ~400 lines and split by responsibility.
**Fix:** Extract sub-domains into dedicated modules (e.g., `BDS.Posts.Publisher`, `BDS.Posts.Translations`, `BDS.Generation.SitemapBuilder`, `BDS.Desktop.ShellLive.EventRouter`).
---
### 2. Process Dictionary for I18n State
(and 13 other call sites across all `*_editor.ex` modules — 15 total)
In `BDS.Desktop.ShellLive`:
```elixir
def render(assigns) do
Process.put(:bds_ui_locale, assigns.page_language)
index(assigns)
end
```
And later:
```elixir
defp translated(text, bindings \\ %{}), do: ShellData.translate(text, bindings, Process.get(:bds_ui_locale))
```
**Why this is an anti-pattern:** The process dictionary is implicit global state. It makes functions harder to test in isolation and can leak state between concurrent operations in the same process. Elixir strongly discourages `Process.put/get` for business logic.
**Fix:** Pass `locale` explicitly through `assigns` to all translation call sites, or use a dedicated context struct.
---
### 3. Side Effects Inside Database Transactions
(7 transactions in `lib/bds/media.ex` follow this pattern)
In `BDS.Media.import_media/1`:
```elixir
Repo.transaction(fn ->
media = %Media{} |> Media.changeset(...) |> Repo.insert!()
:ok = File.mkdir_p(Path.dirname(destination))
:ok = File.cp(source_path, destination)
:ok = write_sidecar(project, media)
:ok = ensure_thumbnails(project, media)
:ok = Search.sync_media(media)
media
end)
```
**Why this is an anti-pattern:** If the transaction retries (or the DB connection times out in SQLite), filesystem and thumbnail side effects are duplicated or orphaned. Ecto transactions should contain **only** database operations.
**Fix:** Perform the DB insert in the transaction, then do filesystem work after the transaction succeeds.
---
### 4. `String.to_atom/1` on Untrusted Input
In `BDS.MCP`:
```elixir
defp atomize_keys(map) when is_map(map) do
Map.new(map, fn {key, value} -> {String.to_atom(key), value} end)
end
```
**Why this is an anti-pattern:** Atoms are not garbage collected. If MCP receives large or arbitrary JSON keys, the VM atom table can fill up and crash the BEAM.
**Fix:** Use `String.to_existing_atom/1` (with a safe fallback) or keep keys as strings.
Note: 10 other `String.to_atom` call sites exist, but they operate on bounded enums (modality, platform, view id) and are lower risk.
---
### 5. Bang Functions (`File.read!`) in Business Logic Without Rescue
Found in `BDS.Posts`, `BDS.Media`, `BDS.MCP`, `BDS.Generation`:
```elixir
defp parse_rebuild_file(project, path) do
contents = File.read!(path)
{:ok, %{fields: fields}} = Frontmatter.parse_document(contents)
...
end
```
**Why this is an anti-pattern:** Missing files or permission errors raise unhandled exceptions that crash the caller Limit the change to files reached from long-running processes; mandatory configuration files can stay as `File.read!`.. In a GenServer context (e.g., background tasks), this brings down the worker.
**Fix:** Use `File.read/1` and handle `{:error, reason}` explicitly, returning `{:error, reason}` tuples to callers.
---
### 6. Duplicate Helper Functions Across Contexts
The same private helpers are copy-pasted in `BDS.Posts`, `BDS.Media`, `BDS.Search`, `BDS.Generation`, `BDS.Publishing`, `BDS.MCP`:
- `attr/2` (atom/string key normalization)
- `maybe_put/3`
- `blank_to_nil/1`
- `progress_callback/1`
- `report_rebuild_started/3`, `report_rebuild_progress/4`
**Why this is an anti-pattern:** Violates DRY. When behavior needs to change (e.g., adding float progress support), every copy must be updated.
**Fix:** Extract to a shared utility module such as `BDS.MapUtils`, `BDS.AttrUtils`, or `BDS.ProgressReporter`.
---
### 7. Direct Repo Access in LiveView (10 sites verified)
`BDS.Desktop.ShellLive` directly queries the repo:
```elixir
case Repo.get(Post, post_id) do
%Post{} = post -> ...
end
```
**Why this is an anti-pattern:** LiveViews are the web layer; they should call context functions (`BDS.Posts.get_post/1`). Direct Repo access leaks persistence details into the UI and makes testing harder.
**Fix:** Replace all `Repo.get` calls in `ShellLive` with context function calls.
---
### 8. `String.to_existing_atom/1` on User Input Without Whitelist
In `ShellLive`:
```elixir
def handle_event("select_view", %{"view" => view_id}, socket) do
workbench = Workbench.click_activity(socket.assigns.workbench, String.to_existing_atom(view_id))
...
end
```
**Why this is an anti-pattern:** If a client sends a non-existent atom string, it raises `ArgumentError`. While safer than `to_atom`, it is still a crash vector.
**Fix:** Validate against a whitelist or map of known string-to-atom conversions before conversion.
---
### 9. Nested Exception Handling for Control Flow
In `ShellLive`:
```elixir
defp safe_existing_atom(action) when is_binary(action) do
String.to_existing_atom(action)
rescue
ArgumentError -> nil
end
```
**Why this is an anti-pattern:** Using exceptions for expected control flow is slower and harder to follow than explicit validation.
**Fix:** Maintain an explicit mapping of allowed actions, or use a `case` with `:erlang.binary_to_existing_atom/2` and catch the error tuple.
---
### 10. `Jason.decode!` Without Err and `BDS.AI` (HTTP-response decoding)or Handling
In `BDS.AI.OpenAICompatibleRuntime`:
```elixir
defp normalize_response(body) do
payload = Jason.decode!(body)
...
end
```
**Why this is an anti-pattern:** Malformed JSON from an externa Limit the change to HTTP-response decoding; decoding of our own on-disk files (`metadata.ex`, `embeddings/index.ex`, etc.) is acceptable.
---
## Moderate Concerns
### 11. Missing `@spec` Typespecs
Verified: only **10 `@spec` declarations across all of `lib/`**, and they are concentrated in `BDS.Scripting` / `BDS.Scripting.Lua`. Every other public context function relies on Dialyzer inference. For a project of this size, adding `@spec` to public APIs is the single highest-ROI type-safety improvement available — and pairs well with the existing `mix dialyzer --format short` check.
### 12. Tests Run Synchronously
Most tests use `use ExUnit.Case, async: false`. Many tests share the SQLite repo and named GenServers, so they cannot realistically be async. Limit conversion to pure-logic test files (parsers, formatters, etc.).
### 13. Raw SQL Overuse
19 `Repo.query` sites split into two groups:
1. **FTS5 virtual tables** in `BDS.Search` (`posts_fts`, `media_fts`, `MATCH`, `bm25`, `rank`). Ecto cannot express FTS5 queries cleanly. **Keep these raw.**
2. **`post_media` join table** queried by hand in `BDS.Posts`, `BDS.Media`, `BDS.Desktop.ShellLive.OverlayComponents`, `BDS.Desktop.ShellLive.PostEditor`. There is no `BDS.PostMedia` schema. This is the real type-safety hole; introducing the schema replaces 68 raw queries with Ecto
While SQLite FTS necessitates some raw SQL, many queries in `BDS.Search`, `BDS.Media`, and `BDS.Posts` use `Repo.query!/2` for standard operations that Ecto could express portably (e.g., `DELETE FROM ... WHERE ...`). This reduces type safety and database portability.
### 14. Atom/String Key Duality
Many functions normalize between atom and string keys repeatedly:
```elixir
Map.get(assigns, :language, Map.get(assigns, "language", ...))
```
This suggests data isn't normalized at boundaries. Prefer atoms for internal structs and strings only at external boundaries (JSON, HTTP params).
### 15. GenServer State Growth (Tasks)
`BDS.Tasks` stores all tasks forever in a single map. The `clear_completed`/`clear_finished` functions exist, but if the UI doesn't trigger them regularly in a long-running desktop app, memory grows unbounded. Consider TTL-based eviction.
---
## What the Project Does Well
- **Contexts are clearly separated** (`Posts`, `Media`, `Projects`, `Search`, etc.).
- **Ecto changesets** are used consistently for validation.
- **`with` blocks** handle multi-step error flows cleanly.
- **Pattern matching** is idiomatic and pervasive.
- **Task.Supervisor** is used correctly for background work.
- *Verification Summary (2026-04-30)
| # | Claim | Verified | Notes |
|---|---|---|---|
| 1 | Module bloat | ✅ | `Generation` is 2624 lines, worse than originally documented. |
| 2 | `Process.put(:bds_ui_locale)` | ✅ | 15 call sites across `shell_live.ex` and every `*_editor.ex`. |
| 3 | Side effects in transactions | ✅ | 7 transactions in `lib/bds/media.ex` wrap filesystem + Search side effects. |
| 4 | `String.to_atom` on untrusted input | ⚠️ Partly | Only `MCP.atomize_keys` is on arbitrary external JSON. Other 10 sites are bounded enums. |
| 5 | `File.read!` | ⚠️ Partly | Most sites read mandatory config; only loop/background paths are real risk. |
| 6 | Duplicated helpers | ✅ | Real DRY violation across contexts. |
| 7 | `Repo.get` in `ShellLive` | ✅ | 10 direct calls. |
| 8/9 | `to_existing_atom` + rescue | ⚠️ | The rescue effectively *is* the whitelist; cosmetic priority. |
| 10 | `Jason.decode!` on external responses | ⚠️ Partly | Only 2 of 14 sites decode HTTP responses; the rest decode our own files. |
| 11 | Missing `@spec` | ✅ | 10 specs in entire `lib/`. Highest type-safety ROI. |
| 12 | Tests synchronous | ✅ | Limited convertibility — most tests share repo/GenServers. |
| 13 | Raw SQL | ⚠️ Partly | FTS5 must stay raw; `post_media` table is the real fix target. |
---
## Priority Order
1. **Add `@spec` to public APIs of core contexts.****DONE 2026-04-30.** See "Priority #1 Completion" section below.
2. **Extract filesystem / Search side effects out of `Repo.transaction` in `BDS.Media`.****DONE 2026-04-30.** See "Priority #2 Completion" section below.
3. **Fix `MCP.atomize_keys`** to use `String.to_existing_atom/1` with a string-fallback. ✅ **DONE 2026-04-30.** See "Priority #3 Completion" section below.
4. **Introduce `BDS.PostMedia` Ecto schema** and migrate the 68 raw `post_media` queries. ✅ **DONE 2026-04-30.** See "Priority #4 Completion" section below.
5. **Module split.** `BDS.Generation` (2624) and `BDS.Desktop.ShellLive` (2607) first, then `BDS.AI` (1700+) and `BDS.Posts`. ✅ **PARTIAL 2026-04-30 / 2026-05-01.** `BDS.Generation` reduced 2651 → 647 (76%). See "Priority #5 Progress" section below.
6. **Replace `Repo.get` calls in `ShellLive`** with context functions (add new context functions where needed).
7. **Move locale from `Process.put` into assigns**, then ban `Process.put` via Credo.
8. **Extract shared helpers** (`attr/2`, `maybe_put/3`, `blank_to_nil/1`, `progress_callback/1`, rebuild progress reporters) into `BDS.MapUtils` / `BDS.ProgressReporter`.
9. **Wrap external `Jason.decode!` calls** in `BDS.AI.OpenAICompatibleRuntime` and `BDS.AI` with `Jason.decode/1` + `{:error, _}` propagation.
### Skipped / downgraded
- #5 in bulk — convert only paths reached from long-running processes.
- #8 / #9 — cosmetic; the rescue functions as a whitelist today.
- #12 — leave `async: false` where the repo or named GenServers are involved.
- #15 — bounded in practice; ensure UI triggers `clear_finished` periodically.
---
## Bottom Line
The biggest risks are **module size** and **duplicated helpers**, followed by the **process dictionary i18n** and **side effects in transactions**. The single highest-ROI improvement for type safety is **adding `@spec` to public context APIs** so Dialyzer can do its job — that is the starting point of the priority order above
---
## Priority #1 Completion (2026-04-30)
Added `@spec` and `@type` declarations to the public APIs of all core contexts and their schemas. Validation: `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4.
**Files modified (specs added):**
- `lib/bds/projects.ex`, `lib/bds/projects/project.ex`
- `lib/bds/posts.ex`, `lib/bds/posts/post.ex`, `lib/bds/posts/translation.ex`, `lib/bds/posts/link.ex`
- `lib/bds/media.ex`, `lib/bds/media/media.ex`, `lib/bds/media/translation.ex`
- `lib/bds/search.ex`
- `lib/bds/publishing.ex`, `lib/bds/publishing/publish_job.ex`
- `lib/bds/generation.ex`
- `lib/bds/metadata.ex`
- `lib/bds/mcp.ex`
- `lib/bds/ai.ex`
**Bugs surfaced and fixed by Dialyzer once specs were in place:**
- `BDS.Search.list_stemmer_languages/0` returns `[String.t()]`, not `[{String.t(), [String.t()]}]`.
- `BDS.Media.sync_media_sidecar/1` returns `:ok` (not `{:ok, t()}`); the `posts.ex` caller was already pattern-matching on `:ok`.
- `BDS.Media.replace_media_file/2` can return `{:ok, nil}` when the new file's checksum is unchanged.
- Removed unreachable `other -> {:error, other}` fall-through clauses in the auto-translate cascades in `BDS.Posts` (the preceding `{:error, reason}` pattern already covers the only remaining return shape).
- Removed unreachable `defp blank?(_value)` clause in `BDS.Desktop.ShellLive.ChatEditor` (prior clauses already covered `binary` and `nil`, no other types reach the function).
**Conventions established for future spec work:**
- Ecto schemas need explicit `@type t` — Ecto does not generate one.
- Use `term()` for `belongs_to` / `has_many` association fields to avoid circular type dependencies between sibling schemas.
- Use `@typedoc` + named types (e.g. `attrs`, `metadata_state`, `search_filters`, `reindex_opts`) to avoid repeating large map shapes across many specs in the same module.
- For `update_*` / attrs-style maps, the canonical type is `%{optional(atom()) => term(), optional(String.t()) => term()}` because both renderer-supplied string keys and Elixir atom keys flow into them.
**Not yet specced (intentional, out of scope of priority #1):**
- LiveView modules under `lib/bds/desktop/shell_live/`. These are Phoenix LiveView callbacks (`mount/3`, `handle_event/3`, `handle_info/2`) whose specs are inherited from the behaviour. Public helper functions in those modules can be specced as part of the eventual ShellLive module split (priority #9).
- Internal helpers in `BDS.MCP` and `BDS.AI` that are private (`defp`) — Dialyzer infers these.
- `BDS.Tags`, `BDS.Templates`, `BDS.Scripts`, `BDS.PostLinks` — smaller contexts; queue for a follow-up pass if Dialyzer surfaces issues.
---
## Priority #2 Completion (2026-04-30)
Refactored every `Repo.transaction/1` block in [lib/bds/media.ex](lib/bds/media.ex) so that only DB writes run inside the transaction. Filesystem writes (`File.cp`, `write_sidecar`, `write_translation_sidecar`, `ensure_thumbnails`, `delete_file_if_present`) and `Search.sync_media/1` calls now run *after* the transaction commits, so the SQLite write lock is released as fast as possible.
**Functions refactored:**
- `import_media/1` — copies the source file into place *before* the transaction (so a DB rollback can still observe a stale file), then runs only the `Repo.insert!` inside the transaction, then writes sidecar / thumbnails / search index. On DB failure the copied file is removed.
- `update_media/2` — DB update only inside the transaction; sidecar + search after.
- `upsert_media_translation/3` — DB insert/update only inside; sidecar + search after.
- `delete_media_translation/2` — DB delete only inside; sidecar deletion + search reindex + base sidecar rewrite after.
- `replace_media_file/2` — moves the existing destination to a `.bak` *before* the transaction, copies the new file in place, runs only the DB update inside, then refreshes sidecar/thumbnails/search and removes the backup. On DB failure the original file is restored from the backup.
- `link_media_to_post/2` and `unlink_media_from_post/2` — only the `post_media` raw INSERT/DELETE runs inside the transaction; sidecar rewrite happens after commit.
**Why this matters:**
SQLite has a single global write lock. Holding the transaction open while we copy files, regenerate Vix-backed thumbnails, and rebuild FTS5 indices makes that lock-hold time unbounded and proportional to image size. Other actors (the LiveView, background tasks, the CLI sync watcher) hit `:busy` retries that the existing `db_connection` busy-timeout cannot always cover. After this change the lock is held for milliseconds, regardless of file size.
**Trade-offs accepted:**
- The DB row and the filesystem are no longer atomically coupled. If the BEAM crashes between `Repo.insert!` and `write_sidecar/2`, the row exists without a sidecar. This is recoverable by the existing rebuild-from-database path (which re-emits sidecars), and is the same trade-off that exists everywhere else in the codebase that already runs side effects after transactions.
- `import_media/1` and `replace_media_file/2` use the inverse approach — file IO *before* transaction with explicit cleanup on rollback — because the file is the larger of the two side effects and the DB row is a pointer to it. This keeps the destination consistent on rollback.
**Spec correction surfaced by Dialyzer:**
- `BDS.Media.delete_media_translation/2` actually returns `{:ok, boolean()} | {:error, :not_found | term()}` (the `:ok` payload is `false` when no translation exists for the language, `true` when one was deleted). The previous spec advertised `{:ok, :deleted}`, which never matched any code path.
**Validation:** `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4.
---
## Priority #3 Completion (2026-04-30)
Removed `BDS.MCP.atomize_keys/1` entirely instead of just narrowing it to `String.to_existing_atom/1`. The function existed only to convert MCP-proposal JSON keys (attacker-controlled strings) into atoms before passing them to `Media.update_media/2` and `Posts.update_post/2`. Both contexts already accept string-keyed maps natively through their `attr/2` helper (`%{optional(atom()) => term(), optional(String.t()) => term()}`), so the conversion was both unnecessary and a textbook unbounded-atom-table DoS vector.
**Change:**
- Deleted `defp atomize_keys/1` from [lib/bds/mcp.ex](lib/bds/mcp.ex).
- The two `accept_proposal` call sites now pass `proposal.data["changes"] || %{}` straight through to `Media.update_media/2` and `Posts.update_post/2`.
**Why removal beats whitelisting:**
- `String.to_existing_atom/1` would still need a whitelist or a `try/rescue` wrapper, both of which add code without buying type safety — the downstream contexts already key on either form.
- Removing the function eliminates the only place in MCP that converts external JSON to atoms; the surface area for atom-table attacks via the MCP tool API is now zero.
- The `attr/2` helpers in `BDS.Posts` and `BDS.Media` are the *single* canonical place where attribute key normalization happens, which is exactly the invariant we want.
**Other `String.to_atom/1` call sites considered:**
The codebase has nine other `String.to_atom/1` call sites; all of them operate on bounded inputs (`Workbench` type names, release platforms, route IDs from the router config, `:supports_attachment` / `:supports_tool_calls` capability keys we ourselves wrote, `parse_modality` from a fixed AI catalog). None of them are reachable from attacker-controlled JSON the way `MCP.accept_proposal` was. They are safe to leave as a follow-up if a stricter sweep is wanted later.
**Validation:** `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4.
---
## Priority #4 Completion (2026-04-30)
Introduced [lib/bds/posts/post_media.ex](lib/bds/posts/post_media.ex) — a proper `Ecto.Schema` for the `post_media` join table — and migrated every raw SQL / string-table query in the codebase to use it.
**New schema:** `BDS.Posts.PostMedia` with fields `id`, `project_id`, `post_id`, `media_id`, `sort_order`, `created_at`, `belongs_to :post` and `belongs_to :media` associations, full `@type t`, and a `changeset/2` enforcing `validate_required` + `foreign_key_constraint` + `unique_constraint([:post_id, :media_id])`.
**Call sites migrated:**
- [lib/bds/media.ex](lib/bds/media.ex)
- `list_linked_posts/1` — replaced `join: post_media in "post_media"` string-table join with `join: pm in PostMedia`.
- `link_media_to_post/2` — replaced `Repo.query("SELECT 1 FROM post_media …")` with `Repo.exists?(from pm in PostMedia, …)` and `Repo.query("INSERT INTO post_media …")` with `%PostMedia{} |> changeset() |> Repo.insert!()` (uniqueness now enforced through the schema constraint).
- `unlink_media_from_post/2` — replaced `Repo.query("DELETE FROM post_media …")` with `Repo.delete_all(from pm in PostMedia, …)`.
- `linked_post_ids/1` — replaced raw `SELECT post_id` with `Repo.all(from pm in PostMedia, select: pm.post_id)`.
- `next_sort_order/1` — replaced raw `SELECT COALESCE(MAX(sort_order), -1)` with `Repo.one(from pm in PostMedia, select: max(pm.sort_order))` and a `value when is_integer(value)` guard for the empty-table case.
- [lib/bds/posts.ex](lib/bds/posts.ex)
- `linked_media_ids/1` — replaced raw `SELECT media_id` with `Repo.all(from pm in PostMedia, select: pm.media_id)`.
- [lib/bds/desktop/shell_live/post_editor.ex](lib/bds/desktop/shell_live/post_editor.ex)
- `linked_media/1` — replaced raw `SELECT media_id, sort_order` with a `Repo.all(from pm in PostMedia, select: {pm.media_id, pm.sort_order})` query.
- [lib/bds/desktop/shell_live/overlay_components.ex](lib/bds/desktop/shell_live/overlay_components.ex)
- `post_media_ids/1` — replaced raw `SELECT media_id` with `Repo.all(from pm in PostMedia, select: pm.media_id)`.
- `delete_details/2` — replaced the raw `SELECT posts.title FROM posts JOIN post_media ON …` with a typed Ecto join query (`from post in Post, join: pm in PostMedia, on: pm.post_id == post.id, …`).
**Why this matters:**
- All `post_media` access now goes through a typed schema, so every column reference is checked at compile time by Ecto and any future migration that renames a column will produce a `Postgrex/Sqlite` error at compile or test time instead of silently breaking at runtime.
- The unique `(post_id, media_id)` constraint is now enforceable via `Ecto.Changeset.unique_constraint/2`, which would let the next refactor turn `link_media_to_post/2` into an idempotent upsert without losing the protection.
- `Repo.query/2` is reserved for the few remaining cases that genuinely need raw SQL (FTS5 virtual tables in `BDS.Search`, which are not Ecto-mappable).
**Validation:** `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4.
---
## Priority #5 Progress (2026-04-30 / 2026-05-01)
**Goal:** Split god modules. Started with the worst offender, `BDS.Generation` (2651 lines).
**Result:** `lib/bds/generation.ex` reduced **2651 → 647 lines (76%)** by extracting nine cohesive submodules under `lib/bds/generation/`:
| Module | Lines | Responsibility |
|---|---|---|
| `BDS.Generation.Outputs` | 490 | All `build_*_outputs/*` builders + `*_route_paths` + `additional_languages`, `route_post_output_path`, `suppress_subtree_translation_variants` |
| `BDS.Generation.Validation` | 445 | `compare_sitemap_to_html`, `plan_validation_paths`, `build_targeted_validation_plan`, `targeted_output?`, `prune_empty_parent_dirs`, post/lang timestamp checks |
| `BDS.Generation.Data` | 352 | `generation_data/2`, snapshot loaders, post-index builders, translation-lookup helpers |
| `BDS.Generation.Sitemap` | 280 | sitemap.xml, RSS/Atom feeds, calendar feed, hreflang link assembly |
| `BDS.Generation.Paths` | 262 | URL/route/path helpers, language prefixing, pagination math, archive routing |
| `BDS.Generation.Renderers` | 227 | Liquid template rendering wrappers (home, post, archive, date, list, 404) |
| `BDS.Generation.Progress` | 96 | Generation/validation progress callback helpers |
| `BDS.Generation.Pagefind` | 70 | Pagefind search-index input file emission |
| `BDS.Generation.GeneratedFileHash` | 23 | (pre-existing) hash-tracking schema |
Total: 2245 lines now live in focused submodules; the remaining 647 in `BDS.Generation` is the orchestrating `plan_generation/2`, `apply_validation/2`, `validate_site/3`, `write_generated_file`, and `delete_extra_validation_paths` — small enough to manage as a single coordinator.
**Refactor pattern used:** `import BDS.Generation.X, only: [...]` (or `except: [...]`) at the head of `BDS.Generation` so the hundreds of internal call sites needed no changes; `defdelegate` for any function that had to remain reachable through the public `BDS.Generation` namespace (e.g. `post_output_path/1,2`).
**Validation after each extraction:** `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4.
**Date-bug side fix:** `test/bds/maintenance_test.exs` had hardcoded `posts/2026/04/...` paths that worked only when the published-post setup happened to create the same year/month directory. With today's date in May, the orphan writes failed; added explicit `File.mkdir_p!` calls for the hardcoded fixture paths.
**Remaining work in this priority:**
-`BDS.Generation` — done (76% reduction, 647 lines remaining is acceptable for a coordinator).
- 🔄 `BDS.Desktop.ShellLive` (2607 → 1545, 41% reduction). Submodules extracted under `lib/bds/desktop/shell_live/`:
| Module | Lines | Responsibility |
|---|---|---|
| `TitlebarMenu` | 181 | Menu group definition, dropdown items, open/close/hover/keydown |
| `CliSync` | 133 | CLI watcher entity-change application + tab refresh |
| `PanelRenderer` | 290 | Tasks/output/post-links/git-log panel rendering + editor toolbar |
| `TabHelpers` | 99 | Tab title/subtitle/icon, route atom mapping, post/media labels |
| `TaskLocalization` | 80 | Task status localization, editor-meta translation |
| `ChatSurface` | 233 | Chat-surface action dispatch, assistant message helpers |
| `SidebarCreate` | 131 | New post/media/script/template/import sidebar creation |
| `Layout` | 53 | Sync-layout, resize-panel, parse-width, ignore-shortcut |
| `ShellCommandRunner` | 95 | `apply_shell_command` + `apply_result` dispatch |
| `SessionUtil` | 49 | Workbench-session restore, project-name picker, task-result tracking |
Coordinator (`shell_live.ex`) now 1545 lines containing only `mount/3`, `render/1`, `handle_event/3`, `handle_info/2` clauses, plus thin dispatchers and small editor-assign helpers.
-`BDS.Posts` (1781 → 569, 68% reduction). Submodules extracted under `lib/bds/posts/`:
| Module | Lines | Responsibility |
|---|---|---|
| `Slugs` | 86 | `slug_available/3`, `unique_slug_for_title/3`, `unique`, `unique_for_import`, `default_source` |
| `AutoTranslation` | 176 | `maybe_schedule/1`, missing-language detection, post + cascading media auto-translate task scheduling |
| `FileSync` | 146 | Post/translation relative-path computation, frontmatter serialization, body extraction, on-disk delete |
| `TranslationValidation` | 464 | `validate/2`, `fix_invalid/1`, invalid DB/FS issue classification, legacy report fields, canonical-language helpers, markdown-file recursion |
| `RebuildFromFiles` | 320 | `rebuild_posts_from_files/2`, `import_orphan_post_file/2`, `import_orphan_post_translation_file/2`, `parse_rebuild_file`, `upsert_post_from_file`, `upsert_post_from_rebuild_file`, `upsert_post_translation_from_rebuild_file`, `progress_callback/1`, `report_rebuild_started/3`, `report_rebuild_progress/4`, `parse_post_status`, `parse_translation_status` |
| `Translations` | 279 | `publish_post_translation/2`, `list_post_translations/1`, `upsert_post_translation/3`, `delete_post_translation/1`, `sync_post_translation_from_file/1`, `rewrite_published_post_translation/1`, `publish_translation/2`, `publish_post_translations/1`, `normalize_translation_updates`, `maybe_reopen_source_post_for_manual_translation` |
Public API on `BDS.Posts` preserved via `defdelegate` for: `slug_available/3`, `unique_slug_for_title/3`, `validate_translations/2`, `fix_invalid_translations/1`, `rebuild_posts_from_files/2`, `import_orphan_post_file/2`, `import_orphan_post_translation_file/2`, `publish_post_translation/2`, `list_post_translations/1`, `upsert_post_translation/3`, `delete_post_translation/1`, `sync_post_translation_from_file/1`, `rewrite_published_post_translation/1`. Remaining clusters in posts.ex are core CRUD (`create_post`, `update_post`, `publish_post`, `delete_post`, `archive_post`, `discard_post_changes`, `sync_post_from_file`, `rewrite_published_post`, `editor_body`), small stats (`dashboard_stats`, `post_counts_by_year_month`, ~40 lines extractable), and `rebuild_post_links` (~22 lines). Stats could be split next, but ~569 lines is a reasonable steady state.
-`BDS.AI` (1711 → 168, **90% reduction**). Submodules extracted under `lib/bds/ai/`:
| Module | Lines | Responsibility |
|---|---|---|
| `Chat` | 597 | Public chat API (`start_chat`, `list_chat_conversations`, `available_chat_models`, `set_conversation_model`, `list_chat_messages`, `send_chat_message`, `cancel_chat`) + chat round/tool-call orchestration, request building, message truncation, system prompt + project stats summary, conversation/message persistence, `count_distinct_string_list/3`, `normalize_usage/1` |
| `OneShot` | 382 | One-shot operations (`detect_language`, `analyze_taxonomy`, `analyze_import_taxonomy`, `analyze_post`, `translate_post`, `analyze_image`, `translate_media`) + per-op system/user prompt builders, JSON response extraction, post/media input normalization, taxonomy mapping filtering |
| `Catalog` | 306 | Model catalog API (`list_endpoint_models`, `refresh_model_catalog`, `list_catalog_providers`, `get_catalog_model`, `catalog_meta`, `put_model_capabilities`, `format_model`, `model_capabilities`, `decode_nullable_json`) + models.dev fetch, persistence, modality parsing |
| `ChatTools` | 271 | Chat tool dispatch (`execute/3`) for blog_stats, list_posts, list_media, render_table/chart/form/card/metric/list/tabs/mindmap; tool spec generation (`available_specs/2`) |
| `Runtime` | 100 | `model_preference_keys/0`, `resolve_target/2`, `validate_target/3`, `endpoint_with_model/2`, per-operation airplane/online resolution |
| `SettingsStore` | 78 | Setting/secret/catalog-meta storage helpers (`get_setting`, `put_setting`, `delete_setting`, `put_secret`, `get_secret`, `encrypted_key`, `secret_backend`, catalog meta) |
Public `BDS.AI` API preserved via `defdelegate` for all extracted operations. Remaining 168 lines hold endpoint storage (`put_endpoint`, `get_endpoint`, `delete_endpoint`), airplane mode (`set_airplane_mode`, `airplane_mode?`), model preferences (`put_model_preference`, `get_model_preference`), and the defdelegate facade.
-`BDS.Scripting.Capabilities` (1715 → 194, **89% reduction**). Submodules extracted under `lib/bds/scripting/capabilities/`:
| Module | Lines | Responsibility |
|---|---|---|
| `Util` | 301 | Sanitization, normalization, arity wrappers, optional-key map builders, datetime parsing, project-path lookup, shell-open helpers |
| `Posts` | 270 | All `posts.*` capabilities (CRUD, publishing, body/cover/excerpt, search, tags, categories, archive, restore, preview path, names-with-counts) |
| `Media` | 254 | All `media.*` capabilities (CRUD, upload, thumbnails, metadata, translations, search) |
| `Crud` | 284 | `scripts.*`, `templates.*`, `tags.*`, `tasks.*` CRUD/search/exec |
| `Projects` | 204 | Project CRUD, metadata read/write, sync-meta-on-startup, data paths, project-for-folder |
| `AppShell` | 134 | Clipboard, bookmarklet, title-bar metrics, renderer-ready, open/select folder, show-in-folder, trigger-menu-action, preview-target, test-mode/env detection |
| `Bridges` | 176 | Sync availability, repo state/status/history/fetch/pull/push/commit-all, upload-site, AI detect/analyze/translate (post + media), embeddings progress/find-similar/compute-similarities/suggest-tags/find-duplicates/dismiss-pair/index-unindexed |
Public `BDS.Scripting.Capabilities.for_project/2` contract preserved unchanged. Main file (194 lines) now holds only the capability-map assembly using `import` of all submodules.
-`BDS.MCP` (677).
---
## Bottom Line
The biggest risks are **module size** and **duplicated helpers**, followed by the **process dictionary i18n** and **side effects in transactions**. Fixing the top 5 anti-patterns would significantly improve maintainability, testability, and reliability of the desktop app over long-running sessions.