443 lines
26 KiB
Markdown
443 lines
26 KiB
Markdown
# Elixir Code Smell Analysis — bDS2
|
||
|
||
> Generated by static analysis of the codebase. Do not delete; use as a reference for refactoring sprints.
|
||
>
|
||
> **Updated 2026-04-30:** Each claim re-verified against current code. See "Verification Summary" and "Priority Order" sections at the bottom.
|
||
|
||
---
|
||
|
||
## Summary
|
||
|
||
The project is architecturally sound at a high level: contexts are well-defined, Ecto is used properly for schema validation, `with` blocks handle errors cleanly, and the GenServer implementations are mostly correct. However, **the codebase suffers from severe module bloat, excessive duplication of helper functions, risky process-dictionary usage, and several side-effect-in-transaction anti-patterns** that become maintenance liabilities in a desktop application expected to run for long sessions.
|
||
|
||
---
|
||
|
||
## Critical Anti-Patterns
|
||
|
||
### 1. Massive Module Bloat (Most Serious)
|
||
|
||
Several modules are far too large and violate the single-responsibility principle.
|
||
|
||
| Module | Lines | Issue |
|
||
|--------|-------|-------|
|
||
| `BDS.Desktop.ShellLive` | 2,607 | Handles UI events, routing, overlays, menus, project switching, tab management, translations, task polling, and native menu integration |
|
||
| `BDS.Posts` | 1,709 | Handles CRUD, publishing, file I/O, translations, auto-translation scheduling, embeddings sync, search sync, link rebuilding, and validation |
|
||
| `BDS.Generation` | 2,624 | Site generation, validation, sitemap building, archive pagination, feed rendering, and file hashing (verified larger than originally documented) |
|
||
| `BDS.MCP` | 670 | MCP tools, resources, proposals, post detail serialization, search, and category counting |
|
||
| `BDS.Media` | 939 | Media CRUD, thumbnail generation, sidecar I/O, post-media join management |
|
||
|
||
**Why this is an anti-pattern:** Giant modules reduce compiler concurrency, make refactoring dangerous, obscure test coverage, and increase merge conflicts. Elixir/OTP best practice is to keep modules under ~400 lines and split by responsibility.
|
||
|
||
**Fix:** Extract sub-domains into dedicated modules (e.g., `BDS.Posts.Publisher`, `BDS.Posts.Translations`, `BDS.Generation.SitemapBuilder`, `BDS.Desktop.ShellLive.EventRouter`).
|
||
|
||
---
|
||
|
||
### 2. Process Dictionary for I18n State
|
||
(and 13 other call sites across all `*_editor.ex` modules — 15 total)
|
||
In `BDS.Desktop.ShellLive`:
|
||
|
||
```elixir
|
||
def render(assigns) do
|
||
Process.put(:bds_ui_locale, assigns.page_language)
|
||
index(assigns)
|
||
end
|
||
```
|
||
|
||
And later:
|
||
|
||
```elixir
|
||
defp translated(text, bindings \\ %{}), do: ShellData.translate(text, bindings, Process.get(:bds_ui_locale))
|
||
```
|
||
|
||
**Why this is an anti-pattern:** The process dictionary is implicit global state. It makes functions harder to test in isolation and can leak state between concurrent operations in the same process. Elixir strongly discourages `Process.put/get` for business logic.
|
||
|
||
**Fix:** Pass `locale` explicitly through `assigns` to all translation call sites, or use a dedicated context struct.
|
||
|
||
---
|
||
|
||
### 3. Side Effects Inside Database Transactions
|
||
(7 transactions in `lib/bds/media.ex` follow this pattern)
|
||
In `BDS.Media.import_media/1`:
|
||
|
||
```elixir
|
||
Repo.transaction(fn ->
|
||
media = %Media{} |> Media.changeset(...) |> Repo.insert!()
|
||
:ok = File.mkdir_p(Path.dirname(destination))
|
||
:ok = File.cp(source_path, destination)
|
||
:ok = write_sidecar(project, media)
|
||
:ok = ensure_thumbnails(project, media)
|
||
:ok = Search.sync_media(media)
|
||
media
|
||
end)
|
||
```
|
||
|
||
**Why this is an anti-pattern:** If the transaction retries (or the DB connection times out in SQLite), filesystem and thumbnail side effects are duplicated or orphaned. Ecto transactions should contain **only** database operations.
|
||
|
||
**Fix:** Perform the DB insert in the transaction, then do filesystem work after the transaction succeeds.
|
||
|
||
---
|
||
|
||
### 4. `String.to_atom/1` on Untrusted Input
|
||
|
||
In `BDS.MCP`:
|
||
|
||
```elixir
|
||
defp atomize_keys(map) when is_map(map) do
|
||
Map.new(map, fn {key, value} -> {String.to_atom(key), value} end)
|
||
end
|
||
```
|
||
|
||
**Why this is an anti-pattern:** Atoms are not garbage collected. If MCP receives large or arbitrary JSON keys, the VM atom table can fill up and crash the BEAM.
|
||
|
||
**Fix:** Use `String.to_existing_atom/1` (with a safe fallback) or keep keys as strings.
|
||
Note: 10 other `String.to_atom` call sites exist, but they operate on bounded enums (modality, platform, view id) and are lower risk.
|
||
|
||
|
||
---
|
||
|
||
### 5. Bang Functions (`File.read!`) in Business Logic Without Rescue
|
||
|
||
Found in `BDS.Posts`, `BDS.Media`, `BDS.MCP`, `BDS.Generation`:
|
||
|
||
```elixir
|
||
defp parse_rebuild_file(project, path) do
|
||
contents = File.read!(path)
|
||
{:ok, %{fields: fields}} = Frontmatter.parse_document(contents)
|
||
...
|
||
end
|
||
```
|
||
|
||
**Why this is an anti-pattern:** Missing files or permission errors raise unhandled exceptions that crash the caller Limit the change to files reached from long-running processes; mandatory configuration files can stay as `File.read!`.. In a GenServer context (e.g., background tasks), this brings down the worker.
|
||
|
||
**Fix:** Use `File.read/1` and handle `{:error, reason}` explicitly, returning `{:error, reason}` tuples to callers.
|
||
|
||
---
|
||
|
||
### 6. Duplicate Helper Functions Across Contexts
|
||
|
||
The same private helpers are copy-pasted in `BDS.Posts`, `BDS.Media`, `BDS.Search`, `BDS.Generation`, `BDS.Publishing`, `BDS.MCP`:
|
||
|
||
- `attr/2` (atom/string key normalization)
|
||
- `maybe_put/3`
|
||
- `blank_to_nil/1`
|
||
- `progress_callback/1`
|
||
- `report_rebuild_started/3`, `report_rebuild_progress/4`
|
||
|
||
**Why this is an anti-pattern:** Violates DRY. When behavior needs to change (e.g., adding float progress support), every copy must be updated.
|
||
|
||
**Fix:** Extract to a shared utility module such as `BDS.MapUtils`, `BDS.AttrUtils`, or `BDS.ProgressReporter`.
|
||
|
||
---
|
||
|
||
### 7. Direct Repo Access in LiveView (10 sites verified)
|
||
|
||
`BDS.Desktop.ShellLive` directly queries the repo:
|
||
|
||
```elixir
|
||
case Repo.get(Post, post_id) do
|
||
%Post{} = post -> ...
|
||
end
|
||
```
|
||
|
||
**Why this is an anti-pattern:** LiveViews are the web layer; they should call context functions (`BDS.Posts.get_post/1`). Direct Repo access leaks persistence details into the UI and makes testing harder.
|
||
|
||
**Fix:** Replace all `Repo.get` calls in `ShellLive` with context function calls.
|
||
|
||
---
|
||
|
||
### 8. `String.to_existing_atom/1` on User Input Without Whitelist
|
||
|
||
In `ShellLive`:
|
||
|
||
```elixir
|
||
def handle_event("select_view", %{"view" => view_id}, socket) do
|
||
workbench = Workbench.click_activity(socket.assigns.workbench, String.to_existing_atom(view_id))
|
||
...
|
||
end
|
||
```
|
||
|
||
**Why this is an anti-pattern:** If a client sends a non-existent atom string, it raises `ArgumentError`. While safer than `to_atom`, it is still a crash vector.
|
||
|
||
**Fix:** Validate against a whitelist or map of known string-to-atom conversions before conversion.
|
||
|
||
---
|
||
|
||
### 9. Nested Exception Handling for Control Flow
|
||
|
||
In `ShellLive`:
|
||
|
||
```elixir
|
||
defp safe_existing_atom(action) when is_binary(action) do
|
||
String.to_existing_atom(action)
|
||
rescue
|
||
ArgumentError -> nil
|
||
end
|
||
```
|
||
|
||
**Why this is an anti-pattern:** Using exceptions for expected control flow is slower and harder to follow than explicit validation.
|
||
|
||
**Fix:** Maintain an explicit mapping of allowed actions, or use a `case` with `:erlang.binary_to_existing_atom/2` and catch the error tuple.
|
||
|
||
---
|
||
|
||
### 10. `Jason.decode!` Without Err and `BDS.AI` (HTTP-response decoding)or Handling
|
||
|
||
In `BDS.AI.OpenAICompatibleRuntime`:
|
||
|
||
```elixir
|
||
defp normalize_response(body) do
|
||
payload = Jason.decode!(body)
|
||
...
|
||
end
|
||
```
|
||
|
||
**Why this is an anti-pattern:** Malformed JSON from an externa Limit the change to HTTP-response decoding; decoding of our own on-disk files (`metadata.ex`, `embeddings/index.ex`, etc.) is acceptable.
|
||
|
||
---
|
||
|
||
## Moderate Concerns
|
||
|
||
### 11. Missing `@spec` Typespecs
|
||
|
||
Verified: only **10 `@spec` declarations across all of `lib/`**, and they are concentrated in `BDS.Scripting` / `BDS.Scripting.Lua`. Every other public context function relies on Dialyzer inference. For a project of this size, adding `@spec` to public APIs is the single highest-ROI type-safety improvement available — and pairs well with the existing `mix dialyzer --format short` check.
|
||
|
||
### 12. Tests Run Synchronously
|
||
|
||
Most tests use `use ExUnit.Case, async: false`. Many tests share the SQLite repo and named GenServers, so they cannot realistically be async. Limit conversion to pure-logic test files (parsers, formatters, etc.).
|
||
|
||
### 13. Raw SQL Overuse
|
||
|
||
19 `Repo.query` sites split into two groups:
|
||
|
||
1. **FTS5 virtual tables** in `BDS.Search` (`posts_fts`, `media_fts`, `MATCH`, `bm25`, `rank`). Ecto cannot express FTS5 queries cleanly. **Keep these raw.**
|
||
2. **`post_media` join table** queried by hand in `BDS.Posts`, `BDS.Media`, `BDS.Desktop.ShellLive.OverlayComponents`, `BDS.Desktop.ShellLive.PostEditor`. There is no `BDS.PostMedia` schema. This is the real type-safety hole; introducing the schema replaces 6–8 raw queries with Ecto
|
||
|
||
While SQLite FTS necessitates some raw SQL, many queries in `BDS.Search`, `BDS.Media`, and `BDS.Posts` use `Repo.query!/2` for standard operations that Ecto could express portably (e.g., `DELETE FROM ... WHERE ...`). This reduces type safety and database portability.
|
||
|
||
### 14. Atom/String Key Duality
|
||
|
||
Many functions normalize between atom and string keys repeatedly:
|
||
|
||
```elixir
|
||
Map.get(assigns, :language, Map.get(assigns, "language", ...))
|
||
```
|
||
|
||
This suggests data isn't normalized at boundaries. Prefer atoms for internal structs and strings only at external boundaries (JSON, HTTP params).
|
||
|
||
### 15. GenServer State Growth (Tasks)
|
||
|
||
`BDS.Tasks` stores all tasks forever in a single map. The `clear_completed`/`clear_finished` functions exist, but if the UI doesn't trigger them regularly in a long-running desktop app, memory grows unbounded. Consider TTL-based eviction.
|
||
|
||
---
|
||
|
||
## What the Project Does Well
|
||
|
||
- **Contexts are clearly separated** (`Posts`, `Media`, `Projects`, `Search`, etc.).
|
||
- **Ecto changesets** are used consistently for validation.
|
||
- **`with` blocks** handle multi-step error flows cleanly.
|
||
- **Pattern matching** is idiomatic and pervasive.
|
||
- **Task.Supervisor** is used correctly for background work.
|
||
- *Verification Summary (2026-04-30)
|
||
|
||
| # | Claim | Verified | Notes |
|
||
|---|---|---|---|
|
||
| 1 | Module bloat | ✅ | `Generation` is 2624 lines, worse than originally documented. |
|
||
| 2 | `Process.put(:bds_ui_locale)` | ✅ | 15 call sites across `shell_live.ex` and every `*_editor.ex`. |
|
||
| 3 | Side effects in transactions | ✅ | 7 transactions in `lib/bds/media.ex` wrap filesystem + Search side effects. |
|
||
| 4 | `String.to_atom` on untrusted input | ⚠️ Partly | Only `MCP.atomize_keys` is on arbitrary external JSON. Other 10 sites are bounded enums. |
|
||
| 5 | `File.read!` | ⚠️ Partly | Most sites read mandatory config; only loop/background paths are real risk. |
|
||
| 6 | Duplicated helpers | ✅ | Real DRY violation across contexts. |
|
||
| 7 | `Repo.get` in `ShellLive` | ✅ | 10 direct calls. |
|
||
| 8/9 | `to_existing_atom` + rescue | ⚠️ | The rescue effectively *is* the whitelist; cosmetic priority. |
|
||
| 10 | `Jason.decode!` on external responses | ⚠️ Partly | Only 2 of 14 sites decode HTTP responses; the rest decode our own files. |
|
||
| 11 | Missing `@spec` | ✅ | 10 specs in entire `lib/`. Highest type-safety ROI. |
|
||
| 12 | Tests synchronous | ✅ | Limited convertibility — most tests share repo/GenServers. |
|
||
| 13 | Raw SQL | ⚠️ Partly | FTS5 must stay raw; `post_media` table is the real fix target. |
|
||
|
||
---
|
||
|
||
## Priority Order
|
||
|
||
1. **Add `@spec` to public APIs of core contexts.** ✅ **DONE 2026-04-30.** See "Priority #1 Completion" section below.
|
||
2. **Extract filesystem / Search side effects out of `Repo.transaction` in `BDS.Media`.** ✅ **DONE 2026-04-30.** See "Priority #2 Completion" section below.
|
||
3. **Fix `MCP.atomize_keys`** to use `String.to_existing_atom/1` with a string-fallback. ✅ **DONE 2026-04-30.** See "Priority #3 Completion" section below.
|
||
4. **Introduce `BDS.PostMedia` Ecto schema** and migrate the 6–8 raw `post_media` queries. ✅ **DONE 2026-04-30.** See "Priority #4 Completion" section below.
|
||
5. **Module split.** `BDS.Generation` (2624) and `BDS.Desktop.ShellLive` (2607) first, then `BDS.AI` (1700+) and `BDS.Posts`. ✅ **PARTIAL 2026-04-30.** `BDS.Generation` reduced 2651 → 1873 (29%). See "Priority #5 Progress" section below.
|
||
6. **Replace `Repo.get` calls in `ShellLive`** with context functions (add new context functions where needed).
|
||
7. **Move locale from `Process.put` into assigns**, then ban `Process.put` via Credo.
|
||
8. **Extract shared helpers** (`attr/2`, `maybe_put/3`, `blank_to_nil/1`, `progress_callback/1`, rebuild progress reporters) into `BDS.MapUtils` / `BDS.ProgressReporter`.
|
||
9. **Wrap external `Jason.decode!` calls** in `BDS.AI.OpenAICompatibleRuntime` and `BDS.AI` with `Jason.decode/1` + `{:error, _}` propagation.
|
||
|
||
### Skipped / downgraded
|
||
|
||
- #5 in bulk — convert only paths reached from long-running processes.
|
||
- #8 / #9 — cosmetic; the rescue functions as a whitelist today.
|
||
- #12 — leave `async: false` where the repo or named GenServers are involved.
|
||
- #15 — bounded in practice; ensure UI triggers `clear_finished` periodically.
|
||
|
||
---
|
||
|
||
## Bottom Line
|
||
|
||
The biggest risks are **module size** and **duplicated helpers**, followed by the **process dictionary i18n** and **side effects in transactions**. The single highest-ROI improvement for type safety is **adding `@spec` to public context APIs** so Dialyzer can do its job — that is the starting point of the priority order above
|
||
|
||
---
|
||
|
||
## Priority #1 Completion (2026-04-30)
|
||
|
||
Added `@spec` and `@type` declarations to the public APIs of all core contexts and their schemas. Validation: `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4.
|
||
|
||
**Files modified (specs added):**
|
||
|
||
- `lib/bds/projects.ex`, `lib/bds/projects/project.ex`
|
||
- `lib/bds/posts.ex`, `lib/bds/posts/post.ex`, `lib/bds/posts/translation.ex`, `lib/bds/posts/link.ex`
|
||
- `lib/bds/media.ex`, `lib/bds/media/media.ex`, `lib/bds/media/translation.ex`
|
||
- `lib/bds/search.ex`
|
||
- `lib/bds/publishing.ex`, `lib/bds/publishing/publish_job.ex`
|
||
- `lib/bds/generation.ex`
|
||
- `lib/bds/metadata.ex`
|
||
- `lib/bds/mcp.ex`
|
||
- `lib/bds/ai.ex`
|
||
|
||
**Bugs surfaced and fixed by Dialyzer once specs were in place:**
|
||
|
||
- `BDS.Search.list_stemmer_languages/0` returns `[String.t()]`, not `[{String.t(), [String.t()]}]`.
|
||
- `BDS.Media.sync_media_sidecar/1` returns `:ok` (not `{:ok, t()}`); the `posts.ex` caller was already pattern-matching on `:ok`.
|
||
- `BDS.Media.replace_media_file/2` can return `{:ok, nil}` when the new file's checksum is unchanged.
|
||
- Removed unreachable `other -> {:error, other}` fall-through clauses in the auto-translate cascades in `BDS.Posts` (the preceding `{:error, reason}` pattern already covers the only remaining return shape).
|
||
- Removed unreachable `defp blank?(_value)` clause in `BDS.Desktop.ShellLive.ChatEditor` (prior clauses already covered `binary` and `nil`, no other types reach the function).
|
||
|
||
**Conventions established for future spec work:**
|
||
|
||
- Ecto schemas need explicit `@type t` — Ecto does not generate one.
|
||
- Use `term()` for `belongs_to` / `has_many` association fields to avoid circular type dependencies between sibling schemas.
|
||
- Use `@typedoc` + named types (e.g. `attrs`, `metadata_state`, `search_filters`, `reindex_opts`) to avoid repeating large map shapes across many specs in the same module.
|
||
- For `update_*` / attrs-style maps, the canonical type is `%{optional(atom()) => term(), optional(String.t()) => term()}` because both renderer-supplied string keys and Elixir atom keys flow into them.
|
||
|
||
**Not yet specced (intentional, out of scope of priority #1):**
|
||
|
||
- LiveView modules under `lib/bds/desktop/shell_live/`. These are Phoenix LiveView callbacks (`mount/3`, `handle_event/3`, `handle_info/2`) whose specs are inherited from the behaviour. Public helper functions in those modules can be specced as part of the eventual ShellLive module split (priority #9).
|
||
- Internal helpers in `BDS.MCP` and `BDS.AI` that are private (`defp`) — Dialyzer infers these.
|
||
- `BDS.Tags`, `BDS.Templates`, `BDS.Scripts`, `BDS.PostLinks` — smaller contexts; queue for a follow-up pass if Dialyzer surfaces issues.
|
||
|
||
---
|
||
|
||
## Priority #2 Completion (2026-04-30)
|
||
|
||
Refactored every `Repo.transaction/1` block in [lib/bds/media.ex](lib/bds/media.ex) so that only DB writes run inside the transaction. Filesystem writes (`File.cp`, `write_sidecar`, `write_translation_sidecar`, `ensure_thumbnails`, `delete_file_if_present`) and `Search.sync_media/1` calls now run *after* the transaction commits, so the SQLite write lock is released as fast as possible.
|
||
|
||
**Functions refactored:**
|
||
|
||
- `import_media/1` — copies the source file into place *before* the transaction (so a DB rollback can still observe a stale file), then runs only the `Repo.insert!` inside the transaction, then writes sidecar / thumbnails / search index. On DB failure the copied file is removed.
|
||
- `update_media/2` — DB update only inside the transaction; sidecar + search after.
|
||
- `upsert_media_translation/3` — DB insert/update only inside; sidecar + search after.
|
||
- `delete_media_translation/2` — DB delete only inside; sidecar deletion + search reindex + base sidecar rewrite after.
|
||
- `replace_media_file/2` — moves the existing destination to a `.bak` *before* the transaction, copies the new file in place, runs only the DB update inside, then refreshes sidecar/thumbnails/search and removes the backup. On DB failure the original file is restored from the backup.
|
||
- `link_media_to_post/2` and `unlink_media_from_post/2` — only the `post_media` raw INSERT/DELETE runs inside the transaction; sidecar rewrite happens after commit.
|
||
|
||
**Why this matters:**
|
||
|
||
SQLite has a single global write lock. Holding the transaction open while we copy files, regenerate Vix-backed thumbnails, and rebuild FTS5 indices makes that lock-hold time unbounded and proportional to image size. Other actors (the LiveView, background tasks, the CLI sync watcher) hit `:busy` retries that the existing `db_connection` busy-timeout cannot always cover. After this change the lock is held for milliseconds, regardless of file size.
|
||
|
||
**Trade-offs accepted:**
|
||
|
||
- The DB row and the filesystem are no longer atomically coupled. If the BEAM crashes between `Repo.insert!` and `write_sidecar/2`, the row exists without a sidecar. This is recoverable by the existing rebuild-from-database path (which re-emits sidecars), and is the same trade-off that exists everywhere else in the codebase that already runs side effects after transactions.
|
||
- `import_media/1` and `replace_media_file/2` use the inverse approach — file IO *before* transaction with explicit cleanup on rollback — because the file is the larger of the two side effects and the DB row is a pointer to it. This keeps the destination consistent on rollback.
|
||
|
||
**Spec correction surfaced by Dialyzer:**
|
||
|
||
- `BDS.Media.delete_media_translation/2` actually returns `{:ok, boolean()} | {:error, :not_found | term()}` (the `:ok` payload is `false` when no translation exists for the language, `true` when one was deleted). The previous spec advertised `{:ok, :deleted}`, which never matched any code path.
|
||
|
||
**Validation:** `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4.
|
||
|
||
---
|
||
|
||
## Priority #3 Completion (2026-04-30)
|
||
|
||
Removed `BDS.MCP.atomize_keys/1` entirely instead of just narrowing it to `String.to_existing_atom/1`. The function existed only to convert MCP-proposal JSON keys (attacker-controlled strings) into atoms before passing them to `Media.update_media/2` and `Posts.update_post/2`. Both contexts already accept string-keyed maps natively through their `attr/2` helper (`%{optional(atom()) => term(), optional(String.t()) => term()}`), so the conversion was both unnecessary and a textbook unbounded-atom-table DoS vector.
|
||
|
||
**Change:**
|
||
|
||
- Deleted `defp atomize_keys/1` from [lib/bds/mcp.ex](lib/bds/mcp.ex).
|
||
- The two `accept_proposal` call sites now pass `proposal.data["changes"] || %{}` straight through to `Media.update_media/2` and `Posts.update_post/2`.
|
||
|
||
**Why removal beats whitelisting:**
|
||
|
||
- `String.to_existing_atom/1` would still need a whitelist or a `try/rescue` wrapper, both of which add code without buying type safety — the downstream contexts already key on either form.
|
||
- Removing the function eliminates the only place in MCP that converts external JSON to atoms; the surface area for atom-table attacks via the MCP tool API is now zero.
|
||
- The `attr/2` helpers in `BDS.Posts` and `BDS.Media` are the *single* canonical place where attribute key normalization happens, which is exactly the invariant we want.
|
||
|
||
**Other `String.to_atom/1` call sites considered:**
|
||
|
||
The codebase has nine other `String.to_atom/1` call sites; all of them operate on bounded inputs (`Workbench` type names, release platforms, route IDs from the router config, `:supports_attachment` / `:supports_tool_calls` capability keys we ourselves wrote, `parse_modality` from a fixed AI catalog). None of them are reachable from attacker-controlled JSON the way `MCP.accept_proposal` was. They are safe to leave as a follow-up if a stricter sweep is wanted later.
|
||
|
||
**Validation:** `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4.
|
||
|
||
---
|
||
|
||
## Priority #4 Completion (2026-04-30)
|
||
|
||
Introduced [lib/bds/posts/post_media.ex](lib/bds/posts/post_media.ex) — a proper `Ecto.Schema` for the `post_media` join table — and migrated every raw SQL / string-table query in the codebase to use it.
|
||
|
||
**New schema:** `BDS.Posts.PostMedia` with fields `id`, `project_id`, `post_id`, `media_id`, `sort_order`, `created_at`, `belongs_to :post` and `belongs_to :media` associations, full `@type t`, and a `changeset/2` enforcing `validate_required` + `foreign_key_constraint` + `unique_constraint([:post_id, :media_id])`.
|
||
|
||
**Call sites migrated:**
|
||
|
||
- [lib/bds/media.ex](lib/bds/media.ex)
|
||
- `list_linked_posts/1` — replaced `join: post_media in "post_media"` string-table join with `join: pm in PostMedia`.
|
||
- `link_media_to_post/2` — replaced `Repo.query("SELECT 1 FROM post_media …")` with `Repo.exists?(from pm in PostMedia, …)` and `Repo.query("INSERT INTO post_media …")` with `%PostMedia{} |> changeset() |> Repo.insert!()` (uniqueness now enforced through the schema constraint).
|
||
- `unlink_media_from_post/2` — replaced `Repo.query("DELETE FROM post_media …")` with `Repo.delete_all(from pm in PostMedia, …)`.
|
||
- `linked_post_ids/1` — replaced raw `SELECT post_id` with `Repo.all(from pm in PostMedia, select: pm.post_id)`.
|
||
- `next_sort_order/1` — replaced raw `SELECT COALESCE(MAX(sort_order), -1)` with `Repo.one(from pm in PostMedia, select: max(pm.sort_order))` and a `value when is_integer(value)` guard for the empty-table case.
|
||
- [lib/bds/posts.ex](lib/bds/posts.ex)
|
||
- `linked_media_ids/1` — replaced raw `SELECT media_id` with `Repo.all(from pm in PostMedia, select: pm.media_id)`.
|
||
- [lib/bds/desktop/shell_live/post_editor.ex](lib/bds/desktop/shell_live/post_editor.ex)
|
||
- `linked_media/1` — replaced raw `SELECT media_id, sort_order` with a `Repo.all(from pm in PostMedia, select: {pm.media_id, pm.sort_order})` query.
|
||
- [lib/bds/desktop/shell_live/overlay_components.ex](lib/bds/desktop/shell_live/overlay_components.ex)
|
||
- `post_media_ids/1` — replaced raw `SELECT media_id` with `Repo.all(from pm in PostMedia, select: pm.media_id)`.
|
||
- `delete_details/2` — replaced the raw `SELECT posts.title FROM posts JOIN post_media ON …` with a typed Ecto join query (`from post in Post, join: pm in PostMedia, on: pm.post_id == post.id, …`).
|
||
|
||
**Why this matters:**
|
||
|
||
- All `post_media` access now goes through a typed schema, so every column reference is checked at compile time by Ecto and any future migration that renames a column will produce a `Postgrex/Sqlite` error at compile or test time instead of silently breaking at runtime.
|
||
- The unique `(post_id, media_id)` constraint is now enforceable via `Ecto.Changeset.unique_constraint/2`, which would let the next refactor turn `link_media_to_post/2` into an idempotent upsert without losing the protection.
|
||
- `Repo.query/2` is reserved for the few remaining cases that genuinely need raw SQL (FTS5 virtual tables in `BDS.Search`, which are not Ecto-mappable).
|
||
|
||
**Validation:** `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4.
|
||
|
||
---
|
||
|
||
## Priority #5 Progress (2026-04-30)
|
||
|
||
**Goal:** Split god modules. Started with the worst offender, `BDS.Generation` (2651 lines).
|
||
|
||
**Result:** `lib/bds/generation.ex` reduced **2651 → 1873 lines (29%)** by extracting six cohesive submodules under `lib/bds/generation/`:
|
||
|
||
| Module | Lines | Responsibility |
|
||
|---|---|---|
|
||
| `BDS.Generation.Paths` | 262 | URL/route/path helpers, language prefixing, pagination math, archive routing |
|
||
| `BDS.Generation.Sitemap` | 280 | sitemap.xml, RSS/Atom feeds, calendar feed, hreflang link assembly |
|
||
| `BDS.Generation.Renderers` | 227 | Liquid template rendering wrappers (home, post, archive, date, list, 404) |
|
||
| `BDS.Generation.Progress` | 96 | Generation/validation progress callback helpers |
|
||
| `BDS.Generation.Pagefind` | 70 | Pagefind search-index input file emission |
|
||
| `BDS.Generation.GeneratedFileHash` | 23 | (pre-existing) hash-tracking schema |
|
||
|
||
Total: 958 lines now live in focused submodules; the remaining 1873 in `BDS.Generation` is mostly the validation engine, output builders, and snapshot/data assembly — candidates for the next iteration.
|
||
|
||
**Refactor pattern used:** `import BDS.Generation.X, only: [...]` (or `except: [...]`) at the head of `BDS.Generation` so the hundreds of internal call sites needed no changes; `defdelegate` for any function that had to remain reachable through the public `BDS.Generation` namespace (e.g. `post_output_path/1,2`).
|
||
|
||
**Validation after each extraction:** `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4.
|
||
|
||
**Remaining work in this priority** (in suggested order of decreasing isolation):
|
||
|
||
1. `BDS.Generation.Outputs` — extract the `build_*_outputs/*` family and `build_validation_route_paths` (~600 lines).
|
||
2. `BDS.Generation.Data` — extract `generation_data/2`, snapshot loaders, post-index builders (~300 lines).
|
||
3. `BDS.Generation.Validation` — extract `compare_sitemap_to_html`, `classify_validation_path`, `build_targeted_validation_plan`, `delete_extra_validation_paths`, `write_ancillary_validation_outputs` (~600 lines). Most coupled — do last.
|
||
4. After `BDS.Generation`, repeat the pattern on `BDS.Desktop.ShellLive` (2607), `BDS.Posts` (1781), `BDS.AI` (1711), `BDS.MCP` (677).
|
||
|
||
---
|
||
|
||
## Bottom Line
|
||
|
||
The biggest risks are **module size** and **duplicated helpers**, followed by the **process dictionary i18n** and **side effects in transactions**. Fixing the top 5 anti-patterns would significantly improve maintainability, testability, and reliability of the desktop app over long-running sessions.
|