chore: working on code smells

This commit is contained in:
2026-04-30 17:46:05 +02:00
parent 8358f9000e
commit a80ce7c845
18 changed files with 513 additions and 19 deletions

View File

@@ -1,6 +1,8 @@
# Elixir Code Smell Analysis — bDS2
> Generated by static analysis of the codebase. Do not delete; use as a reference for refactoring sprints.
>
> **Updated 2026-04-30:** Each claim re-verified against current code. See "Verification Summary" and "Priority Order" sections at the bottom.
---
@@ -18,10 +20,11 @@ Several modules are far too large and violate the single-responsibility principl
| Module | Lines | Issue |
|--------|-------|-------|
| `BDS.Desktop.ShellLive` | ~2,600 | Handles UI events, routing, overlays, menus, project switching, tab management, translations, task polling, and native menu integration |
| `BDS.Posts` | ~1,700 | Handles CRUD, publishing, file I/O, translations, auto-translation scheduling, embeddings sync, search sync, link rebuilding, and validation |
| `BDS.Generation` | ~1,500 | Site generation, validation, sitemap building, archive pagination, feed rendering, and file hashing |
| `BDS.MCP` | ~670 | MCP tools, resources, proposals, post detail serialization, search, and category counting |
| `BDS.Desktop.ShellLive` | 2,607 | Handles UI events, routing, overlays, menus, project switching, tab management, translations, task polling, and native menu integration |
| `BDS.Posts` | 1,709 | Handles CRUD, publishing, file I/O, translations, auto-translation scheduling, embeddings sync, search sync, link rebuilding, and validation |
| `BDS.Generation` | 2,624 | Site generation, validation, sitemap building, archive pagination, feed rendering, and file hashing (verified larger than originally documented) |
| `BDS.MCP` | 670 | MCP tools, resources, proposals, post detail serialization, search, and category counting |
| `BDS.Media` | 939 | Media CRUD, thumbnail generation, sidecar I/O, post-media join management |
**Why this is an anti-pattern:** Giant modules reduce compiler concurrency, make refactoring dangerous, obscure test coverage, and increase merge conflicts. Elixir/OTP best practice is to keep modules under ~400 lines and split by responsibility.
@@ -30,7 +33,7 @@ Several modules are far too large and violate the single-responsibility principl
---
### 2. Process Dictionary for I18n State
(and 13 other call sites across all `*_editor.ex` modules — 15 total)
In `BDS.Desktop.ShellLive`:
```elixir
@@ -53,7 +56,7 @@ defp translated(text, bindings \\ %{}), do: ShellData.translate(text, bindings,
---
### 3. Side Effects Inside Database Transactions
(7 transactions in `lib/bds/media.ex` follow this pattern)
In `BDS.Media.import_media/1`:
```elixir
@@ -87,6 +90,8 @@ end
**Why this is an anti-pattern:** Atoms are not garbage collected. If MCP receives large or arbitrary JSON keys, the VM atom table can fill up and crash the BEAM.
**Fix:** Use `String.to_existing_atom/1` (with a safe fallback) or keep keys as strings.
Note: 10 other `String.to_atom` call sites exist, but they operate on bounded enums (modality, platform, view id) and are lower risk.
---
@@ -102,7 +107,7 @@ defp parse_rebuild_file(project, path) do
end
```
**Why this is an anti-pattern:** Missing files or permission errors raise unhandled exceptions that crash the caller. In a GenServer context (e.g., background tasks), this brings down the worker.
**Why this is an anti-pattern:** Missing files or permission errors raise unhandled exceptions that crash the caller Limit the change to files reached from long-running processes; mandatory configuration files can stay as `File.read!`.. In a GenServer context (e.g., background tasks), this brings down the worker.
**Fix:** Use `File.read/1` and handle `{:error, reason}` explicitly, returning `{:error, reason}` tuples to callers.
@@ -124,7 +129,7 @@ The same private helpers are copy-pasted in `BDS.Posts`, `BDS.Media`, `BDS.Searc
---
### 7. Direct Repo Access in LiveView
### 7. Direct Repo Access in LiveView (10 sites verified)
`BDS.Desktop.ShellLive` directly queries the repo:
@@ -175,7 +180,7 @@ end
---
### 10. `Jason.decode!` Without Error Handling
### 10. `Jason.decode!` Without Err and `BDS.AI` (HTTP-response decoding)or Handling
In `BDS.AI.OpenAICompatibleRuntime`:
@@ -186,9 +191,7 @@ defp normalize_response(body) do
end
```
**Why this is an anti-pattern:** Malformed JSON from an external API will crash the caller.
**Fix:** Use `Jason.decode/1` and propagate `{:error, reason}`.
**Why this is an anti-pattern:** Malformed JSON from an externa Limit the change to HTTP-response decoding; decoding of our own on-disk files (`metadata.ex`, `embeddings/index.ex`, etc.) is acceptable.
---
@@ -196,14 +199,19 @@ end
### 11. Missing `@spec` Typespecs
Very few public functions have `@spec` declarations. For a project of this size, this makes Dialyzer less effective and the API surface harder to understand for new contributors.
Verified: only **10 `@spec` declarations across all of `lib/`**, and they are concentrated in `BDS.Scripting` / `BDS.Scripting.Lua`. Every other public context function relies on Dialyzer inference. For a project of this size, adding `@spec` to public APIs is the single highest-ROI type-safety improvement available — and pairs well with the existing `mix dialyzer --format short` check.
### 12. Tests Run Synchronously
All tests visible use `use ExUnit.Case, async: false`. For a large suite, this unnecessarily slows CI and local feedback loops. Use `async: true` where tests are isolated.
Most tests use `use ExUnit.Case, async: false`. Many tests share the SQLite repo and named GenServers, so they cannot realistically be async. Limit conversion to pure-logic test files (parsers, formatters, etc.).
### 13. Raw SQL Overuse
19 `Repo.query` sites split into two groups:
1. **FTS5 virtual tables** in `BDS.Search` (`posts_fts`, `media_fts`, `MATCH`, `bm25`, `rank`). Ecto cannot express FTS5 queries cleanly. **Keep these raw.**
2. **`post_media` join table** queried by hand in `BDS.Posts`, `BDS.Media`, `BDS.Desktop.ShellLive.OverlayComponents`, `BDS.Desktop.ShellLive.PostEditor`. There is no `BDS.PostMedia` schema. This is the real type-safety hole; introducing the schema replaces 68 raw queries with Ecto
While SQLite FTS necessitates some raw SQL, many queries in `BDS.Search`, `BDS.Media`, and `BDS.Posts` use `Repo.query!/2` for standard operations that Ecto could express portably (e.g., `DELETE FROM ... WHERE ...`). This reduces type safety and database portability.
### 14. Atom/String Key Duality
@@ -229,8 +237,88 @@ This suggests data isn't normalized at boundaries. Prefer atoms for internal str
- **`with` blocks** handle multi-step error flows cleanly.
- **Pattern matching** is idiomatic and pervasive.
- **Task.Supervisor** is used correctly for background work.
- **Atomic file writes** (`Persistence.atomic_write`) show good filesystem hygiene.
- **PubSub** is used appropriately for CLI sync notifications.
- *Verification Summary (2026-04-30)
| # | Claim | Verified | Notes |
|---|---|---|---|
| 1 | Module bloat | ✅ | `Generation` is 2624 lines, worse than originally documented. |
| 2 | `Process.put(:bds_ui_locale)` | ✅ | 15 call sites across `shell_live.ex` and every `*_editor.ex`. |
| 3 | Side effects in transactions | ✅ | 7 transactions in `lib/bds/media.ex` wrap filesystem + Search side effects. |
| 4 | `String.to_atom` on untrusted input | ⚠️ Partly | Only `MCP.atomize_keys` is on arbitrary external JSON. Other 10 sites are bounded enums. |
| 5 | `File.read!` | ⚠️ Partly | Most sites read mandatory config; only loop/background paths are real risk. |
| 6 | Duplicated helpers | ✅ | Real DRY violation across contexts. |
| 7 | `Repo.get` in `ShellLive` | ✅ | 10 direct calls. |
| 8/9 | `to_existing_atom` + rescue | ⚠️ | The rescue effectively *is* the whitelist; cosmetic priority. |
| 10 | `Jason.decode!` on external responses | ⚠️ Partly | Only 2 of 14 sites decode HTTP responses; the rest decode our own files. |
| 11 | Missing `@spec` | ✅ | 10 specs in entire `lib/`. Highest type-safety ROI. |
| 12 | Tests synchronous | ✅ | Limited convertibility — most tests share repo/GenServers. |
| 13 | Raw SQL | ⚠️ Partly | FTS5 must stay raw; `post_media` table is the real fix target. |
---
## Priority Order
1. **Add `@spec` to public APIs of core contexts.****DONE 2026-04-30.** See "Priority #1 Completion" section below.
2. **Extract filesystem / Search side effects out of `Repo.transaction` in `BDS.Media`.** Real correctness bug under SQLite busy retries.
3. **Fix `MCP.atomize_keys`** to use `String.to_existing_atom/1` with a string-fallback. Removes a real atom-table DoS vector.
4. **Introduce `BDS.PostMedia` Ecto schema** and migrate the 68 raw `post_media` queries. Direct type-safety win.
5. **Replace `Repo.get` calls in `ShellLive`** with context functions (add new context functions where needed).
6. **Move locale from `Process.put` into assigns**, then ban `Process.put` via Credo.
7. **Extract shared helpers** (`attr/2`, `maybe_put/3`, `blank_to_nil/1`, `progress_callback/1`, rebuild progress reporters) into `BDS.MapUtils` / `BDS.ProgressReporter`.
8. **Wrap external `Jason.decode!` calls** in `BDS.AI.OpenAICompatibleRuntime` and `BDS.AI` with `Jason.decode/1` + `{:error, _}` propagation.
9. **Module split.** `BDS.Generation` (2624) and `BDS.Desktop.ShellLive` (2607) first; schedule each as its own sprint.
### Skipped / downgraded
- #5 in bulk — convert only paths reached from long-running processes.
- #8 / #9 — cosmetic; the rescue functions as a whitelist today.
- #12 — leave `async: false` where the repo or named GenServers are involved.
- #15 — bounded in practice; ensure UI triggers `clear_finished` periodically.
---
## Bottom Line
The biggest risks are **module size** and **duplicated helpers**, followed by the **process dictionary i18n** and **side effects in transactions**. The single highest-ROI improvement for type safety is **adding `@spec` to public context APIs** so Dialyzer can do its job — that is the starting point of the priority order above
---
## Priority #1 Completion (2026-04-30)
Added `@spec` and `@type` declarations to the public APIs of all core contexts and their schemas. Validation: `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4.
**Files modified (specs added):**
- `lib/bds/projects.ex`, `lib/bds/projects/project.ex`
- `lib/bds/posts.ex`, `lib/bds/posts/post.ex`, `lib/bds/posts/translation.ex`, `lib/bds/posts/link.ex`
- `lib/bds/media.ex`, `lib/bds/media/media.ex`, `lib/bds/media/translation.ex`
- `lib/bds/search.ex`
- `lib/bds/publishing.ex`, `lib/bds/publishing/publish_job.ex`
- `lib/bds/generation.ex`
- `lib/bds/metadata.ex`
- `lib/bds/mcp.ex`
- `lib/bds/ai.ex`
**Bugs surfaced and fixed by Dialyzer once specs were in place:**
- `BDS.Search.list_stemmer_languages/0` returns `[String.t()]`, not `[{String.t(), [String.t()]}]`.
- `BDS.Media.sync_media_sidecar/1` returns `:ok` (not `{:ok, t()}`); the `posts.ex` caller was already pattern-matching on `:ok`.
- `BDS.Media.replace_media_file/2` can return `{:ok, nil}` when the new file's checksum is unchanged.
- Removed unreachable `other -> {:error, other}` fall-through clauses in the auto-translate cascades in `BDS.Posts` (the preceding `{:error, reason}` pattern already covers the only remaining return shape).
- Removed unreachable `defp blank?(_value)` clause in `BDS.Desktop.ShellLive.ChatEditor` (prior clauses already covered `binary` and `nil`, no other types reach the function).
**Conventions established for future spec work:**
- Ecto schemas need explicit `@type t` — Ecto does not generate one.
- Use `term()` for `belongs_to` / `has_many` association fields to avoid circular type dependencies between sibling schemas.
- Use `@typedoc` + named types (e.g. `attrs`, `metadata_state`, `search_filters`, `reindex_opts`) to avoid repeating large map shapes across many specs in the same module.
- For `update_*` / attrs-style maps, the canonical type is `%{optional(atom()) => term(), optional(String.t()) => term()}` because both renderer-supplied string keys and Elixir atom keys flow into them.
**Not yet specced (intentional, out of scope of priority #1):**
- LiveView modules under `lib/bds/desktop/shell_live/`. These are Phoenix LiveView callbacks (`mount/3`, `handle_event/3`, `handle_info/2`) whose specs are inherited from the behaviour. Public helper functions in those modules can be specced as part of the eventual ShellLive module split (priority #9).
- Internal helpers in `BDS.MCP` and `BDS.AI` that are private (`defp`) — Dialyzer infers these.
- `BDS.Tags`, `BDS.Templates`, `BDS.Scripts`, `BDS.PostLinks` — smaller contexts; queue for a follow-up pass if Dialyzer surfaces issues.
---