fix: work on step 12

This commit is contained in:
2026-04-30 16:55:00 +02:00
parent a6033cb86a
commit 8358f9000e
15 changed files with 893 additions and 192 deletions

239
CODESMELL.md Normal file
View File

@@ -0,0 +1,239 @@
# Elixir Code Smell Analysis — bDS2
> Generated by static analysis of the codebase. Do not delete; use as a reference for refactoring sprints.
---
## Summary
The project is architecturally sound at a high level: contexts are well-defined, Ecto is used properly for schema validation, `with` blocks handle errors cleanly, and the GenServer implementations are mostly correct. However, **the codebase suffers from severe module bloat, excessive duplication of helper functions, risky process-dictionary usage, and several side-effect-in-transaction anti-patterns** that become maintenance liabilities in a desktop application expected to run for long sessions.
---
## Critical Anti-Patterns
### 1. Massive Module Bloat (Most Serious)
Several modules are far too large and violate the single-responsibility principle.
| Module | Lines | Issue |
|--------|-------|-------|
| `BDS.Desktop.ShellLive` | ~2,600 | Handles UI events, routing, overlays, menus, project switching, tab management, translations, task polling, and native menu integration |
| `BDS.Posts` | ~1,700 | Handles CRUD, publishing, file I/O, translations, auto-translation scheduling, embeddings sync, search sync, link rebuilding, and validation |
| `BDS.Generation` | ~1,500 | Site generation, validation, sitemap building, archive pagination, feed rendering, and file hashing |
| `BDS.MCP` | ~670 | MCP tools, resources, proposals, post detail serialization, search, and category counting |
**Why this is an anti-pattern:** Giant modules reduce compiler concurrency, make refactoring dangerous, obscure test coverage, and increase merge conflicts. Elixir/OTP best practice is to keep modules under ~400 lines and split by responsibility.
**Fix:** Extract sub-domains into dedicated modules (e.g., `BDS.Posts.Publisher`, `BDS.Posts.Translations`, `BDS.Generation.SitemapBuilder`, `BDS.Desktop.ShellLive.EventRouter`).
---
### 2. Process Dictionary for I18n State
In `BDS.Desktop.ShellLive`:
```elixir
def render(assigns) do
Process.put(:bds_ui_locale, assigns.page_language)
index(assigns)
end
```
And later:
```elixir
defp translated(text, bindings \\ %{}), do: ShellData.translate(text, bindings, Process.get(:bds_ui_locale))
```
**Why this is an anti-pattern:** The process dictionary is implicit global state. It makes functions harder to test in isolation and can leak state between concurrent operations in the same process. Elixir strongly discourages `Process.put/get` for business logic.
**Fix:** Pass `locale` explicitly through `assigns` to all translation call sites, or use a dedicated context struct.
---
### 3. Side Effects Inside Database Transactions
In `BDS.Media.import_media/1`:
```elixir
Repo.transaction(fn ->
media = %Media{} |> Media.changeset(...) |> Repo.insert!()
:ok = File.mkdir_p(Path.dirname(destination))
:ok = File.cp(source_path, destination)
:ok = write_sidecar(project, media)
:ok = ensure_thumbnails(project, media)
:ok = Search.sync_media(media)
media
end)
```
**Why this is an anti-pattern:** If the transaction retries (or the DB connection times out in SQLite), filesystem and thumbnail side effects are duplicated or orphaned. Ecto transactions should contain **only** database operations.
**Fix:** Perform the DB insert in the transaction, then do filesystem work after the transaction succeeds.
---
### 4. `String.to_atom/1` on Untrusted Input
In `BDS.MCP`:
```elixir
defp atomize_keys(map) when is_map(map) do
Map.new(map, fn {key, value} -> {String.to_atom(key), value} end)
end
```
**Why this is an anti-pattern:** Atoms are not garbage collected. If MCP receives large or arbitrary JSON keys, the VM atom table can fill up and crash the BEAM.
**Fix:** Use `String.to_existing_atom/1` (with a safe fallback) or keep keys as strings.
---
### 5. Bang Functions (`File.read!`) in Business Logic Without Rescue
Found in `BDS.Posts`, `BDS.Media`, `BDS.MCP`, `BDS.Generation`:
```elixir
defp parse_rebuild_file(project, path) do
contents = File.read!(path)
{:ok, %{fields: fields}} = Frontmatter.parse_document(contents)
...
end
```
**Why this is an anti-pattern:** Missing files or permission errors raise unhandled exceptions that crash the caller. In a GenServer context (e.g., background tasks), this brings down the worker.
**Fix:** Use `File.read/1` and handle `{:error, reason}` explicitly, returning `{:error, reason}` tuples to callers.
---
### 6. Duplicate Helper Functions Across Contexts
The same private helpers are copy-pasted in `BDS.Posts`, `BDS.Media`, `BDS.Search`, `BDS.Generation`, `BDS.Publishing`, `BDS.MCP`:
- `attr/2` (atom/string key normalization)
- `maybe_put/3`
- `blank_to_nil/1`
- `progress_callback/1`
- `report_rebuild_started/3`, `report_rebuild_progress/4`
**Why this is an anti-pattern:** Violates DRY. When behavior needs to change (e.g., adding float progress support), every copy must be updated.
**Fix:** Extract to a shared utility module such as `BDS.MapUtils`, `BDS.AttrUtils`, or `BDS.ProgressReporter`.
---
### 7. Direct Repo Access in LiveView
`BDS.Desktop.ShellLive` directly queries the repo:
```elixir
case Repo.get(Post, post_id) do
%Post{} = post -> ...
end
```
**Why this is an anti-pattern:** LiveViews are the web layer; they should call context functions (`BDS.Posts.get_post/1`). Direct Repo access leaks persistence details into the UI and makes testing harder.
**Fix:** Replace all `Repo.get` calls in `ShellLive` with context function calls.
---
### 8. `String.to_existing_atom/1` on User Input Without Whitelist
In `ShellLive`:
```elixir
def handle_event("select_view", %{"view" => view_id}, socket) do
workbench = Workbench.click_activity(socket.assigns.workbench, String.to_existing_atom(view_id))
...
end
```
**Why this is an anti-pattern:** If a client sends a non-existent atom string, it raises `ArgumentError`. While safer than `to_atom`, it is still a crash vector.
**Fix:** Validate against a whitelist or map of known string-to-atom conversions before conversion.
---
### 9. Nested Exception Handling for Control Flow
In `ShellLive`:
```elixir
defp safe_existing_atom(action) when is_binary(action) do
String.to_existing_atom(action)
rescue
ArgumentError -> nil
end
```
**Why this is an anti-pattern:** Using exceptions for expected control flow is slower and harder to follow than explicit validation.
**Fix:** Maintain an explicit mapping of allowed actions, or use a `case` with `:erlang.binary_to_existing_atom/2` and catch the error tuple.
---
### 10. `Jason.decode!` Without Error Handling
In `BDS.AI.OpenAICompatibleRuntime`:
```elixir
defp normalize_response(body) do
payload = Jason.decode!(body)
...
end
```
**Why this is an anti-pattern:** Malformed JSON from an external API will crash the caller.
**Fix:** Use `Jason.decode/1` and propagate `{:error, reason}`.
---
## Moderate Concerns
### 11. Missing `@spec` Typespecs
Very few public functions have `@spec` declarations. For a project of this size, this makes Dialyzer less effective and the API surface harder to understand for new contributors.
### 12. Tests Run Synchronously
All tests visible use `use ExUnit.Case, async: false`. For a large suite, this unnecessarily slows CI and local feedback loops. Use `async: true` where tests are isolated.
### 13. Raw SQL Overuse
While SQLite FTS necessitates some raw SQL, many queries in `BDS.Search`, `BDS.Media`, and `BDS.Posts` use `Repo.query!/2` for standard operations that Ecto could express portably (e.g., `DELETE FROM ... WHERE ...`). This reduces type safety and database portability.
### 14. Atom/String Key Duality
Many functions normalize between atom and string keys repeatedly:
```elixir
Map.get(assigns, :language, Map.get(assigns, "language", ...))
```
This suggests data isn't normalized at boundaries. Prefer atoms for internal structs and strings only at external boundaries (JSON, HTTP params).
### 15. GenServer State Growth (Tasks)
`BDS.Tasks` stores all tasks forever in a single map. The `clear_completed`/`clear_finished` functions exist, but if the UI doesn't trigger them regularly in a long-running desktop app, memory grows unbounded. Consider TTL-based eviction.
---
## What the Project Does Well
- **Contexts are clearly separated** (`Posts`, `Media`, `Projects`, `Search`, etc.).
- **Ecto changesets** are used consistently for validation.
- **`with` blocks** handle multi-step error flows cleanly.
- **Pattern matching** is idiomatic and pervasive.
- **Task.Supervisor** is used correctly for background work.
- **Atomic file writes** (`Persistence.atomic_write`) show good filesystem hygiene.
- **PubSub** is used appropriately for CLI sync notifications.
---
## Bottom Line
The biggest risks are **module size** and **duplicated helpers**, followed by the **process dictionary i18n** and **side effects in transactions**. Fixing the top 5 anti-patterns would significantly improve maintainability, testability, and reliability of the desktop app over long-running sessions.