fix: added things around project folder pollution from program runs
This commit is contained in:
@@ -27,6 +27,7 @@ Gap categories: **SC** = spec correct, fix code | **CS** = code correct, update
|
||||
| A1-14b | ~~USearch HNSW ANN index + debounced persistence not implemented~~ | embedding.allium config/FindSimilar/DebouncedPersistence | `Embeddings.Index` is now an HNSW (hnswlib) ANN index with debounced persistence | **Resolved:** rewrote `Embeddings.Index` as a DB-free GenServer wrapping an hnswlib HNSW graph (cosine, M=16, efConstruction=128, efSearch=64) — O(n·log n) build, O(log n) queries, replacing the O(n²) JSON cosine snapshot; per-project in-memory index + `label→post_id` map; 5s debounced `save_index` + `.meta.json` sidecar, force-save on project switch (`set_active_project`) and shutdown (`terminate`), `forget/1` on project delete; lazy reload from disk with rebuild-from-DB self-heal on miss; `find_similar`/`find_duplicates`/`compute_similarities` rewired (no brute-force fallback); USearch has no Elixir binding so hnswlib provides the identical HNSW algorithm/params (spec reconciled); supervision + dialyzer PLT updated; tests updated for debounced/binary persistence + self-heal. Follow-up hardening: explicit rebuild now forces re-embedding regardless of content_hash (ReindexAll), and model-unavailable errors propagate cleanly (post saves degrade to unindexed + log; rebuild/index return `{:error, reason}` surfaced as a failed task with a user-facing message instead of crashing). |
|
||||
| A1-14c | ~~Embedding model runs on CPU only; no Apple GPU acceleration~~ | embedding.allium invariant NativeAcceleratedExecution | `Backends.Neural` now selects the defn compiler at serving-build time: Apple GPU via EMLX (MLX/Metal) on arm64 macOS, EXLA-CPU elsewhere | **Resolved:** added `{:emlx, "~> 0.2.0"}` dep (ships precompiled MLX binaries; EMLX 0.2.0 implements both `EMLX.Backend` and the `Nx.Defn.Compiler` behaviour, GPU-default); `Backends.Neural` gained a pure `select_accelerator/3` policy (`:auto` prefers EMLX only when available **and** on Apple Silicon; explicit `:emlx`/`:exla` honoured; forced `:emlx` degrades to EXLA when unavailable so misconfigured hosts still run), `current_accelerator/0`, and `defn_options/1`; `build_serving` places params on `{EMLX.Backend, device: :gpu}` and compiles with `EMLX` for the EMLX path, keeps `EXLA` otherwise; new `accelerator: :auto` config key; spec `NativeAcceleratedExecution` + `EmbeddingModel` updated; PLT app added; 7 tests added (offline — test config still uses the InApp stub). |
|
||||
| A1-15 | ~~Preview vs generation content source strategy undocumented~~ | preview.allium (no invariant), generation.allium (no invariant) | Generation uses only published .md file content (`Generation.Data` snapshots set `content: nil`); preview includes published+draft posts and prefers DB content over file (`Preview.Router` queries `:published`/`:draft`, uses `editor_body`) | **Resolved:** added `PreviewDraftOverlay` invariant to preview.allium and `GenerationPublishedOnly` invariant to generation.allium; both cross-reference each other; code already correct, 3 tests added for draft-in-preview behavior |
|
||||
| A1-16 | Public project content + data_path discovery not compliant with storage-location spec | project.allium `PublicContentLivesInProjectFolder` / `PrivateArtifactsLiveInOsAppDir` / `DataPathNotPersistedInProjectJson` / `DiscoverProjectDataPath` (newly added) | **Private side done:** `Projects.project_cache_root/0` now falls back to the OS private app dir (`:filename.basedir(:user_config, "bds")` → macOS `~/Library/Application Support/bds`) instead of `priv/data`, so the embeddings index no longer lands in the repo. **Still non-compliant (public side):** `project_data_dir/0` (projects.ex:97-99) falls back to `priv/data/projects/<id>` when `data_path` is nil, so the default project's *public* content (posts, media, templates, scripts, `meta/`, generated `html/`) is written into the application repo; there is no discovery of `data_path` from the `meta/project.json` location, and the `default` project is created with `data_path: nil` (projects.ex:80). | Implement project-folder discovery: `data_path` := the folder containing `meta/project.json` (never stored in project.json, keeping projects movable — `DiscoverProjectDataPath`); create the default project's folder at a per-user default content location on first launch (never in repo/private_dir); drop the `priv/data/projects/<id>` fallback in `project_data_dir/0`; persist the current project-folder location as a machine-local pointer (project registry) under `private_dir`. Migrate the committed `priv/data/projects/default/` content out of the repo. |
|
||||
|
||||
### A2. Spec Should Update (code is normative)
|
||||
|
||||
@@ -186,7 +187,8 @@ All reconciled to follow code. Specs must be self-consistent and match code.
|
||||
|
||||
## Priority Order for Resolution
|
||||
|
||||
1. ~~**A1-1 through A1-14c**~~ — all resolved: auto-save, on-demand preview, template lookup, validation gates, real Pagefind, graceful shutdown, real embedding model, HNSW ANN index, and Apple GPU/EMLX acceleration (A1-14c)
|
||||
1. ~~**A1-1 through A1-15**~~ — all resolved: auto-save, on-demand preview, template lookup, validation gates, real Pagefind, graceful shutdown, real embedding model, HNSW ANN index, Apple GPU/EMLX acceleration (A1-14c), and preview/generation content strategy (A1-15)
|
||||
1b. **A1-16** — storage-location compliance: private side done (embeddings index → OS app dir); public side open (data_path discovery from meta/project.json, drop the `priv/data/projects/<id>` fallback, migrate committed default project out of repo)
|
||||
2. **D1-1 through D1-18** — untested invariants/guarantees
|
||||
3. **C-1 through C-3** — internal spec inconsistencies (reconcile to code)
|
||||
4. **B1-1 through B1-6** — major code behaviors missing from spec
|
||||
|
||||
Reference in New Issue
Block a user