test: D1-11 cover ChatContextTruncation invariant in chat requests

2026-05-30 09:08:51 +02:00
parent 8db7bcf357
commit d688c61b0e
2 changed files with 105 additions and 1 deletions
--- a/SPECGAPS.md
+++ b/SPECGAPS.md
@@ -125,7 +125,7 @@ All reconciled to follow code. Specs must be self-consistent and match code.
 | D1-8 | ~~MacroTimeout guarantee~~ | script.allium:94-95 | **Resolved:** added test in `api_test.exs` — an infinite-loop `render()` macro run with `max_reductions: :none` (forces the luerl sandbox onto its wall-clock path) and a 150ms `timeout` returns `{:error, :timeout}` and terminates within budget (<2s), proving the macro is killed near its budget rather than the default multi-minute script timeout |
 | D1-9 | ~~ExecuteTransform rule (pipeline, ordering, toast budget)~~ | script.allium:229-263 | **Resolved:** the `ExecuteTransform` rule had no engine — added `BDS.Scripts.Transforms.run/3` (+ `Scripts.list_transform_scripts/1` ordered by updated_at→slug→id and `Scripts.resolved_content/1`). The pipeline runs enabled project transforms sequentially on the blogmark candidate with a `{source="blogmark", url}` context, captures per-script errors without rolling back the last valid candidate (TransformPipelineContinuation), and enforces the toast budget (`transform_max_toasts_per_script`/`transform_max_toasts_total`/`transform_max_toast_length`, new config keys). 6 tests added (ordering, project/disabled scoping, continuation, context, per-script + total toast caps with truncation). Deep-link OS routing into this engine remains future work. |
 | D1-10 | ~~TransformPipelineContinuation~~ | script.allium:247-249 | **Resolved:** added focused test in `transforms_test.exs` — a failing *first* transform (no prior valid state) does not halt the pipeline: the original input survives, a later enabled transform still runs against it, and every failure is captured per-script in pipeline order tagged with its slug |
-| D1-11 | ChatContextTruncation invariant | ai.allium:375-379 | Write test: long chat history trimmed to context window |
+| D1-11 | ~~ChatContextTruncation invariant~~ | ai.allium:375-379 | **Resolved:** test added in `ai_test.exs` — a catalog model with a 2,000-token context window plus 40 large seeded turns forces truncation; the captured chat request keeps the system prompt as the first message, drops the oldest pairs first (surviving markers form a contiguous newest suffix, oldest absent), and always retains the newest user turn |
 | D1-12 | BoundedToolLoop enforcement | ai.allium:381-385 | Write test: tool rounds bounded by chat_max_tool_rounds |
 | D1-13 | DiscardPostChangesSideEffects | engine_side_effects.allium:99-104 | Write test: FTS updated after discard |
 | D1-14 | ReplaceMediaFileSideEffects | engine_side_effects.allium:128-134 | Write test: file replaced, thumbnails regenerated |