feat: added gemma 4 (not supported yet in mlx-swift-lm, though)

This commit is contained in:
2026-04-28 21:52:32 +02:00
parent d5b9ae15cc
commit 4ad46ec1ea
4 changed files with 208 additions and 31 deletions

View File

@@ -141,7 +141,7 @@ MLXServer/
│ ├── ToolCallParser.swift — Parses tool calls from model output
│ └── ToolPromptBuilder.swift — Model-specific tool prompt formatting
└── Utilities/
├── LocalModelResolver.swift — Offline-first HuggingFace cache resolution (sandbox + system)
├── LocalModelResolver.swift — Offline-first HuggingFace cache resolution
├── ChatExporter.swift — Export conversations to Markdown or RTF
├── FocusedValues.swift — FocusedValue keys for menu bar integration
└── Preferences.swift — UserDefaults wrapper, including scene persistence
@@ -153,7 +153,7 @@ build.sh — One-command build script (xcodegen + xcodebuild)
## Key Design Decisions
- Uses `mlx-swift-lm` for inference — `VLMModelFactory` for vision models and `LLMModelFactory` for text-only models
- **Offline-first**: `LocalModelResolver` checks both the sandboxed app container and `~/.cache/huggingface/hub/` for locally-cached models before downloading
- **Offline-first**: `LocalModelResolver` checks `~/.cache/huggingface/hub/` for locally-cached models before downloading
- **No duplicate storage**: custom `HubApi` with blob cache disabled — models are stored once in the snapshot cache
- **KV cache reuse** across API requests — reuses `ChatSession` when conversation history prefix matches
- **Thinking mode**: `enable_thinking` passed via Jinja template context; `<think>` tags parsed in real-time during streaming