feat: added gemma 4 (not supported yet in mlx-swift-lm, though)
This commit is contained in:
@@ -141,7 +141,7 @@ MLXServer/
|
||||
│ ├── ToolCallParser.swift — Parses tool calls from model output
|
||||
│ └── ToolPromptBuilder.swift — Model-specific tool prompt formatting
|
||||
└── Utilities/
|
||||
├── LocalModelResolver.swift — Offline-first HuggingFace cache resolution (sandbox + system)
|
||||
├── LocalModelResolver.swift — Offline-first HuggingFace cache resolution
|
||||
├── ChatExporter.swift — Export conversations to Markdown or RTF
|
||||
├── FocusedValues.swift — FocusedValue keys for menu bar integration
|
||||
└── Preferences.swift — UserDefaults wrapper, including scene persistence
|
||||
@@ -153,7 +153,7 @@ build.sh — One-command build script (xcodegen + xcodebuild)
|
||||
## Key Design Decisions
|
||||
|
||||
- Uses `mlx-swift-lm` for inference — `VLMModelFactory` for vision models and `LLMModelFactory` for text-only models
|
||||
- **Offline-first**: `LocalModelResolver` checks both the sandboxed app container and `~/.cache/huggingface/hub/` for locally-cached models before downloading
|
||||
- **Offline-first**: `LocalModelResolver` checks `~/.cache/huggingface/hub/` for locally-cached models before downloading
|
||||
- **No duplicate storage**: custom `HubApi` with blob cache disabled — models are stored once in the snapshot cache
|
||||
- **KV cache reuse** across API requests — reuses `ChatSession` when conversation history prefix matches
|
||||
- **Thinking mode**: `enable_thinking` passed via Jinja template context; `<think>` tags parsed in real-time during streaming
|
||||
|
||||
Reference in New Issue
Block a user