feat: first tries at save dialog, so far failing
This commit is contained in:
33
CLAUDE.md
33
CLAUDE.md
@@ -1,6 +1,6 @@
|
|||||||
# MLX Server
|
# MLX Server
|
||||||
|
|
||||||
Native macOS SwiftUI app for local LLMs on Apple Silicon via MLX. Provides a chat UI and an embedded OpenAI-compatible API server. Supports vision and tool use.
|
Native macOS SwiftUI app for local LLMs on Apple Silicon via MLX. Provides a chat UI and an embedded OpenAI-compatible API server. Supports vision, tool use, and thinking mode.
|
||||||
|
|
||||||
## Quick Start
|
## Quick Start
|
||||||
|
|
||||||
@@ -14,18 +14,24 @@ open "build/Debug/MLX Server.app"
|
|||||||
|
|
||||||
## Project Structure
|
## Project Structure
|
||||||
|
|
||||||
- `MLXServer/MLXServerApp.swift` — App entry point, GPU cache config
|
- `MLXServer/MLXServerApp.swift` — App entry point, GPU cache config, menu commands
|
||||||
- `MLXServer/ContentView.swift` — Main layout, toolbar, keyboard shortcuts
|
- `MLXServer/ContentView.swift` — Main layout, toolbar, keyboard shortcuts, focused values
|
||||||
- `MLXServer/Models/ModelConfig.swift` — Model definitions (alias, repoId, contextLength), resolution
|
- `MLXServer/Models/ModelConfig.swift` — Model definitions (alias, repoId, contextLength), resolution
|
||||||
- `MLXServer/Models/ChatMessage.swift` — Chat message data model
|
- `MLXServer/Models/ChatMessage.swift` — Chat message data model, `<think>` tag parsing
|
||||||
- `MLXServer/ViewModels/ModelManager.swift` — Model loading/switching via VLMModelFactory, offline-first resolution
|
- `MLXServer/ViewModels/ModelManager.swift` — Model loading/switching via VLMModelFactory, download tracking, idle unload
|
||||||
- `MLXServer/ViewModels/ChatViewModel.swift` — Chat state, ChatSession management, API server lifecycle
|
- `MLXServer/ViewModels/ChatViewModel.swift` — Chat state, ChatSession management, API server lifecycle
|
||||||
- `MLXServer/Server/APIServer.swift` — NWListener HTTP server, SSE streaming, KV cache reuse, vision, tool call handling
|
- `MLXServer/Server/APIServer.swift` — NWListener HTTP server, SSE streaming, KV cache reuse, vision, tool call handling
|
||||||
- `MLXServer/Server/APIModels.swift` — OpenAI-compatible Codable structs
|
- `MLXServer/Server/APIModels.swift` — OpenAI-compatible Codable structs
|
||||||
- `MLXServer/Server/ToolCallParser.swift` — Parses tool calls from model output (Gemma tool_code, Qwen XML tags)
|
- `MLXServer/Server/ToolCallParser.swift` — Parses tool calls from model output (Gemma tool_code, Qwen XML tags)
|
||||||
- `MLXServer/Server/ToolPromptBuilder.swift` — Model-specific tool prompt formatting
|
- `MLXServer/Server/ToolPromptBuilder.swift` — Model-specific tool prompt formatting
|
||||||
- `MLXServer/Utilities/LocalModelResolver.swift` — Resolves HF repo IDs to ~/.cache/huggingface/hub/ snapshots
|
- `MLXServer/Views/DownloadModalView.swift` — Modal overlay for model download progress
|
||||||
- `MLXServer/Utilities/Preferences.swift` — UserDefaults wrapper
|
- `MLXServer/Views/ChatMessagesView.swift` — Message bubbles with markdown rendering and collapsible thinking blocks
|
||||||
|
- `MLXServer/Views/ChatInputView.swift` — Text input, image attach (file picker, drag & drop, Finder copy-paste)
|
||||||
|
- `MLXServer/Commands/SaveChatCommands.swift` — File > Export Chat menu command
|
||||||
|
- `MLXServer/Utilities/LocalModelResolver.swift` — Resolves HF repo IDs to local snapshots (sandbox + system cache + flat layouts)
|
||||||
|
- `MLXServer/Utilities/ChatExporter.swift` — Export conversations to Markdown or RTF (Pages-compatible)
|
||||||
|
- `MLXServer/Utilities/FocusedValues.swift` — FocusedValue keys for menu bar integration
|
||||||
|
- `MLXServer/Utilities/Preferences.swift` — UserDefaults wrapper (model, thinking mode, API, idle timeout)
|
||||||
- `project.yml` — xcodegen project spec
|
- `project.yml` — xcodegen project spec
|
||||||
- `build.sh` — Build script (xcodegen + xcodebuild)
|
- `build.sh` — Build script (xcodegen + xcodebuild)
|
||||||
|
|
||||||
@@ -35,6 +41,9 @@ open "build/Debug/MLX Server.app"
|
|||||||
|-------|---------------|-------|
|
|-------|---------------|-------|
|
||||||
| `gemma` | `mlx-community/gemma-3-4b-it-4bit` | Vision + tool use via `tool_code` blocks (128k context) |
|
| `gemma` | `mlx-community/gemma-3-4b-it-4bit` | Vision + tool use via `tool_code` blocks (128k context) |
|
||||||
| `qwen` | `mlx-community/Qwen3-VL-4B-Instruct-4bit` | Vision + tool use via `<tool_call>` tags (256k context) |
|
| `qwen` | `mlx-community/Qwen3-VL-4B-Instruct-4bit` | Vision + tool use via `<tool_call>` tags (256k context) |
|
||||||
|
| `qwen3.5-9b` | `mlx-community/Qwen3.5-9B-4bit` | Thinking mode, tool use (256k context) |
|
||||||
|
|
||||||
|
Any model in MLX format on HuggingFace can be added — no restriction on uploader or architecture.
|
||||||
|
|
||||||
## Critical Performance Rule
|
## Critical Performance Rule
|
||||||
|
|
||||||
@@ -47,9 +56,15 @@ open "build/Debug/MLX Server.app"
|
|||||||
|
|
||||||
## Key Design Decisions
|
## Key Design Decisions
|
||||||
|
|
||||||
- Uses `mlx-swift-lm` (`MLXVLM` / `VLMModelFactory`) as the inference backend — supports both text and vision in a single model load
|
- Uses `mlx-swift-lm` (`MLXVLM` / `VLMModelFactory`) as the inference backend — loads any MLX-format model from HuggingFace
|
||||||
- Model-specific prompt formatting: Gemma uses `tool_code` blocks; Qwen uses `<tool_call>` XML tags
|
- Model-specific prompt formatting: Gemma uses `tool_code` blocks; Qwen uses `<tool_call>` XML tags
|
||||||
- Offline-first: if the model is already cached locally (~/.cache/huggingface/hub/), `LocalModelResolver` resolves the local snapshot path directly — no network requests
|
- **Offline-first**: `LocalModelResolver` checks the sandboxed app container, system `~/.cache/huggingface/hub/`, and flat download layouts — no network requests if model is cached
|
||||||
|
- **No duplicate storage**: custom `HubApi(cache: nil)` with explicit `downloadBase` — models stored once in the snapshot cache, not duplicated across blob cache and snapshots
|
||||||
|
- **Thinking mode**: `enable_thinking` passed to Jinja template context via `additionalContext`; `<think>...</think>` tags parsed in real-time during streaming and shown in collapsible UI blocks. Toggleable in Settings.
|
||||||
|
- **Download progress**: separate `isDownloading` state from `isLoading`; modal overlay shows file count, percentage, speed
|
||||||
|
- **Idle unload**: timer resets on both user input and model generation completion (not just request start)
|
||||||
|
- **Chat export**: Markdown (user messages as blockquotes) and RTF (Pages-compatible with formatted markdown)
|
||||||
|
- **Finder paste**: local event monitor intercepts Cmd+V to check pasteboard for image file URLs before TextField handles it
|
||||||
- HTTP server built on `Network.framework` (`NWListener`) — no third-party server dependencies
|
- HTTP server built on `Network.framework` (`NWListener`) — no third-party server dependencies
|
||||||
- KV cache reuse across API requests — reuses `ChatSession` when conversation history prefix matches
|
- KV cache reuse across API requests — reuses `ChatSession` when conversation history prefix matches
|
||||||
- GPU cache limit set to 20 MB; cache cleared on model unload
|
- GPU cache limit set to 20 MB; cache cleared on model unload
|
||||||
|
|||||||
@@ -10,9 +10,12 @@
|
|||||||
0168AEE16009097901363E16 /* ModelManager.swift in Sources */ = {isa = PBXBuildFile; fileRef = 922CBDC9206737BD04AF2874 /* ModelManager.swift */; };
|
0168AEE16009097901363E16 /* ModelManager.swift in Sources */ = {isa = PBXBuildFile; fileRef = 922CBDC9206737BD04AF2874 /* ModelManager.swift */; };
|
||||||
165E8AB6ADAE1D59B1A86420 /* Preferences.swift in Sources */ = {isa = PBXBuildFile; fileRef = 145B888FBDD4F931512C5473 /* Preferences.swift */; };
|
165E8AB6ADAE1D59B1A86420 /* Preferences.swift in Sources */ = {isa = PBXBuildFile; fileRef = 145B888FBDD4F931512C5473 /* Preferences.swift */; };
|
||||||
189362AAE2CDE5D4B3428334 /* ToolCallParser.swift in Sources */ = {isa = PBXBuildFile; fileRef = E73B165A1822729C907791AE /* ToolCallParser.swift */; };
|
189362AAE2CDE5D4B3428334 /* ToolCallParser.swift in Sources */ = {isa = PBXBuildFile; fileRef = E73B165A1822729C907791AE /* ToolCallParser.swift */; };
|
||||||
|
29879D696584B96CC56560DF /* ChatExporter.swift in Sources */ = {isa = PBXBuildFile; fileRef = D7C9BAD674E29688ACE53B0B /* ChatExporter.swift */; };
|
||||||
2CAAF7129F7CC45200FA9F6B /* ModelPickerView.swift in Sources */ = {isa = PBXBuildFile; fileRef = C3C3A76C02AF70A9D8F868FC /* ModelPickerView.swift */; };
|
2CAAF7129F7CC45200FA9F6B /* ModelPickerView.swift in Sources */ = {isa = PBXBuildFile; fileRef = C3C3A76C02AF70A9D8F868FC /* ModelPickerView.swift */; };
|
||||||
2D08769282BD71C170DB0943 /* InferenceStats.swift in Sources */ = {isa = PBXBuildFile; fileRef = E35452B166893B25E765FF70 /* InferenceStats.swift */; };
|
2D08769282BD71C170DB0943 /* InferenceStats.swift in Sources */ = {isa = PBXBuildFile; fileRef = E35452B166893B25E765FF70 /* InferenceStats.swift */; };
|
||||||
|
4158FA884D981D73288FB74C /* SaveChatCommands.swift in Sources */ = {isa = PBXBuildFile; fileRef = 2E2FCA55CEBEBCED78D9479A /* SaveChatCommands.swift */; };
|
||||||
4CB13DC1AC7A500DDBB443EC /* ChatInputView.swift in Sources */ = {isa = PBXBuildFile; fileRef = E5E6AD02CDF23BDAB64700A7 /* ChatInputView.swift */; };
|
4CB13DC1AC7A500DDBB443EC /* ChatInputView.swift in Sources */ = {isa = PBXBuildFile; fileRef = E5E6AD02CDF23BDAB64700A7 /* ChatInputView.swift */; };
|
||||||
|
4DC033E45880B2948B47DEB1 /* FocusedValues.swift in Sources */ = {isa = PBXBuildFile; fileRef = EF518FEBF3A38E830E3CE1A5 /* FocusedValues.swift */; };
|
||||||
50B6861FF8610B3ED4FFAD9D /* MLXServerApp.swift in Sources */ = {isa = PBXBuildFile; fileRef = C67742651DB486871CEF1612 /* MLXServerApp.swift */; };
|
50B6861FF8610B3ED4FFAD9D /* MLXServerApp.swift in Sources */ = {isa = PBXBuildFile; fileRef = C67742651DB486871CEF1612 /* MLXServerApp.swift */; };
|
||||||
50DD129CCF2843482DEC3B96 /* APIServer.swift in Sources */ = {isa = PBXBuildFile; fileRef = 3D08828E16B17EF02C14243E /* APIServer.swift */; };
|
50DD129CCF2843482DEC3B96 /* APIServer.swift in Sources */ = {isa = PBXBuildFile; fileRef = 3D08828E16B17EF02C14243E /* APIServer.swift */; };
|
||||||
5946258F1DE88CE904584E0B /* ContentView.swift in Sources */ = {isa = PBXBuildFile; fileRef = 944C699FBB76C734C9DF2F2E /* ContentView.swift */; };
|
5946258F1DE88CE904584E0B /* ContentView.swift in Sources */ = {isa = PBXBuildFile; fileRef = 944C699FBB76C734C9DF2F2E /* ContentView.swift */; };
|
||||||
@@ -38,6 +41,7 @@
|
|||||||
145B888FBDD4F931512C5473 /* Preferences.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = Preferences.swift; sourceTree = "<group>"; };
|
145B888FBDD4F931512C5473 /* Preferences.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = Preferences.swift; sourceTree = "<group>"; };
|
||||||
16AE82A64D1D07AE3CD8D33A /* ToolPromptBuilder.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ToolPromptBuilder.swift; sourceTree = "<group>"; };
|
16AE82A64D1D07AE3CD8D33A /* ToolPromptBuilder.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ToolPromptBuilder.swift; sourceTree = "<group>"; };
|
||||||
2DC8C86D397B1FCA08E07CBD /* DownloadModalView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = DownloadModalView.swift; sourceTree = "<group>"; };
|
2DC8C86D397B1FCA08E07CBD /* DownloadModalView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = DownloadModalView.swift; sourceTree = "<group>"; };
|
||||||
|
2E2FCA55CEBEBCED78D9479A /* SaveChatCommands.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SaveChatCommands.swift; sourceTree = "<group>"; };
|
||||||
38DFC212AF4359A45FBE22BA /* ModelConfig.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ModelConfig.swift; sourceTree = "<group>"; };
|
38DFC212AF4359A45FBE22BA /* ModelConfig.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ModelConfig.swift; sourceTree = "<group>"; };
|
||||||
3AF462805202797F61422AEE /* MLXServer.entitlements */ = {isa = PBXFileReference; lastKnownFileType = text.plist.entitlements; path = MLXServer.entitlements; sourceTree = "<group>"; };
|
3AF462805202797F61422AEE /* MLXServer.entitlements */ = {isa = PBXFileReference; lastKnownFileType = text.plist.entitlements; path = MLXServer.entitlements; sourceTree = "<group>"; };
|
||||||
3D08828E16B17EF02C14243E /* APIServer.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = APIServer.swift; sourceTree = "<group>"; };
|
3D08828E16B17EF02C14243E /* APIServer.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = APIServer.swift; sourceTree = "<group>"; };
|
||||||
@@ -53,10 +57,12 @@
|
|||||||
C3C3A76C02AF70A9D8F868FC /* ModelPickerView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ModelPickerView.swift; sourceTree = "<group>"; };
|
C3C3A76C02AF70A9D8F868FC /* ModelPickerView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ModelPickerView.swift; sourceTree = "<group>"; };
|
||||||
C67742651DB486871CEF1612 /* MLXServerApp.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = MLXServerApp.swift; sourceTree = "<group>"; };
|
C67742651DB486871CEF1612 /* MLXServerApp.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = MLXServerApp.swift; sourceTree = "<group>"; };
|
||||||
D733A0D1D4AC25DDDA6C8684 /* LocalModelResolver.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = LocalModelResolver.swift; sourceTree = "<group>"; };
|
D733A0D1D4AC25DDDA6C8684 /* LocalModelResolver.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = LocalModelResolver.swift; sourceTree = "<group>"; };
|
||||||
|
D7C9BAD674E29688ACE53B0B /* ChatExporter.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ChatExporter.swift; sourceTree = "<group>"; };
|
||||||
DB1A5E8B1C9F2BC4D262C53A /* ChatMessagesView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ChatMessagesView.swift; sourceTree = "<group>"; };
|
DB1A5E8B1C9F2BC4D262C53A /* ChatMessagesView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ChatMessagesView.swift; sourceTree = "<group>"; };
|
||||||
E35452B166893B25E765FF70 /* InferenceStats.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = InferenceStats.swift; sourceTree = "<group>"; };
|
E35452B166893B25E765FF70 /* InferenceStats.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = InferenceStats.swift; sourceTree = "<group>"; };
|
||||||
E5E6AD02CDF23BDAB64700A7 /* ChatInputView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ChatInputView.swift; sourceTree = "<group>"; };
|
E5E6AD02CDF23BDAB64700A7 /* ChatInputView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ChatInputView.swift; sourceTree = "<group>"; };
|
||||||
E73B165A1822729C907791AE /* ToolCallParser.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ToolCallParser.swift; sourceTree = "<group>"; };
|
E73B165A1822729C907791AE /* ToolCallParser.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ToolCallParser.swift; sourceTree = "<group>"; };
|
||||||
|
EF518FEBF3A38E830E3CE1A5 /* FocusedValues.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = FocusedValues.swift; sourceTree = "<group>"; };
|
||||||
F1A52E2C9964ADA9D841A89B /* APIModels.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = APIModels.swift; sourceTree = "<group>"; };
|
F1A52E2C9964ADA9D841A89B /* APIModels.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = APIModels.swift; sourceTree = "<group>"; };
|
||||||
/* End PBXFileReference section */
|
/* End PBXFileReference section */
|
||||||
|
|
||||||
@@ -78,6 +84,8 @@
|
|||||||
05B1BAE308E64D2FB2E73823 /* Utilities */ = {
|
05B1BAE308E64D2FB2E73823 /* Utilities */ = {
|
||||||
isa = PBXGroup;
|
isa = PBXGroup;
|
||||||
children = (
|
children = (
|
||||||
|
D7C9BAD674E29688ACE53B0B /* ChatExporter.swift */,
|
||||||
|
EF518FEBF3A38E830E3CE1A5 /* FocusedValues.swift */,
|
||||||
D733A0D1D4AC25DDDA6C8684 /* LocalModelResolver.swift */,
|
D733A0D1D4AC25DDDA6C8684 /* LocalModelResolver.swift */,
|
||||||
145B888FBDD4F931512C5473 /* Preferences.swift */,
|
145B888FBDD4F931512C5473 /* Preferences.swift */,
|
||||||
);
|
);
|
||||||
@@ -99,6 +107,7 @@
|
|||||||
944C699FBB76C734C9DF2F2E /* ContentView.swift */,
|
944C699FBB76C734C9DF2F2E /* ContentView.swift */,
|
||||||
3AF462805202797F61422AEE /* MLXServer.entitlements */,
|
3AF462805202797F61422AEE /* MLXServer.entitlements */,
|
||||||
C67742651DB486871CEF1612 /* MLXServerApp.swift */,
|
C67742651DB486871CEF1612 /* MLXServerApp.swift */,
|
||||||
|
B459409ED6FD8797FDD81E94 /* Commands */,
|
||||||
BD0E350482D91238B4B59721 /* Models */,
|
BD0E350482D91238B4B59721 /* Models */,
|
||||||
E13C1AAA0C49D0ED85EFD94D /* Server */,
|
E13C1AAA0C49D0ED85EFD94D /* Server */,
|
||||||
05B1BAE308E64D2FB2E73823 /* Utilities */,
|
05B1BAE308E64D2FB2E73823 /* Utilities */,
|
||||||
@@ -122,6 +131,14 @@
|
|||||||
path = Views;
|
path = Views;
|
||||||
sourceTree = "<group>";
|
sourceTree = "<group>";
|
||||||
};
|
};
|
||||||
|
B459409ED6FD8797FDD81E94 /* Commands */ = {
|
||||||
|
isa = PBXGroup;
|
||||||
|
children = (
|
||||||
|
2E2FCA55CEBEBCED78D9479A /* SaveChatCommands.swift */,
|
||||||
|
);
|
||||||
|
path = Commands;
|
||||||
|
sourceTree = "<group>";
|
||||||
|
};
|
||||||
BD0E350482D91238B4B59721 /* Models */ = {
|
BD0E350482D91238B4B59721 /* Models */ = {
|
||||||
isa = PBXGroup;
|
isa = PBXGroup;
|
||||||
children = (
|
children = (
|
||||||
@@ -238,12 +255,14 @@
|
|||||||
files = (
|
files = (
|
||||||
D96DDE66F76FDDA642629E17 /* APIModels.swift in Sources */,
|
D96DDE66F76FDDA642629E17 /* APIModels.swift in Sources */,
|
||||||
50DD129CCF2843482DEC3B96 /* APIServer.swift in Sources */,
|
50DD129CCF2843482DEC3B96 /* APIServer.swift in Sources */,
|
||||||
|
29879D696584B96CC56560DF /* ChatExporter.swift in Sources */,
|
||||||
4CB13DC1AC7A500DDBB443EC /* ChatInputView.swift in Sources */,
|
4CB13DC1AC7A500DDBB443EC /* ChatInputView.swift in Sources */,
|
||||||
FAF7D4714AC6D02674920208 /* ChatMessage.swift in Sources */,
|
FAF7D4714AC6D02674920208 /* ChatMessage.swift in Sources */,
|
||||||
5C1E8FE1C521914CEF98D3AA /* ChatMessagesView.swift in Sources */,
|
5C1E8FE1C521914CEF98D3AA /* ChatMessagesView.swift in Sources */,
|
||||||
B5AA6E3B4BE21676226B342B /* ChatViewModel.swift in Sources */,
|
B5AA6E3B4BE21676226B342B /* ChatViewModel.swift in Sources */,
|
||||||
5946258F1DE88CE904584E0B /* ContentView.swift in Sources */,
|
5946258F1DE88CE904584E0B /* ContentView.swift in Sources */,
|
||||||
C07A377244DCD67F4FE709FE /* DownloadModalView.swift in Sources */,
|
C07A377244DCD67F4FE709FE /* DownloadModalView.swift in Sources */,
|
||||||
|
4DC033E45880B2948B47DEB1 /* FocusedValues.swift in Sources */,
|
||||||
2D08769282BD71C170DB0943 /* InferenceStats.swift in Sources */,
|
2D08769282BD71C170DB0943 /* InferenceStats.swift in Sources */,
|
||||||
6828CCA8B78AB40906F87CAB /* LocalModelResolver.swift in Sources */,
|
6828CCA8B78AB40906F87CAB /* LocalModelResolver.swift in Sources */,
|
||||||
50B6861FF8610B3ED4FFAD9D /* MLXServerApp.swift in Sources */,
|
50B6861FF8610B3ED4FFAD9D /* MLXServerApp.swift in Sources */,
|
||||||
@@ -252,6 +271,7 @@
|
|||||||
2CAAF7129F7CC45200FA9F6B /* ModelPickerView.swift in Sources */,
|
2CAAF7129F7CC45200FA9F6B /* ModelPickerView.swift in Sources */,
|
||||||
B1D9BC407DB7DB1489230C20 /* MonitorView.swift in Sources */,
|
B1D9BC407DB7DB1489230C20 /* MonitorView.swift in Sources */,
|
||||||
165E8AB6ADAE1D59B1A86420 /* Preferences.swift in Sources */,
|
165E8AB6ADAE1D59B1A86420 /* Preferences.swift in Sources */,
|
||||||
|
4158FA884D981D73288FB74C /* SaveChatCommands.swift in Sources */,
|
||||||
D666A311788375E8A061C832 /* SettingsView.swift in Sources */,
|
D666A311788375E8A061C832 /* SettingsView.swift in Sources */,
|
||||||
621B7E4382199AC1378F5F9C /* StatusBarView.swift in Sources */,
|
621B7E4382199AC1378F5F9C /* StatusBarView.swift in Sources */,
|
||||||
189362AAE2CDE5D4B3428334 /* ToolCallParser.swift in Sources */,
|
189362AAE2CDE5D4B3428334 /* ToolCallParser.swift in Sources */,
|
||||||
@@ -399,7 +419,7 @@
|
|||||||
);
|
);
|
||||||
MACOSX_DEPLOYMENT_TARGET = 15.0;
|
MACOSX_DEPLOYMENT_TARGET = 15.0;
|
||||||
MARKETING_VERSION = 1.0.0;
|
MARKETING_VERSION = 1.0.0;
|
||||||
PRODUCT_BUNDLE_IDENTIFIER = com.mlxserver.app;
|
PRODUCT_BUNDLE_IDENTIFIER = de.rfc1437.mlxserver;
|
||||||
PRODUCT_NAME = "MLX Server";
|
PRODUCT_NAME = "MLX Server";
|
||||||
SDKROOT = macosx;
|
SDKROOT = macosx;
|
||||||
SWIFT_VERSION = 6.0;
|
SWIFT_VERSION = 6.0;
|
||||||
@@ -424,7 +444,7 @@
|
|||||||
);
|
);
|
||||||
MACOSX_DEPLOYMENT_TARGET = 15.0;
|
MACOSX_DEPLOYMENT_TARGET = 15.0;
|
||||||
MARKETING_VERSION = 1.0.0;
|
MARKETING_VERSION = 1.0.0;
|
||||||
PRODUCT_BUNDLE_IDENTIFIER = com.mlxserver.app;
|
PRODUCT_BUNDLE_IDENTIFIER = de.rfc1437.mlxserver;
|
||||||
PRODUCT_NAME = "MLX Server";
|
PRODUCT_NAME = "MLX Server";
|
||||||
SDKROOT = macosx;
|
SDKROOT = macosx;
|
||||||
SWIFT_VERSION = 6.0;
|
SWIFT_VERSION = 6.0;
|
||||||
|
|||||||
16
MLXServer/Commands/SaveChatCommands.swift
Normal file
16
MLXServer/Commands/SaveChatCommands.swift
Normal file
@@ -0,0 +1,16 @@
|
|||||||
|
import SwiftUI
|
||||||
|
|
||||||
|
/// Adds "Export Chat…" to the File menu.
|
||||||
|
struct SaveChatCommands: Commands {
|
||||||
|
@FocusedBinding(\.exportTrigger) var isExporting
|
||||||
|
|
||||||
|
var body: some Commands {
|
||||||
|
CommandGroup(after: .saveItem) {
|
||||||
|
Button("Export Chat…") {
|
||||||
|
isExporting = true
|
||||||
|
}
|
||||||
|
.keyboardShortcut("e", modifiers: [.command, .shift])
|
||||||
|
.disabled(isExporting == nil)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -1,10 +1,12 @@
|
|||||||
import SwiftUI
|
import SwiftUI
|
||||||
|
import UniformTypeIdentifiers
|
||||||
|
|
||||||
struct ContentView: View {
|
struct ContentView: View {
|
||||||
@Environment(ModelManager.self) private var modelManager
|
@Environment(ModelManager.self) private var modelManager
|
||||||
@State private var chatVM: ChatViewModel?
|
@State private var chatVM: ChatViewModel?
|
||||||
@State private var showLoadError = false
|
@State private var showLoadError = false
|
||||||
@State private var showMonitor = false
|
@State private var showMonitor = false
|
||||||
|
@State private var isExporting = false
|
||||||
|
|
||||||
var body: some View {
|
var body: some View {
|
||||||
mainContent
|
mainContent
|
||||||
@@ -52,6 +54,21 @@ struct ContentView: View {
|
|||||||
.background {
|
.background {
|
||||||
modelSwitchShortcuts
|
modelSwitchShortcuts
|
||||||
}
|
}
|
||||||
|
// Expose export trigger to menu bar command
|
||||||
|
.focusedSceneValue(\.exportTrigger, $isExporting)
|
||||||
|
.fileExporter(
|
||||||
|
isPresented: $isExporting,
|
||||||
|
document: ChatExportDocument(
|
||||||
|
messages: chatVM?.conversation.messages ?? [],
|
||||||
|
modelName: modelManager.currentModel?.displayName
|
||||||
|
),
|
||||||
|
contentTypes: ChatExportDocument.writableContentTypes,
|
||||||
|
defaultFilename: "chat"
|
||||||
|
) { result in
|
||||||
|
if case .failure(let error) = result {
|
||||||
|
print("[Export] Failed: \(error.localizedDescription)")
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ViewBuilder
|
@ViewBuilder
|
||||||
|
|||||||
@@ -23,6 +23,9 @@ struct MLXServerApp: App {
|
|||||||
}
|
}
|
||||||
.windowStyle(.titleBar)
|
.windowStyle(.titleBar)
|
||||||
.defaultSize(width: 800, height: 700)
|
.defaultSize(width: 800, height: 700)
|
||||||
|
.commands {
|
||||||
|
SaveChatCommands()
|
||||||
|
}
|
||||||
|
|
||||||
#if os(macOS)
|
#if os(macOS)
|
||||||
Settings {
|
Settings {
|
||||||
|
|||||||
290
MLXServer/Utilities/ChatExporter.swift
Normal file
290
MLXServer/Utilities/ChatExporter.swift
Normal file
@@ -0,0 +1,290 @@
|
|||||||
|
import AppKit
|
||||||
|
import Foundation
|
||||||
|
import SwiftUI
|
||||||
|
import UniformTypeIdentifiers
|
||||||
|
|
||||||
|
/// A FileDocument that exports a chat conversation as Markdown or RTF.
|
||||||
|
struct ChatExportDocument: FileDocument {
|
||||||
|
static var readableContentTypes: [UTType] { [.plainText] }
|
||||||
|
static var writableContentTypes: [UTType] {
|
||||||
|
[UTType(filenameExtension: "md") ?? .plainText, .rtf]
|
||||||
|
}
|
||||||
|
|
||||||
|
let messages: [ChatMessage]
|
||||||
|
let modelName: String?
|
||||||
|
|
||||||
|
init(messages: [ChatMessage], modelName: String?) {
|
||||||
|
self.messages = messages
|
||||||
|
self.modelName = modelName
|
||||||
|
}
|
||||||
|
|
||||||
|
init(configuration: ReadConfiguration) throws {
|
||||||
|
self.messages = []
|
||||||
|
self.modelName = nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func fileWrapper(configuration: WriteConfiguration) throws -> FileWrapper {
|
||||||
|
let contentType = configuration.contentType
|
||||||
|
|
||||||
|
if contentType == .rtf, let data = ChatExporter.exportRTF(messages: messages, modelName: modelName) {
|
||||||
|
return FileWrapper(regularFileWithContents: data)
|
||||||
|
} else {
|
||||||
|
let md = ChatExporter.exportMarkdown(messages: messages, modelName: modelName)
|
||||||
|
return FileWrapper(regularFileWithContents: Data(md.utf8))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Exports a chat conversation to Markdown or RTF (Pages-compatible) format.
|
||||||
|
enum ChatExporter {
|
||||||
|
|
||||||
|
// MARK: - Markdown export
|
||||||
|
|
||||||
|
static func exportMarkdown(messages: [ChatMessage], modelName: String?) -> String {
|
||||||
|
var lines: [String] = []
|
||||||
|
|
||||||
|
// Header
|
||||||
|
lines.append("# Chat Session")
|
||||||
|
if let modelName {
|
||||||
|
lines.append("**Model:** \(modelName)")
|
||||||
|
}
|
||||||
|
let formatter = DateFormatter()
|
||||||
|
formatter.dateStyle = .long
|
||||||
|
formatter.timeStyle = .short
|
||||||
|
if let first = messages.first {
|
||||||
|
lines.append("**Date:** \(formatter.string(from: first.timestamp))")
|
||||||
|
}
|
||||||
|
lines.append("")
|
||||||
|
lines.append("---")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
for message in messages {
|
||||||
|
guard message.role != .system else { continue }
|
||||||
|
|
||||||
|
if message.role == .user {
|
||||||
|
// User messages as blockquotes
|
||||||
|
lines.append("**You:**")
|
||||||
|
lines.append("")
|
||||||
|
for line in message.content.components(separatedBy: "\n") {
|
||||||
|
lines.append("> \(line)")
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
// Assistant messages: carry over original markdown
|
||||||
|
lines.append("**Assistant:**")
|
||||||
|
lines.append("")
|
||||||
|
lines.append(message.content)
|
||||||
|
}
|
||||||
|
|
||||||
|
lines.append("")
|
||||||
|
lines.append("---")
|
||||||
|
lines.append("")
|
||||||
|
}
|
||||||
|
|
||||||
|
return lines.joined(separator: "\n")
|
||||||
|
}
|
||||||
|
|
||||||
|
// MARK: - RTF export
|
||||||
|
|
||||||
|
static func exportRTF(messages: [ChatMessage], modelName: String?) -> Data? {
|
||||||
|
let doc = NSMutableAttributedString()
|
||||||
|
|
||||||
|
let bodyFont = NSFont.systemFont(ofSize: 13)
|
||||||
|
let bodyBoldFont = NSFont.boldSystemFont(ofSize: 13)
|
||||||
|
let titleFont = NSFont.boldSystemFont(ofSize: 20)
|
||||||
|
let metaFont = NSFont.systemFont(ofSize: 11)
|
||||||
|
let codeFont = NSFont.monospacedSystemFont(ofSize: 12, weight: .regular)
|
||||||
|
|
||||||
|
let bodyParagraph = NSMutableParagraphStyle()
|
||||||
|
bodyParagraph.paragraphSpacing = 8
|
||||||
|
bodyParagraph.lineSpacing = 2
|
||||||
|
|
||||||
|
let userParagraph = NSMutableParagraphStyle()
|
||||||
|
userParagraph.paragraphSpacing = 8
|
||||||
|
userParagraph.lineSpacing = 2
|
||||||
|
userParagraph.headIndent = 20
|
||||||
|
userParagraph.firstLineHeadIndent = 20
|
||||||
|
|
||||||
|
// Title
|
||||||
|
doc.append(NSAttributedString(
|
||||||
|
string: "Chat Session\n",
|
||||||
|
attributes: [.font: titleFont, .paragraphStyle: bodyParagraph]
|
||||||
|
))
|
||||||
|
|
||||||
|
// Metadata
|
||||||
|
let formatter = DateFormatter()
|
||||||
|
formatter.dateStyle = .long
|
||||||
|
formatter.timeStyle = .short
|
||||||
|
var metaText = ""
|
||||||
|
if let modelName { metaText += "Model: \(modelName) " }
|
||||||
|
if let first = messages.first {
|
||||||
|
metaText += "Date: \(formatter.string(from: first.timestamp))"
|
||||||
|
}
|
||||||
|
if !metaText.isEmpty {
|
||||||
|
doc.append(NSAttributedString(
|
||||||
|
string: metaText + "\n\n",
|
||||||
|
attributes: [.font: metaFont, .foregroundColor: NSColor.secondaryLabelColor]
|
||||||
|
))
|
||||||
|
}
|
||||||
|
|
||||||
|
for message in messages {
|
||||||
|
guard message.role != .system else { continue }
|
||||||
|
|
||||||
|
if message.role == .user {
|
||||||
|
doc.append(NSAttributedString(
|
||||||
|
string: "You\n",
|
||||||
|
attributes: [
|
||||||
|
.font: bodyBoldFont,
|
||||||
|
.foregroundColor: NSColor.systemBlue,
|
||||||
|
]
|
||||||
|
))
|
||||||
|
doc.append(NSAttributedString(
|
||||||
|
string: message.content + "\n\n",
|
||||||
|
attributes: [
|
||||||
|
.font: bodyFont,
|
||||||
|
.paragraphStyle: userParagraph,
|
||||||
|
.foregroundColor: NSColor.labelColor,
|
||||||
|
]
|
||||||
|
))
|
||||||
|
} else {
|
||||||
|
doc.append(NSAttributedString(
|
||||||
|
string: "Assistant\n",
|
||||||
|
attributes: [
|
||||||
|
.font: bodyBoldFont,
|
||||||
|
.foregroundColor: NSColor.labelColor,
|
||||||
|
]
|
||||||
|
))
|
||||||
|
let rendered = renderMarkdown(message.content, bodyFont: bodyFont, codeFont: codeFont, paragraph: bodyParagraph)
|
||||||
|
doc.append(rendered)
|
||||||
|
doc.append(NSAttributedString(string: "\n\n"))
|
||||||
|
}
|
||||||
|
|
||||||
|
doc.append(NSAttributedString(
|
||||||
|
string: "\n",
|
||||||
|
attributes: [
|
||||||
|
.strikethroughStyle: NSUnderlineStyle.single.rawValue,
|
||||||
|
.strikethroughColor: NSColor.separatorColor,
|
||||||
|
.font: NSFont.systemFont(ofSize: 4),
|
||||||
|
]
|
||||||
|
))
|
||||||
|
}
|
||||||
|
|
||||||
|
return doc.rtf(from: NSRange(location: 0, length: doc.length), documentAttributes: [
|
||||||
|
.documentType: NSAttributedString.DocumentType.rtf,
|
||||||
|
])
|
||||||
|
}
|
||||||
|
|
||||||
|
// MARK: - Markdown → NSAttributedString (basic)
|
||||||
|
|
||||||
|
private static func renderMarkdown(
|
||||||
|
_ text: String,
|
||||||
|
bodyFont: NSFont,
|
||||||
|
codeFont: NSFont,
|
||||||
|
paragraph: NSParagraphStyle
|
||||||
|
) -> NSAttributedString {
|
||||||
|
let result = NSMutableAttributedString()
|
||||||
|
let lines = text.components(separatedBy: "\n")
|
||||||
|
var inCodeBlock = false
|
||||||
|
var codeBlockLines: [String] = []
|
||||||
|
|
||||||
|
for line in lines {
|
||||||
|
if line.hasPrefix("```") {
|
||||||
|
if inCodeBlock {
|
||||||
|
let code = codeBlockLines.joined(separator: "\n")
|
||||||
|
let codePara = NSMutableParagraphStyle()
|
||||||
|
codePara.paragraphSpacing = 4
|
||||||
|
codePara.headIndent = 12
|
||||||
|
codePara.firstLineHeadIndent = 12
|
||||||
|
result.append(NSAttributedString(
|
||||||
|
string: code + "\n",
|
||||||
|
attributes: [
|
||||||
|
.font: codeFont,
|
||||||
|
.foregroundColor: NSColor.secondaryLabelColor,
|
||||||
|
.backgroundColor: NSColor.quaternaryLabelColor,
|
||||||
|
.paragraphStyle: codePara,
|
||||||
|
]
|
||||||
|
))
|
||||||
|
codeBlockLines = []
|
||||||
|
inCodeBlock = false
|
||||||
|
} else {
|
||||||
|
inCodeBlock = true
|
||||||
|
}
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
if inCodeBlock {
|
||||||
|
codeBlockLines.append(line)
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
if line.hasPrefix("### ") {
|
||||||
|
result.append(NSAttributedString(
|
||||||
|
string: String(line.dropFirst(4)) + "\n",
|
||||||
|
attributes: [.font: NSFont.boldSystemFont(ofSize: 14), .paragraphStyle: paragraph]
|
||||||
|
))
|
||||||
|
} else if line.hasPrefix("## ") {
|
||||||
|
result.append(NSAttributedString(
|
||||||
|
string: String(line.dropFirst(3)) + "\n",
|
||||||
|
attributes: [.font: NSFont.boldSystemFont(ofSize: 15), .paragraphStyle: paragraph]
|
||||||
|
))
|
||||||
|
} else if line.hasPrefix("# ") {
|
||||||
|
result.append(NSAttributedString(
|
||||||
|
string: String(line.dropFirst(2)) + "\n",
|
||||||
|
attributes: [.font: NSFont.boldSystemFont(ofSize: 17), .paragraphStyle: paragraph]
|
||||||
|
))
|
||||||
|
} else {
|
||||||
|
let styled = applyInlineFormatting(line, bodyFont: bodyFont, codeFont: codeFont)
|
||||||
|
result.append(styled)
|
||||||
|
result.append(NSAttributedString(string: "\n", attributes: [.font: bodyFont]))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return result
|
||||||
|
}
|
||||||
|
|
||||||
|
private static func applyInlineFormatting(
|
||||||
|
_ text: String,
|
||||||
|
bodyFont: NSFont,
|
||||||
|
codeFont: NSFont
|
||||||
|
) -> NSAttributedString {
|
||||||
|
let result = NSMutableAttributedString()
|
||||||
|
var remaining = text[text.startIndex...]
|
||||||
|
|
||||||
|
while !remaining.isEmpty {
|
||||||
|
if remaining.hasPrefix("`"), let end = remaining.dropFirst().firstIndex(of: "`") {
|
||||||
|
let code = String(remaining[remaining.index(after: remaining.startIndex)..<end])
|
||||||
|
result.append(NSAttributedString(
|
||||||
|
string: code,
|
||||||
|
attributes: [
|
||||||
|
.font: codeFont,
|
||||||
|
.foregroundColor: NSColor.secondaryLabelColor,
|
||||||
|
.backgroundColor: NSColor.quaternaryLabelColor,
|
||||||
|
]
|
||||||
|
))
|
||||||
|
remaining = remaining[remaining.index(after: end)...]
|
||||||
|
} else if remaining.hasPrefix("**"), let end = remaining.dropFirst(2).range(of: "**") {
|
||||||
|
let bold = String(remaining[remaining.index(remaining.startIndex, offsetBy: 2)..<end.lowerBound])
|
||||||
|
result.append(NSAttributedString(
|
||||||
|
string: bold,
|
||||||
|
attributes: [.font: NSFont.boldSystemFont(ofSize: bodyFont.pointSize)]
|
||||||
|
))
|
||||||
|
remaining = remaining[end.upperBound...]
|
||||||
|
} else if remaining.hasPrefix("*"), let end = remaining.dropFirst().firstIndex(of: "*") {
|
||||||
|
let italic = String(remaining[remaining.index(after: remaining.startIndex)..<end])
|
||||||
|
result.append(NSAttributedString(
|
||||||
|
string: italic,
|
||||||
|
attributes: [.font: NSFontManager.shared.convert(bodyFont, toHaveTrait: .italicFontMask)]
|
||||||
|
))
|
||||||
|
remaining = remaining[remaining.index(after: end)...]
|
||||||
|
} else {
|
||||||
|
let ch = remaining[remaining.startIndex]
|
||||||
|
result.append(NSAttributedString(
|
||||||
|
string: String(ch),
|
||||||
|
attributes: [.font: bodyFont]
|
||||||
|
))
|
||||||
|
remaining = remaining[remaining.index(after: remaining.startIndex)...]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return result
|
||||||
|
}
|
||||||
|
}
|
||||||
13
MLXServer/Utilities/FocusedValues.swift
Normal file
13
MLXServer/Utilities/FocusedValues.swift
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
import SwiftUI
|
||||||
|
|
||||||
|
/// Focused value key for triggering chat export from the menu bar.
|
||||||
|
struct FocusedExportTriggerKey: FocusedValueKey {
|
||||||
|
typealias Value = Binding<Bool>
|
||||||
|
}
|
||||||
|
|
||||||
|
extension FocusedValues {
|
||||||
|
var exportTrigger: Binding<Bool>? {
|
||||||
|
get { self[FocusedExportTriggerKey.self] }
|
||||||
|
set { self[FocusedExportTriggerKey.self] = newValue }
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -1,75 +1,28 @@
|
|||||||
import Foundation
|
import Foundation
|
||||||
|
|
||||||
/// Resolves HuggingFace model repos to local snapshot directories,
|
/// Resolves HuggingFace model repos to local directories.
|
||||||
/// matching the cache layout used by Python's `huggingface_hub`.
|
|
||||||
///
|
///
|
||||||
/// Checks two locations:
|
/// HubApi(downloadBase: .cachesDirectory, cache: nil) downloads models to:
|
||||||
/// 1. App sandbox container: ~/Library/Containers/com.mlxserver.app/.../huggingface/hub/
|
/// ~/Library/Containers/de.rfc1437.mlxserver/Data/Library/Caches/models/{org}/{name}/
|
||||||
/// 2. System-wide cache: ~/.cache/huggingface/hub/ (shared with Python tools)
|
|
||||||
///
|
|
||||||
/// Cache structure:
|
|
||||||
/// .../huggingface/hub/models--{org}--{name}/snapshots/{hash}/
|
|
||||||
enum LocalModelResolver {
|
enum LocalModelResolver {
|
||||||
|
|
||||||
/// All HuggingFace cache directories to search, in priority order.
|
/// Base directory where HubApi stores downloaded models.
|
||||||
/// The sandboxed container path is checked first (where the app downloads to),
|
private static let modelsBase: URL? = {
|
||||||
/// then the system-wide Python cache (for models downloaded via huggingface-cli).
|
FileManager.default.urls(for: .cachesDirectory, in: .userDomainMask).first?
|
||||||
private static let cacheBases: [URL] = {
|
.appendingPathComponent("models", isDirectory: true)
|
||||||
var bases: [URL] = []
|
|
||||||
|
|
||||||
// 1. Sandboxed app container cache (where swift-transformers Hub downloads to)
|
|
||||||
let containerCache = FileManager.default.homeDirectoryForCurrentUser
|
|
||||||
.appendingPathComponent("Library/Caches/huggingface/hub", isDirectory: true)
|
|
||||||
bases.append(containerCache)
|
|
||||||
|
|
||||||
// 2. System-wide ~/.cache/huggingface/hub/ (Python huggingface_hub)
|
|
||||||
// When sandboxed, homeDirectory points to the container, so construct the real path.
|
|
||||||
let realHome = URL(fileURLWithPath: NSHomeDirectory())
|
|
||||||
let systemCache = realHome
|
|
||||||
.appendingPathComponent(".cache/huggingface/hub", isDirectory: true)
|
|
||||||
// Avoid duplicate if they resolve to the same path
|
|
||||||
if systemCache.path != containerCache.path {
|
|
||||||
bases.append(systemCache)
|
|
||||||
}
|
|
||||||
|
|
||||||
// 3. Also try the unsandboxed home directory path
|
|
||||||
let globalHome = FileManager.default.homeDirectoryForCurrentUser
|
|
||||||
.appendingPathComponent(".cache/huggingface/hub", isDirectory: true)
|
|
||||||
if globalHome.path != containerCache.path && globalHome.path != systemCache.path {
|
|
||||||
bases.append(globalHome)
|
|
||||||
}
|
|
||||||
|
|
||||||
return bases
|
|
||||||
}()
|
}()
|
||||||
|
|
||||||
/// Resolve a HuggingFace repo ID (e.g. "mlx-community/gemma-3-4b-it-4bit")
|
/// Resolve a HuggingFace repo ID (e.g. "mlx-community/gemma-3-4b-it-4bit")
|
||||||
/// to its local snapshot directory, if it exists.
|
/// to its local directory, if it exists.
|
||||||
///
|
///
|
||||||
/// Returns `nil` if the model hasn't been downloaded yet.
|
/// Returns `nil` if the model hasn't been downloaded yet.
|
||||||
static func resolve(repoId: String) -> URL? {
|
static func resolve(repoId: String) -> URL? {
|
||||||
let dirName = "models--" + repoId.replacingOccurrences(of: "/", with: "--")
|
guard let base = modelsBase else { return nil }
|
||||||
|
let modelDir = base.appendingPathComponent(repoId, isDirectory: true)
|
||||||
for cacheBase in cacheBases {
|
var isDir: ObjCBool = false
|
||||||
let snapshotsDir = cacheBase
|
if FileManager.default.fileExists(atPath: modelDir.path, isDirectory: &isDir), isDir.boolValue {
|
||||||
.appendingPathComponent(dirName, isDirectory: true)
|
return modelDir
|
||||||
.appendingPathComponent("snapshots", isDirectory: true)
|
|
||||||
|
|
||||||
guard let contents = try? FileManager.default.contentsOfDirectory(
|
|
||||||
at: snapshotsDir,
|
|
||||||
includingPropertiesForKeys: [.isDirectoryKey],
|
|
||||||
options: [.skipsHiddenFiles]
|
|
||||||
) else {
|
|
||||||
continue
|
|
||||||
}
|
|
||||||
|
|
||||||
if let snapshot = contents
|
|
||||||
.filter({ (try? $0.resourceValues(forKeys: [.isDirectoryKey]).isDirectory) == true })
|
|
||||||
.sorted(by: { $0.lastPathComponent < $1.lastPathComponent })
|
|
||||||
.last {
|
|
||||||
return snapshot
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -79,39 +32,18 @@ enum LocalModelResolver {
|
|||||||
}
|
}
|
||||||
|
|
||||||
/// Delete the local cache for a model so it will be re-downloaded next time.
|
/// Delete the local cache for a model so it will be re-downloaded next time.
|
||||||
/// Removes from all cache locations.
|
|
||||||
/// Returns true if something was deleted.
|
|
||||||
@discardableResult
|
@discardableResult
|
||||||
static func deleteLocal(repoId: String) -> Bool {
|
static func deleteLocal(repoId: String) -> Bool {
|
||||||
let dirName = "models--" + repoId.replacingOccurrences(of: "/", with: "--")
|
guard let base = modelsBase else { return false }
|
||||||
var deleted = false
|
let modelDir = base.appendingPathComponent(repoId, isDirectory: true)
|
||||||
|
guard FileManager.default.fileExists(atPath: modelDir.path) else { return false }
|
||||||
for cacheBase in cacheBases {
|
do {
|
||||||
let modelDir = cacheBase.appendingPathComponent(dirName, isDirectory: true)
|
try FileManager.default.removeItem(at: modelDir)
|
||||||
guard FileManager.default.fileExists(atPath: modelDir.path) else { continue }
|
print("[LocalModelResolver] Deleted \(modelDir.path)")
|
||||||
do {
|
return true
|
||||||
try FileManager.default.removeItem(at: modelDir)
|
} catch {
|
||||||
print("[LocalModelResolver] Deleted \(modelDir.path)")
|
print("[LocalModelResolver] Failed to delete \(modelDir.path): \(error)")
|
||||||
deleted = true
|
return false
|
||||||
} catch {
|
|
||||||
print("[LocalModelResolver] Failed to delete \(modelDir.path): \(error)")
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// Also clean up the per-model cache in the container (used by swift-transformers)
|
|
||||||
let containerModelsDir = FileManager.default.homeDirectoryForCurrentUser
|
|
||||||
.appendingPathComponent("Library/Caches/models", isDirectory: true)
|
|
||||||
.appendingPathComponent(repoId, isDirectory: true)
|
|
||||||
if FileManager.default.fileExists(atPath: containerModelsDir.path) {
|
|
||||||
do {
|
|
||||||
try FileManager.default.removeItem(at: containerModelsDir)
|
|
||||||
print("[LocalModelResolver] Deleted \(containerModelsDir.path)")
|
|
||||||
deleted = true
|
|
||||||
} catch {
|
|
||||||
print("[LocalModelResolver] Failed to delete \(containerModelsDir.path): \(error)")
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return deleted
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -12,7 +12,12 @@ final class ModelManager {
|
|||||||
/// HubApi with blob cache disabled to avoid storing every model twice.
|
/// HubApi with blob cache disabled to avoid storing every model twice.
|
||||||
/// swift-huggingface defaults to caching in both huggingface/hub/ (snapshots)
|
/// swift-huggingface defaults to caching in both huggingface/hub/ (snapshots)
|
||||||
/// AND models/ (content-addressed blobs). We only need the snapshots.
|
/// AND models/ (content-addressed blobs). We only need the snapshots.
|
||||||
private static let hub = HubApi(cache: nil)
|
/// Must use the same downloadBase as defaultHubApi (.cachesDirectory) so
|
||||||
|
/// LocalModelResolver can find downloaded models.
|
||||||
|
private static let hub: HubApi = {
|
||||||
|
let cachesDir = FileManager.default.urls(for: .cachesDirectory, in: .userDomainMask).first
|
||||||
|
return HubApi(downloadBase: cachesDir, cache: nil)
|
||||||
|
}()
|
||||||
var currentModel: ModelConfig?
|
var currentModel: ModelConfig?
|
||||||
var modelContainer: ModelContainer?
|
var modelContainer: ModelContainer?
|
||||||
var isLoading = false
|
var isLoading = false
|
||||||
@@ -52,7 +57,6 @@ final class ModelManager {
|
|||||||
}
|
}
|
||||||
|
|
||||||
do {
|
do {
|
||||||
let container: ModelContainer
|
|
||||||
let progressHandler: @Sendable (Progress) -> Void = { progress in
|
let progressHandler: @Sendable (Progress) -> Void = { progress in
|
||||||
Task { @MainActor in
|
Task { @MainActor in
|
||||||
self.downloadProgress = progress.fractionCompleted
|
self.downloadProgress = progress.fractionCompleted
|
||||||
@@ -73,7 +77,7 @@ final class ModelManager {
|
|||||||
configuration = config.modelConfiguration
|
configuration = config.modelConfiguration
|
||||||
}
|
}
|
||||||
|
|
||||||
container = try await VLMModelFactory.shared.loadContainer(
|
let container = try await VLMModelFactory.shared.loadContainer(
|
||||||
hub: Self.hub,
|
hub: Self.hub,
|
||||||
configuration: configuration,
|
configuration: configuration,
|
||||||
progressHandler: progressHandler
|
progressHandler: progressHandler
|
||||||
|
|||||||
51
README.md
51
README.md
@@ -1,6 +1,6 @@
|
|||||||
# MLX Server
|
# MLX Server
|
||||||
|
|
||||||
Native macOS app for running local LLMs on Apple Silicon via [MLX](https://github.com/ml-explore/mlx). Built with SwiftUI, it provides both a **chat UI** and an embedded **OpenAI-compatible API server**. Supports vision and tool use with automatic model swapping.
|
Native macOS app for running local LLMs on Apple Silicon via [MLX](https://github.com/ml-explore/mlx). Built with SwiftUI, it provides both a **chat UI** and an embedded **OpenAI-compatible API server**. Supports vision, tool use, and thinking mode.
|
||||||
|
|
||||||
## Supported Models
|
## Supported Models
|
||||||
|
|
||||||
@@ -8,6 +8,9 @@ Native macOS app for running local LLMs on Apple Silicon via [MLX](https://githu
|
|||||||
|-------|-------|---------|-------------|
|
|-------|-------|---------|-------------|
|
||||||
| `gemma` | `mlx-community/gemma-3-4b-it-4bit` | 128k | Vision, tool use (`tool_code` blocks) |
|
| `gemma` | `mlx-community/gemma-3-4b-it-4bit` | 128k | Vision, tool use (`tool_code` blocks) |
|
||||||
| `qwen` | `mlx-community/Qwen3-VL-4B-Instruct-4bit` | 256k | Vision, tool use (`<tool_call>` tags) |
|
| `qwen` | `mlx-community/Qwen3-VL-4B-Instruct-4bit` | 256k | Vision, tool use (`<tool_call>` tags) |
|
||||||
|
| `qwen3.5-9b` | `mlx-community/Qwen3.5-9B-4bit` | 256k | Thinking mode, tool use |
|
||||||
|
|
||||||
|
Any model in MLX format on HuggingFace can be added — there is no restriction on uploader or architecture.
|
||||||
|
|
||||||
## Quick Start
|
## Quick Start
|
||||||
|
|
||||||
@@ -20,12 +23,16 @@ open "build/Debug/MLX Server.app"
|
|||||||
|
|
||||||
## App Features
|
## App Features
|
||||||
|
|
||||||
- **Chat interface** with markdown rendering, image attachments (file picker, drag & drop, clipboard paste)
|
- **Chat interface** with markdown rendering, image attachments (file picker, drag & drop, clipboard paste, Finder copy-paste)
|
||||||
- **Model picker** in toolbar with local/download status indicators
|
- **Model picker** in toolbar with local/download status indicators and re-download button
|
||||||
|
- **Download progress modal** — shows file progress, percentage, and speed when downloading a new model
|
||||||
|
- **Thinking mode** — models like Qwen3.5 can reason internally before responding; thinking content appears in a collapsible box. Toggle on/off in Settings.
|
||||||
- **Streaming responses** with live token display
|
- **Streaming responses** with live token display
|
||||||
|
- **Export chat** — File > Export Chat (Cmd+Shift+S) saves conversations as Markdown or RTF (Pages-compatible)
|
||||||
- **Status bar** showing model name, context window, tokens/sec, token counts, GPU memory, API server status
|
- **Status bar** showing model name, context window, tokens/sec, token counts, GPU memory, API server status
|
||||||
- **Keyboard shortcuts**: `Cmd+N` (new chat), `Cmd+Return` (send), `Escape` (stop), `Cmd+1/2/3` (switch models)
|
- **Keyboard shortcuts**: `Cmd+N` (new chat), `Cmd+Return` (send), `Escape` (stop), `Cmd+1/2/3/4` (switch models), `Cmd+Shift+S` (export)
|
||||||
- **Settings** (`Cmd+,`): system prompt, API port, API auto-start
|
- **Settings** (`Cmd+,`): default model, thinking mode toggle, system prompt, API port, API auto-start, idle unload timeout
|
||||||
|
- **Idle auto-unload** — model is unloaded after configurable idle time (resets on both user input and model output), reloaded on next request
|
||||||
|
|
||||||
## API Server
|
## API Server
|
||||||
|
|
||||||
@@ -74,23 +81,29 @@ MLXServer/
|
|||||||
├── ContentView.swift — Main layout, toolbar, keyboard shortcuts
|
├── ContentView.swift — Main layout, toolbar, keyboard shortcuts
|
||||||
├── Models/
|
├── Models/
|
||||||
│ ├── ModelConfig.swift — Model definitions, alias/repoId resolution
|
│ ├── ModelConfig.swift — Model definitions, alias/repoId resolution
|
||||||
│ └── ChatMessage.swift — Chat message data model
|
│ └── ChatMessage.swift — Chat message data model, thinking tag parser
|
||||||
├── ViewModels/
|
├── ViewModels/
|
||||||
│ ├── ModelManager.swift — Model loading/switching via VLMModelFactory
|
│ ├── ModelManager.swift — Model loading/switching, download tracking, idle unload
|
||||||
│ └── ChatViewModel.swift — Chat state, ChatSession, API server lifecycle
|
│ └── ChatViewModel.swift — Chat state, ChatSession, API server lifecycle
|
||||||
├── Views/
|
├── Views/
|
||||||
│ ├── ModelPickerView.swift — Toolbar model selector
|
│ ├── ModelPickerView.swift — Toolbar model selector with re-download
|
||||||
│ ├── ChatMessagesView.swift — Scrollable message list with markdown
|
│ ├── ChatMessagesView.swift — Scrollable message list with markdown + thinking blocks
|
||||||
│ ├── ChatInputView.swift — Text input + image attach
|
│ ├── ChatInputView.swift — Text input + image attach (paste, drag, picker)
|
||||||
|
│ ├── DownloadModalView.swift — Model download progress overlay
|
||||||
│ ├── StatusBarView.swift — Model info, tok/s, GPU memory, API status
|
│ ├── StatusBarView.swift — Model info, tok/s, GPU memory, API status
|
||||||
│ └── SettingsView.swift — System prompt + API settings
|
│ ├── MonitorView.swift — Inference statistics monitor
|
||||||
|
│ └── SettingsView.swift — System prompt, thinking mode, API, idle settings
|
||||||
|
├── Commands/
|
||||||
|
│ └── SaveChatCommands.swift — File menu export command
|
||||||
├── Server/
|
├── Server/
|
||||||
│ ├── APIServer.swift — NWListener HTTP server, SSE streaming, KV cache reuse
|
│ ├── APIServer.swift — NWListener HTTP server, SSE streaming, KV cache reuse
|
||||||
│ ├── APIModels.swift — OpenAI-compatible Codable structs
|
│ ├── APIModels.swift — OpenAI-compatible Codable structs
|
||||||
│ ├── ToolCallParser.swift — Parses tool calls from model output
|
│ ├── ToolCallParser.swift — Parses tool calls from model output
|
||||||
│ └── ToolPromptBuilder.swift — Model-specific tool prompt formatting
|
│ └── ToolPromptBuilder.swift — Model-specific tool prompt formatting
|
||||||
└── Utilities/
|
└── Utilities/
|
||||||
├── LocalModelResolver.swift — Offline-first HuggingFace cache resolution
|
├── LocalModelResolver.swift — Offline-first HuggingFace cache resolution (sandbox + system)
|
||||||
|
├── ChatExporter.swift — Export conversations to Markdown or RTF
|
||||||
|
├── FocusedValues.swift — FocusedValue keys for menu bar integration
|
||||||
└── Preferences.swift — UserDefaults wrapper
|
└── Preferences.swift — UserDefaults wrapper
|
||||||
|
|
||||||
project.yml — xcodegen project spec (dependencies, settings, deployment target)
|
project.yml — xcodegen project spec (dependencies, settings, deployment target)
|
||||||
@@ -99,17 +112,11 @@ build.sh — One-command build script (xcodegen + xcodebuild)
|
|||||||
|
|
||||||
## Key Design Decisions
|
## Key Design Decisions
|
||||||
|
|
||||||
- Uses `mlx-swift-lm` (`MLXVLM` / `VLMModelFactory`) for inference — supports both text and vision in a single model load
|
- Uses `mlx-swift-lm` (`MLXVLM` / `VLMModelFactory`) for inference — loads any MLX-format model from HuggingFace
|
||||||
- **Offline-first**: `LocalModelResolver` checks `~/.cache/huggingface/hub/` for locally-cached snapshots before downloading
|
- **Offline-first**: `LocalModelResolver` checks both the sandboxed app container and `~/.cache/huggingface/hub/` for locally-cached models before downloading
|
||||||
|
- **No duplicate storage**: custom `HubApi` with blob cache disabled — models are stored once in the snapshot cache
|
||||||
- **KV cache reuse** across API requests — reuses `ChatSession` when conversation history prefix matches
|
- **KV cache reuse** across API requests — reuses `ChatSession` when conversation history prefix matches
|
||||||
|
- **Thinking mode**: `enable_thinking` passed via Jinja template context; `<think>` tags parsed in real-time during streaming
|
||||||
- HTTP server built on `Network.framework` (`NWListener`) — no third-party server dependencies
|
- HTTP server built on `Network.framework` (`NWListener`) — no third-party server dependencies
|
||||||
- Model-specific prompt formatting: Gemma uses `tool_code` blocks, Qwen uses `<tool_call>` XML tags
|
- Model-specific prompt formatting: Gemma uses `tool_code` blocks, Qwen uses `<tool_call>` XML tags
|
||||||
- GPU cache limit set to 20 MB; cache cleared on model unload
|
- GPU cache limit set to 20 MB; cache cleared on model unload
|
||||||
|
|
||||||
## Design Notes
|
|
||||||
|
|
||||||
- Uses `mlx_vlm` (not `mlx_lm`) as the backend — supports both text and vision in a single model load
|
|
||||||
- Offline-first: if the model is cached locally (`~/.cache/huggingface/hub/`), no network requests are made
|
|
||||||
- Thread lock on generation — MLX models aren't safe for concurrent generation
|
|
||||||
- KV prefix caching for multi-turn conversations
|
|
||||||
- Context window read from each model's config (Gemma 3 4B: 128k, Qwen3-VL 4B: 256k) with automatic summarization fallback
|
|
||||||
|
|||||||
@@ -22,7 +22,7 @@ targets:
|
|||||||
- MLXServer
|
- MLXServer
|
||||||
settings:
|
settings:
|
||||||
base:
|
base:
|
||||||
PRODUCT_BUNDLE_IDENTIFIER: com.mlxserver.app
|
PRODUCT_BUNDLE_IDENTIFIER: de.rfc1437.mlxserver
|
||||||
PRODUCT_NAME: MLX Server
|
PRODUCT_NAME: MLX Server
|
||||||
MARKETING_VERSION: "1.0.0"
|
MARKETING_VERSION: "1.0.0"
|
||||||
CURRENT_PROJECT_VERSION: "1"
|
CURRENT_PROJECT_VERSION: "1"
|
||||||
|
|||||||
Reference in New Issue
Block a user