feat: first tries at save dialog, so far failing
This commit is contained in:
33
CLAUDE.md
33
CLAUDE.md
@@ -1,6 +1,6 @@
|
||||
# MLX Server
|
||||
|
||||
Native macOS SwiftUI app for local LLMs on Apple Silicon via MLX. Provides a chat UI and an embedded OpenAI-compatible API server. Supports vision and tool use.
|
||||
Native macOS SwiftUI app for local LLMs on Apple Silicon via MLX. Provides a chat UI and an embedded OpenAI-compatible API server. Supports vision, tool use, and thinking mode.
|
||||
|
||||
## Quick Start
|
||||
|
||||
@@ -14,18 +14,24 @@ open "build/Debug/MLX Server.app"
|
||||
|
||||
## Project Structure
|
||||
|
||||
- `MLXServer/MLXServerApp.swift` — App entry point, GPU cache config
|
||||
- `MLXServer/ContentView.swift` — Main layout, toolbar, keyboard shortcuts
|
||||
- `MLXServer/MLXServerApp.swift` — App entry point, GPU cache config, menu commands
|
||||
- `MLXServer/ContentView.swift` — Main layout, toolbar, keyboard shortcuts, focused values
|
||||
- `MLXServer/Models/ModelConfig.swift` — Model definitions (alias, repoId, contextLength), resolution
|
||||
- `MLXServer/Models/ChatMessage.swift` — Chat message data model
|
||||
- `MLXServer/ViewModels/ModelManager.swift` — Model loading/switching via VLMModelFactory, offline-first resolution
|
||||
- `MLXServer/Models/ChatMessage.swift` — Chat message data model, `<think>` tag parsing
|
||||
- `MLXServer/ViewModels/ModelManager.swift` — Model loading/switching via VLMModelFactory, download tracking, idle unload
|
||||
- `MLXServer/ViewModels/ChatViewModel.swift` — Chat state, ChatSession management, API server lifecycle
|
||||
- `MLXServer/Server/APIServer.swift` — NWListener HTTP server, SSE streaming, KV cache reuse, vision, tool call handling
|
||||
- `MLXServer/Server/APIModels.swift` — OpenAI-compatible Codable structs
|
||||
- `MLXServer/Server/ToolCallParser.swift` — Parses tool calls from model output (Gemma tool_code, Qwen XML tags)
|
||||
- `MLXServer/Server/ToolPromptBuilder.swift` — Model-specific tool prompt formatting
|
||||
- `MLXServer/Utilities/LocalModelResolver.swift` — Resolves HF repo IDs to ~/.cache/huggingface/hub/ snapshots
|
||||
- `MLXServer/Utilities/Preferences.swift` — UserDefaults wrapper
|
||||
- `MLXServer/Views/DownloadModalView.swift` — Modal overlay for model download progress
|
||||
- `MLXServer/Views/ChatMessagesView.swift` — Message bubbles with markdown rendering and collapsible thinking blocks
|
||||
- `MLXServer/Views/ChatInputView.swift` — Text input, image attach (file picker, drag & drop, Finder copy-paste)
|
||||
- `MLXServer/Commands/SaveChatCommands.swift` — File > Export Chat menu command
|
||||
- `MLXServer/Utilities/LocalModelResolver.swift` — Resolves HF repo IDs to local snapshots (sandbox + system cache + flat layouts)
|
||||
- `MLXServer/Utilities/ChatExporter.swift` — Export conversations to Markdown or RTF (Pages-compatible)
|
||||
- `MLXServer/Utilities/FocusedValues.swift` — FocusedValue keys for menu bar integration
|
||||
- `MLXServer/Utilities/Preferences.swift` — UserDefaults wrapper (model, thinking mode, API, idle timeout)
|
||||
- `project.yml` — xcodegen project spec
|
||||
- `build.sh` — Build script (xcodegen + xcodebuild)
|
||||
|
||||
@@ -35,6 +41,9 @@ open "build/Debug/MLX Server.app"
|
||||
|-------|---------------|-------|
|
||||
| `gemma` | `mlx-community/gemma-3-4b-it-4bit` | Vision + tool use via `tool_code` blocks (128k context) |
|
||||
| `qwen` | `mlx-community/Qwen3-VL-4B-Instruct-4bit` | Vision + tool use via `<tool_call>` tags (256k context) |
|
||||
| `qwen3.5-9b` | `mlx-community/Qwen3.5-9B-4bit` | Thinking mode, tool use (256k context) |
|
||||
|
||||
Any model in MLX format on HuggingFace can be added — no restriction on uploader or architecture.
|
||||
|
||||
## Critical Performance Rule
|
||||
|
||||
@@ -47,9 +56,15 @@ open "build/Debug/MLX Server.app"
|
||||
|
||||
## Key Design Decisions
|
||||
|
||||
- Uses `mlx-swift-lm` (`MLXVLM` / `VLMModelFactory`) as the inference backend — supports both text and vision in a single model load
|
||||
- Uses `mlx-swift-lm` (`MLXVLM` / `VLMModelFactory`) as the inference backend — loads any MLX-format model from HuggingFace
|
||||
- Model-specific prompt formatting: Gemma uses `tool_code` blocks; Qwen uses `<tool_call>` XML tags
|
||||
- Offline-first: if the model is already cached locally (~/.cache/huggingface/hub/), `LocalModelResolver` resolves the local snapshot path directly — no network requests
|
||||
- **Offline-first**: `LocalModelResolver` checks the sandboxed app container, system `~/.cache/huggingface/hub/`, and flat download layouts — no network requests if model is cached
|
||||
- **No duplicate storage**: custom `HubApi(cache: nil)` with explicit `downloadBase` — models stored once in the snapshot cache, not duplicated across blob cache and snapshots
|
||||
- **Thinking mode**: `enable_thinking` passed to Jinja template context via `additionalContext`; `<think>...</think>` tags parsed in real-time during streaming and shown in collapsible UI blocks. Toggleable in Settings.
|
||||
- **Download progress**: separate `isDownloading` state from `isLoading`; modal overlay shows file count, percentage, speed
|
||||
- **Idle unload**: timer resets on both user input and model generation completion (not just request start)
|
||||
- **Chat export**: Markdown (user messages as blockquotes) and RTF (Pages-compatible with formatted markdown)
|
||||
- **Finder paste**: local event monitor intercepts Cmd+V to check pasteboard for image file URLs before TextField handles it
|
||||
- HTTP server built on `Network.framework` (`NWListener`) — no third-party server dependencies
|
||||
- KV cache reuse across API requests — reuses `ChatSession` when conversation history prefix matches
|
||||
- GPU cache limit set to 20 MB; cache cleared on model unload
|
||||
|
||||
@@ -10,9 +10,12 @@
|
||||
0168AEE16009097901363E16 /* ModelManager.swift in Sources */ = {isa = PBXBuildFile; fileRef = 922CBDC9206737BD04AF2874 /* ModelManager.swift */; };
|
||||
165E8AB6ADAE1D59B1A86420 /* Preferences.swift in Sources */ = {isa = PBXBuildFile; fileRef = 145B888FBDD4F931512C5473 /* Preferences.swift */; };
|
||||
189362AAE2CDE5D4B3428334 /* ToolCallParser.swift in Sources */ = {isa = PBXBuildFile; fileRef = E73B165A1822729C907791AE /* ToolCallParser.swift */; };
|
||||
29879D696584B96CC56560DF /* ChatExporter.swift in Sources */ = {isa = PBXBuildFile; fileRef = D7C9BAD674E29688ACE53B0B /* ChatExporter.swift */; };
|
||||
2CAAF7129F7CC45200FA9F6B /* ModelPickerView.swift in Sources */ = {isa = PBXBuildFile; fileRef = C3C3A76C02AF70A9D8F868FC /* ModelPickerView.swift */; };
|
||||
2D08769282BD71C170DB0943 /* InferenceStats.swift in Sources */ = {isa = PBXBuildFile; fileRef = E35452B166893B25E765FF70 /* InferenceStats.swift */; };
|
||||
4158FA884D981D73288FB74C /* SaveChatCommands.swift in Sources */ = {isa = PBXBuildFile; fileRef = 2E2FCA55CEBEBCED78D9479A /* SaveChatCommands.swift */; };
|
||||
4CB13DC1AC7A500DDBB443EC /* ChatInputView.swift in Sources */ = {isa = PBXBuildFile; fileRef = E5E6AD02CDF23BDAB64700A7 /* ChatInputView.swift */; };
|
||||
4DC033E45880B2948B47DEB1 /* FocusedValues.swift in Sources */ = {isa = PBXBuildFile; fileRef = EF518FEBF3A38E830E3CE1A5 /* FocusedValues.swift */; };
|
||||
50B6861FF8610B3ED4FFAD9D /* MLXServerApp.swift in Sources */ = {isa = PBXBuildFile; fileRef = C67742651DB486871CEF1612 /* MLXServerApp.swift */; };
|
||||
50DD129CCF2843482DEC3B96 /* APIServer.swift in Sources */ = {isa = PBXBuildFile; fileRef = 3D08828E16B17EF02C14243E /* APIServer.swift */; };
|
||||
5946258F1DE88CE904584E0B /* ContentView.swift in Sources */ = {isa = PBXBuildFile; fileRef = 944C699FBB76C734C9DF2F2E /* ContentView.swift */; };
|
||||
@@ -38,6 +41,7 @@
|
||||
145B888FBDD4F931512C5473 /* Preferences.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = Preferences.swift; sourceTree = "<group>"; };
|
||||
16AE82A64D1D07AE3CD8D33A /* ToolPromptBuilder.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ToolPromptBuilder.swift; sourceTree = "<group>"; };
|
||||
2DC8C86D397B1FCA08E07CBD /* DownloadModalView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = DownloadModalView.swift; sourceTree = "<group>"; };
|
||||
2E2FCA55CEBEBCED78D9479A /* SaveChatCommands.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SaveChatCommands.swift; sourceTree = "<group>"; };
|
||||
38DFC212AF4359A45FBE22BA /* ModelConfig.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ModelConfig.swift; sourceTree = "<group>"; };
|
||||
3AF462805202797F61422AEE /* MLXServer.entitlements */ = {isa = PBXFileReference; lastKnownFileType = text.plist.entitlements; path = MLXServer.entitlements; sourceTree = "<group>"; };
|
||||
3D08828E16B17EF02C14243E /* APIServer.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = APIServer.swift; sourceTree = "<group>"; };
|
||||
@@ -53,10 +57,12 @@
|
||||
C3C3A76C02AF70A9D8F868FC /* ModelPickerView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ModelPickerView.swift; sourceTree = "<group>"; };
|
||||
C67742651DB486871CEF1612 /* MLXServerApp.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = MLXServerApp.swift; sourceTree = "<group>"; };
|
||||
D733A0D1D4AC25DDDA6C8684 /* LocalModelResolver.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = LocalModelResolver.swift; sourceTree = "<group>"; };
|
||||
D7C9BAD674E29688ACE53B0B /* ChatExporter.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ChatExporter.swift; sourceTree = "<group>"; };
|
||||
DB1A5E8B1C9F2BC4D262C53A /* ChatMessagesView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ChatMessagesView.swift; sourceTree = "<group>"; };
|
||||
E35452B166893B25E765FF70 /* InferenceStats.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = InferenceStats.swift; sourceTree = "<group>"; };
|
||||
E5E6AD02CDF23BDAB64700A7 /* ChatInputView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ChatInputView.swift; sourceTree = "<group>"; };
|
||||
E73B165A1822729C907791AE /* ToolCallParser.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ToolCallParser.swift; sourceTree = "<group>"; };
|
||||
EF518FEBF3A38E830E3CE1A5 /* FocusedValues.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = FocusedValues.swift; sourceTree = "<group>"; };
|
||||
F1A52E2C9964ADA9D841A89B /* APIModels.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = APIModels.swift; sourceTree = "<group>"; };
|
||||
/* End PBXFileReference section */
|
||||
|
||||
@@ -78,6 +84,8 @@
|
||||
05B1BAE308E64D2FB2E73823 /* Utilities */ = {
|
||||
isa = PBXGroup;
|
||||
children = (
|
||||
D7C9BAD674E29688ACE53B0B /* ChatExporter.swift */,
|
||||
EF518FEBF3A38E830E3CE1A5 /* FocusedValues.swift */,
|
||||
D733A0D1D4AC25DDDA6C8684 /* LocalModelResolver.swift */,
|
||||
145B888FBDD4F931512C5473 /* Preferences.swift */,
|
||||
);
|
||||
@@ -99,6 +107,7 @@
|
||||
944C699FBB76C734C9DF2F2E /* ContentView.swift */,
|
||||
3AF462805202797F61422AEE /* MLXServer.entitlements */,
|
||||
C67742651DB486871CEF1612 /* MLXServerApp.swift */,
|
||||
B459409ED6FD8797FDD81E94 /* Commands */,
|
||||
BD0E350482D91238B4B59721 /* Models */,
|
||||
E13C1AAA0C49D0ED85EFD94D /* Server */,
|
||||
05B1BAE308E64D2FB2E73823 /* Utilities */,
|
||||
@@ -122,6 +131,14 @@
|
||||
path = Views;
|
||||
sourceTree = "<group>";
|
||||
};
|
||||
B459409ED6FD8797FDD81E94 /* Commands */ = {
|
||||
isa = PBXGroup;
|
||||
children = (
|
||||
2E2FCA55CEBEBCED78D9479A /* SaveChatCommands.swift */,
|
||||
);
|
||||
path = Commands;
|
||||
sourceTree = "<group>";
|
||||
};
|
||||
BD0E350482D91238B4B59721 /* Models */ = {
|
||||
isa = PBXGroup;
|
||||
children = (
|
||||
@@ -238,12 +255,14 @@
|
||||
files = (
|
||||
D96DDE66F76FDDA642629E17 /* APIModels.swift in Sources */,
|
||||
50DD129CCF2843482DEC3B96 /* APIServer.swift in Sources */,
|
||||
29879D696584B96CC56560DF /* ChatExporter.swift in Sources */,
|
||||
4CB13DC1AC7A500DDBB443EC /* ChatInputView.swift in Sources */,
|
||||
FAF7D4714AC6D02674920208 /* ChatMessage.swift in Sources */,
|
||||
5C1E8FE1C521914CEF98D3AA /* ChatMessagesView.swift in Sources */,
|
||||
B5AA6E3B4BE21676226B342B /* ChatViewModel.swift in Sources */,
|
||||
5946258F1DE88CE904584E0B /* ContentView.swift in Sources */,
|
||||
C07A377244DCD67F4FE709FE /* DownloadModalView.swift in Sources */,
|
||||
4DC033E45880B2948B47DEB1 /* FocusedValues.swift in Sources */,
|
||||
2D08769282BD71C170DB0943 /* InferenceStats.swift in Sources */,
|
||||
6828CCA8B78AB40906F87CAB /* LocalModelResolver.swift in Sources */,
|
||||
50B6861FF8610B3ED4FFAD9D /* MLXServerApp.swift in Sources */,
|
||||
@@ -252,6 +271,7 @@
|
||||
2CAAF7129F7CC45200FA9F6B /* ModelPickerView.swift in Sources */,
|
||||
B1D9BC407DB7DB1489230C20 /* MonitorView.swift in Sources */,
|
||||
165E8AB6ADAE1D59B1A86420 /* Preferences.swift in Sources */,
|
||||
4158FA884D981D73288FB74C /* SaveChatCommands.swift in Sources */,
|
||||
D666A311788375E8A061C832 /* SettingsView.swift in Sources */,
|
||||
621B7E4382199AC1378F5F9C /* StatusBarView.swift in Sources */,
|
||||
189362AAE2CDE5D4B3428334 /* ToolCallParser.swift in Sources */,
|
||||
@@ -399,7 +419,7 @@
|
||||
);
|
||||
MACOSX_DEPLOYMENT_TARGET = 15.0;
|
||||
MARKETING_VERSION = 1.0.0;
|
||||
PRODUCT_BUNDLE_IDENTIFIER = com.mlxserver.app;
|
||||
PRODUCT_BUNDLE_IDENTIFIER = de.rfc1437.mlxserver;
|
||||
PRODUCT_NAME = "MLX Server";
|
||||
SDKROOT = macosx;
|
||||
SWIFT_VERSION = 6.0;
|
||||
@@ -424,7 +444,7 @@
|
||||
);
|
||||
MACOSX_DEPLOYMENT_TARGET = 15.0;
|
||||
MARKETING_VERSION = 1.0.0;
|
||||
PRODUCT_BUNDLE_IDENTIFIER = com.mlxserver.app;
|
||||
PRODUCT_BUNDLE_IDENTIFIER = de.rfc1437.mlxserver;
|
||||
PRODUCT_NAME = "MLX Server";
|
||||
SDKROOT = macosx;
|
||||
SWIFT_VERSION = 6.0;
|
||||
|
||||
16
MLXServer/Commands/SaveChatCommands.swift
Normal file
16
MLXServer/Commands/SaveChatCommands.swift
Normal file
@@ -0,0 +1,16 @@
|
||||
import SwiftUI
|
||||
|
||||
/// Adds "Export Chat…" to the File menu.
|
||||
struct SaveChatCommands: Commands {
|
||||
@FocusedBinding(\.exportTrigger) var isExporting
|
||||
|
||||
var body: some Commands {
|
||||
CommandGroup(after: .saveItem) {
|
||||
Button("Export Chat…") {
|
||||
isExporting = true
|
||||
}
|
||||
.keyboardShortcut("e", modifiers: [.command, .shift])
|
||||
.disabled(isExporting == nil)
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,10 +1,12 @@
|
||||
import SwiftUI
|
||||
import UniformTypeIdentifiers
|
||||
|
||||
struct ContentView: View {
|
||||
@Environment(ModelManager.self) private var modelManager
|
||||
@State private var chatVM: ChatViewModel?
|
||||
@State private var showLoadError = false
|
||||
@State private var showMonitor = false
|
||||
@State private var isExporting = false
|
||||
|
||||
var body: some View {
|
||||
mainContent
|
||||
@@ -52,6 +54,21 @@ struct ContentView: View {
|
||||
.background {
|
||||
modelSwitchShortcuts
|
||||
}
|
||||
// Expose export trigger to menu bar command
|
||||
.focusedSceneValue(\.exportTrigger, $isExporting)
|
||||
.fileExporter(
|
||||
isPresented: $isExporting,
|
||||
document: ChatExportDocument(
|
||||
messages: chatVM?.conversation.messages ?? [],
|
||||
modelName: modelManager.currentModel?.displayName
|
||||
),
|
||||
contentTypes: ChatExportDocument.writableContentTypes,
|
||||
defaultFilename: "chat"
|
||||
) { result in
|
||||
if case .failure(let error) = result {
|
||||
print("[Export] Failed: \(error.localizedDescription)")
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@ViewBuilder
|
||||
|
||||
@@ -23,6 +23,9 @@ struct MLXServerApp: App {
|
||||
}
|
||||
.windowStyle(.titleBar)
|
||||
.defaultSize(width: 800, height: 700)
|
||||
.commands {
|
||||
SaveChatCommands()
|
||||
}
|
||||
|
||||
#if os(macOS)
|
||||
Settings {
|
||||
|
||||
290
MLXServer/Utilities/ChatExporter.swift
Normal file
290
MLXServer/Utilities/ChatExporter.swift
Normal file
@@ -0,0 +1,290 @@
|
||||
import AppKit
|
||||
import Foundation
|
||||
import SwiftUI
|
||||
import UniformTypeIdentifiers
|
||||
|
||||
/// A FileDocument that exports a chat conversation as Markdown or RTF.
|
||||
struct ChatExportDocument: FileDocument {
|
||||
static var readableContentTypes: [UTType] { [.plainText] }
|
||||
static var writableContentTypes: [UTType] {
|
||||
[UTType(filenameExtension: "md") ?? .plainText, .rtf]
|
||||
}
|
||||
|
||||
let messages: [ChatMessage]
|
||||
let modelName: String?
|
||||
|
||||
init(messages: [ChatMessage], modelName: String?) {
|
||||
self.messages = messages
|
||||
self.modelName = modelName
|
||||
}
|
||||
|
||||
init(configuration: ReadConfiguration) throws {
|
||||
self.messages = []
|
||||
self.modelName = nil
|
||||
}
|
||||
|
||||
func fileWrapper(configuration: WriteConfiguration) throws -> FileWrapper {
|
||||
let contentType = configuration.contentType
|
||||
|
||||
if contentType == .rtf, let data = ChatExporter.exportRTF(messages: messages, modelName: modelName) {
|
||||
return FileWrapper(regularFileWithContents: data)
|
||||
} else {
|
||||
let md = ChatExporter.exportMarkdown(messages: messages, modelName: modelName)
|
||||
return FileWrapper(regularFileWithContents: Data(md.utf8))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Exports a chat conversation to Markdown or RTF (Pages-compatible) format.
|
||||
enum ChatExporter {
|
||||
|
||||
// MARK: - Markdown export
|
||||
|
||||
static func exportMarkdown(messages: [ChatMessage], modelName: String?) -> String {
|
||||
var lines: [String] = []
|
||||
|
||||
// Header
|
||||
lines.append("# Chat Session")
|
||||
if let modelName {
|
||||
lines.append("**Model:** \(modelName)")
|
||||
}
|
||||
let formatter = DateFormatter()
|
||||
formatter.dateStyle = .long
|
||||
formatter.timeStyle = .short
|
||||
if let first = messages.first {
|
||||
lines.append("**Date:** \(formatter.string(from: first.timestamp))")
|
||||
}
|
||||
lines.append("")
|
||||
lines.append("---")
|
||||
lines.append("")
|
||||
|
||||
for message in messages {
|
||||
guard message.role != .system else { continue }
|
||||
|
||||
if message.role == .user {
|
||||
// User messages as blockquotes
|
||||
lines.append("**You:**")
|
||||
lines.append("")
|
||||
for line in message.content.components(separatedBy: "\n") {
|
||||
lines.append("> \(line)")
|
||||
}
|
||||
} else {
|
||||
// Assistant messages: carry over original markdown
|
||||
lines.append("**Assistant:**")
|
||||
lines.append("")
|
||||
lines.append(message.content)
|
||||
}
|
||||
|
||||
lines.append("")
|
||||
lines.append("---")
|
||||
lines.append("")
|
||||
}
|
||||
|
||||
return lines.joined(separator: "\n")
|
||||
}
|
||||
|
||||
// MARK: - RTF export
|
||||
|
||||
static func exportRTF(messages: [ChatMessage], modelName: String?) -> Data? {
|
||||
let doc = NSMutableAttributedString()
|
||||
|
||||
let bodyFont = NSFont.systemFont(ofSize: 13)
|
||||
let bodyBoldFont = NSFont.boldSystemFont(ofSize: 13)
|
||||
let titleFont = NSFont.boldSystemFont(ofSize: 20)
|
||||
let metaFont = NSFont.systemFont(ofSize: 11)
|
||||
let codeFont = NSFont.monospacedSystemFont(ofSize: 12, weight: .regular)
|
||||
|
||||
let bodyParagraph = NSMutableParagraphStyle()
|
||||
bodyParagraph.paragraphSpacing = 8
|
||||
bodyParagraph.lineSpacing = 2
|
||||
|
||||
let userParagraph = NSMutableParagraphStyle()
|
||||
userParagraph.paragraphSpacing = 8
|
||||
userParagraph.lineSpacing = 2
|
||||
userParagraph.headIndent = 20
|
||||
userParagraph.firstLineHeadIndent = 20
|
||||
|
||||
// Title
|
||||
doc.append(NSAttributedString(
|
||||
string: "Chat Session\n",
|
||||
attributes: [.font: titleFont, .paragraphStyle: bodyParagraph]
|
||||
))
|
||||
|
||||
// Metadata
|
||||
let formatter = DateFormatter()
|
||||
formatter.dateStyle = .long
|
||||
formatter.timeStyle = .short
|
||||
var metaText = ""
|
||||
if let modelName { metaText += "Model: \(modelName) " }
|
||||
if let first = messages.first {
|
||||
metaText += "Date: \(formatter.string(from: first.timestamp))"
|
||||
}
|
||||
if !metaText.isEmpty {
|
||||
doc.append(NSAttributedString(
|
||||
string: metaText + "\n\n",
|
||||
attributes: [.font: metaFont, .foregroundColor: NSColor.secondaryLabelColor]
|
||||
))
|
||||
}
|
||||
|
||||
for message in messages {
|
||||
guard message.role != .system else { continue }
|
||||
|
||||
if message.role == .user {
|
||||
doc.append(NSAttributedString(
|
||||
string: "You\n",
|
||||
attributes: [
|
||||
.font: bodyBoldFont,
|
||||
.foregroundColor: NSColor.systemBlue,
|
||||
]
|
||||
))
|
||||
doc.append(NSAttributedString(
|
||||
string: message.content + "\n\n",
|
||||
attributes: [
|
||||
.font: bodyFont,
|
||||
.paragraphStyle: userParagraph,
|
||||
.foregroundColor: NSColor.labelColor,
|
||||
]
|
||||
))
|
||||
} else {
|
||||
doc.append(NSAttributedString(
|
||||
string: "Assistant\n",
|
||||
attributes: [
|
||||
.font: bodyBoldFont,
|
||||
.foregroundColor: NSColor.labelColor,
|
||||
]
|
||||
))
|
||||
let rendered = renderMarkdown(message.content, bodyFont: bodyFont, codeFont: codeFont, paragraph: bodyParagraph)
|
||||
doc.append(rendered)
|
||||
doc.append(NSAttributedString(string: "\n\n"))
|
||||
}
|
||||
|
||||
doc.append(NSAttributedString(
|
||||
string: "\n",
|
||||
attributes: [
|
||||
.strikethroughStyle: NSUnderlineStyle.single.rawValue,
|
||||
.strikethroughColor: NSColor.separatorColor,
|
||||
.font: NSFont.systemFont(ofSize: 4),
|
||||
]
|
||||
))
|
||||
}
|
||||
|
||||
return doc.rtf(from: NSRange(location: 0, length: doc.length), documentAttributes: [
|
||||
.documentType: NSAttributedString.DocumentType.rtf,
|
||||
])
|
||||
}
|
||||
|
||||
// MARK: - Markdown → NSAttributedString (basic)
|
||||
|
||||
private static func renderMarkdown(
|
||||
_ text: String,
|
||||
bodyFont: NSFont,
|
||||
codeFont: NSFont,
|
||||
paragraph: NSParagraphStyle
|
||||
) -> NSAttributedString {
|
||||
let result = NSMutableAttributedString()
|
||||
let lines = text.components(separatedBy: "\n")
|
||||
var inCodeBlock = false
|
||||
var codeBlockLines: [String] = []
|
||||
|
||||
for line in lines {
|
||||
if line.hasPrefix("```") {
|
||||
if inCodeBlock {
|
||||
let code = codeBlockLines.joined(separator: "\n")
|
||||
let codePara = NSMutableParagraphStyle()
|
||||
codePara.paragraphSpacing = 4
|
||||
codePara.headIndent = 12
|
||||
codePara.firstLineHeadIndent = 12
|
||||
result.append(NSAttributedString(
|
||||
string: code + "\n",
|
||||
attributes: [
|
||||
.font: codeFont,
|
||||
.foregroundColor: NSColor.secondaryLabelColor,
|
||||
.backgroundColor: NSColor.quaternaryLabelColor,
|
||||
.paragraphStyle: codePara,
|
||||
]
|
||||
))
|
||||
codeBlockLines = []
|
||||
inCodeBlock = false
|
||||
} else {
|
||||
inCodeBlock = true
|
||||
}
|
||||
continue
|
||||
}
|
||||
|
||||
if inCodeBlock {
|
||||
codeBlockLines.append(line)
|
||||
continue
|
||||
}
|
||||
|
||||
if line.hasPrefix("### ") {
|
||||
result.append(NSAttributedString(
|
||||
string: String(line.dropFirst(4)) + "\n",
|
||||
attributes: [.font: NSFont.boldSystemFont(ofSize: 14), .paragraphStyle: paragraph]
|
||||
))
|
||||
} else if line.hasPrefix("## ") {
|
||||
result.append(NSAttributedString(
|
||||
string: String(line.dropFirst(3)) + "\n",
|
||||
attributes: [.font: NSFont.boldSystemFont(ofSize: 15), .paragraphStyle: paragraph]
|
||||
))
|
||||
} else if line.hasPrefix("# ") {
|
||||
result.append(NSAttributedString(
|
||||
string: String(line.dropFirst(2)) + "\n",
|
||||
attributes: [.font: NSFont.boldSystemFont(ofSize: 17), .paragraphStyle: paragraph]
|
||||
))
|
||||
} else {
|
||||
let styled = applyInlineFormatting(line, bodyFont: bodyFont, codeFont: codeFont)
|
||||
result.append(styled)
|
||||
result.append(NSAttributedString(string: "\n", attributes: [.font: bodyFont]))
|
||||
}
|
||||
}
|
||||
|
||||
return result
|
||||
}
|
||||
|
||||
private static func applyInlineFormatting(
|
||||
_ text: String,
|
||||
bodyFont: NSFont,
|
||||
codeFont: NSFont
|
||||
) -> NSAttributedString {
|
||||
let result = NSMutableAttributedString()
|
||||
var remaining = text[text.startIndex...]
|
||||
|
||||
while !remaining.isEmpty {
|
||||
if remaining.hasPrefix("`"), let end = remaining.dropFirst().firstIndex(of: "`") {
|
||||
let code = String(remaining[remaining.index(after: remaining.startIndex)..<end])
|
||||
result.append(NSAttributedString(
|
||||
string: code,
|
||||
attributes: [
|
||||
.font: codeFont,
|
||||
.foregroundColor: NSColor.secondaryLabelColor,
|
||||
.backgroundColor: NSColor.quaternaryLabelColor,
|
||||
]
|
||||
))
|
||||
remaining = remaining[remaining.index(after: end)...]
|
||||
} else if remaining.hasPrefix("**"), let end = remaining.dropFirst(2).range(of: "**") {
|
||||
let bold = String(remaining[remaining.index(remaining.startIndex, offsetBy: 2)..<end.lowerBound])
|
||||
result.append(NSAttributedString(
|
||||
string: bold,
|
||||
attributes: [.font: NSFont.boldSystemFont(ofSize: bodyFont.pointSize)]
|
||||
))
|
||||
remaining = remaining[end.upperBound...]
|
||||
} else if remaining.hasPrefix("*"), let end = remaining.dropFirst().firstIndex(of: "*") {
|
||||
let italic = String(remaining[remaining.index(after: remaining.startIndex)..<end])
|
||||
result.append(NSAttributedString(
|
||||
string: italic,
|
||||
attributes: [.font: NSFontManager.shared.convert(bodyFont, toHaveTrait: .italicFontMask)]
|
||||
))
|
||||
remaining = remaining[remaining.index(after: end)...]
|
||||
} else {
|
||||
let ch = remaining[remaining.startIndex]
|
||||
result.append(NSAttributedString(
|
||||
string: String(ch),
|
||||
attributes: [.font: bodyFont]
|
||||
))
|
||||
remaining = remaining[remaining.index(after: remaining.startIndex)...]
|
||||
}
|
||||
}
|
||||
|
||||
return result
|
||||
}
|
||||
}
|
||||
13
MLXServer/Utilities/FocusedValues.swift
Normal file
13
MLXServer/Utilities/FocusedValues.swift
Normal file
@@ -0,0 +1,13 @@
|
||||
import SwiftUI
|
||||
|
||||
/// Focused value key for triggering chat export from the menu bar.
|
||||
struct FocusedExportTriggerKey: FocusedValueKey {
|
||||
typealias Value = Binding<Bool>
|
||||
}
|
||||
|
||||
extension FocusedValues {
|
||||
var exportTrigger: Binding<Bool>? {
|
||||
get { self[FocusedExportTriggerKey.self] }
|
||||
set { self[FocusedExportTriggerKey.self] = newValue }
|
||||
}
|
||||
}
|
||||
@@ -1,75 +1,28 @@
|
||||
import Foundation
|
||||
|
||||
/// Resolves HuggingFace model repos to local snapshot directories,
|
||||
/// matching the cache layout used by Python's `huggingface_hub`.
|
||||
/// Resolves HuggingFace model repos to local directories.
|
||||
///
|
||||
/// Checks two locations:
|
||||
/// 1. App sandbox container: ~/Library/Containers/com.mlxserver.app/.../huggingface/hub/
|
||||
/// 2. System-wide cache: ~/.cache/huggingface/hub/ (shared with Python tools)
|
||||
///
|
||||
/// Cache structure:
|
||||
/// .../huggingface/hub/models--{org}--{name}/snapshots/{hash}/
|
||||
/// HubApi(downloadBase: .cachesDirectory, cache: nil) downloads models to:
|
||||
/// ~/Library/Containers/de.rfc1437.mlxserver/Data/Library/Caches/models/{org}/{name}/
|
||||
enum LocalModelResolver {
|
||||
|
||||
/// All HuggingFace cache directories to search, in priority order.
|
||||
/// The sandboxed container path is checked first (where the app downloads to),
|
||||
/// then the system-wide Python cache (for models downloaded via huggingface-cli).
|
||||
private static let cacheBases: [URL] = {
|
||||
var bases: [URL] = []
|
||||
|
||||
// 1. Sandboxed app container cache (where swift-transformers Hub downloads to)
|
||||
let containerCache = FileManager.default.homeDirectoryForCurrentUser
|
||||
.appendingPathComponent("Library/Caches/huggingface/hub", isDirectory: true)
|
||||
bases.append(containerCache)
|
||||
|
||||
// 2. System-wide ~/.cache/huggingface/hub/ (Python huggingface_hub)
|
||||
// When sandboxed, homeDirectory points to the container, so construct the real path.
|
||||
let realHome = URL(fileURLWithPath: NSHomeDirectory())
|
||||
let systemCache = realHome
|
||||
.appendingPathComponent(".cache/huggingface/hub", isDirectory: true)
|
||||
// Avoid duplicate if they resolve to the same path
|
||||
if systemCache.path != containerCache.path {
|
||||
bases.append(systemCache)
|
||||
}
|
||||
|
||||
// 3. Also try the unsandboxed home directory path
|
||||
let globalHome = FileManager.default.homeDirectoryForCurrentUser
|
||||
.appendingPathComponent(".cache/huggingface/hub", isDirectory: true)
|
||||
if globalHome.path != containerCache.path && globalHome.path != systemCache.path {
|
||||
bases.append(globalHome)
|
||||
}
|
||||
|
||||
return bases
|
||||
/// Base directory where HubApi stores downloaded models.
|
||||
private static let modelsBase: URL? = {
|
||||
FileManager.default.urls(for: .cachesDirectory, in: .userDomainMask).first?
|
||||
.appendingPathComponent("models", isDirectory: true)
|
||||
}()
|
||||
|
||||
/// Resolve a HuggingFace repo ID (e.g. "mlx-community/gemma-3-4b-it-4bit")
|
||||
/// to its local snapshot directory, if it exists.
|
||||
/// to its local directory, if it exists.
|
||||
///
|
||||
/// Returns `nil` if the model hasn't been downloaded yet.
|
||||
static func resolve(repoId: String) -> URL? {
|
||||
let dirName = "models--" + repoId.replacingOccurrences(of: "/", with: "--")
|
||||
|
||||
for cacheBase in cacheBases {
|
||||
let snapshotsDir = cacheBase
|
||||
.appendingPathComponent(dirName, isDirectory: true)
|
||||
.appendingPathComponent("snapshots", isDirectory: true)
|
||||
|
||||
guard let contents = try? FileManager.default.contentsOfDirectory(
|
||||
at: snapshotsDir,
|
||||
includingPropertiesForKeys: [.isDirectoryKey],
|
||||
options: [.skipsHiddenFiles]
|
||||
) else {
|
||||
continue
|
||||
}
|
||||
|
||||
if let snapshot = contents
|
||||
.filter({ (try? $0.resourceValues(forKeys: [.isDirectoryKey]).isDirectory) == true })
|
||||
.sorted(by: { $0.lastPathComponent < $1.lastPathComponent })
|
||||
.last {
|
||||
return snapshot
|
||||
}
|
||||
guard let base = modelsBase else { return nil }
|
||||
let modelDir = base.appendingPathComponent(repoId, isDirectory: true)
|
||||
var isDir: ObjCBool = false
|
||||
if FileManager.default.fileExists(atPath: modelDir.path, isDirectory: &isDir), isDir.boolValue {
|
||||
return modelDir
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
@@ -79,39 +32,18 @@ enum LocalModelResolver {
|
||||
}
|
||||
|
||||
/// Delete the local cache for a model so it will be re-downloaded next time.
|
||||
/// Removes from all cache locations.
|
||||
/// Returns true if something was deleted.
|
||||
@discardableResult
|
||||
static func deleteLocal(repoId: String) -> Bool {
|
||||
let dirName = "models--" + repoId.replacingOccurrences(of: "/", with: "--")
|
||||
var deleted = false
|
||||
|
||||
for cacheBase in cacheBases {
|
||||
let modelDir = cacheBase.appendingPathComponent(dirName, isDirectory: true)
|
||||
guard FileManager.default.fileExists(atPath: modelDir.path) else { continue }
|
||||
do {
|
||||
try FileManager.default.removeItem(at: modelDir)
|
||||
print("[LocalModelResolver] Deleted \(modelDir.path)")
|
||||
deleted = true
|
||||
} catch {
|
||||
print("[LocalModelResolver] Failed to delete \(modelDir.path): \(error)")
|
||||
}
|
||||
guard let base = modelsBase else { return false }
|
||||
let modelDir = base.appendingPathComponent(repoId, isDirectory: true)
|
||||
guard FileManager.default.fileExists(atPath: modelDir.path) else { return false }
|
||||
do {
|
||||
try FileManager.default.removeItem(at: modelDir)
|
||||
print("[LocalModelResolver] Deleted \(modelDir.path)")
|
||||
return true
|
||||
} catch {
|
||||
print("[LocalModelResolver] Failed to delete \(modelDir.path): \(error)")
|
||||
return false
|
||||
}
|
||||
|
||||
// Also clean up the per-model cache in the container (used by swift-transformers)
|
||||
let containerModelsDir = FileManager.default.homeDirectoryForCurrentUser
|
||||
.appendingPathComponent("Library/Caches/models", isDirectory: true)
|
||||
.appendingPathComponent(repoId, isDirectory: true)
|
||||
if FileManager.default.fileExists(atPath: containerModelsDir.path) {
|
||||
do {
|
||||
try FileManager.default.removeItem(at: containerModelsDir)
|
||||
print("[LocalModelResolver] Deleted \(containerModelsDir.path)")
|
||||
deleted = true
|
||||
} catch {
|
||||
print("[LocalModelResolver] Failed to delete \(containerModelsDir.path): \(error)")
|
||||
}
|
||||
}
|
||||
|
||||
return deleted
|
||||
}
|
||||
}
|
||||
|
||||
@@ -12,7 +12,12 @@ final class ModelManager {
|
||||
/// HubApi with blob cache disabled to avoid storing every model twice.
|
||||
/// swift-huggingface defaults to caching in both huggingface/hub/ (snapshots)
|
||||
/// AND models/ (content-addressed blobs). We only need the snapshots.
|
||||
private static let hub = HubApi(cache: nil)
|
||||
/// Must use the same downloadBase as defaultHubApi (.cachesDirectory) so
|
||||
/// LocalModelResolver can find downloaded models.
|
||||
private static let hub: HubApi = {
|
||||
let cachesDir = FileManager.default.urls(for: .cachesDirectory, in: .userDomainMask).first
|
||||
return HubApi(downloadBase: cachesDir, cache: nil)
|
||||
}()
|
||||
var currentModel: ModelConfig?
|
||||
var modelContainer: ModelContainer?
|
||||
var isLoading = false
|
||||
@@ -52,7 +57,6 @@ final class ModelManager {
|
||||
}
|
||||
|
||||
do {
|
||||
let container: ModelContainer
|
||||
let progressHandler: @Sendable (Progress) -> Void = { progress in
|
||||
Task { @MainActor in
|
||||
self.downloadProgress = progress.fractionCompleted
|
||||
@@ -73,7 +77,7 @@ final class ModelManager {
|
||||
configuration = config.modelConfiguration
|
||||
}
|
||||
|
||||
container = try await VLMModelFactory.shared.loadContainer(
|
||||
let container = try await VLMModelFactory.shared.loadContainer(
|
||||
hub: Self.hub,
|
||||
configuration: configuration,
|
||||
progressHandler: progressHandler
|
||||
|
||||
51
README.md
51
README.md
@@ -1,6 +1,6 @@
|
||||
# MLX Server
|
||||
|
||||
Native macOS app for running local LLMs on Apple Silicon via [MLX](https://github.com/ml-explore/mlx). Built with SwiftUI, it provides both a **chat UI** and an embedded **OpenAI-compatible API server**. Supports vision and tool use with automatic model swapping.
|
||||
Native macOS app for running local LLMs on Apple Silicon via [MLX](https://github.com/ml-explore/mlx). Built with SwiftUI, it provides both a **chat UI** and an embedded **OpenAI-compatible API server**. Supports vision, tool use, and thinking mode.
|
||||
|
||||
## Supported Models
|
||||
|
||||
@@ -8,6 +8,9 @@ Native macOS app for running local LLMs on Apple Silicon via [MLX](https://githu
|
||||
|-------|-------|---------|-------------|
|
||||
| `gemma` | `mlx-community/gemma-3-4b-it-4bit` | 128k | Vision, tool use (`tool_code` blocks) |
|
||||
| `qwen` | `mlx-community/Qwen3-VL-4B-Instruct-4bit` | 256k | Vision, tool use (`<tool_call>` tags) |
|
||||
| `qwen3.5-9b` | `mlx-community/Qwen3.5-9B-4bit` | 256k | Thinking mode, tool use |
|
||||
|
||||
Any model in MLX format on HuggingFace can be added — there is no restriction on uploader or architecture.
|
||||
|
||||
## Quick Start
|
||||
|
||||
@@ -20,12 +23,16 @@ open "build/Debug/MLX Server.app"
|
||||
|
||||
## App Features
|
||||
|
||||
- **Chat interface** with markdown rendering, image attachments (file picker, drag & drop, clipboard paste)
|
||||
- **Model picker** in toolbar with local/download status indicators
|
||||
- **Chat interface** with markdown rendering, image attachments (file picker, drag & drop, clipboard paste, Finder copy-paste)
|
||||
- **Model picker** in toolbar with local/download status indicators and re-download button
|
||||
- **Download progress modal** — shows file progress, percentage, and speed when downloading a new model
|
||||
- **Thinking mode** — models like Qwen3.5 can reason internally before responding; thinking content appears in a collapsible box. Toggle on/off in Settings.
|
||||
- **Streaming responses** with live token display
|
||||
- **Export chat** — File > Export Chat (Cmd+Shift+S) saves conversations as Markdown or RTF (Pages-compatible)
|
||||
- **Status bar** showing model name, context window, tokens/sec, token counts, GPU memory, API server status
|
||||
- **Keyboard shortcuts**: `Cmd+N` (new chat), `Cmd+Return` (send), `Escape` (stop), `Cmd+1/2/3` (switch models)
|
||||
- **Settings** (`Cmd+,`): system prompt, API port, API auto-start
|
||||
- **Keyboard shortcuts**: `Cmd+N` (new chat), `Cmd+Return` (send), `Escape` (stop), `Cmd+1/2/3/4` (switch models), `Cmd+Shift+S` (export)
|
||||
- **Settings** (`Cmd+,`): default model, thinking mode toggle, system prompt, API port, API auto-start, idle unload timeout
|
||||
- **Idle auto-unload** — model is unloaded after configurable idle time (resets on both user input and model output), reloaded on next request
|
||||
|
||||
## API Server
|
||||
|
||||
@@ -74,23 +81,29 @@ MLXServer/
|
||||
├── ContentView.swift — Main layout, toolbar, keyboard shortcuts
|
||||
├── Models/
|
||||
│ ├── ModelConfig.swift — Model definitions, alias/repoId resolution
|
||||
│ └── ChatMessage.swift — Chat message data model
|
||||
│ └── ChatMessage.swift — Chat message data model, thinking tag parser
|
||||
├── ViewModels/
|
||||
│ ├── ModelManager.swift — Model loading/switching via VLMModelFactory
|
||||
│ ├── ModelManager.swift — Model loading/switching, download tracking, idle unload
|
||||
│ └── ChatViewModel.swift — Chat state, ChatSession, API server lifecycle
|
||||
├── Views/
|
||||
│ ├── ModelPickerView.swift — Toolbar model selector
|
||||
│ ├── ChatMessagesView.swift — Scrollable message list with markdown
|
||||
│ ├── ChatInputView.swift — Text input + image attach
|
||||
│ ├── ModelPickerView.swift — Toolbar model selector with re-download
|
||||
│ ├── ChatMessagesView.swift — Scrollable message list with markdown + thinking blocks
|
||||
│ ├── ChatInputView.swift — Text input + image attach (paste, drag, picker)
|
||||
│ ├── DownloadModalView.swift — Model download progress overlay
|
||||
│ ├── StatusBarView.swift — Model info, tok/s, GPU memory, API status
|
||||
│ └── SettingsView.swift — System prompt + API settings
|
||||
│ ├── MonitorView.swift — Inference statistics monitor
|
||||
│ └── SettingsView.swift — System prompt, thinking mode, API, idle settings
|
||||
├── Commands/
|
||||
│ └── SaveChatCommands.swift — File menu export command
|
||||
├── Server/
|
||||
│ ├── APIServer.swift — NWListener HTTP server, SSE streaming, KV cache reuse
|
||||
│ ├── APIModels.swift — OpenAI-compatible Codable structs
|
||||
│ ├── ToolCallParser.swift — Parses tool calls from model output
|
||||
│ └── ToolPromptBuilder.swift — Model-specific tool prompt formatting
|
||||
└── Utilities/
|
||||
├── LocalModelResolver.swift — Offline-first HuggingFace cache resolution
|
||||
├── LocalModelResolver.swift — Offline-first HuggingFace cache resolution (sandbox + system)
|
||||
├── ChatExporter.swift — Export conversations to Markdown or RTF
|
||||
├── FocusedValues.swift — FocusedValue keys for menu bar integration
|
||||
└── Preferences.swift — UserDefaults wrapper
|
||||
|
||||
project.yml — xcodegen project spec (dependencies, settings, deployment target)
|
||||
@@ -99,17 +112,11 @@ build.sh — One-command build script (xcodegen + xcodebuild)
|
||||
|
||||
## Key Design Decisions
|
||||
|
||||
- Uses `mlx-swift-lm` (`MLXVLM` / `VLMModelFactory`) for inference — supports both text and vision in a single model load
|
||||
- **Offline-first**: `LocalModelResolver` checks `~/.cache/huggingface/hub/` for locally-cached snapshots before downloading
|
||||
- Uses `mlx-swift-lm` (`MLXVLM` / `VLMModelFactory`) for inference — loads any MLX-format model from HuggingFace
|
||||
- **Offline-first**: `LocalModelResolver` checks both the sandboxed app container and `~/.cache/huggingface/hub/` for locally-cached models before downloading
|
||||
- **No duplicate storage**: custom `HubApi` with blob cache disabled — models are stored once in the snapshot cache
|
||||
- **KV cache reuse** across API requests — reuses `ChatSession` when conversation history prefix matches
|
||||
- **Thinking mode**: `enable_thinking` passed via Jinja template context; `<think>` tags parsed in real-time during streaming
|
||||
- HTTP server built on `Network.framework` (`NWListener`) — no third-party server dependencies
|
||||
- Model-specific prompt formatting: Gemma uses `tool_code` blocks, Qwen uses `<tool_call>` XML tags
|
||||
- GPU cache limit set to 20 MB; cache cleared on model unload
|
||||
|
||||
## Design Notes
|
||||
|
||||
- Uses `mlx_vlm` (not `mlx_lm`) as the backend — supports both text and vision in a single model load
|
||||
- Offline-first: if the model is cached locally (`~/.cache/huggingface/hub/`), no network requests are made
|
||||
- Thread lock on generation — MLX models aren't safe for concurrent generation
|
||||
- KV prefix caching for multi-turn conversations
|
||||
- Context window read from each model's config (Gemma 3 4B: 128k, Qwen3-VL 4B: 256k) with automatic summarization fallback
|
||||
|
||||
@@ -22,7 +22,7 @@ targets:
|
||||
- MLXServer
|
||||
settings:
|
||||
base:
|
||||
PRODUCT_BUNDLE_IDENTIFIER: com.mlxserver.app
|
||||
PRODUCT_BUNDLE_IDENTIFIER: de.rfc1437.mlxserver
|
||||
PRODUCT_NAME: MLX Server
|
||||
MARKETING_VERSION: "1.0.0"
|
||||
CURRENT_PROJECT_VERSION: "1"
|
||||
|
||||
Reference in New Issue
Block a user