feat: added gemma 3n E4B as another model for fast response

2026-03-17 13:24:43 +01:00
parent cc4f937d9a
commit c80fe97f41
4 changed files with 23 additions and 4 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -29,8 +29,9 @@ python -m mlx_server.main --model mlx-community/Qwen3-VL-4B-Instruct-4bit --port

 | Alias | HuggingFace ID | Notes |
 |-------|---------------|-------|
-| `gemma` | `mlx-community/gemma-3-4b-it-4bit` | Vision + tool use via `tool_code` blocks |
-| `qwen` | `mlx-community/Qwen3-VL-4B-Instruct-4bit` | Vision + tool use via `<tool_call>` tags |
+| `gemma` | `mlx-community/gemma-3-4b-it-4bit` | Vision + tool use via `tool_code` blocks (128k context) |
+| `gemma3n` | `mlx-community/gemma-3n-E4B-it-4bit` | Vision/audio/video + tool use via `tool_code` blocks (32k context, ~1.5x faster) |
+| `qwen` | `mlx-community/Qwen3-VL-4B-Instruct-4bit` | Vision + tool use via `<tool_call>` tags (256k context) |

 ## Key Design Decisions