Models
Deploy a Model
Choose a preset model or enter any HuggingFace model path to deploy an inference server.
Choose Model
| Model Family | Version | Specialized Task | GPU | VRAM | Tags | |
|---|---|---|---|---|---|---|
| Qwen3-Coder | 30B MoE | Code Generation | H200:1 | 46 GB VRAM | Tool callingfp8 | |
| Qwen3-Coder | 30B MoE FP8 | Code Generation | H200:1 | 48 GB VRAM | Tool callingfp8 | |
| Qwen3 | 4B Instruct 2507 | Text Generation | RTX5090:1 | 24 GB VRAM | Tool callingfp8 | |
| Qwen3 | 0.6B | Text Generation | RTX5090:1 | 16 GB VRAM | Tool calling | |
| Qwen3 | 1.7B | Text Generation | RTX5090:1 | 16 GB VRAM | Tool calling | |
| Qwen3 | 4B | Text Generation | RTX5090:1 | 16 GB VRAM | Tool calling | |
| Qwen3 | 8B | Text Generation | RTX5090:1 | 24 GB VRAM | Tool calling | |
| Qwen3 | 3.5B | Text Generation | RTX5090:1 | 24 GB VRAM | Tool calling |
23 models · page 1 of 3