Inference Providers
Active filters: 8-bit
MaziyarPanahi/Phi-3.5-mini-instruct-GGUF
Text Generation
• 4B • Updated • 549k
• 27
HF1BitLLM/Llama3-8B-1.58-Linear-10B-tokens
Text Generation
• 3B • Updated • 29
• 11
HF1BitLLM/Llama3-8B-1.58-Sigmoid-k100-10B-tokens
Text Generation
• 3B • Updated • 18
• 10
HF1BitLLM/Llama3-8B-1.58-100B-tokens
Text Generation
• 3B • Updated • 2.59k
• 208
MaziyarPanahi/L3-8B-Sunfall-v0.5-Stheno-v3.2-GGUF
Text Generation
• 8B • Updated • 322
• 5
Text Generation
• 15B • Updated • 113k
• 8
roleplaiapp/Mistral-MOE-4X7B-Dark-MultiVerse-Uncensored-Enhanced32-24B-gguf-Q8_0-GGUF
Text Generation
• 24B • Updated • 56
• 1
Text Generation
• 397B • Updated • 10.5k
• 273
nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4
56B • Updated • 61.3k
• 28
MaziyarPanahi/Qwen3-0.6B-GGUF
Text Generation
• 0.8B • Updated • 174k
• 11
MaziyarPanahi/Qwen3-14B-GGUF
Text Generation
• 15B • Updated • 172k
• 9
nvidia/DeepSeek-V3-0324-NVFP4
Text Generation
• 397B • Updated • 61k
• 15
miike-ai/DeepSeek-R1-Distill-Llama-70B-FP4
41B • Updated • 149
• 1
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-8bit
Text Generation
• 2B • Updated • 334k
• 14
nvidia/Qwen3-30B-A3B-NVFP4
Text Generation
• 16B • Updated • 149k
• 26
mlx-community/Qwen3-Coder-30B-A3B-Instruct-8bit
Text Generation
• Updated • 952
• 4
NVFP4/Qwen3-30B-A3B-Instruct-2507-FP4
Text Generation
• 16B • Updated • 75.4k
• 12
Text Generation
• 0.4B • Updated • 284
• 1
3B • Updated • 180
• 1
nvidia/Phi-4-reasoning-plus-NVFP4
8B • Updated • 1.08k
• 7
nightmedia/Qwen3-Next-80B-A3B-Instruct-qx86-hi-mlx
Text Generation
• 80B • Updated • 18
• 2
txn545/Qwen3-Coder-30B-A3B-Instruct-NVFP4
16B • Updated • 134
• 1
mlx-community/DeepSeek-OCR-8bit
Image-Text-to-Text
• 1B • Updated • 1.78k
• 35
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-NVFP4
Text Generation
• 26B • Updated • 7.66k
• 16
Text Generation
• 22B • Updated • 20
• 1
nvidia/DeepSeek-V3.1-NVFP4
Text Generation
• 394B • Updated • 9.51k
• 14
DataSnake/Wayfarer-2-12B-NVFP4
Text Generation
• 7B • Updated • 3
• 2
Firworks/GLM-4.5-Air-nvfp4
61B • Updated • 205
• 4
introvoyz041/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Lite-Preview-Distill-qx86-hi-mlx-mlx-4Bit
Text Generation
• 1B • Updated • 32
• 1
Firworks/aquif-3.5-Max-42B-A3B-nvfp4
24B • Updated • 2
• 1