Text Generation
Transformers
Safetensors
qwen3_5_moe
image-text-to-text
Merge
evolutionary-merge
darwin
darwin-v5
model-mri
reasoning
advanced-reasoning
chain-of-thought
thinking
qwen3.5
qwen
Mixture of Experts
mixture-of-experts
claude-opus
distillation
multimodal
vision-language
gpqa
benchmark
open-source
apache-2.0
layer-wise-merge
moe-merge
dead-expert-revival
coding-agent
tool-calling
long-context
262k-context
conversational
Eval Results (legacy)
| { | |
| "bos_token_id": 248044, | |
| "do_sample": true, | |
| "eos_token_id": [ | |
| 248046, | |
| 248044 | |
| ], | |
| "pad_token_id": 248044, | |
| "temperature": 1.0, | |
| "top_k": 20, | |
| "top_p": 0.95, | |
| "transformers_version": "4.57.0.dev0" | |
| } |