Instructions to use Fate-Zero/Archer-Code-1.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Fate-Zero/Archer-Code-1.5B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Fate-Zero/Archer-Code-1.5B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Fate-Zero/Archer-Code-1.5B") model = AutoModelForCausalLM.from_pretrained("Fate-Zero/Archer-Code-1.5B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Fate-Zero/Archer-Code-1.5B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Fate-Zero/Archer-Code-1.5B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Fate-Zero/Archer-Code-1.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Fate-Zero/Archer-Code-1.5B
- SGLang
How to use Fate-Zero/Archer-Code-1.5B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Fate-Zero/Archer-Code-1.5B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Fate-Zero/Archer-Code-1.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Fate-Zero/Archer-Code-1.5B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Fate-Zero/Archer-Code-1.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Fate-Zero/Archer-Code-1.5B with Docker Model Runner:
docker model run hf.co/Fate-Zero/Archer-Code-1.5B
Overview
The Archer series focuses on research into RL algorithms and training for medium and small-scale models, aiming to deepen the community's understanding of the fundamental principles of reinforcement learning (RL) on large language models (LLMs). All released content will be comprehensively open-sourced to advance community research development.
Archer significantly improves the reasoning performance upon DAPO and outperforms previous 1.5B-level SOTA reasoning models.
Archer is an open-source initiative enhancing reasoning in large language models through scalable, rule-governed reinforcement learning. We provide full-stack reproducibility including:
- Training code and pipelines
- Curated datasets
- Trained models
- Complete training logs
Current Models:
- Archer-Code-1.5B - SOTA among similarly-sized models.
Evaluation
We conduct evaluation on both mathematical and coding benchmarks. Due to the high variance of the outputs from reasoning models, we report avg@K (pass@1 performance averaged over K outputs) and pass@K for each benchmark. The detailed results are shown in the table below.
| Method | AIME24 | AIME25 | AMC23 | MATH-500 | Minerva | Olympiad | Avg. | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| avg@64 | pass@64 | avg@64 | pass@64 | avg@64 | pass@64 | avg@4 | pass@4 | avg@8 | pass@8 | avg@4 | pass@4 | ||
| DeepSeek-R1-1.5B | 30.6 | 80.0 | 23.5 | 63.3 | 70.7 | 100.0 | 83.6 | 92.4 | 27.6 | 48.2 | 44.6 | 59.4 | 46.8 |
| DAPO | 42.1 | 80.0 | 28.6 | 56.7 | 80.3 | 97.5 | 87.6 | 94.6 | 29.2 | 46.3 | 53.2 | 65.8 | 53.5 |
| DeepScaleR-1.5B | 42.0 | 83.3 | 29.0 | 63.3 | 81.3 | 100.0 | 87.7 | 93.6 | 30.3 | 51.1 | 50.7 | 61.0 | 53.5 |
| FastCuRL-1.5B-V3 | 48.1 | 80.0 | 32.7 | 60.0 | 86.4 | 95.0 | 89.8 | 94.0 | 33.6 | 50.0 | 55.3 | 64.3 | 57.7 |
| Nemotron-1.5B | 48.0 | 76.7 | 33.1 | 60.0 | 86.1 | 97.5 | 90.6 | 93.6 | 35.3 | 47.8 | 59.2 | 66.8 | 58.7 |
| Archer-Math-1.5B | 48.7 | 83.3 | 33.8 | 70.0 | 86.0 | 97.5 | 90.8 | 94.4 | 35.7 | 51.1 | 59.3 | 67.1 | 59.1 |
| Method | LCB v5 (2024.08.01–2025.02.01) | LCB v6 (2025.02.01–2025.05.01) | Avg. | ||
|---|---|---|---|---|---|
| avg@8 | pass@8 | avg@16 | pass@16 | ||
| DeepSeek-R1-1.5B | 16.7 | 29.0 | 17.2 | 34.4 | 17.0 |
| DAPO | 26.0 | 40.5 | 27.6 | 43.5 | 26.8 |
| DeepCoder-1.5B | 23.3 | 39.1 | 22.6 | 42.0 | 23.0 |
| Nemotron-1.5B | 26.1 | 35.5 | 29.5 | 42.8 | 27.8 |
| Archer-Code-1.5B | 29.4 | 43.7 | 30.2 | 45.8 | 29.8 |
Technical Report
Stabilizing Knowledge, Promoting Reasoning: Dual-Token Constraints for RLVR
Acknowledgements
- We build our model upon
DeepSeek-R1-Distill-Qwen-1.5B. - Training was carried out with a modified version of verl.
Citation
Please cite the following:
@article{wang2025stabilizing,
title={Stabilizing Knowledge, Promoting Reasoning: Dual-Token Constraints for RLVR},
author={Wang, Jiakang and Liu, Runze and Zhang, Fuzheng and Li, Xiu and Zhou, Guorui},
journal={arXiv preprint arXiv:2507.15778},
year={2025}
}
- Downloads last month
- 8
Model tree for Fate-Zero/Archer-Code-1.5B
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B