Instructions to use MiniMaxAI/MiniMax-M2.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MiniMaxAI/MiniMax-M2.1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="MiniMaxAI/MiniMax-M2.1", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MiniMaxAI/MiniMax-M2.1", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("MiniMaxAI/MiniMax-M2.1", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use MiniMaxAI/MiniMax-M2.1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "MiniMaxAI/MiniMax-M2.1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MiniMaxAI/MiniMax-M2.1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/MiniMaxAI/MiniMax-M2.1

SGLang

How to use MiniMaxAI/MiniMax-M2.1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "MiniMaxAI/MiniMax-M2.1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MiniMaxAI/MiniMax-M2.1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "MiniMaxAI/MiniMax-M2.1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MiniMaxAI/MiniMax-M2.1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use MiniMaxAI/MiniMax-M2.1 with Docker Model Runner:
```
docker model run hf.co/MiniMaxAI/MiniMax-M2.1
```

MiniMax-M2.1

Commit History

Add SWE-bench Verified evaluation result (74.0%)

742e8cc
verified

SaylorTwift HF Staff commited on Mar 17

Update README.md (#28)

cd97f59

UnicornChan commited on Feb 13

Clarify supported context length in docs (#21)

69af314
verified

windniw commited on Jan 28

add mlx (#8)

927ea2b
verified

rogeryoungh commited on Jan 23

update bench.png (#7)

17f852d
verified

MiniMax-AI commited on Dec 27, 2025

update

2713e50

xuebi commited on Dec 26, 2025

update

1ccbe32

xuebi commited on Dec 26, 2025

Update README.md

8d70a8a
verified

rogeryoungh commited on Dec 23, 2025

Update README.md

2bdeb8c
verified

rogeryoungh commited on Dec 23, 2025

Update chat_template.jinja

c6b717a
verified

rogeryoungh commited on Dec 23, 2025

Upload bench.png

20f0649
verified

rogeryoungh commited on Dec 23, 2025

update: README

1aeff0e

xuebi commited on Dec 23, 2025

update: use nightly sglang

0469488

xuebi commited on Dec 22, 2025

update: add toolcall docs

5b50d0e

xuebi commited on Dec 22, 2025

udpate: guide use nightly vllm

a423be8

xuebi commited on Dec 22, 2025

Create generation_config.json

2e162cb
verified

rogeryoungh commited on Dec 22, 2025

update: update model_identity in chat_template.jinja

7aef555
verified

rogeryoungh commited on Dec 22, 2025

update: support transformers

e551b49

xuebi commited on Dec 21, 2025

update: add config.json

500a52e

xuebi commited on Dec 21, 2025