Instructions to use stabilityai/stablecode-instruct-alpha-3b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use stabilityai/stablecode-instruct-alpha-3b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="stabilityai/stablecode-instruct-alpha-3b")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablecode-instruct-alpha-3b")
model = AutoModelForCausalLM.from_pretrained("stabilityai/stablecode-instruct-alpha-3b")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use stabilityai/stablecode-instruct-alpha-3b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "stabilityai/stablecode-instruct-alpha-3b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stabilityai/stablecode-instruct-alpha-3b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/stabilityai/stablecode-instruct-alpha-3b

SGLang

How to use stabilityai/stablecode-instruct-alpha-3b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "stabilityai/stablecode-instruct-alpha-3b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stabilityai/stablecode-instruct-alpha-3b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "stabilityai/stablecode-instruct-alpha-3b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stabilityai/stablecode-instruct-alpha-3b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use stabilityai/stablecode-instruct-alpha-3b with Docker Model Runner:
```
docker model run hf.co/stabilityai/stablecode-instruct-alpha-3b
```

ValueError: `model_kwargs` are not used by the model

by axiopaladin - opened Aug 10, 2023

Discussion

axiopaladin

Aug 10, 2023

When running the sample snippet provided on the model page, it throws this error (after downloading the tokenizer, config, safetensors, etc):
ValueError: The following `model_kwargs` are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)

This is from running the snippet copied directly from the documentation with no alterations. Python version 3.10.12, Pytorch version 2.1.0.dev20230705+cu121, running with CUDA on a 10GB RTX 3080.

Full traceback:
```

ValueError Traceback (most recent call last)
Cell In[1], line 10
8 model.cuda()
9 inputs = tokenizer("###Instruction\nGenerate a python function to find number of CPU cores###Response\n", return_tensors="pt").to("cuda")
---> 10 tokens = model.generate(
11 **inputs,
12 max_new_tokens=48,
13 temperature=0.2,
14 do_sample=True,
15 )
16 print(tokenizer.decode(tokens[0], skip_special_tokens=True))

File ~/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs)
112 @functools.wraps(func)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)

File ~/.local/lib/python3.10/site-packages/transformers/generation/utils.py:1282, in GenerationMixin.generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, **kwargs)
1280 model_kwargs = generation_config.update(**kwargs) # All unused kwargs must be model kwargs
1281 generation_config.validate()
-> 1282 self._validate_model_kwargs(model_kwargs.copy())
1284 # 2. Set generation parameters if not already defined
1285 logits_processor = logits_processor if logits_processor is not None else LogitsProcessorList()

File ~/.local/lib/python3.10/site-packages/transformers/generation/utils.py:1155, in GenerationMixin._validate_model_kwargs(self, model_kwargs)
1152 unused_model_args.append(key)
1154 if unused_model_args:
-> 1155 raise ValueError(
1156 f"The following model_kwargs are not used by the model: {unused_model_args} (note: typos in the"
1157 " generate arguments will also show up in this list)"
1158 )

ValueError: The following model_kwargs are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)
```

yongjer

Aug 13, 2023

•

edited Aug 13, 2023

I have faced the same issue, using colab

AIML434

Aug 14, 2023

I have faced the same issue, using colab

Tell me, can you share exisitng code and ask it to debug? How? In the colab there is weird behavior on changing the prompt and adding my own code. In oogabaoga, the same thing. Can it only write code?

yongjer

Aug 14, 2023

I have faced the same issue, using colab

Tell me, can you share exisitng code and ask it to debug? How? In the colab there is weird behavior on changing the prompt and adding my own code. In oogabaoga, the same thing. Can it only write code?

jy395

Aug 15, 2023

•

edited Aug 15, 2023

inputs = tokenizer("###Instruction\nGenerate a python function to find number of CPU cores###Response\n", return_tensors="pt").to("cuda")

# Removing 'token_type_ids' from the inputs dictionary resolved the error
if 'token_type_ids' in inputs:
    del inputs['token_type_ids']

tokens = model.generate(

joenas

Aug 16, 2023

You can also use fix provided here: https://huggingface.co/stabilityai/stablecode-instruct-alpha-3b/discussions/2#64d30f314eb2ea6d5d8e118a

kephalian

Aug 16, 2023

•

edited Aug 16, 2023

Same error on Colab. Speaks volume about ease of use and user friendliness if their proverbial "Hello world" gives errors as output. Such a difficult model or program causing user frustration is bound to fail. I guess they are headed the Android Studio way!

technohead

Aug 30, 2023

@jy395
I worked miracles. Thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

ValueError: `model_kwargs` are not used by the model

Full traceback:```

Full traceback:
```