Microcoder 1.5B

Microcoder 1.5B is a code-focused language model fine-tuned from Qwen 2.5 Coder 1.5B Instruct using LoRA (Low-Rank Adaptation) on curated code datasets. It is designed for code generation, completion, and instruction-following tasks in a lightweight, efficient package.


Model Details

Property Value
Base Model Qwen 2.5 Coder 1.5B Instruct
Fine-tuning LoRA
Parameters ~1.5B
License BSD 3-Clause
Language English (primary), multilingual code
Task Code generation, completion, instruction following

Benchmarks

Benchmark Metric Score
HumanEval pass@1 59.15%
MBPP+ pass@1 52.91%

HumanEval and MBPP+ results were obtained using the model in GGUF format with Q5_K_M quantization. Results may vary slightly with other formats or quantization levels.


Usage

Important: You must use apply_chat_template when formatting inputs. Passing raw text directly to the tokenizer will produce incorrect results.

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "pedrodev2026/microcoder-1.5b"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

messages = [
    {
        "role": "user",
        "content": "Write a Python function that returns the nth Fibonacci number."
    }
]

input_text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Microcoder 1.5B was fine-tuned using LoRA on top of Qwen 2.5 Coder 1.5B Instruct. The training focused on code-heavy datasets covering multiple programming languages and problem-solving scenarios, aiming to improve instruction-following and code correctness at a small model scale.


Credits


License

The Microcoder 1.5B model weights and associated code in this repository are released under the BSD 3-Clause License. See LICENSE for details.

Note that the base model (Qwen 2.5 Coder 1.5B Instruct) and the datasets used for fine-tuning are subject to their own respective licenses, as detailed in the credit files above.


Notice

The documentation files in this repository (including README.md, MODEL_CREDITS.md, DATASET_CREDITS.md, and other .md files) were generated with the assistance of an AI language model.

Downloads last month
303
Safetensors
Model size
2B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ 2 Ask for provider support

Model tree for pedrodev2026/microcoder-1.5b

Finetuned
(41)
this model
Quantizations
2 models