Text Generation
Transformers
PyTorch
llama
alpaca
cot
vicuna
uncensored
Merge
mix
text-generation-inference
Instructions to use CalderaAI/13B-BlueMethod with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use CalderaAI/13B-BlueMethod with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="CalderaAI/13B-BlueMethod")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("CalderaAI/13B-BlueMethod") model = AutoModelForCausalLM.from_pretrained("CalderaAI/13B-BlueMethod") - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use CalderaAI/13B-BlueMethod with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "CalderaAI/13B-BlueMethod" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CalderaAI/13B-BlueMethod", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/CalderaAI/13B-BlueMethod
- SGLang
How to use CalderaAI/13B-BlueMethod with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "CalderaAI/13B-BlueMethod" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CalderaAI/13B-BlueMethod", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "CalderaAI/13B-BlueMethod" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CalderaAI/13B-BlueMethod", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use CalderaAI/13B-BlueMethod with Docker Model Runner:
docker model run hf.co/CalderaAI/13B-BlueMethod
| tags: | |
| - llama | |
| - alpaca | |
| - cot | |
| - vicuna | |
| - uncensored | |
| - merge | |
| - mix | |
| ## 13B-BlueMethod | |
| ## Composition: | |
| BlueMethod is a bit of a convoluted experiment in tiered merging. | |
| Furthering the experimental nature of the merge, the models combined | |
| were done so with a custom script that randomized the percent of each | |
| layer merged from one model to the next. This is a warmup for a larger | |
| project. | |
| [Tier One and Two Merges not released; internal naming convention] | |
| Tier One Merges: | |
| 13B-Metharme+13B-Nous-Hermes=13B-Methermes | |
| 13B-Vicuna-cocktail+13B-Manticore=13B-Vicortia | |
| 13B-HyperMantis+13B-Alpacino=13B-PsychoMantis | |
| Tier Two Merges: | |
| 13B-Methermes+13B-Vicortia=13B-Methphistopheles | |
| 13B-PsychoMantis+13B-BlueMoonRP=13B-BlueMantis | |
| Tier Three Merge: | |
| 13B-Methphistopheles+13B-BlueMantis=13B-BlueMethod | |
| ## Use: | |
| Multiple instruct models and model composites were combined to make the final resulting model; | |
| This model is highly open to experimental prompting, both Alpaca and Vicuna instruct can be used. | |
| It can have interesting results. | |
| ## Language Models and LoRAs Used Credits: | |
| 13B-Metharme by PygmalionAI | |
| https://www.huggingface.co/PygmalionAI/metharme-13b | |
| 13B-Nous-Hermes by NousResearch | |
| https://www.huggingface.co/NousResearch/Nous-Hermes-13b | |
| 13B-Vicuna-cocktail by reeducator | |
| https://www.huggingface.co/reeducator/vicuna-13b-cocktail | |
| 13B-Manticore by openaccess-ai-collective | |
| https://www.huggingface.co/openaccess-ai-collective/manticore-13b | |
| 13B-HyperMantis and 13B-Alpacino by Digitous | |
| https://huggingface.co/digitous/13B-HyperMantis | |
| https://huggingface.co/digitous/Alpacino13b | |
| Also thanks to Meta for LLaMA. | |
| Each model and LoRA was hand picked and considered for what it could contribute to this ensemble. | |
| Thanks to each and every one of you for your incredible work developing some of the best things | |
| to come out of this community. | |
| # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) | |
| Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_CalderaAI__13B-BlueMethod) | |
| | Metric | Value | | |
| |-----------------------|---------------------------| | |
| | Avg. | 51.76 | | |
| | ARC (25-shot) | 59.64 | | |
| | HellaSwag (10-shot) | 82.07 | | |
| | MMLU (5-shot) | 50.34 | | |
| | TruthfulQA (0-shot) | 47.74 | | |
| | Winogrande (5-shot) | 77.11 | | |
| | GSM8K (5-shot) | 7.81 | | |
| | DROP (3-shot) | 37.62 | | |