| --- |
| license: apache-2.0 |
| base_model: |
| - sentence-transformers/all-MiniLM-L6-v2 |
| --- |
| **This model is a neuron compiled version of https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 *** |
|
|
| It was compiled on version 2.19.1 of the Neuron SDK. You may need to run the compilation process again. |
|
|
| See https://huggingface.co/docs/optimum-neuron/en/inference_tutorials/sentence_transformers for more details |
|
|
| For information on how to run on SageMaker: https://huggingface.co/docs/optimum-neuron/en/inference_tutorials/sentence_transformers |
|
|
| To run: |
|
|
| ``` |
| from optimum.neuron import NeuronModelForSentenceTransformers |
| from transformers import AutoTokenizer |
| model_id = "jburtoft/all-MiniLM-L6-v2-neuron" |
| |
| # Use the line below if you have to compile the model yourself |
| #model_id = "all-MiniLM-L6-v2-neuron" |
| |
| |
| model = NeuronModelForSentenceTransformers.from_pretrained(model_id) |
| tokenizer = AutoTokenizer.from_pretrained(model_id) |
| |
| # Run inference |
| prompt = "I like to eat apples" |
| encoded_input = tokenizer(prompt, return_tensors='pt') |
| outputs = model(**encoded_input) |
| |
| token_embeddings = outputs.token_embeddings |
| sentence_embedding = outputs.sentence_embedding |
| |
| print(f"token embeddings: {token_embeddings.shape}") # torch.Size([1, 7, 384]) |
| print(f"sentence_embedding: {sentence_embedding.shape}") # torch.Size([1, 384]) |
| ``` |
|
|
| To compile: |
| ``` |
| optimum-cli export neuron -m sentence-transformers/all-MiniLM-L6-v2 --sequence_length 512 --batch_size 1 --task feature-extraction all-MiniLM-L6-v2-neuron |
| ``` |