Instructions to use intfloat/e5-large-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use intfloat/e5-large-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("intfloat/e5-large-v2") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Inference
- Notebooks
- Google Colab
- Kaggle
e5-large-v2 requirements for training in non english?
Friends congrulations for the amazing work! My name is Wilfredo and i would like to training this model for non english so what are the further modification that must be done to get that goal?
And could you please describe the hardware need to get this model done?
Hi @wilfoderek , thanks for your interest.
The vocabulary of this model is mostly English, so you need to change it to a multilingual model (e.g., multilingual-bert / xlm-roberta). Also, you need to curate a collection of multilingual datasets for training.
We have released a multilingual model at https://huggingface.co/intfloat/multilingual-e5-base , which you may want to check out.
For hardware requirements, as described in our paper, the large-size model requires 64 V100 GPUs for roughly 4 days.
Thank you for your soon answer