AnnouncementPinecone serverless on AWS is now generally availableLearn more
Models

multilingual-e5-large

Open source text embedding model, ideal for multilingual applications.
Dimension:Size of a single vector
supported by this model.
1024
Distance Metric:Used to measure similarity
between vectors.
cosine
Max Seq. Length:Number of tokens the model
can process at once.
512

Overview

Ideal multilingual model for high performance while keeping with open source. Works well on messy data. Good for short queries expected to return medium-length passages of text (1-2 paragraphs).

Using the model

Must prefix passages/documents with \"passage: \" and queries with \"query: \". See here for an example.

Installation:

Creating Embeddings:

Learn more about multilingual-e5-large