AnnouncementPinecone serverless on AWS is now generally availableLearn more
Models

e5-large-V2

Good performance open source text embedding model.
Dimension:Size of a single vector
supported by this model.
1024
Distance Metric:Used to measure similarity
between vectors.
cosine
Max Seq. Length:Number of tokens the model
can process at once.
512

Overview

Ideal model for high performance while keeping with open source. Works well on messy data. Good for short queries expected to return medium-length passages of text (1-2 paragraphs).

Must prefix passages/documents with "passage: " and queries with "query: ". See here for an example.

Using the Model

Installation:

Creating Embeddings:

Learn how vector databases work