Overview

In the Pinecone docs, you'll find information on using Pinecone through a client or our REST APIs. Our quickstart guide explains how you can get a production-ready similarity search service up and running in minutes. You'll also find answers to troubleshooting and FAQs.

Below, you'll find a summary of Pinecone.

We explain the key concepts, the workflow, example Use Cases, why use Pinecone, supported indexes, and deployment options.

What is Pinecone?

Pinecone is a managed similarity search service that enables you to add vector search to your applications. This is something that traditional databases don't tend to do well.

You can quickly search for objects, such as images, audio files and documents, that are similar to each other.

Pinecone indexes and searches vector representations of data to find items that are similar to the query. You can index billions of items in real-time and search for the closest matches, with millisecond latency.

Key concepts

Learn more

Our Learn section explains the basics of vector databases and similarity search as a service.

This is a new method of searching through big data. Unlike traditional search methods, it indexes and searches vector representations of data to find items in close proximity to the query.

Vector embeddings

Vector embeddings, or “vectors,” are sets of floating-point numbers that represent objects, such as images and documents. They are often generated by Machine Learning (ML) models trained to capture the semantic similarity of objects. Deep Learning models almost always use vectors.

Does it work with raw data or do I need vector embeddings?

You need vector embeddings. That means finding an embedding model and running it somewhere.

Example use cases

Learn more

Want to start with working examples? See: Example Applications

Example use cases of similarity search include:

Build semantic text search into your applications. After converting text data into vector embeddings using an NLP transformer (eg, a sentence embedding model) you can store, index, and search through those vectors using Pinecone.

Create an image similarity search backend service.

You can transform image data into vector embeddings and build an index with Pinecone to store these vector embeddings. This enables you to send a new image as query, and retrieve similar images in the index.

Build an audio search application.

The vector embeddings are rich, mathematical representations of the audio recordings. They make it possible to determine how similar recordings are to one another, by using algorithms.

This enables you to:

  • Find songs and metadata within a catalog, based on a sample
  • Find similar sounds in an audio library
  • Detect who's speaking in an audio file
  • Take some new (unseen) audio recordings and search through the index to find the most similar matches, along with their YouTube links.

Build a question answering application.

You can index a set of questions and retrieve the most similar stored questions for a new (unseen) question. This enables you to link a new question to answers you might already have.

Product recommendation engine

You can generate product recommendations for ecommerce customers based on previous orders and trending items.

Overview of the workflow

workflow

The key steps are:

  1. Create an index
  2. Connect to an index
  3. Insert the data (and vectors) into the index

From there, you can:

Why use Pinecone?

Pinecone makes it easy to add vector search to production applications. Using Pinecone means no more hassles of benchmarking and tuning algorithms or building and maintaining infrastructure for vector search.

Key benefits:

  • Production-Ready: Go to production with a few lines of code, without any additional engineering or devops work.
  • Scale & High Performance: Search through billions of vectors in tens of milliseconds.
  • Fully Managed: We obsess over infrastructure, operations, and security so you don't have to.

What indexes and metrics are supported?

note

You must declare the index type and distance metric when you create a new index.

Similarity metric

You can use different types of metric in your vector index:

  • euclidean

    • This is used to calculate the distance between two data points in a plane. It is one of the most commonly used distance metric. For an example, see our image similarity search example.
    • When you use metric=‘euclidean’, the most similar results are those with the lowest score.
  • cosine

    • This is often used to find similarities between different documents. The advantage is that the scores are normalized to [0,1] range.
  • dotproduct

    • This is used to multiply two vectors. You can use it to tell us how similar the two vectors are. The more positive the answer is, the closer the two vectors are in terms of their directions.

Depending on your application, some metrics have better recall and precision performance than others. For more information, see: What is Vector Similarity Search?

Index types

Pinecone supports:

  • Approximate nearest neighbor search

    • The approximated engine uses fast approximate search algorithms developed by Pinecone; it is fast and highly accurate.

Pricing and deployment options

Visit the pricing page for pricing and deployment options.

Get started with Pinecone

Go to the quickstart guide to get a production-ready similarity search service up and running in minutes.