Performance tuning

This section provides some tips for getting the best performance out of Pinecone.

Reuse connections

We recommend you reuse the same pinecone.Index() instance when you are upserting and querying the same index.

This is because, under the hood, pinecone.Index connects to Pinecone by establishing a gRPC channel. A gRPC channel should be reused in order to allow RPC calls to use the existing HTTP/2 connection.

Minimize latency

Pinecone Beta is deployed in the AWS us-west-2 US West (Oregon) region.

To minimize latency when you access Pinecone, consider deploying your application in the same US West (Oregon) region.

Contact us if you need a dedicated deployment in other regions. We currently support AWS and GCP.

Slow uploads or high latencies?

Pinecone supports very high throughput (10K+ vectors per second). If you experience slow uploads or high query latencies, it may be because you are accessing Pinecone from your home network.

To improve the performance, switch to a cloud environment. For example: EC2, GCE, Google Colab, GCP AI Platform Notebook, or SageMaker Notebook.

How to upload >1GB of data

When you have more than 1GB of data, we recommend you use more than 1 shard.

As a general guideline, add 1 shard to your index for every additional GB of data:

pythoncurl
pinecone.create_index(name='index-name',metric='cosine',shards=num_shards)
curl -i -X POST \
  https://controller.beta.pinecone.io/databases \
  -H 'Content-Type: application/json' \
  -H 'Api-Key: YOUR_API_KEY_HERE' \
  -d '{
    "name": "example-index-name",
    "dimension": 128,
    "index_type": "approximated",
    "metric": "cosine",
    "replicas": 1,
    "shards": 1,
    "index_config": {
      "k_bits": 512,
      "hybrid": False
    }
  }'

See the Manage Indexes documentation for information on how to specify the number of shards for your index.

How to increase throughput

To increase throughput (QPS), increase the number of replicas for your index:

pythoncurl
pinecone.create_index(name='index-name',metric='cosine',replicas=num_replicas)
curl -i -X PATCH \
  https://controller.beta.pinecone.io/databases/indexName \
  -H 'Content-Type: application/json' \
  -H 'Api-Key: YOUR_API_KEY_HERE' \
  -d '{
    "replicas": 2
  }'

Using the REST API

You can use the Pinecone REST API endpoints to manage a Pinecone index.

See the Pinecone API Reference documentation for more information on Pinecone's API endpoints and schemas.