Performance tuning

This section provides some tips for getting the best performance out of Pinecone.

Reuse connections

We recommend you reuse the same pinecone.Index() instance when you are upserting and querying the same index.

Minimize latency

Pinecone is deployed in the GCP us-west1 US West (Oregon) region.

To minimize latency when you access Pinecone, consider deploying your application in the same US West (Oregon) region.

Contact us if you need a dedicated deployment in other regions. We currently support AWS and GCP.

Slow uploads or high latencies?

If you experience slow uploads or high query latencies, it may be because you are accessing Pinecone from your home network.

To improve the performance, switch to a cloud environment. For example: EC2, GCE, Google Colab, GCP AI Platform Notebook, or SageMaker Notebook.

Using the gRPC client to get higher upsert speeds

Pinecone has a gRPC flavor of the standard client that can provide higher upsert speeds for multi-node indexes. See instructions to install the gRPC client.

To connect to an index via the gRPC client:

index = pinecone.GRPCIndex("index-name")

The syntax for upsert, query, fetch, and delete with the gRPC client remain the same as the standard client.

We recommend you use parallel upserts to get the best performance.

# We recommend you use same number of pool_threads as the number of cores on the system

index = pinecone.GRPCIndex('example-index')
def chunker(seq, size):
  return (seq[pos:pos + size] for pos in range(0, len(seq), size))
async_results = [
        index.upsert(vectors=chunk, async_req=True)
        for chunk in chunker(data, batch_size=100)
    ]
# Wait for and retrieve responses (in case of error)
[async_result.result() for async_result in async_results]

We recommend you use the gRPC client for multi-node indexes only. This is because the performance of the standard and gRPC clients are similar in a single node index.

It's possible to get write throttled faster when upserting using the gRPC index. If a user sees this often, then we recommend you use a backoff algorithm while upserting.

How to increase throughput

To increase throughput (QPS), increase the number of replicas for your index:

pythoncurl
pinecone.create_index(name='index-name',metric='cosine',replicas=num_replicas)
curl -i -X PATCH https://controller.us-west1-gcp.pinecone.io/databases/indexName \
  -H 'Api-Key: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "replicas": 2
  }'

See the Pinecone API Reference documentation for more information on Pinecone's API endpoints and schemas.