Tips and best practices for getting the most out of Pinecone.
Under the hood,
pinecone.Index connects to Pinecone by establishing a gRPC channel. A gRPC channel should be reused to allow RPC calls to use the existing HTTP/2 connection. In other words, when upserting and querying the same index, you shoud reuse the same
Pinecone Beta is deployed in the AWS
us-west-2 US West (Oregon) region. To minimize latency when you access Pinecone, consider deploying your application in the same US West (Oregon) region.
Contact us if you need a dedicated deployment in other regions. We currently support AWS and GCP.
Slow Uploads or High Latencies?
Pinecone supports very high throughput (10K+ vectors per second). If you experience slow uploads or high query latencies, it may be because you are accessing Pinecone from your home network. Switch to a cloud environment such as EC2, GCE, Google Colab, GCP AI Platform Notebook, or SageMaker Notebook for significant performance improvements.
How to Upload >1GB of Data
When your data is more than 1GB, be sure to use more than 1 shard. As a general guideline, add 1 shard to your index for every additional GB of data. Refer to the documentation on how to specify the number of shards for your index.
How to Increase Throughput
To increase throughput (QPS), increase the number of replicas for your index. Refer to the SDK Reference about specifying the number of replicas for your index.
unary_* when you handle only one item at a time, or when you have many concurrent clients accessing the same Pinecone index.
unary_* interface sends unary gRPC requests under the hood.
For example, if you scale your web application horizontally,
then you should use
unary_query to send queries.