Pinecone Dedicated Read Nodes

Predictable speed and cost for billion-vector and high-QPS workloads

Create Index View Docs

Watch the Dedicated Read Nodes webinar

Learn how teams are running mission-critical AI with hard SLOs, and what it takes to get there.

Billion vector-scale semantic search

With strict latency requirements

High-QPS recommendations

That need steady, predictable throughput

Mission-critical AI services

With hard SLOs

Large enterprise or multitenant platforms

That require performance isolation

Lower, more predictable cost

Hourly per-node pricing is more cost-effective than per-request pricing for sustained, high-QPS workloads and makes spend easier to forecast.

Predictable costs

Pay a predictable hourly rate for DRN instead of fluctuating costs based on the number of queries.

Easy to forecast

Tie node count directly to spend so you can model, budget, and adjust costs as traffic grows.

Efficient at high QPS

High-throughput workloads see a lower cost per query with DRN than with per-request pricing.

Predictable low-latency and high throughput at scale

DRN powers 100M-1B+ vector workloads at 100's to 1000's of QPS, delivering p50 latencies in the tens of milliseconds.

E-commerce marketplace. Recommendations. 1.4B vectors.

2.7k QPS – Unfiltered

p50

60ms

p99

100ms

5.7k QPS – Filtered (0.26% avg. selectivity)

p50

26ms

p99

60ms

Design platform. Semantic search. 135M vectors.

600 QPS

p50

45ms

p99

96ms

Media company. Semantic search. 480M vectors.

380 QPS

p50

80ms

p99

170ms

Scale for your largest workloads

DRN is built for billion-vector semantic search and high-QPS recommendation systems, so you can grow without re-architecting or migrating.

Click to scale

Add replicas to increase throughput and shards to grow storage, no reindexing or manual tuning required.

No migrations required

Pinecone moves data and adjusts read capacity behind the scenes, with no downtime or performance degradation, so you never have to plan or run migrations.

One API, two vector database modes

The combination of On-Demand and Dedicated Read Nodes powers a wide range of production workloads with the right price-performance for each.

On-Demand

Autoscaling for bursty or multi-tenant workloads with simple, usage-based pricing.

Dedicated Read Nodes

Dedicated, provisioned read nodes, a warm data path, and simple scaling with hourly per-node pricing for predictable speed and cost.

Deploy in seconds

Scale seamlessly.

Get an API Key View Docs

search/pinecone.py

DRN Docs

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone("<API KEY>")

pc.create_index(
    name=index_name,
    dimension=1024,
    metric="cosine",
    spec=ServerlessSpec(
        cloud='aws', 
        region='us-east-1',
        read_capacity={
            "mode": "Dedicated",
            "dedicated": {
                "node_type": "b1",
                "scaling": "Manual",
                "manual": {
                    "shards": 2,
                    "replicas": 2
                }
            }
        },
    )
)

Start building knowledgeable AI today

Create your first index for free, then pay as you go when you're ready to scale.

Start Building

Get a Demo