πŸš€ Pinecone BYOC is in public preview. Run Pinecone inside your AWS, GCP, or Azure account with a zero-access operating model. - Read the blog
Migration Guide

Pinecone Pods vs. Serverless

Pinecone's pod-based architecture is legacy infrastructure. The current Pinecone architecture is serverless, built on object storage with decoupled storage and compute. This page explains the differences, why serverless is the recommended path forward, and how to migrate.

OVERVIEW

The Short Version

If you're starting a new project, use serverless. It's the current architecture, it's what Pinecone actively develops and optimizes, and it eliminates the operational overhead of managing pods.

If you have existing pod-based indexes, Pinecone provides migration paths to serverless. The rest of this page explains why you should migrate and how the two architectures differ.

LEGACY ARCHITECTURE

What Were Pods?

Pods were Pinecone's original deployment model. A pod was a pre-configured unit of compute and storage β€” essentially a fixed-size virtual machine running a Pinecone index. You chose a pod type (s1, p1, p2) based on your performance and storage needs, and you provisioned a specific number of pods.

Pods had several limitations inherent to their architecture:

Coupled storage and compute

If you needed more storage, you also paid for more compute (and vice versa). Scaling required provisioning more pods, even if only one dimension (storage or throughput) was the bottleneck.

Fixed capacity

You provisioned a specific number of pods and paid for them whether they were fully utilized or idle. Handling traffic spikes meant over-provisioning.

Manual scaling

Adding capacity required explicit pod count changes. There was no elastic auto-scaling.

Data size constraints

Each pod type had a maximum vector count. Scaling beyond that required adding pods and re-distributing data.

No background optimization

Indexing algorithms were fixed at pod creation time. Upgrading algorithms required manual re-indexing.

CURRENT ARCHITECTURE

What Is Serverless?

Pinecone's serverless architecture (launched as the current and recommended architecture) fundamentally rethinks how vector data is stored and queried. Instead of coupling compute and storage in fixed pods, serverless separates them:

Storage

Vectors are stored in immutable files called slabs on object storage (e.g., Amazon S3). Data is durable, distributed, and decoupled from the compute layer.

Compute

A fleet of stateless query executors caches slabs on local SSDs and processes queries in parallel. The executor pool scales dynamically based on demand.

Indexing

An asynchronous index builder processes writes into the slab structure, using a Write-Ahead Log for durability and a memtable for immediate query availability.

This is not a minor upgrade β€” it is a fundamentally different architecture. For a deep technical explanation, see How Pinecone Works.

COMPARISON

Side-by-Side Comparison

A detailed comparison of the pod-based (legacy) and serverless (current) architectures across key capabilities.

CapabilityPods (Legacy)Serverless (Current)
ArchitectureCoupled compute + storage in fixed VMsDecoupled: object storage + stateless compute
ScalingManual pod provisioningAutomatic elastic scaling (On-Demand) or replica-based (DRN)
Write latencyVaries by pod type and loadUnder 100ms acknowledgment
Write-to-query freshnessSeconds to minutes depending on configurationSeconds (via memtable)
Indexing algorithmFixed at creation timeDynamically chosen per-slab; upgraded transparently
QuantizationUser-managed (or pod defaults)Automatic, optimized during background compaction
Metadata filteringSupported, can degrade performanceAccelerates queries via roaring bitmaps and adaptive pre/mid-filtering
Re-indexing requiredYes, for algorithm changes or major updatesNever β€” compaction handles optimization transparently
Storage efficiencyFull vectors in RAM per podFull-fidelity on object storage; optimized projections in executor cache
Max scaleLimited by pod count and typeBillions of vectors across distributed slabs
Pricing modelPer-pod-hour (fixed capacity)Per-read-unit (On-Demand) or per-shard flat fee (DRN)
Idle costFull pod cost even when idleZero queries = minimal cost (On-Demand)
StatusLegacy β€” supported but not actively enhancedCurrent β€” actively developed and optimized
MIGRATION BENEFITS

Why Migrate to Serverless?

Automatic Algorithm Upgrades

When new algorithms are developed (like Product-Quantized Fast Scan), they are applied to your existing indexes during compaction β€” without downtime, re-ingestion, or any action on your part.

Stop Paying for Idle Capacity

On-Demand serverless charges per read unit consumed. Zero queries means minimal cost. For sustained high-throughput workloads, Dedicated Read Nodes offer flat-fee pricing without per-query charges.

Eliminate Operational Overhead

With pods, you had to choose pod types, manage pod counts, monitor utilization, and plan capacity. With serverless, Pinecone handles all of this. You interact with indexes and namespaces; the infrastructure is invisible.

Better Metadata Filtering

Serverless indexes use roaring bitmap indexes for every metadata field and dynamically choose between pre-filtering and mid-scan filtering based on selectivity. Selective filters accelerate queries rather than slow them down.

Full-Fidelity Storage

Serverless stores your vectors at full precision on object storage and applies optimized quantization internally during compaction. You never sacrifice accuracy for scalability.

You Get Automatic Algorithm Upgrades

On pods, the indexing algorithm was fixed when the index was created. If Pinecone developed a better algorithm, you had to manually re-index to benefit.

On serverless, Pinecone's background compaction process continuously re-optimizes your data. When new algorithms are developed (like the recent Product-Quantized Fast Scan), they are applied to your existing indexes during compaction β€” without downtime, re-ingestion, or any action on your part.

You Stop Paying for Idle Capacity

Pods charge per-hour regardless of utilization. If your traffic is bursty β€” heavy during business hours, quiet at night β€” you pay full price for idle pods.

On-Demand serverless charges per read unit consumed. Zero queries means minimal cost. For sustained high-throughput workloads, Dedicated Read Nodes offer flat-fee pricing without per-query charges.

You Eliminate Operational Overhead

With pods, you had to choose pod types, manage pod counts, monitor utilization, and plan capacity. With serverless, Pinecone handles all of this. You interact with indexes and namespaces; the infrastructure is invisible.

You Get Better Metadata Filtering

Serverless indexes use roaring bitmap indexes for every metadata field and dynamically choose between pre-filtering and mid-scan filtering based on selectivity. This makes selective filters accelerate queries rather than slow them down.

You Unlock Full-Fidelity Storage

Serverless stores your vectors at full precision on object storage and applies optimized quantization internally during compaction. You never sacrifice accuracy for scalability.

MIGRATION STEPS

How to Migrate from Pods to Serverless

Migration involves four straightforward steps. Your pod-based index remains available during the entire process, so there is no downtime.

STEP 1

Create a New Serverless Index

Create a new index using the serverless architecture. Specify the same metric (cosine, euclidean, or dotproduct) and dimensions as your pod-based index.

STEP 2

Export and Re-Ingest Your Data

Export vectors from your pod-based index and upsert them into the new serverless index. For large datasets, use batch operations and parallel processing. Your pod-based index remains available during migration.

STEP 3

Update Your Application

Point your application to the new serverless index. The query and upsert APIs are the same β€” no code changes are needed beyond updating the index name or host.

STEP 4

Decommission the Pod-Based Index

Once you've validated that the serverless index is serving correctly, delete the pod-based index to stop incurring pod charges.

Step 1: Create a serverless index (Python)

Create a new index using the serverless architecture. Specify the same metric (cosine, euclidean, or dotproduct) and dimensions as your pod-based index.

migration.py
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="YOUR_API_KEY")

pc.create_index(
    name="my-index-serverless",
    dimension=1536,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

Pinecone provides tooling and documentation for data migration. The key consideration is that this is a data copy operation β€” your pod-based index remains available during the migration, so there is no downtime. The query and upsert APIs are the same β€” no code changes are needed beyond updating the index name or host.

Frequently Asked Questions

Pods are Pinecone's legacy architecture. They are still supported for existing users, but all new indexes should use the serverless architecture. Pinecone's active development, algorithm improvements, and performance optimizations are focused on the serverless platform.

Always use serverless for new projects. It is Pinecone's current architecture, offers automatic scaling, background algorithm optimization, full-fidelity vector storage, and eliminates the operational overhead of managing pods.

Yes. Migration involves creating a new serverless index, exporting data from your pod-based index, and re-ingesting it into the serverless index. Your pod-based index remains available during migration, so there is no downtime. The query and upsert APIs are the same β€” only the index name or host changes.

Pods coupled compute and storage in fixed virtual machines that you provisioned and managed. Serverless decouples storage (object storage like S3) from compute (stateless query executors), scales automatically, applies algorithm upgrades transparently, and eliminates manual infrastructure management. Serverless is the current architecture; pods are legacy.

Pods are legacy infrastructure. While existing pod-based indexes continue to be supported, Pinecone's active development and optimization efforts are focused on the serverless architecture. New users should start with serverless, and existing pod users are encouraged to migrate.

Get Started

For a complete technical explanation of how Pinecone's serverless architecture works, see How Pinecone Works. For migration assistance, contact our solutions team at pinecone.io/contact.

Start building knowledgeable AI today

Migrate from pods to serverless and unlock automatic scaling, algorithm upgrades, and full-fidelity vector storage β€” with zero operational overhead.