Manage indexes

An index is the highest-level organizational unit of vector data in Pinecone. It accepts and stores vectors, serves queries over the vectors it contains, and does other vector operations over its contents. In this section, we explain how you can get a list of your indexes, create an index, delete an index, and describe an index.

warning

Indexes on the Starter (free) plan are deleted after 7 days of inactivity. To prevent this, send any API request or log into the console. This will count as activity.

Getting information on your indexes

List all your Pinecone indexes:

pythoncurl
Copy
Copied
pinecone.list_indexes()
Copy
Copied
curl -i https://controller.YOUR_ENVIRONMENT.pinecone.io/databases \
  -H 'Api-Key: YOUR_API_KEY'

Get the configuration and current status of an index named "pinecone-index":

pythoncurl
Copy
Copied
pinecone.describe_index("pinecone-index")
Copy
Copied
curl -i -X GET https://controller.YOUR_ENVIRONMENT.pinecone.io/databases/example-index \
  -H 'Api-Key: YOUR_API_KEY'

Creating an index

The simplest way to create an index is as follows. This gives you an index with a single pod that will perform approximate nearest neighbor (ANN) search using cosine similarity:

pythoncurl
Copy
Copied
pinecone.create_index("example-index", dimension=128)
Copy
Copied
curl -i -X POST https://controller.YOUR_ENVIRONMENT.pinecone.io/databases \
  -H 'Api-Key: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "name": "example-index",
    "dimension": 128
  }'

A more complex index can be created as follows. This creates an index that measures similarity by Euclidean distance and runs on 4 s1 (storage-optimized) pods:

pythoncurl
Copy
Copied
pinecone.create_index("example-index", dimension=128, metric="euclidean", pods=4, pod_type="s1")
Copy
Copied
curl -i -X POST https://controller.YOUR_ENVIRONMENT.pinecone.io/databases \
  -H 'Api-Key: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "name": "example-index",
    "dimension": 128,
    "metric": "euclidean",
    "pods": 4,
    "pod_type": "p1"
  }'

For the full list of parameters available to customize an index, see the create_index API reference.

Pods and pod types

Pods are pre-configured units of hardware for running a Pinecone service. Each index runs on one or more pods. Generally, more pods mean more storage capacity, lower latency, and higher throughput.

Use the usage estimator to calculate the minimum number of pods required for your index.

Users on the Starter (free) plan are limited to 1 p1 pod.

P1 pods

These performance-optimized pods provide very low query latencies, but hold fewer vectors per pod than s1 pods. They are ideal for applications with extremely low latency requirements (<200ms).

Each p1 pod has enough capacity for around 1M vectors of 768 dimensions.

S1 pods

These storage-optimized pods provide large storage capacity and lower overall costs, but have slightly higher query latencies than p1 pods. They are ideal for very large indexes with moderate or high latency requirements.

Each s1 pod has enough capacity for around 5M vectors of 768 dimensions.

The s1 pods are priced differently, and require a Standard or Enterprise account. See pricing for more details.

Distance metrics

You can choose from different metrics when creating a vector index:

  • euclidean

    • This is used to calculate the distance between two data points in a plane. It is one of the most commonly used distance metric. For an example, see our image similarity search example.
    • When you use metric='euclidean', the most similar results are those with the lowest score.
  • cosine

    • This is often used to find similarities between different documents. The advantage is that the scores are normalized to [-1,1] range.
  • dotproduct

    • This is used to multiply two vectors. You can use it to tell us how similar the two vectors are. The more positive the answer is, the closer the two vectors are in terms of their directions.

Depending on your application, some metrics have better recall and precision performance than others. For more information, see: What is Vector Similarity Search?

Replicas

You can increase the number of replicas for your index to increase throughput (QPS). All indexes start with replicas=1.

pythoncurl
Copy
Copied
pinecone.scale_index("example-index", replicas=4)
Copy
Copied
curl -i -X PATCH https://controller.us-west1-gcp.pinecone.io/databases/example-index \
  -H 'Api-Key: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "replicas": 4
  }'

See the scale_index API reference for more details.

See the Pinecone API Reference documentation for more information on Pinecone's API endpoints and schemas.

Selective metadata indexing

By default, Pinecone indexes all metadata. When you index metadata fields, you can filter vector search queries using those fields. When you store metadata fields without indexing them, you keep memory utilization low, especially when you have many unique metadata values, and therefore can fit more vectors per pod.

When you create a new index, you can specify which metadata fields to index using the metadata_config parameter.

pythoncurl
Copy
Copied
metadata_config = {
    "indexed": ["metadata-field-name"]
}

pinecone.create_index("example-index", dimension=128,
                      metadata_config=metadata_config)
Copy
Copied
curl -i -X POST https://controller.YOUR_ENVIRONMENT.pinecone.io/databases \
  -H 'Api-Key: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "name": "example-index",
    "dimension": 128,
    "metadata_config": {
      "indexed": ["metadata-field-name"]
    }
  }'

The value for the metadata_config parameter is a JSON object containing the names of the metadata fields to index.

Copy
Copied
{
    "indexed": [
        "metadata-field-1",
        "metadata-field-2",
        "metadata-field-n"
    ]
}

When you provide a metadata_config object, Pinecone only indexes the metadata fields present in that object: any metadata fields absent from the metadata_config object are not indexed.

When a metadata field is indexed, you can filter your queries using that metadata field; if a metadata field is not indexed, metadata filtering ignores that field.

Examples

The following example creates an index that only indexes the genre metadata field. Queries against this index that filter for the genre metadata field may return results; queries that filter for other metadata fields behave as though those fields do not exist.

pythoncurl
Copy
Copied
metadata_config = {
    "indexed": ["genre"]
}

pinecone.create_index("example-index", dimension=128,
                      metadata_config=metadata_config)
Copy
Copied
curl -i -X POST https://controller.us-west1-gcp.pinecone.io/databases \
  -H 'Api-Key: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "name": "example-index",
    "dimension": 128,
    "metadata_config": {
      "indexed": ["genre"]
    }
  }'

Deleting an index

This operation will delete all of the data and the computing resources associated with the index.

Caution

When you create an index, it runs as a service until you delete it. Users are billed for running indexes, so we recommend you delete any indexes you're not using. This will minimize your costs.

Delete a Pinecone index named "pinecone-index":

pythoncurl
Copy
Copied
pinecone.delete_index("example-index")
Copy
Copied
curl -i -X DELETE https://controller.YOUR_ENVIRONMENT.pinecone.io/databases/example-index \
  -H 'Api-Key: YOUR_API_KEY'