Product Recommendation Engine

Learn how to build a product recommendation engine using collaborative filtering and Pinecone.

In this example, we will generate product recommendations for ecommerce customers based on previous orders and trending items. This example covers preparing the vector embeddings, creating and deploying the Pinecone service, writing data to Pinecone, and finally querying Pinecone to receive a ranked list of recommended products.

Data Preparation

Import Python Libraries

import time
import random

import numpy as np
import pandas as pd

import scipy.sparse as sparse

Load the (Example) Instacart Data

We are going to use the Instacart Market Basket Analysis dataset for this task.

The data used throughout this example is a set of files describing customers' orders over time. The main focus is on the orders.csv file, where each line represents a relation between a user and the order. In other words, each line has information on user_id (user who made the order) and order_id. Note there is no information about products in this table. Product information related to specific orders is stored in the order_product__*.csv dataset.

order_products_train = pd.read_csv('data/order_products__train.csv')
order_products_prior = pd.read_csv('data/order_products__prior.csv')
products = pd.read_csv('data/products.csv')
orders = pd.read_csv('data/orders.csv')

order_products = order_products_train.append(order_products_prior)

Preparing data for the model

The Collaborative Filtering model used in this example requires only users’ historical preferences on a set of items. As there is no explicit rating in the data we are using, the purchase quantity can represent a “confidence” in terms of how strong the interaction was between the user and the products.

The dataframe data will store this data and will be the base for the model.

customer_order_products = pd.merge(orders, order_products, how='inner',on='order_id')

# creating a table with "confidences"
data = customer_order_products.groupby(['user_id', 'product_id'])[['order_id']].count().reset_index()
data.columns=["user_id", "product_id", "total_orders"]
data.product_id = data.product_id.astype('int64')

# Create a lookup frame so we can get the product names back in readable form later.
products_lookup = products[['product_id', 'product_name']].drop_duplicates()
products_lookup['product_id'] = products_lookup.product_id.astype('int64')

We will create three prototype users here and add them to our data dataframe. Each user will be buying only a specific product :

  • The first user will be buying only Mineral Water
  • The second user will be buying baby products: No More Tears Baby Shampoo and Baby Wash & Shampoo

These users will be later used for querying and examination of the model results.

data_new = pd.DataFrame([[data.user_id.max() + 1, 22802, 97],
                         [data.user_id.max() + 2, 26834, 89],
                         [data.user_id.max() + 2, 12590, 77]
                        ], columns=['user_id', 'product_id', 'total_orders'])
data_new
user_idproduct_idtotal_orders
02062102280297
12062112683489
22062111259077
data = data.append(data_new).reset_index(drop = True)
data.tail()
user_idproduct_idtotal_orders
13863744206209486971
13863745206209487422
138637462062102280297
138637472062112683489
138637482062111259077

In the next step, we will first extract user and item unique ids, in order to create a CSR (Compressed Sparse Row) matrix.

users = list(np.sort(data.user_id.unique()))
items = list(np.sort(products.product_id.unique()))
purchases = list(data.total_orders)

# create zero-based index position <-> user/item ID mappings
index_to_user = pd.Series(users)

# create reverse mappings from user/item ID to index positions
user_to_index = pd.Series(data=index_to_user.index + 1, index=index_to_user.values)

# create zero-based index position <-> item/user ID mappings
index_to_item = pd.Series(items)

# create reverse mapping from item/user ID to index positions
item_to_index = pd.Series(data=index_to_item.index, index=index_to_item.values)

# Get the rows and columns for our new matrix
products_rows = data.product_id.astype(int)
users_cols = data.user_id.astype(int)

# Create a sparse matrix for our users and products containing number of purchases
sparse_product_user = sparse.csr_matrix((purchases, (products_rows, users_cols)), shape=(len(items) + 1, len(users) + 1))
sparse_product_user.data = np.nan_to_num(sparse_product_user.data, copy=False)

sparse_user_product = sparse.csr_matrix((purchases, (users_cols, products_rows)), shape=(len(users) + 1, len(items) + 1))
sparse_user_product.data = np.nan_to_num(sparse_user_product.data, copy=False)

Implicit Model

In this section we will demonstrate creation and training of a recommender model using the implicit library. The recommendation model is based off the algorithms described in the paper Collaborative Filtering for Implicit Feedback Datasets with performance optimizations described in Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering.

!pip install --quiet -U implicit
import implicit
from implicit import evaluation 

#split data into train and test sets
train_set, test_set = evaluation.train_test_split(sparse_product_user, train_percentage=0.9)



# initialize a model
model = implicit.als.AlternatingLeastSquares(factors=100, 
                                             regularization = 0.05, 
                                             iterations = 50,
                                             num_threads=1)

alpha_val = 15
train_set = (train_set * alpha_val).astype('double')


# train the model on a sparse matrix of item/user/confidence weights
model.fit(train_set, show_progress = True)

We will evaluate the model using the inbuilt library function

test_set = (test_set * alpha_val).astype('double')
evaluation.ranking_metrics_at_k(model, train_set.T, test_set.T, K=100,
                         show_progress=True, num_threads=1)
{'auc': 0.6544273952545997,
 'map': 0.04470542133452368,
 'ndcg': 0.1442704183587715,
 'precision': 0.2737463275716011}

This is what item and user factors look like. These vectors will be stored in our vector index later and used for recommendation.

model.item_factors[1:3]
array([[ 0.00622083,  0.00874168,  0.00020163,  0.01146342,  0.00097009,
         0.00164262,  0.02481926,  0.01256867,  0.00841584,  0.01532503,
        -0.00843535,  0.01044653, -0.01107129,  0.01580254,  0.02204889,
         0.00823004,  0.00148215, -0.00425257,  0.0006036 , -0.01217703,
         0.00103806,  0.01606779,  0.01002141,  0.00690033, -0.00114819,
        -0.01123063,  0.01372802, -0.00065527, -0.007907  ,  0.01966622,
         0.01133489,  0.0198267 ,  0.00584944,  0.00039154,  0.02089761,
        -0.00854403,  0.00150728,  0.01410823,  0.01639647,  0.00110211,
        -0.01929718,  0.00969881,  0.00126109,  0.00643161,  0.00043924,
         0.00280298,  0.00794108,  0.01625498, -0.00114213,  0.00724161,
        -0.00866053, -0.00145312,  0.01239553,  0.01141248, -0.00264722,
         0.00298474, -0.00605761, -0.01180423,  0.02440505,  0.02707466,
         0.00342609,  0.02222574, -0.01922864, -0.00498338,  0.0041606 ,
         0.00382998,  0.00388185,  0.01277138,  0.00294698,  0.01419339,
         0.00355842, -0.01140964, -0.0032904 ,  0.02107666, -0.01220581,
         0.00830553, -0.00360495,  0.00796246,  0.01355901,  0.01906696,
         0.01876887, -0.01135404,  0.0003751 ,  0.0067894 ,  0.00827718,
        -0.00656992,  0.00786064,  0.00747358,  0.0002797 ,  0.0191397 ,
         0.01021724,  0.00180425,  0.01149296,  0.01268673,  0.01666239,
         0.01081016,  0.00498641,  0.0114116 ,  0.00209432, -0.00268642,
         0.0058637 ,  0.00285215,  0.00899839,  0.01169245,  0.002553  ,
         0.02069743, -0.00726644,  0.00561864,  0.01457589,  0.01074686,
        -0.00163069, -0.01503158,  0.00594237,  0.01657531,  0.01839461,
         0.00363692,  0.00524   , -0.00506192,  0.00556242,  0.01142012,
         0.00242651, -0.01567049,  0.02473967,  0.01068216,  0.01182867,
         0.01121255,  0.00763299,  0.00207949],
       [ 0.00361272,  0.00541232,  0.00397553,  0.00553514,  0.00116621,
         0.00390807,  0.00730027,  0.00350796,  0.00229085,  0.00210324,
         0.00080144,  0.00127138, -0.00048949, -0.00078521,  0.00585358,
         0.00386293,  0.00446024,  0.00679846,  0.0049519 ,  0.00504748,
         0.00375845,  0.00152915,  0.00525329,  0.00219877,  0.00118681,
         0.00691467,  0.00149077,  0.0035372 ,  0.00329178,  0.00739743,
         0.00356595,  0.00461424,  0.00712844,  0.00159762,  0.00113318,
         0.00270729,  0.00356044,  0.00490391,  0.00446271,  0.00489495,
         0.00795775,  0.00176176,  0.00814594,  0.00773027,  0.00283073,
         0.00196886,  0.00400983, -0.00023961,  0.00470835,  0.00268707,
         0.00391872,  0.00046707,  0.00443491,  0.00776493,  0.00522574,
         0.00452722,  0.00431347,  0.00668673,  0.00875636,  0.00314672,
         0.002721  ,  0.00351624,  0.00649229,  0.00658003,  0.00503745,
         0.00323636,  0.00340225,  0.00443649,  0.00495935,  0.00396223,
         0.00621412,  0.00180953,  0.00259974,  0.00430223,  0.00351434,
         0.00204523,  0.00220418,  0.006047  ,  0.00161205,  0.00288573,
         0.00205453,  0.00407582,  0.00010012, -0.00087068,  0.00547411,
         0.0062386 ,  0.00430007,  0.0053917 ,  0.00213009,  0.00239013,
         0.00334436,  0.0088663 ,  0.00672663,  0.00554007,  0.00510628,
         0.00365183,  0.00409472,  0.00557497,  0.00680547,  0.00468006,
         0.00512921,  0.0024056 ,  0.00402409,  0.00530454,  0.00437682,
         0.00492994,  0.00342923,  0.00449692,  0.00633973,  0.00289778,
         0.00733405,  0.00450299,  0.00741144,  0.00301688,  0.00798721,
         0.00777207,  0.00523104,  0.00608047,  0.00483847,  0.0066071 ,
         0.00506872,  0.0056828 ,  0.00508885,  0.00714068,  0.00070709,
         0.00316286,  0.00627841,  0.00872395]], dtype=float32)
model.user_factors[1:3]
array([[ 2.05956388e+00,  1.03066587e+00,  2.27296278e-02,
         5.04728675e-01,  1.05790019e+00, -4.77940768e-01,
        -2.85392487e-03,  4.62690033e-02,  4.81819026e-02,
         4.50731069e-03,  1.64774302e-02,  5.73487103e-01,
         1.61625969e+00,  1.60310090e+00,  1.46418881e+00,
         9.76078287e-02, -1.82325274e-01, -4.96896893e-01,
        -8.85763824e-01, -1.12253749e+00,  5.43332458e-01,
         1.69883859e+00, -9.17470381e-02, -4.98051852e-01,
         5.55772662e-01, -9.08166468e-01,  7.83680826e-02,
         1.17396832e-01, -3.07566571e+00,  1.77771485e+00,
        -1.43003717e-01, -2.14999646e-01,  9.12185788e-01,
        -7.33593618e-03,  5.69309473e-01, -1.43774307e+00,
        -4.77365911e-01,  7.98923194e-01,  1.76864147e+00,
        -1.05115807e+00,  1.20567513e+00,  2.07274079e+00,
         8.38997602e-01, -3.73533100e-01, -7.20990598e-02,
         1.17872655e+00,  7.48169661e-01,  4.35547233e-01,
         5.08942008e-01, -8.25820863e-02, -1.05478811e+00,
         1.43872929e+00,  1.49440026e+00,  5.28426766e-01,
        -1.20120656e+00, -2.83354431e-01, -3.48546743e-01,
        -1.18344009e+00,  6.30475104e-01,  3.54865313e-01,
        -1.06830275e+00,  5.94896376e-01, -2.02387333e+00,
         3.43774796e-01, -7.86069930e-01, -1.40384817e+00,
        -1.34640768e-01, -6.90365553e-01,  1.28094208e+00,
         7.72092402e-01, -3.20080161e-01, -9.73542750e-01,
        -2.07741454e-01,  1.26082337e+00, -6.03238940e-01,
        -2.77531087e-01, -5.57969034e-01,  1.08366840e-01,
         1.29822493e-01,  9.93184268e-01,  1.07663655e+00,
        -8.59428227e-01, -5.90113223e-01, -1.29793453e+00,
         7.46172369e-01, -2.55118966e-01, -2.19995994e-03,
         1.21766829e+00, -1.28226900e+00, -5.42479813e-01,
         8.77720594e-01, -1.03940058e+00,  3.27266216e-01,
        -1.30056214e+00,  1.44443536e+00, -1.07019377e+00,
        -2.26064250e-01, -3.33775997e-01, -1.66298103e+00,
         8.44688237e-01, -4.06899989e-01,  5.30075312e-01,
        -5.71435332e-01, -3.81951898e-01,  6.83442056e-01,
         1.83441651e+00, -1.70589864e+00, -5.83450139e-01,
         6.37444973e-01,  8.04152310e-01,  2.90158331e-01,
        -3.22059810e-01, -1.33641481e+00,  1.09643951e-01,
         8.90391916e-02, -7.69055665e-01, -3.01431835e-01,
         4.35697548e-02, -6.48079455e-01,  8.60126317e-01,
        -8.62364709e-01, -2.83923388e-01,  1.72379971e+00,
        -1.20301694e-01,  9.27006781e-01, -8.79757777e-02,
        -3.04306567e-01, -3.34203660e-01],
       [ 2.51648211e+00,  7.42882729e-01, -2.40415782e-01,
         1.77723575e+00,  2.94037312e-01, -1.53213894e+00,
        -1.99559462e+00,  3.30232954e+00,  2.07880545e+00,
        -3.54836404e-01, -8.20263624e-01,  7.00577140e-01,
         3.86924952e-01,  4.87244844e-01,  5.76600611e-01,
         2.43876839e+00, -4.48292680e-03, -2.14092350e+00,
         6.12611115e-01,  1.11285603e+00, -2.22536993e+00,
         6.19186044e-01,  2.32936382e+00,  7.81414032e-01,
         2.20310807e+00,  1.70728886e+00,  9.98786509e-01,
         2.17993450e+00, -1.10910702e+00,  7.74712563e-01,
        -1.32214510e+00,  5.63690007e-01, -3.84416461e-01,
         8.27995241e-01, -4.83949006e-01, -8.21544886e-01,
        -2.12973142e+00,  7.56460242e-03, -9.60952818e-01,
        -9.00639892e-01, -5.32640398e-01, -1.54721355e+00,
         1.28844106e+00, -2.18590665e+00,  7.59613886e-02,
         1.41904652e+00,  2.00145459e+00, -5.61082037e-03,
        -8.48800004e-01, -3.39667296e+00,  4.72937316e-01,
         3.90724599e-01,  2.25583315e+00, -1.64154947e+00,
         1.09247971e+00,  1.22169948e+00, -1.77249923e-01,
        -1.72610426e+00, -1.00338519e+00,  1.57657874e+00,
         5.66755593e-01,  9.59587276e-01,  1.20562530e+00,
         5.56555271e-01, -1.73438549e-01,  1.37680650e+00,
         4.84442592e-01, -7.84019530e-01,  1.46761346e+00,
         2.41437510e-01, -1.45273006e+00,  2.36359453e+00,
        -7.23895729e-01, -5.48329949e-01,  1.63029408e+00,
         6.01357341e-01,  1.00652158e+00, -1.26761377e+00,
         1.69352138e+00, -1.16331351e+00,  1.76305699e+00,
         1.09777987e+00,  1.88910949e+00, -1.34760678e-01,
        -5.33491485e-02, -1.41164458e+00,  8.12173724e-01,
        -1.32523164e-01, -1.48728597e+00, -5.82327962e-01,
         9.99465764e-01, -6.27849877e-01, -3.76695085e+00,
        -1.78728449e+00,  7.25734711e-01, -8.02710533e-01,
         1.77324498e+00, -7.34931648e-01, -1.28798640e+00,
         2.06032351e-01, -1.44852668e-01, -7.73492306e-02,
         1.36506236e+00,  7.98806548e-02,  2.30095610e-01,
         7.99459219e-02,  9.00573730e-02, -2.27577433e-01,
        -4.09351349e+00,  5.53026319e-01,  1.14314580e+00,
         2.26423240e+00, -1.01395875e-01,  1.58043361e+00,
        -2.02454543e+00, -7.69232273e-01,  5.05997539e-01,
        -1.70164680e+00,  1.40148854e+00, -3.52646708e-02,
         1.68559182e+00,  2.71474093e-01,  1.80929244e+00,
        -1.39540589e+00,  8.89791310e-01, -5.15011787e-01,
        -6.14166200e-01, -7.38051891e-01]], dtype=float32)

Configure Pinecone

Install and set up Pinecone.

!pip install --quiet -U pinecone-client
import pinecone 

# Load Pinecone API key

api_key = '<YOUR API KEY HERE>'
pinecone.init(api_key=api_key)

Get a Pinecone API key if you don’t have one.

Create an Index

#Set a name for your index
index_name = 'shopping-cart-demo'
# check whether an index with the same name already exists
if index_name in pinecone.list_indexes():
    pinecone.deleted_index(index_name)

pinecone.create_index(name=index_name,metric='cosine')
{'msg': '', 'success': True}

Connect to the new index

index = pinecone.Index(name=index_name)

Load Data

Uploading all items (products that one can buy) and displaying some examples of products and their vector representations.

# Get all of the items
all_items = [title for title in products_lookup['product_name']]

# Transform items into factors
items_factors = model.item_factors

# Prepare item factors for upload
items_to_insert = list(zip(all_items, items_factors[1:]))
display(items_to_insert[:2])

[('Chocolate Sandwich Cookies',
  array([ 0.00622083,  0.00874168,  0.00020163,  0.01146342,  0.00097009,
          0.00164262,  0.02481926,  0.01256867,  0.00841584,  0.01532503,
         -0.00843535,  0.01044653, -0.01107129,  0.01580254,  0.02204889,
          0.00823004,  0.00148215, -0.00425257,  0.0006036 , -0.01217703,
          0.00103806,  0.01606779,  0.01002141,  0.00690033, -0.00114819,
         -0.01123063,  0.01372802, -0.00065527, -0.007907  ,  0.01966622,
          0.01133489,  0.0198267 ,  0.00584944,  0.00039154,  0.02089761,
         -0.00854403,  0.00150728,  0.01410823,  0.01639647,  0.00110211,
         -0.01929718,  0.00969881,  0.00126109,  0.00643161,  0.00043924,
          0.00280298,  0.00794108,  0.01625498, -0.00114213,  0.00724161,
         -0.00866053, -0.00145312,  0.01239553,  0.01141248, -0.00264722,
          0.00298474, -0.00605761, -0.01180423,  0.02440505,  0.02707466,
          0.00342609,  0.02222574, -0.01922864, -0.00498338,  0.0041606 ,
          0.00382998,  0.00388185,  0.01277138,  0.00294698,  0.01419339,
          0.00355842, -0.01140964, -0.0032904 ,  0.02107666, -0.01220581,
          0.00830553, -0.00360495,  0.00796246,  0.01355901,  0.01906696,
          0.01876887, -0.01135404,  0.0003751 ,  0.0067894 ,  0.00827718,
         -0.00656992,  0.00786064,  0.00747358,  0.0002797 ,  0.0191397 ,
          0.01021724,  0.00180425,  0.01149296,  0.01268673,  0.01666239,
          0.01081016,  0.00498641,  0.0114116 ,  0.00209432, -0.00268642,
          0.0058637 ,  0.00285215,  0.00899839,  0.01169245,  0.002553  ,
          0.02069743, -0.00726644,  0.00561864,  0.01457589,  0.01074686,
         -0.00163069, -0.01503158,  0.00594237,  0.01657531,  0.01839461,
          0.00363692,  0.00524   , -0.00506192,  0.00556242,  0.01142012,
          0.00242651, -0.01567049,  0.02473967,  0.01068216,  0.01182867,
          0.01121255,  0.00763299,  0.00207949], dtype=float32)),
 ('All-Seasons Salt',
  array([ 0.00361272,  0.00541232,  0.00397553,  0.00553514,  0.00116621,
          0.00390807,  0.00730027,  0.00350796,  0.00229085,  0.00210324,
          0.00080144,  0.00127138, -0.00048949, -0.00078521,  0.00585358,
          0.00386293,  0.00446024,  0.00679846,  0.0049519 ,  0.00504748,
          0.00375845,  0.00152915,  0.00525329,  0.00219877,  0.00118681,
          0.00691467,  0.00149077,  0.0035372 ,  0.00329178,  0.00739743,
          0.00356595,  0.00461424,  0.00712844,  0.00159762,  0.00113318,
          0.00270729,  0.00356044,  0.00490391,  0.00446271,  0.00489495,
          0.00795775,  0.00176176,  0.00814594,  0.00773027,  0.00283073,
          0.00196886,  0.00400983, -0.00023961,  0.00470835,  0.00268707,
          0.00391872,  0.00046707,  0.00443491,  0.00776493,  0.00522574,
          0.00452722,  0.00431347,  0.00668673,  0.00875636,  0.00314672,
          0.002721  ,  0.00351624,  0.00649229,  0.00658003,  0.00503745,
          0.00323636,  0.00340225,  0.00443649,  0.00495935,  0.00396223,
          0.00621412,  0.00180953,  0.00259974,  0.00430223,  0.00351434,
          0.00204523,  0.00220418,  0.006047  ,  0.00161205,  0.00288573,
          0.00205453,  0.00407582,  0.00010012, -0.00087068,  0.00547411,
          0.0062386 ,  0.00430007,  0.0053917 ,  0.00213009,  0.00239013,
          0.00334436,  0.0088663 ,  0.00672663,  0.00554007,  0.00510628,
          0.00365183,  0.00409472,  0.00557497,  0.00680547,  0.00468006,
          0.00512921,  0.0024056 ,  0.00402409,  0.00530454,  0.00437682,
          0.00492994,  0.00342923,  0.00449692,  0.00633973,  0.00289778,
          0.00733405,  0.00450299,  0.00741144,  0.00301688,  0.00798721,
          0.00777207,  0.00523104,  0.00608047,  0.00483847,  0.0066071 ,
          0.00506872,  0.0056828 ,  0.00508885,  0.00714068,  0.00070709,
          0.00316286,  0.00627841,  0.00872395], dtype=float32))]

Insert items into the index service.

print('Index size before upsert:', index.info())

acks = index.upsert(items=[(ii[:64],x) for ii,x in items_to_insert])

print('Index size after upsert:', index.info())
print()

print(f'Sample upsert responses:')
pd.DataFrame(acks[:3])
Index size before upsert: InfoResult(index_size=0)


Index size after upsert: InfoResult(index_size=49677)

Sample upsert responses:
id
0Chocolate Sandwich Cookies
1All-Seasons Salt
2Robust Golden Unsweetened Oolong Tea

This is a helper method for analysing recommendations later. This method returns top N products that someone bought in the past (based on product quantity).

def products_bought_by_user_in_the_past(user_id: int, top: int = 10):
    
    selected = data[data.user_id == user_id].sort_values(by=['total_orders'], ascending=False) 
    
    selected['product_name'] = selected['product_id'].map(products_lookup.set_index('product_id')['product_name'])
    selected = selected[['product_id', 'product_name', 'total_orders']].reset_index(drop=True)
    if selected.shape[0] < top:
        return selected
    
    return selected[:top]
data.tail()
user_idproduct_idtotal_orders
13863744206209486971
13863745206209487422
138637462062102280297
138637472062112683489
138637482062111259077

Query for Recommendations

We are now retrieving user factors for users that we have manually created before for testing purposes. Besides these users, we are adding a random existing user. We are also displaying these users so you can see what these factors look like.

user_ids = [206210, 206211, 103593]
user_factors = model.user_factors[user_to_index[user_ids]]

display(user_factors[1:])
array([[-1.6957028 ,  1.6948007 ,  1.1281109 , -1.8624253 , -0.01295529,
         0.41422075, -0.04887928, -1.1238362 ,  0.6461375 ,  1.0761728 ,
         0.48219234, -0.1158735 ,  0.19704771,  0.548042  , -1.9548368 ,
        -0.854441  ,  1.5604722 ,  0.65668327,  1.7775067 ,  0.14161143,
         0.12637897, -1.192585  , -0.03127463, -0.6064059 , -0.20703381,
        -1.1387925 ,  0.5587323 , -1.456463  , -0.15781799, -0.22153452,
        -3.0347443 ,  1.0725269 , -0.90342635,  0.93805426,  2.752009  ,
         1.2206806 ,  0.35675663, -0.3706993 , -0.17378844, -1.7452165 ,
         1.8028396 , -0.8711866 ,  0.29263344, -0.605011  ,  0.65558636,
        -1.4160249 , -0.43984538,  2.9496586 , -0.4095888 ,  0.91677475,
         0.26470307,  0.30461687, -1.1963339 ,  0.43830916, -1.2901427 ,
        -0.63025063, -1.8126433 ,  2.3717566 ,  0.7649209 , -1.6137354 ,
        -1.4110433 ,  0.02955647, -1.7383931 ,  2.426386  ,  0.31014746,
        -0.24071512,  0.6608062 ,  0.9864494 , -0.3657122 , -0.6880389 ,
        -1.3747276 ,  1.7448217 , -3.1429775 ,  1.667982  , -0.5088452 ,
        -0.6388357 , -0.6970774 ,  0.46273124, -0.56489336, -0.34625688,
         0.5281177 , -0.46833715,  0.88507086,  1.5959613 , -0.5450506 ,
         0.6233538 ,  0.1369354 ,  0.7783381 , -0.7929994 ,  0.7728444 ,
         1.3493261 , -1.1029574 , -1.1583933 , -1.0181646 , -1.4891635 ,
        -1.1943097 , -0.23805316, -0.531797  , -2.426462  ,  3.5465262 ,
         1.4920706 ,  1.0036546 ,  0.28052935,  1.726973  ,  0.639604  ,
         1.5242133 ,  0.5813025 ,  0.23365884, -1.293975  ,  1.2538519 ,
        -0.6764324 , -0.88901204, -1.0468957 ,  1.7073478 , -1.1485049 ,
        -0.6168143 ,  1.3466699 , -0.5989921 , -0.7310282 ,  0.77216476,
         2.8653984 ,  0.69503015, -1.2812102 , -0.15854032,  0.30414662,
        -0.7775853 ,  1.035563  ,  0.29556623],
       [-0.02376875,  1.7833768 , -0.8564922 ,  3.0098526 , -0.2536913 ,
        -0.92627114, -1.0936354 ,  2.5968904 , -0.89597887, -0.38435826,
         0.73669183,  1.1071353 , -0.32326367, -1.2109493 ,  1.2991474 ,
        -0.50840443,  0.08860902, -0.58359015,  0.7530021 , -1.5837436 ,
         0.13703083,  4.1062417 , -0.17467318, -0.08843535,  0.05084601,
         0.9489649 ,  0.75546736,  0.02652768, -0.5387976 ,  1.7493476 ,
        -0.80722845,  1.6132078 , -0.8338906 ,  1.4556283 ,  0.9582538 ,
         0.0205423 ,  1.8907573 , -0.35851976, -1.620731  ,  0.21423498,
         0.9960194 , -0.79507023,  0.64672726, -1.1162902 , -0.9142073 ,
         1.0978727 , -1.5249863 , -0.79998255,  1.711109  , -0.47507587,
         0.98168623,  0.54857165,  0.40612945, -0.5053595 , -0.45795304,
         1.7788986 ,  0.7139665 ,  0.17143697,  0.42983797,  1.1683003 ,
         0.22838737,  0.22731057,  0.57180977,  2.318902  ,  0.8294081 ,
        -1.6339854 , -0.7319715 , -0.62441677,  1.5005964 ,  1.823522  ,
        -0.87894535,  1.6501321 , -1.5699443 ,  2.0350382 ,  1.0813975 ,
        -0.4121969 ,  1.2346659 , -1.1027308 ,  0.52319163,  1.1409373 ,
         1.5197383 ,  1.2310604 ,  2.2276106 , -0.11142991, -1.1258705 ,
        -0.5721893 ,  2.521472  , -0.10450049, -0.06087149, -0.04528678,
        -0.4464207 , -1.6825993 ,  0.45609498, -2.1429644 ,  1.8785133 ,
        -0.4103479 , -0.637818  ,  0.26403964, -0.6523134 , -0.8116513 ,
        -0.22152117, -1.9043759 ,  0.23663412,  1.4579793 ,  0.7663014 ,
        -0.36981013, -1.3066893 ,  1.5400338 , -4.7338514 , -0.6610359 ,
        -0.8715579 ,  2.0064037 ,  0.6400188 , -0.87609607,  0.03797498,
        -1.806934  , -0.41281745, -1.8454045 ,  0.65532756,  0.48017284,
        -0.45883533, -0.33465794, -0.28919998, -3.144824  ,  0.27775276,
        -1.0185521 , -1.5696504 , -0.7269824 ]], dtype=float32)

Model recommendations

We will now retrieve recommendations from our model directly, just to have these results as a baseline.

print("Model recommendations\n")

start_time = time.process_time()
recommendations0 = model.recommend(userid=user_ids[0], user_items=sparse_user_product)
recommendations1 = model.recommend(userid=user_ids[1], user_items=sparse_user_product)
recommendations2 = model.recommend(userid=user_ids[2], user_items=sparse_user_product)
print("Time needed for retrieving recommended products: " + str(time.process_time() - start_time) + ' seconds.\n')

print('\nRecommendations for person 0:')
for recommendation in recommendations0:
    product_id = recommendation[0]
    print(products_lookup[products_lookup.product_id == product_id]['product_name'].values)
    
print('\nRecommendations for person 1:')
for recommendation in recommendations1:
    product_id = recommendation[0]
    print(products_lookup[products_lookup.product_id == product_id]['product_name'].values)
    
print('\nRecommendations for person 2:')
for recommendation in recommendations2:
    product_id = recommendation[0]
    print(products_lookup[products_lookup.product_id == product_id]['product_name'].values)
Model recommendations

Time needed for retrieving recommended products: 0.018081140000049345 seconds.


Recommendations for person 0:
['Sparkling Water']
['Soda']
['Sparkling Natural Mineral Water']
['Zero Calorie Cola']
['Drinking Water']
['Natural Artesian Water']
['Spring Water']
['Sparkling Mineral Water']
['Coconut Water']
['Orange & Lemon Flavor Variety Pack Sparkling Fruit Beverage']

Recommendations for person 1:
['Total Greek Strained Yogurt']
['Baby Wipes Sensitive']
['Organic Whole Milk']
['Strawberry Explosion/Banana Split Smoothie']
['Danimals Strawberry Explosion Flavored Smoothie']
['Natural California Raisins Mini Snack Boxes']
['Baby Wipes']
['Eggo Pancakes Minis']
['Whole Milk']
['White Buttermints']

Recommendations for person 2:
['Organic Red Delicious Apple']
['Watermelon Chunks']
['Black Seedless Grapes']
['Bag of Organic Bananas']
['Bartlett Pears']
['Organic Green Seedless Grapes']
['Large Grapefruit']
['Red Plum']
["Organic D'Anjou Pears"]
['Organic Blackberries']

Query the database

Let’s now query the index to check how quickly we retrieve results. Please note that query speed depends in part on your internet connection.

# Query by user factors

query_results = index.query(queries=user_factors[:-1], top_k=10)

for _id, res in zip(user_ids, query_results):
    print(f'user_id={_id}')
    df = pd.DataFrame({'products': res.ids, 'scores': res.scores})
    print("Recommendation: ")
    display(df)
    print("Top buys from the past: ")
    display(products_bought_by_user_in_the_past(_id, top=15))
user_id=206210
Recommendation: 
productsscores
0Mineral Water0.901349
1Zero Calorie Cola0.644741
2Orange & Lemon Flavor Variety Pack Sparkling F...0.592524
3Sparkling Water0.580313
4Popcorn0.560757
5Extra Fancy Unsalted Mixed Nuts0.542908
6Drinking Water0.540762
7Milk Chocolate Almonds0.535276
8Tall Kitchen Bag With Febreze Odor Shield0.523282
9Trail Mix0.510975
Top buys from the past: 
product_idproduct_nametotal_orders
022802Mineral Water97
user_id=206211
Recommendation: 
productsscores
0No More Tears Baby Shampoo0.689890
1Baby Wash & Shampoo0.666703
2Size 6 Baby Dry Diapers0.483426
3Baby Wipes Sensitive0.455424
4White Buttermints0.452916
5Head-to-Toe Baby Wash0.444567
6Size 5 Cruisers Diapers Super Pack0.442469
7Strawberry Explosion/Banana Split Smoothie0.439390
8Original Detergent0.421467
9Grow & Gain Chocolate Shake Nutritional Drink0.417840
Top buys from the past: 
product_idproduct_nametotal_orders
026834No More Tears Baby Shampoo89
112590Baby Wash & Shampoo77

All that’s left to do is surface these recommendations on the shopping site, or feed them into other applications.

Cleanup

Delete the index if you no longer have any use for it.

pinecone.delete_index(index_name)
{'success': True}

Summary

In this example we used Pinecone to build and deploy a product recommendation engine that uses collaborative filtering, relatively quickly.

Once deployed, the product recommendation engine can index new data, retrieve recommendations in milliseconds, and send results to production applications.