Recommendation Engine is widely used in various fields such as e-commerce, entertainment, and social media to overcome the information overload problem faced by users. These engines provide personalized recommendations to users based on their preferences and behavior. In this tutorial, we will learn how to build a recommendation engine with Surprise, a Python library for building and analyzing recommender systems.
What is Surprise?
Surprise is an open-source Python library for building and analyzing recommender systems. It provides easy-to-use interfaces for implementing various collaborative filtering algorithms, such as matrix factorization, neighborhood-based methods, and so on. Surprise makes it easy to load, evaluate, and tune different algorithms for recommender system tasks.
Dataset
For this tutorial, we will use the MovieLens 100k dataset, which contains about 100,000 ratings from 943 users on 1682 movies. The dataset can be downloaded from the MovieLens website.
Installation
To install Surprise, you can use pip:
pip install scikit-surprise
Code
Let’s start by importing the necessary modules:
import os from surprise import Dataset from surprise import Reader from surprise import KNNBasic from surprise.model_selection import train_test_split
We need to define the path to the dataset file and load it using the Dataset
class:
file_path = os.path.expanduser('~/.surprise_data/ml-100k/ml-100k/u.data') reader = Reader(line_format='user item rating timestamp', sep='\t') data = Dataset.load_from_file(file_path, reader=reader)
Next, we can split the data into training and test sets:
trainset, testset = train_test_split(data, test_size=.25)
Now, let’s define a KNNBasic
object and fit the model on the training data:
algo = KNNBasic() algo.fit(trainset)
We can now use the trained model to make predictions on the test set and evaluate its performance using RMSE:
test_pred = algo.test(testset) from surprise import accuracy accuracy.rmse(test_pred)
This gives us an RMSE of around 1.05, which is not bad considering that the ratings are on a scale of 1-5.
In this tutorial, we learned how to build a recommendation engine with Surprise in Python. We used the MovieLens 100k dataset and trained a KNN-based model to make personalized recommendations for users. Surprise provides several other algorithms and evaluation metrics that can be explored for building and analyzing different types of recommender systems.
Want to learn more about Python, checkout the Python Official Documentation for detail.