Building a Movie Recommendation System Using TigerGraph

Building a Movie Recommendation System Using TigerGraph

Recommendation systems are used in our daily life for a variety of reasons including products, movies, music, news, and books. These recommendations are based on the user’s historical behavior and the behavior of other users having similar tastes. In this video we will build a simple movie recommendation system using TigerGraph’s graph database technology. We use the MovieLens dataset of 20,000,000 movie ratings from about 138,000 users towards about 27,000 movies.

To follow along, download the free TigerGraph Developer Edition. After registration, you will receive an email including download link and installation guide. The installation takes about 15 minutes. You can also download the MovieLens dataset at: https://grouplens.org/datasets/movielens/20m/.

Following the video, you will learn how to build a simple movie recommendation system using TigerGraph’s GraphStudio SDK to:

  • Build a graph schema
  • Map data to graph and load data
  • Explore and visualize graph data
  • Write a collaborative filtering algorithm using GSQL language

The technique we used in the recommendation algorithm is collaborative filtering. The procedure of the algorithm is:

  1. Find all movies rated by p, and call the movie set M1.
  2. Find all people rated at least one movie within M1, and exclude p from the person vertex set, calculate the taste similarity between each person in the set with p (cosine similarity), and retain k1 people with highest similarity, name the person vertex set P2.
  3. Find all movies rated by at least one person in P2, and not rated by p, calculate the average rating of each movie by the people in P2, select the top k2 movies with highest average rating and recommend to p. Use the average rating as recommending score.

You can learn more detail about collaborative filtering at: https://en.wikipedia.org/wiki/Collaborative_filtering

We use cosine similarity to measure the taste similarity between two persons. Given two people p1 and p2, assume the movies they both rated is set M, then A is the rating from p1 to each movie in M, and B is the rating from p2 to each movie in M, following same element order in M. Their movie rating cosine similarity is:

You can learn more about cosine similarity at https://en.wikipedia.org/wiki/Cosine_similarity

Because of space limitations, we only touch a small portion of the functionalities supported by TigerGraph. If you are interested in learning more about TigerGraph here are some helpful resources to download:

If you have any questions please contact sales@tigergraph.com for more information.

References

[1] F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4, Article 19 (December 2015), 19 pages. DOI=<The MovieLens Datasets>

[2] 字凤芹, 牛进, 毕柱兰, 沈加敏. 基于图数据库的电影推荐系统设计. 软件导刊. 2016(1):144-6