Building a Movie Recommendation System Using TigerGraph
Recommendation systems are used in our daily life for a variety of reasons including products, movies, music, news, and books. These recommendations are based on the user’s historical behavior and the behavior of other users having similar tastes. In this video we will build a simple movie recommendation system using TigerGraph’s graph database technology.We use the MovieLens dataset of 20,000,000 movie ratings from about 138,000 users towards about 27,000 movies.
The technique we used in the recommendation algorithm is collaborative filtering. The procedure of the algorithm is:
Find all movies rated by p, and call the movie set M1.
Find all people rated at least one movie within M1, and exclude p from the person vertex set, calculate the taste similarity between each person in the set with p (cosine similarity), and retain k1 people with highest similarity, name the person vertex set P2.
Find all movies rated by at least one person in P2, and not rated by p, calculate the average rating of each movie by the people in P2, select the top k2 movies with highest average rating and recommend to p. Use the average rating as recommending score.
We use cosine similarity to measure the taste similarity between two persons. Given two people p1 and p2, assume the movies they both rated is set M, then A is the rating from p1 to each movie in M, and B is the rating from p2 to each movie in M, following same element order in M. Their movie rating cosine similarity is:
Because of space limitations, we only touch a small portion of the functionalities supported by TigerGraph. If you are interested in learning more about TigerGraph here are some helpful resources to download:
GSQL White Paper: In this paper we compare GSQL to other prominent graph query languages in use today – Cypher and Gremlin.
Dr. Jay Yu is the VP of Product and Innovation at TigerGraph, responsible for driving product strategy and roadmap, as well as fostering innovation in graph database engine and graph solutions. He is a proven hands-on full-stack innovator, strategic thinker, leader, and evangelist for new technology and product, with 25+ years of industry experience ranging from highly scalable distributed database engine company (Teradata), B2B e-commerce services startup, to consumer-facing financial applications company (Intuit). He received his PhD from the University of Wisconsin - Madison, where he specialized in large scale parallel database systems
Todd Blaschka |COO
Todd Blaschka is a veteran in the enterprise software industry. He is passionate about creating entirely new segments in data, analytics and AI, with the distinction of establishing graph analytics as a Gartner Top 10 Data & Analytics trend two years in a row. By fervently focusing on critical industry and customer challenges, the companies under Todd's leadership have delivered significant quantifiable results to the largest brands in the world through channel and solution sales approach. Prior to TigerGraph, Todd led go to market and customer experience functions at Clustrix (acquired by MariaDB), Dataguise and IBM.