Skip to content
START FOR FREE
START FOR FREE
  • SUPPORT
  • COMMUNITY
Menu
  • SUPPORT
  • COMMUNITY
MENUMENU
  • Products
    • The World’s Fastest and Most Scalable Graph Platform

      LEARN MORE

      Watch a TigerGraph Demo

      TIGERGRAPH CLOUD

      • Overview
      • TigerGraph Cloud Suite
      • FAQ
      • Pricing

      USER TOOLS

      • GraphStudio
      • Insights
      • Application Workbenches
      • Connectors and Drivers
      • Starter Kits
      • openCypher Support

      TIGERGRAPH DB

      • Overview
      • GSQL Query Language
      • Compare Editions

      GRAPH DATA SCIENCE

      • Graph Data Science Library
      • Machine Learning Workbench
  • Solutions
    • The World’s Fastest and Most Scalable Graph Platform

      LEARN MORE

      Watch a TigerGraph Demo

      Solutions

      • Solutions Overview

      INCREASE REVENUE

      • Customer Journey/360
      • Product Marketing
      • Entity Resolution
      • Recommendation Engine

      MANAGE RISK

      • Fraud Detection
      • Anti-Money Laundering
      • Threat Detection
      • Risk Monitoring

      IMPROVE OPERATIONS

      • Supply Chain Analysis
      • Energy Management
      • Network Optimization

      By Industry

      • Advertising, Media & Entertainment
      • Financial Services
      • Healthcare & Life Sciences

      FOUNDATIONAL

      • AI & Machine Learning
      • Time Series Analysis
      • Geospatial Analysis
  • Customers
    • The World’s Fastest and Most Scalable Graph Platform

      LEARN MORE

      CUSTOMER SUCCESS STORIES

      • Ford
      • Intuit
      • JPMorgan Chase
      • READ MORE SUCCESS STORIES
      • Jaguar Land Rover
      • United Health Group
      • Xbox
  • Partners
    • The World’s Fastest and Most Scalable Graph Platform

      LEARN MORE

      PARTNER PROGRAM

      • Partner Benefits
      • TigerGraph Partners
      • Sign Up
      TigerGraph partners with organizations that offer complementary technology solutions and services.​
  • Resources
    • The World’s Fastest and Most Scalable Graph Platform

      LEARN MORE

      BLOG

      • TigerGraph Blog

      RESOURCES

      • Resource Library
      • Benchmarks
      • Demos
      • O'Reilly Graph + ML Book

      EVENTS & WEBINARS

      • Graph+AI Summit
      • Graph for All - Million Dollar Challenge
      • Events &Trade Shows
      • Webinars

      DEVELOPERS

      • Documentation
      • Ecosystem
      • Developers Hub
      • Community Forum

      SUPPORT

      • Contact Support
      • Production Guidelines

      EDUCATION

      • Training & Certifications
  • Company
    • Join the World’s Fastest and Most Scalable Graph Platform

      WE ARE HIRING

      COMPANY

      • Company Overview
      • Leadership
      • Legal Terms
      • Patents
      • Security and Compliance

      CAREERS

      • Join Us
      • Open Positions

      AWARDS

      • Awards and Recognition
      • Leader in Forrester Wave
      • Gartner Research

      PRESS RELEASE

      • Read All Press Releases
      TigerGraph Reports Exceptional Customer Growth and Product Leadership as More Market-Leading Companies Tap the Power of Graph
      March 1, 2023
      Read More »

      NEWS

      • Read All News
      The-New-Stack-Logo-square

      Multiple Vendors Make Data and Analytics Ubiquitous

      TigerGraph enhances fundamentals in latest platform update

  • START FREE
    • The World’s Fastest and Most Scalable Graph Platform

      GET STARTED

      • Request a Demo
      • CONTACT US
      • Try TigerGraph
      • START FREE
      • TRY AN ONLINE DEMO

TigerGraph Unveils Enhanced Graph Data Science Library With More Than 50 Algorithms

  • Victor Lee
  • November 11, 2021
  • blog, Data Science
  • Blog >
  • TigerGraph Unveils Enhanced Graph Data Science Library With More Than 50 Algorithms

Graph Data Science

TigerGraph Unveils Enhanced Graph Data Science Library With More Than 50 Algorithms

By Victor Lee, VP of Machine Learning and AI

When I completed my Ph.D. research on graph algorithms, I didn’t expect to be making a career out of it.  Even more surprising was to see the data and analytics industry embracing graph analytics. It’s been thrilling to watch the demand for graph analytics explode. Major corporations are deploying large-scale graphs to gain greater insights from their data. As interest, use cases, and deployments have grown, the field has matured so that it now makes sense to talk about Graph Data Science. 

Graph Data Science

What is Graph Data Science? The simple answer is that it is the branch of data science which employs graph data structures and graph analytical techniques. It includes data modeling and data management, as well analytical methods ranging from simple statistics-gathering queries, to algorithms which reveal more complex patterns, to machine learning methods which form predictive models.

TigerGraph has been a leader in graph data management with its real-time scalable graph database, and graph algorithms have been part of TigerGraph’s DNA since our start. We used to call our collection of algorithms the GSQL Graph Algorithm Library, GSQL being our analytics-friendly query language. Recently, as part of a major initiative to deliver out-of-the-box Graph Data Science and Graph Machine Learning, we decided to fine-tune the algorithm library. To signal this change, we’ve renamed it the Graph Data Science Library.

What’s New?

TigerGraph’s Graph Data Science Library v1.0 contains several great improvements and additions, but it’s only the start. The most obvious change is the addition of over 20 new algorithms, bringing the total to over 50, including some new categories of algorithms. Here’s a summary of what’s new:

  • Library collection: 20+ new algorithms, including embedding algorithms for graph ML.
  • Library structure and management: Our library continues to be open-source, on GitHub. We’ve improved the organization, grouping algorithms by category, and placing each algorithm in its own folder with a README and Change Log file. The repository will use tags to identify major releases.

Let’s take a close look at the new algorithms and their use cases:

Graph Embeddings

Embedding (verb) is transforming high-dimensional data to a lower-dimensional representation, while accurately preserving the most important features. An embedding (noun) is one of those lower-dimensional representations. An example is a 2-D map of our 3-D planet surface. Graphs are high-dimensional, with each type of relationship representing another dimension. This high-dimensionality is part of the reason that they are so good at describing complex relationships.  Alas, traditional machine learning techniques are designed to work with low-dimensional tabular or matrix data. Enter graph embedding!

Graph embedding algorithms (strictly speaking, node embedding algorithms) transform a network of interconnected vertices into a set of independent embedding vectors, one for each vertex.  The edge connections are dropped.  However, vertices which have similar neighborhoods in the original graph will have similar embedding vectors. So, you can easily perform similarity and clustering with the vertex embeddings, which can in turn be used to train a model to predict classifications.

TigerGraph’s Graph Data Science Library includes two popular graph embedding algorithms, node2vec and FastRP. Node2vec is known for producing accurate embeddings, but it can be slow on larger graphs. As its name suggests, FastRP runs (much) faster, at the cost of some accuracy.

Other New Algorithms

Centrality algorithms help to identify which entity is the most influential or impactful. Different centrality algorithms frame the question in different ways, so having a good selection of algorithms to choose from means you are ready to handle more situations. We expanded our collection by adding Article Rank, Eigenvector Centrality, Degree Centrality, and Influence Maximization.

Community algorithms are one of our most popular types of algorithms, because they are often used in anti-fraud, personalized recommendation, and for detecting social groups. They perform the complex task of judging exactly which set of neighboring vertices should be considered a “community” because they have a relatively high degree of in-group connections. We added Speaker-Listener Label Propagation which supports the important case of overlapping communities. We also improved the performance or added two variants of our existing community algorithms.

Similarity algorithms are also incredibly useful, for recommendation and classification tasks. We added four new algorithms: Approximate Nearest Neighbors, Euclidean Similarity, Overlap Similarity, and Pearson Similarity.

We introduced a new category: Topological Link Prediction. This category could also be called Structural Similarity. While these algorithms are fairly simple, they nevertheless help to round out our offerings.  We added measurement algorithms for Adamic Adar, Common Neighbors, Preferential Attachment, Resource Allocation, Same Community, and Total Neighbors.

Last but not least, we added A* and Random Walk path algorithms. A* (pronounced “A star”) estimates the total cost to reach a destination or goal, if you have completed part of the journey and then have a heuristic for estimating the remaining cost based on your present position.

Continuing the TigerGraph Way

While adding improvements, we’re maintaining three principles that have set us apart: open-source, in-database, and scalable high-performance. Together, they form a user-oriented experience. 

Open-source: TigerGraph’s algorithms are open to the community to critique and to offer suggested improvements. It also means that you can customize the algorithms as you like. This has been one of our most appreciated features. You aren’t locked into the default functionality. You can tweak input parameters, filtering, and output formatting. Share your improvements and additions with the rest of the developer community. Email us at [email protected] or make a Git pull request.

Our algorithms are written in GSQL. GSQL has been called the PL/SQL of graph databases: procedural querying. Take the most well-known and successful database query language – SQL, adapt it for graph traversal, add support for procedures like looping, conditional statements, variables and parameters, and then wrap it in a named procedure which you can store and invoke whenever you want: That’s GSQL. Oh, and accumulators and parallel processing! The TigerGraph engine loves to speed up your work through massive parallelism. GSQL accumulators make it easy to traverse and aggregate in parallel.

GSQL is also extensible via user-defined functions (UDFs). Is there a specialized function you need which is tricky to implement using GSQL or is not supported? Then you can write a custom function in C++. A few of the graph data science algorithms employ UDFs.

In-database: Yes, TigerGraph is a database! There are some products out there that are graph analytics or graph visualization tools, but they aren’t designed for managing data or handling transactions the way that TigerGraph is. We are an enterprise distributed in-memory database, so we handle a wide range of operational and analytics use cases.  One of our advantages is the ability to handle multiple types of workloads on the same platform. Being able to run graph algorithms within the updatable database means that you are analyzing the latest data, with no need to export to a separately managed copy, and that you can even update the database based on the results of your algorithms.

Scalable: TigerGraph can grow with your needs. No matter how big or small your current set of data is, it’s likely to grow. TigerGraph is built with a distributed database architecture, with massively parallel processing.

What’s Coming?

The Graph Data Science Library will continue to grow and improve, but it represents only one part of TigerGraph’s vision for delivering high-performance and easy-to-use Graph Data Science and Machine Learning to everyone. Graph embedding algorithms like node2vec and FastRP are one way to take advantage of graph insight to develop more accurate and powerful machine learning. However, since they transform the graph to linear data, we are still losing something.

Graph neural networks (GNNs) represent what is arguably the ultimate integration of connected data analytics and machine learning, using the graph structure during the training process itself. The GCN research paper by Kipf and Welling that paved the way is only five years old, but it has transformed the way that data scientists think about graphs. Many other versions of GNNs have followed and are now supported in two major open-source libraries, PyTorch Geometric and DGL

The good news is that you can build a pipeline today, with TigerGraph as a data source, export data with the help of PyTigerGraph or our REST APIs, and train your model with a GNN running on your favorite platform. But it gets better. We are close to delivering the TigerGraph Graph+ML Integrated Workbench.

Imagine a Jupyter-style data science platform that includes the key stages for model development – for graph data and graph machine learning!  Prepare your data in TigerGraph, transport it seamlessly to your training environment, select from your choice of graph ML models, train and tune. 

Our customers have asked for in-database neural networks, to complement our current in-database algorithms, and we’ve listened.  We are planning to deliver in-database ML training in the first half of 2022.  In-database training simplifies your pipeline, saving the time and cost of exporting data and running another system.

The Graph Data Science team at TigerGraph has more to tell you about what we’ve already done and what’s coming: Tutorials and use case examples to help you get started with graph algorithms, graph feature engineering, and graph machine learning, as well as more product announcements as we continue to roll out new features and services. We would love to hear your thoughts and feedback. Send them to [email protected]

You Might Also Like

Trillion edges benchmark: new world record beyond 100TB by TigerGraph featuring AMD based Amazon EC2 instances

Trillion edges benchmark: new world record...

March 13, 2023
Graph Databases 101: Your Top 5 Questions with Non-Technical Answers

Graph Databases 101: Your Top 5...

February 7, 2023
It’s Time to Harness the Power of Graph Technology [Infographic]

It’s Time to Harness the Power...

January 25, 2023

Introducing TigerGraph 3.0

July 1, 2020

Everything to Know to Pass your TigerGraph Certification Test

June 24, 2020

Neo4j 4.0 Fabric – A Look Behind the Curtain

February 7, 2020

TigerGraph Blog

  • Categories
    • blogs
      • About TigerGraph
      • Benchmark
      • Business
      • Community
      • Compliance
      • Customer
      • Customer 360
      • Cybersecurity
      • Developers
      • Digital Twin
      • eCommerce
      • Emerging Use Cases
      • Entity Resolution
      • Finance
      • Fraud / Anti-Money Laundering
      • GQL
      • Graph Database Market
      • Graph Databases
      • GSQL
      • Healthcare
      • Machine Learning / AI
      • Podcast
      • Supply Chain
      • TigerGraph
      • TigerGraph Cloud
    • Graph AI On Demand
      • Analysts and Research
      • Customer 360 and Entity Resolution
      • Customer Spotlight
      • Development
      • Finance, Banking, Insurance
      • Keynote
      • Session
    • Video
  • Recent Posts

    • Trillion edges benchmark: new world record beyond 100TB by TigerGraph featuring AMD based Amazon EC2 instances
    • Overview of Graph and Machine Learning with TigerGraph | Mar 8 @ 11am PST
    • Gartner Data & Analytics Summit 2023, London
    • Gartner Data and Analytics Summit, Orlando
    • Transaction Surveillance with Maximum Flow Algorithm
    TigerGraph

    Product

    SOLUTIONS

    customers

    RESOURCES

    start for free

    TIGERGRAPH DB
    • Overview
    • Features
    • GSQL Query Language
    GRAPH DATA SCIENCE
    • Graph Data Science Library
    • Machine Learning Workbench
    TIGERGRAPH CLOUD
    • Overview
    • Cloud Starter Kits
    • Login
    • FAQ
    • Pricing
    • Cloud Marketplaces
    USEr TOOLS
    • GraphStudio
    • TigerGraph Insights
    • Application Workbenches
    • Connectors and Drivers
    • Starter Kits
    • openCypher Support
    SOLUTIONS
    • Why Graph?
    industry
    • Advertising, Media & Entertainment
    • Financial Services
    • Healthcare & Life Sciences
    use cases
    • Benefits
    • Product & Service Marketing
    • Entity Resolution
    • Customer 360/MDM
    • Recommendation Engine
    • Anti-Money Laundering
    • Cybersecurity Threat Detection
    • Fraud Detection
    • Risk Assessment & Monitoring
    • Energy Management
    • Network & IT Management
    • Supply Chain Analysis
    • AI & Machine Learning
    • Geospatial Analysis
    • Time Series Analysis
    success stories
    • Customer Success Stories

    Partners

    Partner program
    • Partner Benefits
    • TigerGraph Partners
    • Sign Up
    LIBRARY
    • Resources
    • Benchmark
    • Webinars
    Events
    • Trade Shows
    • Graph + AI Summit
    • Million Dollar Challenge
    EDUCATION
    • Training & Certifications
    Blog
    • TigerGraph Blog
    DEVELOPERS
    • Developers Hub
    • Community Forum
    • Documentation
    • Ecosystem

    COMPANY

    Company
    • Overview
    • Careers
    • News
    • Press Release
    • Awards
    • Legal
    • Patents
    • Security and Compliance
    • Contact
    Get Started
    • Start Free
    • Compare Editions
    • Online Demo - Test Drive
    • Request a Demo

    Product

    • Overview
    • TigerGraph 3.0
    • TIGERGRAPH DB
    • TIGERGRAPH CLOUD
    • GRAPHSTUDIO
    • TRY NOW

    customers

    • success stories

    RESOURCES

    • LIBRARY
    • Events
    • EDUCATION
    • BLOG
    • DEVELOPERS

    SOLUTIONS

    • SOLUTIONS
    • use cases
    • industry

    Partners

    • partner program

    company

    • Overview
    • news
    • Press Release
    • Awards

    start for free

    • Request Demo
    • take a test drive
    • SUPPORT
    • COMMUNITY
    • CONTACT
    • Copyright © 2023 TigerGraph
    • Privacy Policy
    • Linkedin
    • Facebook
    • Twitter

    Copyright © 2020 TigerGraph | Privacy Policy

    Copyright © 2020 TigerGraph Privacy Policy

    • SUPPORT
    • COMMUNITY
    • COMPANY
    • CONTACT
    • Linkedin
    • Facebook
    • Twitter

    Copyright © 2020 TigerGraph

    Privacy Policy

    • Products
    • Solutions
    • Customers
    • Partners
    • Resources
    • Company
    • START FREE
    START FOR FREE
    START FOR FREE
    TigerGraph
    PRODUCT
    PRODUCT
    • Overview
    • GraphStudio UI
    • Graph Data Science Library
    TIGERGRAPH DB
    • Overview
    • Features
    • GSQL Query Language
    TIGERGRAPH CLOUD
    • Overview
    • Cloud Starter Kits
    TRY TIGERGRAPH
    • Get Started for Free
    • Compare Editions
    SOLUTIONS
    SOLUTIONS
    • Why Graph?
    use cases
    • Benefits
    • Product & Service Marketing
    • Entity Resolution
    • Customer Journey/360
    • Recommendation Engine
    • Anti-Money Laundering (AML)
    • Cybersecurity Threat Detection
    • Fraud Detection
    • Risk Assessment & Monitoring
    • Energy Management
    • Network Resources Optimization
    • Supply Chain Analysis
    • AI & Machine Learning
    • Geospatial Analysis
    • Time Series Analysis
    industry
    • Advertising, Media & Entertainment
    • Financial Services
    • Healthcare & Life Sciences
    CUSTOMERS
    read all success stories

     

    PARTNERS
    Partner program
    • Partner Benefits
    • TigerGraph Partners
    • Sign Up
    RESOURCES
    LIBRARY
    • Resource Library
    • Benchmark
    • Webinars
    Events
    • Trade Shows
    • Graph + AI Summit
    • Graph for All - Million Dollar Challenge
    EDUCATION
    • TigerGraph Academy
    • Certification
    Blog
    • TigerGraph Blog
    DEVELOPERS
    • Developers Hub
    • Community Forum
    • Documentation
    • Ecosystem
    COMPANY
    COMPANY
    • Overview
    • Leadership
    • Careers  
    NEWS
    PRESS RELEASE
    AWARDS
    START FREE
    Start Free
    • Request a Demo
    • SUPPORT
    • COMMUNITY
    • CONTACT
    Dr. Jay Yu

    Dr. Jay Yu | VP of Product and Innovation

    Dr. Jay Yu is the VP of Product and Innovation at TigerGraph, responsible for driving product strategy and roadmap, as well as fostering innovation in graph database engine and graph solutions. He is a proven hands-on full-stack innovator, strategic thinker, leader, and evangelist for new technology and product, with 25+ years of industry experience ranging from highly scalable distributed database engine company (Teradata), B2B e-commerce services startup, to consumer-facing financial applications company (Intuit). He received his PhD from the University of Wisconsin - Madison, where he specialized in large scale parallel database systems

    Todd Blaschka | COO

    Todd Blaschka is a veteran in the enterprise software industry. He is passionate about creating entirely new segments in data, analytics and AI, with the distinction of establishing graph analytics as a Gartner Top 10 Data & Analytics trend two years in a row. By fervently focusing on critical industry and customer challenges, the companies under Todd's leadership have delivered significant quantifiable results to the largest brands in the world through channel and solution sales approach. Prior to TigerGraph, Todd led go to market and customer experience functions at Clustrix (acquired by MariaDB), Dataguise and IBM.