Skip to content
START FOR FREE
START FOR FREE
  • SUPPORT
  • COMMUNITY
Menu
  • SUPPORT
  • COMMUNITY
MENUMENU
  • Products
    • The World’s Fastest and Most Scalable Graph Platform

      LEARN MORE

      Watch a TigerGraph Demo

      TIGERGRAPH CLOUD

      • Overview
      • TigerGraph Cloud Suite
      • FAQ
      • Pricing

      USER TOOLS

      • GraphStudio
      • Insights
      • Application Workbenches
      • Connectors and Drivers
      • Starter Kits
      • openCypher Support

      TIGERGRAPH DB

      • Overview
      • GSQL Query Language
      • Compare Editions

      GRAPH DATA SCIENCE

      • Graph Data Science Library
      • Machine Learning Workbench
  • Solutions
    • The World’s Fastest and Most Scalable Graph Platform

      LEARN MORE

      Watch a TigerGraph Demo

      Solutions

      • Solutions Overview

      INCREASE REVENUE

      • Customer Journey/360
      • Product Marketing
      • Entity Resolution
      • Recommendation Engine

      MANAGE RISK

      • Fraud Detection
      • Anti-Money Laundering
      • Threat Detection
      • Risk Monitoring

      IMPROVE OPERATIONS

      • Supply Chain Analysis
      • Energy Management
      • Network Optimization

      By Industry

      • Advertising, Media & Entertainment
      • Financial Services
      • Healthcare & Life Sciences

      FOUNDATIONAL

      • AI & Machine Learning
      • Time Series Analysis
      • Geospatial Analysis
  • Customers
    • The World’s Fastest and Most Scalable Graph Platform

      LEARN MORE

      CUSTOMER SUCCESS STORIES

      • Ford
      • Intuit
      • JPMorgan Chase
      • READ MORE SUCCESS STORIES
      • Jaguar Land Rover
      • United Health Group
      • Xbox
  • Partners
    • The World’s Fastest and Most Scalable Graph Platform

      LEARN MORE

      PARTNER PROGRAM

      • Partner Benefits
      • TigerGraph Partners
      • Sign Up
      TigerGraph partners with organizations that offer complementary technology solutions and services.​
  • Resources
    • The World’s Fastest and Most Scalable Graph Platform

      LEARN MORE

      BLOG

      • TigerGraph Blog

      RESOURCES

      • Resource Library
      • Benchmarks
      • Demos
      • O'Reilly Graph + ML Book

      EVENTS & WEBINARS

      • Graph+AI Summit
      • Graph for All - Million Dollar Challenge
      • Events &Trade Shows
      • Webinars

      DEVELOPERS

      • Documentation
      • Ecosystem
      • Developers Hub
      • Community Forum

      SUPPORT

      • Contact Support
      • Production Guidelines

      EDUCATION

      • Training & Certifications
  • Company
    • Join the World’s Fastest and Most Scalable Graph Platform

      WE ARE HIRING

      COMPANY

      • Company Overview
      • Leadership
      • Legal Terms
      • Patents
      • Security and Compliance

      CAREERS

      • Join Us
      • Open Positions

      AWARDS

      • Awards and Recognition
      • Leader in Forrester Wave
      • Gartner Research

      PRESS RELEASE

      • Read All Press Releases
      TigerGraph Reports Exceptional Customer Growth and Product Leadership as More Market-Leading Companies Tap the Power of Graph
      March 1, 2023
      Read More »

      NEWS

      • Read All News
      The-New-Stack-Logo-square

      Multiple Vendors Make Data and Analytics Ubiquitous

      TigerGraph enhances fundamentals in latest platform update

  • START FREE
    • The World’s Fastest and Most Scalable Graph Platform

      GET STARTED

      • Request a Demo
      • CONTACT US
      • Try TigerGraph
      • START FREE
      • TRY AN ONLINE DEMO

Turbocharge your business intelligence with TigerGraph’s ML Workbench on TigerGraph Cloud

  • Andrew Wei
  • November 14, 2022
  • blog, blogs, Machine Learning / AI
  • Blog >
  • Turbocharge your business intelligence with TigerGraph’s ML Workbench on TigerGraph Cloud

How to develop a graph-enhanced machine learning model using the most advanced graph analytics and graph ML with TigerGraph’s ML Workbench together with TigerGraph Cloud. 

 The source code and examples used in this blog post are available here. 

What is a graph database and why should you care?

Making the right business decisions requires understanding the relationships of any one action or asset as they relate to each other. Many enterprises, data analytics companies, and data scientists are finding new ways to explore connections and relationships to see what additional insights our data can give us.

Graph analysis recognizes that all data represents something in the real world, and almost everything in the real world is connected in some way. Finding these new patterns from relationships could be used to search for better product recommendations for e-commerce sites, banks searching for fraud indicators before it happens, or manufacturing companies looking for ways to improve the efficiency of their supply chains.

TigerGraph Cloud is the industry’s first and only distributed native graph database-as-a-service, enabling users to accelerate the adoption of graphs with easy-to-use features that process analytics and transactional workloads in real-time. With the latest 3.8 release, you can also provision your ML Workbench Jupyter notebook on TigerGraph Cloud to have a one-stop shop experience for both your graph database and graph machine learning development environment.

A Case Study: Fraud detection with graph-enhanced ML model

Companies around the world are investing in graphs as a competitive advantage. Research in the field of graph algorithms and machine learning has shown that big improvements in predictive model quality can be made by structuring data in a graph structure that inherently captures context and relationships. Particularly, in the fraud domain, a Graph-enhanced machine learning model can learn the underlying pattern of the relationship between fraudulent transactions and actors which otherwise a traditional ML approach such as an XGBoost model failed to capture.

In this blog, we will explore ways you can apply graph algorithms and graph features to tackle fraud detection problems. We will showcase how to construct your graph data set with TigerGraph, then we will walk through a Jupyter notebook example to construct an end-to-end fraud detection application with a GNN model using the Ethereum dataset which contains accounts (with positive and negative labels) and transactions between them. Here is how the schema looks.

Constructing your Graph on TigerGraph Cloud

Before any model development, we need first to construct your graph. For this example, we will be using a free version of TigerGraph Cloud, the industry’s first and only native parallel graph database-as-a-service.

To get started with your TigerGraph database cluster, you simply need to walk through the cluster provisioning process by selecting the hardware configurations

In the advanced setting section, make sure to enable Machine Learning Workbench, then select Graph Machine Learning in the practice starter kit so it is included in your provisioning cluster. (Note: for this release, we will only support single server configuration i.e. Partition Factor = 1)

TigerGraph ML Workbench on TigerGraph Cloud

Once your Graph database is provisioned. You will need to add a user and password to connect to the database with the Machine Learning Workbench. Simply click on “Access Management” of the cluster you just provisioned from the “Cluster” tab on the left, then click “Add User” with your credentials.

Once you have added a user, you can now leverage Machine Learning Workbench directly by clicking on the Clusters on the left panel, then click on Tools → Machine Learning Workbench.

A new browser window will be opened, and you will be landing on Machine Learning Workbench’s Jupyter server.

The Machine Learning Workbench comes with many great tutorials, including examples of how to use our ML capabilities using pyTigerGraph, running algorithms from our Graph Data Science Library, as well as end-to-end applications.

You might have heard of the recent breakthrough in AI/ML with Graph Neural Networks. In this blog, we will showcase how easy it is to build a GNN model with our built-in python capabilities such as graph data partitioning, data exporting/batching, and graph feature engineering. The notebook can be found under GML→ Applications → Fraud_Detection → Fraud_Detection.ipynb.

Before running any code, you first need to make sure the username and password from the config.json (in the root folder of the Jupyter Server) is updated accordingly to the new user you just created from tgcloud.io.

Now, we are ready to make a connection to your TigerGraph Cloud DB instances by simply running the following code, and importing the Ethereum data set to your instance.

Preparing your Graph Datasets

Like any other supervised machine learning model, GNN requires training, validation, and a test set for model development. ML Workbench makes data partitioning easy with a simple command. We will partition your graph data while preserving the relationship of your data set.

Graph Feature Engineering

The ML workbench includes quite a few graph algorithms from TIgerGraph’s Graph Data Science Library to perform feature engineering. The key functions the notebook is highlighting are:

  1. listAlgorithm(): If it gets the class of algorithms (e.g. Centrality) as an input, it will print the available algorithms for the specified category; otherwise it will print all available algorithm categories.
  2. installAlgorithm(): Gets the name of the algorithm as input and installs the algorithm if it is not already installed.
  3. runAlgorithm(): Gets the algorithm name and parameters to run the algorithm with. If the algorithm is not already installed and in TigerGraph’s Graph Data Science library, the algorithm will automatically install the query and create the necessary schema attributes in the graph.

The following code shows how to use the Featurizer to get PageRank as a feature. You can also define your own custom features by running your own GSQL query and running it through Featurizer.

Now that we are done with feature engineering, the next step is to export your training, validation, and test data set using our Neighbor Loader function. You can define your sampling strategies such as batch size, number of hops, and number of neighbors with our Neighbor Loader function.

Training your GNN model

Now that we are done with graph feature engineering, and have all the data exported into your Machine Learning Workbench environment to train a machine learning model.

We embrace the open-source community, which is why we make TigerGraph ML Workbench to be compatible with some of the most popular deep learning frameworks such as PyTorch Geometric, and Tensorflow. Notice in the code above, we are directly exporting your connected data in a PyG format specified in the output_format parameter, and you will be able to directly leverage PyG to train a GNN model such as a Graph Attention Network. See the example below:

Once your model training is complete, you can do inference on your model to see how a fraudster is moving transactions through its network. To better explain the prediction behavior, we can visualize the subgraph associated with the predicting vertex.

Visualize Your Model Prediction with Subgraphs

In this example, vertex #1891 is predicted to be a fraud account. Vertices in pink are known fraudulent accounts, and vertices colored in blue are unknown accounts. It looks like vertex 1891 is the mastermind behind a fraudster network that has been taking money from innocent users!

Next Steps

If you found this article interesting and wanted to build your own GNN applications, please try out our TigerGraph Cloud and TigerGraph ML Workbench for free. Check out our tutorials from our Github. You can also find the link to the notebook example we walked through in this blog post. We look forward to learning more about what kind of application you can build with TigerGraph.

Get started with TigerGraph Cloud today for free. No credit card is required.

You Might Also Like

Trillion edges benchmark: new world record beyond 100TB by TigerGraph featuring AMD based Amazon EC2 instances

Trillion edges benchmark: new world record...

March 13, 2023
Transaction Surveillance with Maximum Flow Algorithm

Transaction Surveillance with Maximum Flow Algorithm

February 16, 2023
Graph Databases 101: Your Top 5 Questions with Non-Technical Answers

Graph Databases 101: Your Top 5...

February 7, 2023

Introducing TigerGraph 3.0

July 1, 2020

Everything to Know to Pass your TigerGraph Certification Test

June 24, 2020

Neo4j 4.0 Fabric – A Look Behind the Curtain

February 7, 2020

TigerGraph Blog

  • Categories
    • blogs
      • About TigerGraph
      • Benchmark
      • Business
      • Community
      • Compliance
      • Customer
      • Customer 360
      • Cybersecurity
      • Developers
      • Digital Twin
      • eCommerce
      • Emerging Use Cases
      • Entity Resolution
      • Finance
      • Fraud / Anti-Money Laundering
      • GQL
      • Graph Database Market
      • Graph Databases
      • GSQL
      • Healthcare
      • Machine Learning / AI
      • Podcast
      • Supply Chain
      • TigerGraph
      • TigerGraph Cloud
    • Graph AI On Demand
      • Analysts and Research
      • Customer 360 and Entity Resolution
      • Customer Spotlight
      • Development
      • Finance, Banking, Insurance
      • Keynote
      • Session
    • Video
  • Recent Posts

    • Trillion edges benchmark: new world record beyond 100TB by TigerGraph featuring AMD based Amazon EC2 instances
    • Overview of Graph and Machine Learning with TigerGraph | Mar 8 @ 11am PST
    • Gartner Data & Analytics Summit 2023, London
    • Gartner Data and Analytics Summit, Orlando
    • Transaction Surveillance with Maximum Flow Algorithm
    TigerGraph

    Product

    SOLUTIONS

    customers

    RESOURCES

    start for free

    TIGERGRAPH DB
    • Overview
    • Features
    • GSQL Query Language
    GRAPH DATA SCIENCE
    • Graph Data Science Library
    • Machine Learning Workbench
    TIGERGRAPH CLOUD
    • Overview
    • Cloud Starter Kits
    • Login
    • FAQ
    • Pricing
    • Cloud Marketplaces
    USEr TOOLS
    • GraphStudio
    • TigerGraph Insights
    • Application Workbenches
    • Connectors and Drivers
    • Starter Kits
    • openCypher Support
    SOLUTIONS
    • Why Graph?
    industry
    • Advertising, Media & Entertainment
    • Financial Services
    • Healthcare & Life Sciences
    use cases
    • Benefits
    • Product & Service Marketing
    • Entity Resolution
    • Customer 360/MDM
    • Recommendation Engine
    • Anti-Money Laundering
    • Cybersecurity Threat Detection
    • Fraud Detection
    • Risk Assessment & Monitoring
    • Energy Management
    • Network & IT Management
    • Supply Chain Analysis
    • AI & Machine Learning
    • Geospatial Analysis
    • Time Series Analysis
    success stories
    • Customer Success Stories

    Partners

    Partner program
    • Partner Benefits
    • TigerGraph Partners
    • Sign Up
    LIBRARY
    • Resources
    • Benchmark
    • Webinars
    Events
    • Trade Shows
    • Graph + AI Summit
    • Million Dollar Challenge
    EDUCATION
    • Training & Certifications
    Blog
    • TigerGraph Blog
    DEVELOPERS
    • Developers Hub
    • Community Forum
    • Documentation
    • Ecosystem

    COMPANY

    Company
    • Overview
    • Careers
    • News
    • Press Release
    • Awards
    • Legal
    • Patents
    • Security and Compliance
    • Contact
    Get Started
    • Start Free
    • Compare Editions
    • Online Demo - Test Drive
    • Request a Demo

    Product

    • Overview
    • TigerGraph 3.0
    • TIGERGRAPH DB
    • TIGERGRAPH CLOUD
    • GRAPHSTUDIO
    • TRY NOW

    customers

    • success stories

    RESOURCES

    • LIBRARY
    • Events
    • EDUCATION
    • BLOG
    • DEVELOPERS

    SOLUTIONS

    • SOLUTIONS
    • use cases
    • industry

    Partners

    • partner program

    company

    • Overview
    • news
    • Press Release
    • Awards

    start for free

    • Request Demo
    • take a test drive
    • SUPPORT
    • COMMUNITY
    • CONTACT
    • Copyright © 2023 TigerGraph
    • Privacy Policy
    • Linkedin
    • Facebook
    • Twitter

    Copyright © 2020 TigerGraph | Privacy Policy

    Copyright © 2020 TigerGraph Privacy Policy

    • SUPPORT
    • COMMUNITY
    • COMPANY
    • CONTACT
    • Linkedin
    • Facebook
    • Twitter

    Copyright © 2020 TigerGraph

    Privacy Policy

    • Products
    • Solutions
    • Customers
    • Partners
    • Resources
    • Company
    • START FREE
    START FOR FREE
    START FOR FREE
    TigerGraph
    PRODUCT
    PRODUCT
    • Overview
    • GraphStudio UI
    • Graph Data Science Library
    TIGERGRAPH DB
    • Overview
    • Features
    • GSQL Query Language
    TIGERGRAPH CLOUD
    • Overview
    • Cloud Starter Kits
    TRY TIGERGRAPH
    • Get Started for Free
    • Compare Editions
    SOLUTIONS
    SOLUTIONS
    • Why Graph?
    use cases
    • Benefits
    • Product & Service Marketing
    • Entity Resolution
    • Customer Journey/360
    • Recommendation Engine
    • Anti-Money Laundering (AML)
    • Cybersecurity Threat Detection
    • Fraud Detection
    • Risk Assessment & Monitoring
    • Energy Management
    • Network Resources Optimization
    • Supply Chain Analysis
    • AI & Machine Learning
    • Geospatial Analysis
    • Time Series Analysis
    industry
    • Advertising, Media & Entertainment
    • Financial Services
    • Healthcare & Life Sciences
    CUSTOMERS
    read all success stories

     

    PARTNERS
    Partner program
    • Partner Benefits
    • TigerGraph Partners
    • Sign Up
    RESOURCES
    LIBRARY
    • Resource Library
    • Benchmark
    • Webinars
    Events
    • Trade Shows
    • Graph + AI Summit
    • Graph for All - Million Dollar Challenge
    EDUCATION
    • TigerGraph Academy
    • Certification
    Blog
    • TigerGraph Blog
    DEVELOPERS
    • Developers Hub
    • Community Forum
    • Documentation
    • Ecosystem
    COMPANY
    COMPANY
    • Overview
    • Leadership
    • Careers  
    NEWS
    PRESS RELEASE
    AWARDS
    START FREE
    Start Free
    • Request a Demo
    • SUPPORT
    • COMMUNITY
    • CONTACT
    Dr. Jay Yu

    Dr. Jay Yu | VP of Product and Innovation

    Dr. Jay Yu is the VP of Product and Innovation at TigerGraph, responsible for driving product strategy and roadmap, as well as fostering innovation in graph database engine and graph solutions. He is a proven hands-on full-stack innovator, strategic thinker, leader, and evangelist for new technology and product, with 25+ years of industry experience ranging from highly scalable distributed database engine company (Teradata), B2B e-commerce services startup, to consumer-facing financial applications company (Intuit). He received his PhD from the University of Wisconsin - Madison, where he specialized in large scale parallel database systems

    Todd Blaschka | COO

    Todd Blaschka is a veteran in the enterprise software industry. He is passionate about creating entirely new segments in data, analytics and AI, with the distinction of establishing graph analytics as a Gartner Top 10 Data & Analytics trend two years in a row. By fervently focusing on critical industry and customer challenges, the companies under Todd's leadership have delivered significant quantifiable results to the largest brands in the world through channel and solution sales approach. Prior to TigerGraph, Todd led go to market and customer experience functions at Clustrix (acquired by MariaDB), Dataguise and IBM.