Cybersecurity Threat Detection with Graph Database

Guard Against Cybersecurity Threats in Real-time With TigerGraph

Businesses Face the Constant Threat of Cybersecurity Attacks

Cybersecurity attacks were estimated to cost an astounding $45 billion in 2018. The Internet Society’s Online Trust Alliance (OTA), which identifies and promotes security and privacy best practices that build consumer confidence in the Internet, released its Cyber Incident & Breach Trends Report, which found the financial impact of ransomware rose by 60%, losses from business email compromise (BEC) doubled, and cryptojacking incidents more than tripled, all despite the fact that overall breaches and exposed records were down in 2018.

Network and security vendors need a more accurate and timely solution to protect their customers. Security is only effective if it uses accurate and timely threat data.

Deflecting Cybersecurity Threats
SOLORIGATE: Microsoft

Legacy Approaches Are Ill-Suited for Deflecting Cybersecurity Threats

Cybersecurity threat detection requires an ability to integrate and traverse data from multiple data sources, and do so in fractions of a second. The internet is vast and the information required to detect threats in the terabytes.

Any threat detection system built upon a relational database will struggle to detect fraud in minutes or even hours, let alone in fractions of a second – relational databases, which store information in separate tables, one for each type of network entity, will require multiple joins in order to uncover connections. 

Similarly, legacy graph approaches that cannot perform deep link analytics (ie, the ability to traverse 5+ entries) in real-time will fail to detect and prevent an attack.

Why TigerGraph, a Native Parallel Graph Database for Cybersecurity?

Graph Databases Are an Ideal Way to Detect Cybersecurity Threats

Any network is a network of components and processes: the internet is an interconnected system of servers, routers, bridges, laptops, smartphones, and so on – and there are processes defining how these work together. A company’s intranet has equivalency. In both cases, any attack relies on the interconnection of these entities to succeed – an attack is a chain of events between these entities.

The interconnection between these entities can be perfectly represented in a graph database. Any attack, either from outside a company, or from inside it, can be modelled using a graph database.

Moreover, graph databases are ideally suited to detecting and preventing attacks for many reasons, including: 

  • Huge data sizes – up to terabytes of log data generated per day need to be analyzed.
  • Multiple data sources – information from multiple disparate sources, such as log files, infrastructure info and user info, need to be integrated.
  • Multi-level structures – data stored in services and microservices, domains and subdomains, organizational hierarchies, need to be queried.
  • Deep-link analytics – queries need to traverse multiple entities, often five and more.
  • Rapid response times – answers need to be provided to queries in fractions of seconds.

No other advanced analytics approach meets all these requirements as well as graph databases.

Detect Cybersecurity Threats with Graph Databases
Graph Databases Can Fight Cybersecurity Threats

Graph Databases Can Fight Cybersecurity Threats in Multiple Ways

There are many ways that graph databases can assist in the fight against cybersecurity threats, such as:

  • Looking for patterns of behavior associated with malicious attacks – this could include a user plugging in a mobile disk, copying a file and then removing the mobile disk – or a user reading from a restricted file after bypassing a firewall check.

A graph database can be used to uncover these patterns in real time and prevent confidential information being stolen.

Tracing an error / alert / problem back to its source – for example, a file could be  corrupted while someone was attempting to write to it and an alert was generated – or a high CPU usage alert that was received when a user was connecting to it.

A graph database can be used to trace these alerts back to a user and even to a specific IP address – and it’s worth noticing that doing so successfully requires traversing multiple hops – something that can take  a fraction of a second with a graph database, but minutes or hours with a relational database.

  • Detecting anomalies – this can include a flooding detection event when a service receives many more requests than usual – or a footpringing detection event when a service receives a large amount of requests from a single user , who may be probing for weaknesses in the security measures of that service.

A graph database, which models normal patterns of behavior, can detect anomalous events in real time.

  • Extracting feature sets that can be used for machine learning – one feature is the number of shortest paths from the new user to blacklisted users and IP addresses – and another is the number of blacklisted users within one hop, two hops, three hops and so on on – while another is characterising the environment of the new user using the k nearest neighbor algorithm.

These types of graph features can be generated easily and used to train artificial intelligence to detect and prevent cybersecurity attacks at internet scale, in real time.