The Road to a Standardized Graph Query Language: GQL, Part 1

The Road to a Standardized Graph Query Language: GQL, Part 1

Last month I had an epiphany, 10,000 miles from home.

It wasn’t until the second day in Brisbane that it struck me…I was sitting on the ISO working group committee that had defined and continues to maintain the specifications for SQL, the language that is synonymous with Relational Database. The specs I had studied and read about in college. I was there because TigerGraph is joining forces with other companies to define a universal standard for a property graph query language. There already is a W3C standard for semantic graphs, SPARQL, but with the rise in popularity of property graphs like TigerGraph, the need for a standard query language and data model for property graphs is growing.

I’ll be honest: I was dreading five straight days of committee meetings. But it turned out to be quite interesting, even fun at times. The main reason was that here was a group of persons all deeply invested in the topic, talking shop and trying to find the best solution for you, the end users.  (Being in the beautiful city of Brisbane during the Australian summer didn’t hurt either.)

Many of the representatives, like me, come from industry, so we also have the interests of our design teams and our current customers in mind. But every proposal has to be based on its merits to the industry as a whole. Having TigerGraph, Neo4j, Oracle, and others, including academics and industry consultants, keeps it fair and focused.

The Road to GQL

GQL is the proposed name for the new standard property graph query language. Standards take time.  This June the working group will make its formal request to the next higher level organization in ISO to authorize us to develop a standard.  Then, based on the history of such efforts and the amount of work we know we need to do, we are looking at 2022 for a final version. So, if you are shopping for a graph database now, consider the value and performance each platform has to offer today, including its query language. To help protect your investment, TigerGraph is actively engaged in the new standards process. We’re already making changes in 2019 to improve GSQL’s usability and towards a future standard:

  • Interpreted mode, so you can run GSQL queries immediately, without compiling first.
  • Multi-hop patterns in the FROM clause, to express pattern matching more succinctly.

I say more about the transition to GQL below.

GQL is not going to a rebranding of any one vendor’s current language, for two important reasons:

  1. Any honest assessment says that each of the major languages in existence has made some valuable contributions and innovations which deserve to be in a forthcoming standard. TigerGraph’s user community has told us how much they like accumulators, for example. Accumulators are included in our proposals.
  2. The standard is for the future. There are features that users would like which no vendor has implemented, or has not implemented in ideal form, yet. Such as “query compositionality.” That’s the ability of a query to input one type of object and to return the same type of object, so that you can nest them.  Numeric functions takes numbers and return a number. SQL queries take tables and return a table. A graph query should take graph(s) and return a graph.

Both accumulators and composable, nestable queries are included in our latest proposal to the standards bodies, Seamless Querying of Relational and Graph Languages.

Creating Bridges

Next month I will be going to Berlin, for a W3C Workshop on standardizing graph languages for the web. The RDF data model began life as a vehicle for representing and sharing semantic information, to be stored and distributed across the Web. SPARQL has risen to be the standard way to query a RDF data collection. Semantic graphs and property graphs were designed with different needs and use cases in mind, and have followed different roads for their query languages. But real-world users don’t see things so black and white. They have one set of data: sometimes they have a use that calls for semantic reasoning; sometimes they have an analytical or algorithmic need. And sometimes it’s a blend of the two. In Berlin, with over 100 persons in attendance, I will be one of several persons leading discussions on how to best serve those users.

Our Commitment

TigerGraph’s VP Engineering and GSQL Architect Mingxi Wu, our Chief Scientist Alin Deutsch, and me, Director of Product Management with a background in databases, graph algorithms and data mining, are the core leadership team that is working through ISO, ANSI, and W3C to make sure graph users get the best language standard they possibly can.

Our commitment to you:

  • To always deliver the best and most consistent property graph query language that we can, for GSQL now and for GQL in the future.
  • To stay in the thick of things regarding standards efforts and industry trends.
  • To offer improvements and innovations, regardless of the final stamp of standardization.
  • To provide a smooth transition from GSQL to GQL, and to maintain dual support for as long as appropriate.
  • To provide the overall fastest, most scalable, and most reliable graph database platform, regardless of query language.

I’ll have further developments and insights to share after Berlin.