Version 2.0 to 2.1
The GSQL™ software program is the TigerGraph comprehensive environment for designing graph schemas, loading and managing data to build a graph, and querying the graph to perform data analysis. In short, TigerGraph users do most of their work via the GSQL program. This document presents the syntax and features of the GSQL language.
This document is a reference manual, not a tutorial. The user should read
GSQL Demo Examples v2.1
prior to using this document. There are also User Guides or Tutorials for particular aspects of the GSQL environment. This document is best used when the reader already has some basic familiarity with running GSQL and then wants a more detailed understanding of a particular topic.
This document is Part 1 of the GSQL Language Reference, which describes system basics, defining a graph schema, and loading data. Part 2 describes querying.
A handy GSQL Reference Card
lists the syntax for the most commonly used GSQL commands for graph definition and data loading
. Look for the reference card on our User Document home page.
The GSQL workflow has four major steps:
- Define a graph schema or model.
- Load data into the TigerGraph system.
- Create and install queries.
- Run queries.
After initial data and queries have been installed, the user can run queries or go back to load more data and create additional queries. This document provides specifications and details for steps 1 and 2. The Appendix contains flowcharts which provide a visual understanding of the required and allowed sequence of commands to proceed through the workflow.
Identifiers are user-defined names. An identifier consists of letters, digits, and the underscore. Identifiers may not begin with a digit. Identifiers are case sensitive.
Keywords and Reserved Words
Keywords are words with a predefined semantic meaning in the language. Keywords are not case sensitive. Reserved words are set aside for use by the language, either now or in the future. Reserved words may not be reused as user-defined identifiers. In most cases, a keyword is also a reserved word. For example, VERTEX is a keyword. It is also a reserved word, so VERTEX may not be used as an identifier.
Each line corresponds to one statement (except in multi-line mode). Usually, there is no punctuation at the end of a top-level statement. Some statements, such as CREATE LOADING JOB, are block statements which enclose a set of statements within themselves. Some punctuation may be needed to separate the statements within a block.
Within a command file, comments are text that is ignored by the language interpreter.
Single line comments begin with either # or //.
A comment may be on the same line with interpreted code
. Text to the left of the comment marker is interpreted, and text to the right of the marker is ignored.
Multi-line comment blocks begin with /* and end with */
In the documentation, code examples are either
(formally describing the syntax of part of the language) or
. Actual code examples show code that can be run exactly as shown, e.g., copy-and-paste. Template code, on the other hand, cannot be run exactly as shown because it uses placeholder names and additional symbols to explain the syntax. It should be clear from context whether an example is template code or actual code.
This guide uses conventional notation for software documentation. In particular, note the following:
Most of the examples in this document take place within the GSQL shell. When clarity is needed, the GSQL shell prompt is represented by a greater-than arrow:
When a command is to be issued from the operating system, outside of the GSQL shell, the prompt is the following:
In the GSQL language, keywords are not case sensitive, but user-defined identifiers are case sensitive. In code examples, keywords are in ALL CAPS to make clear the distinction between keywords and user-defined identifiers.
In a very few cases, some option keywords are case-sensitive. For example, in the command to delete all data from the graph store,
clear graph store -HARD
the option -HARD must be in all capital letters.
Placeholder identifiers and values
In template code, any token that is not a keyword, a literal value, or punctuation is a placeholder identifier or a placeholder value.Example:
CREATE UNDIRECTED EDGE
The user-defined identifiers are
, vertex_type_name1, vertex_type_name2, attribute_name
. As explained in the Create Vertex section,
is one of the attribute data types.
When quotation marks are shown, they are to be typed as shown (unless stated otherwise). A placeholder for a string value will not have quotation marks in the template code, but if a template is converted to actual code, quotation marks should be used around string values.
The vertical bar | is used to separate the choices, when the syntax requires that the user choose one out of a set of values. Example: Either the keyword
is to be used. Also, note the inclusion of quotation marks.Template:
Possible actual values:
TO VERTEX user VALUES ($0, $1, $2)
Square brackets are used to enclose a portion that is optional. Options can be nested. Square brackets themselves are rarely used as part of the GSQL language itself.Example: In the RUN JOB statement, the -n flag is optional. If used, -n is to be followed by a value.
RUN JOB [-n
Sometimes, options are nested, which means that an inner option can only be used if the outer option is used:
RUN JOB [-n [
may be specified if and only if
is specified first. These options provide three possible forms for this statement:
RUN JOB -n
RUN JOB -n
Repeated zero or more times
In template code, it is sometimes desirable to show that a term is repeated an arbitrary number of times. For example, a vertex definition contains zero or more user-defined attributes. A loading job contains one or more LOAD statements. In formal template code, if an asterisk (Kleene star) immediately follows option brackets, then the bracketed term can be repeated zero or more times. For example:
means that the VALUES list contains at least one attribute expression. It may be followed by any number of additional attribute expressions. Each additional attribute expression must be preceded by a comma.
For more convenient display, long statements in this guide may sometimes be displayed on multiple lines. This is for display purposes only; the actual code must be entered as a single line (unless the multiline mode is used). When necessary, the examples may show a shell prompt before the start of a statement, to clearly mark where each statement begins.
Example: A SELECT query is grammatically a single statement, so GSQL requires that it be entered as a single line.
However, the statement is easier to read and to understand when displayed one clause per line:
System and Language Basics
New -g Option to set the working graph
To enter the GSQL shell and work in interactive mode, type
from an operating system shell prompt. A user name, password, and a graph name may also be provided on the command line.
If a user name if provided but not a password, the GSQL system will then ask for the user’s password:
If a user name is not given, then GSQL will assume that you are attempting to log in as the default tigergraph user:
To exit the GSQL shell, type either
at the GSQL prompt:
Multiple Shell Sessions
Multiple shell sessions of GSQL may be run at the same time. This feature can be used to have multiple clients (human or machine) using the system to perform concurrent operations. A basic locking scheme is used to maintain isolation and consistency.
Multi-line Mode – BEGIN, END, ABORT
In interactive mode, the default behavior is to treat each line as one statement; the GSQL interpreter will activate as soon as the End-Of-Line character is entered.
Multi-line mode allows the user to enter several lines of text without triggering immediate execution. This is useful when a statement is very long and the user would like to split it into multiple lines. It is also useful when defining a JOB, because jobs typically contain multiple statements.
To enter multi-line mode, use the command BEGIN. The end-of-line character is now disabled from triggering execution. The shell remains in multi-line mode until the command END is entered. The END command also triggers the execution of the multi-line block. In the example below, BEGIN and END are used to allow the SELECT statement to be split into several lines:
Alternately, the ABORT command exits multi-line mode and discards the multi-line block.
Command Files and Inline Commands
A command file is a text file containing a series of GSQL statements. Blank lines and comments are ignored. By convention, GSQL command files end with the suffix .
, but this is not a requirement. Command files are automatically treated as multi-line mode, so BEGIN and END statements are not needed. Command files may be run either from within the GSQL shell by prefixing the filename with an @ symbol:
or from the operating system (i.e., a Linux shell) by giving the filename as the argument after gsql:
Similarly, a single GSQL command can be run by enclosing the command string in quotation marks and placing it at the end of the GSQL statement. Either single or double quotation marks. It is recommended to use single quotation marks to enclose the entire command and double quotation marks to enclose any strings within the command.
In the example below, the file name_query.gsql contains the multi-line CREATE QUERY block to define the query namesSimilar.
Help and Information
command displays a summary of the available GSQL commands:
Note that the HELP command has options for showing more details about certain categories of commands.
command displays the
: all the vertex types, edge types, graphs, queries, jobs, and session parameters which have been defined by the user.
The –reset option will clear the entire graph data store and erase all related definitions (graph schema, loading jobs, and queries) from the Dictionary. The data deletion cannot be undone; use with extreme caution. The REST++, GPE, and GSE modules will be turned off.
The table below summaries the basic system commands introduced so far.
|Display the help menu for all or a subset of the commands|
Display the catalog, which records all the vertex types, edge types, graphs, queries, jobs, and session parameters that have been defined for the current active graph.
|Enter multi-line edit mode (only for console mode within the shell)|
|Finish multi-line edit mode and execute the multi-line block.|
|Abort multi-line edit mode and discard the multi-line block.|
|Run the gsql statements in the command file
from within the GSQL shell.
||Run the gsql statements in the command file
from an operating system shell.
||Run a single gsql statement from the operating system shell.|
||Clear the graph store and erase the dictionary.|
Notes on the LS command
Starting with v1.2, the output of the LS command is sensitive to the user and the active graph:
- If the user has not set an active graph or specified “USE GLOBAL”:
- If the user is a superuser, then LS displays global vertices, global edges, and all graph schemas.
- If the user is not a superuser, then LS displays nothing (null).
- If the user has set an active graph, then LS displays the schema, jobs, queries, and other definitions for that particular graph.
Session parameters are built-in system variables whose values are valid during the current session; their values do not endure after the session ends. In interactive command mode, a session starts and ends when entering and exiting interactive mode, respectively. When running a command file, the session lasts during the execution of the command file.
Use the SET command to set the value of a session parameter:
|Session Parameter||Meaning and Usage|
|sys.data_root||The value should be a string, representing the absolute or relative path to the folder where data files are stored. After the parameter has been set, a loading statement can reference this parameter with $sys.data_root.|
|gsql_src_dir||The value should be a string, representing the absolute or relative path to the root folder for the gsql system installation. After the parameter has been set, a loading statement can reference this parameter with $gsql_src_dir.|
|exit_on_error||When this parameter is true (default), if a semantic error occurs while running a GSQL command file, the GSQL shell will terminate. Accepted parameter values: true, false (case insensitive). If the parameter is set to false, then a command file which is syntactically correct will continue running, even if certain runtime errors in individual commands occur. Specifically, this affects these commands:
Semantic errors include a reference to a nonexistent entity or an improper reuse of an entity.
This session parameter does not affect GSQL interactive mode; GSQL interactive mode does not exit on any error.
CREATE UNDIRECTED EDGE e2 (FROM u, TO v) #error 2: vertex type u doesn’t exist
CREATE UNDIRECTED EDGE e1 (FROM v, TO v)
CREATE GRAPH g(v) #error 3: no graph definition has no edge type
CREATE GRAPH g2(*)