Reference

Collapse
Older Versions

GSQL Language Reference Querying

Version 2.1

Document updated:

Introduction

The GSQL® Query Language is a language for the exploration and analysis of large scale graphs. The high-level language makes it easy to perform powerful graph traversal queries in the TigerGraph system. By combining features familiar to database users and programmers with highly expressive new capabilities, the GSQL query language offers both easy authoring and powerful execution.

A GSQL query contains one or more SELECT statements, where each SELECT statement describes a traversal over a set of vertices and edges in the graph or describes a selection of a subset of vertices.  By combining multiple SELECT statements, the user can map out query patterns to answer a virtually unlimited set of real-life data questions.

This document focuses on the formal specification for the GSQL Query Language. It includes example queries which demonstrate the language, each of which works on one of the following six graphs:

workNet, socialNet, friendNet, computerNet, minimalNet,

and

investmentNet

. Their schemas are shown below. Appendix D lists the full command and data files to create and load these graphs with small sets of data (~10 to 20 vertices). The data sets are small so that you can understand the result of each query example. The tarball file


gsql_ref_examples_2.0.tar.gz


contains all of the graph schemas, data files, and queries.
Schemas for Example Graphs


Graph Schema: socialNet

 



CREATE VERTEX person(PRIMARY_ID personId UINT, id STRING, gender STRING) WITH STATS=”OUTDEGREE_BY_EDGETYPE”
CREATE UNDIRECTED EDGE friend(FROM person, TO person)
CREATE VERTEX post(PRIMARY_ID postId UINT, subject STRING, postTime DATETIME)
CREATE DIRECTED EDGE posted(FROM person, TO post)
CREATE DIRECTED EDGE liked(FROM person, TO post, actionTime DATETIME)

Graph Schema: workNet

 



CREATE VERTEX person(PRIMARY_ID personId STRING, id STRING, locationId STRING, skillSet SET<INT>, skillList LIST<INT>, interestSet SET<STRING COMPRESS>, interestList LIST<STRING COMPRESS>)
CREATE VERTEX company(PRIMARY_ID clientId STRING, id STRING, country STRING)
CREATE UNDIRECTED EDGE worksFor(FROM person, TO company, startYear INT, startMonth INT, fullTime BOOL)

 


Graph Schema: friendNet

 



CREATE VERTEX person(PRIMARY_ID personId UINT, id STRING)
CREATE UNDIRECTED EDGE friend(FROM person, TO person)
CREATE UNDIRECTED EDGE coworker(FROM person, TO person)

 


Graph Schema: computerNet

 



CREATE VERTEX computer(PRIMARY_ID compID STRING, id STRING)
CREATE DIRECTED EDGE connected(FROM computer, TO computer, connectionSpeed INT)

Graph Schema: minimalNet

 



CREATE VERTEX testV(PRIMARY_ID id STRING)
CREATE UNDIRECTED EDGE testE(FROM testV, TO testV)

 


Graph Schema: investmentNet
TYPEDEF TUPLE < age UINT (4), mothersName STRING(20) > SECRET_INFO
CREATE VERTEX person(PRIMARY_ID personId STRING, portfolio MAP<STRING, DOUBLE>, secretInfo SECRET_INFO)
CREATE VERTEX stockOrder(PRIMARY_ID orderId STRING, ticker STRING, orderSize UINT, price FLOAT)
CREATE UNDIRECTED EDGE makeOrder(FROM person, TO stockOrder, orderTime DATETIME)

CREATE / INSTALL / RUN / SHOW / DROP QUERY

 

A GSQL query is a compiled data retrieval-and-computation task. Users can write queries to explore a data graph however they like, to read and make computations on the graph data along the way, to update the graph, and to deliver resulting data. A query is analogous to a user-defined procedure or function: it can have one or more input parameters, and it can produce output in two ways: by returning a value or by printing. Using a query is a three-step procedure:

  1. CREATE QUERY: define the functionality of the query
  2. INSTALL QUERY: compile the query
  3. RUN QUERY: execute the query with input values

 

Query Action Privileges


Users with querywriter role or greater (architect, admin, and superuser) can create, install and drop queries.

Any user with queryreader role or greater for a given graph can run the queries for that graph.





To implement fine-grained control over which queries can be executed by which sets of users:

  1. Group your queries into your desired privilege groups.
  2. Define a graph for each privilege group. These graphs can all have the same domain if you wish.
  3. Create your queries, assigning each to its appropriate privilege group.

 


EBNF for CREATE QUERY
createQuery := CREATE [DISTRIBUTED][OR REPLACE] QUERY name “(” [parameterList] “)” FOR GRAPH name
[RETURNS “(” baseType | accumType “)”]
[API “(” stringLiteral “)”]
“{” [typedefs] [declStmts] [declExceptStmts] queryBodyStmts “}”parameterValueList := parameterValue [, parameterValue]*
parameterValue := parameterConstant
| “[” parameterValue [, parameterValue]* “]” // BAG or SET
| “(” stringLiteral, stringLiteral “)” // a generic VERTEX value
parameterConstant := numeric | stringLiteral | TRUE | FALSE
parameterList := parameterType name [“=” constant] [“,” parameterType name [“=” constant]]*typedefs := (typedef “;”)+
declStmts := (declStmt “;”)+
declStmt := baseDeclStat | accumDeclStmt | fileDeclStmt
declExceptStmts := (declExceptStmt “;”)+
queryBodyStmts := (queryBodyStmt “;”)+

installQuery := INSTALL QUERY [installOptions] ( “*” | ALL |name [, name]* )
runQuery := RUN QUERY [runOptions] name “(” parameterValueList “)”

showQuery := SHOW QUERY name
dropQuery := DROP QUERY ( “*” | ALL | name [, name]* )

CREATE QUERY Statement

createQuery := CREATE [DISTRIBUTED][OR REPLACE] QUERY name “(” [parameterList] “)” FOR GRAPH name
[RETURNS “(” baseType | accumType “)”]
“{” [typedefs] [declStmts] [declExceptStmts] queryBodyStmts “}”

CREATE QUERY defines the functionality of a query on a given graph schema.

A query has a name, a parameter list, the name of the graph being queried, an optional RETURNS type (see Section “RETURN Statement” for more details), an optional specificier for the output api, and a body. The body consists of an optional sequence of

typedefs

, followed by an optional sequence of declarations, then followed by one or more statements. The body defines the behavior of the query.

The DISTRIBUTED option applies only to installations where the graph has been distributed across a cluster . If specified, the query will run with a different execution model which may give better performance for queries which traverse a large portion of the cluster. Not all GSQL query language features are supported in DISTRIBUTED mode. For details, see the separate document: Distributed Query Mode.

OR REPLACE is deprecated


If the optional keywords OR REPLACE are included, then this query definition, if error-free, will replace a previous definition with the same query name.  However, if there are any errors in this query definition, then the previous query definition will be maintained.  If the OR REPLACE option is not used, then GSQL will reject a CREATE QUERY command that uses an existing name.

 

Typedefs allow the programmer to define custom types for use within the body.  The declarations support definition of

accumulators

(see Chapter  “Accumulators” for more details) and global/local variables.  All accumulators and global variables must be declared before any statements. There are various types of statements that can be used within the body.  Typically, the core statement(s) in the body of a query is one or more SELECT, UPDATE, INSERT, DELETE statements. The language supports conditional statements such as an IF statement as well as looping constructs such as WHILE and FOREACH.  It also supports calling functions, assigning variables, printing, and modifying the graph data.

The query body may include calls to other queries. That is, the other queries are treated as subquery functions.  See the subsection on “Queries as Functions”.


Example of a CREATE QUERY statement
CREATE QUERY createQueryEx (STRING uid) FOR GRAPH socialNet RETURNS (int) {
# declaration statements
users = {person.*};
# body statements
posts = SELECT p
FROM users:u-(posted)->:p
WHERE u.id == uid;
PRINT posts;
RETURN posts.size();
}

Query Parameter and Return Types

This table lists the supported data types for input parameters and return values.

Parameter Types
  • any baseType (except EDGE): INT, UINT, FLOAT, DOUBLE, STRING, BOOL, STRING, VERTEX,
    JSONOBJECT, JSONARRAY
  • SET<baseType>, BAG<baseType>

  • Exception: EDGE type is not supported, either as a primitive parameter or as part of a complex type.
Return Types
  • any baseType (including EDGE): INT, UINT, FLOAT, DOUBLE, STRING, BOOL, STRING, VERTEX, EDGE, JSONOBJECT, JSONARRAY

  • any

    accumulator type, except GroupByAccum

Statement Types

A
statement is
a standalone instruction that expresses an action to be carried out. The most common statements are


data manipulation language (




DML) statements


.
DML statements include
the SELECT, UPDATE, INSERT INTO, DELETE FROM, and DELETE statements.

A GSQL query has two levels of statements.
The upper-level statement type is called

query-body-level statement

, or


query-body statement


for short. This statement type is part of either the top-level block or a query-body control flow block. For example, each of the statements at the top level directly under CREATE QUERY is a query-body statement. If one of the statements is a CASE statement with several THEN blocks, each of the statements in the THEN blocks is also a query-body statement. Each query-body statement ends with a semicolon.

The lower-level statement type is called

DML-sub-level


statement

or


DML-sub-statement


for short. This statement type is used inside certain query-body DML statements, to define particular data manipulation actions. DML-sub-statements are comma-separated. There is no comma or semicolon after the last DML-sub-statement in a block. For example, one of the top-level statements is a SELECT statement, each of the statements in its ACCUM clause is a DML-sub-statement.  If one of those DML-sub-statements is a CASE statement, each of the statement in the THEN blocks is a DML-sub-statement.

There is some overlap in the types. For example, an assignStmt can be used either at the query-body level or the DML-sub-level.

queryBodyStmts := (queryBodyStmt “;”)+

queryBodyStmt := assignStmt // Assignment
| vSetVarDeclStmt // Declaration
| gAccumAssignStmt // Assignment
| gAccumAccumStmt // Assignment
| funcCallStmt // Function Call
| selectStmt // Select
| queryBodyCaseStmt // Control Flow
| queryBodyIfStmt // Control Flow
| queryBodyWhileStmt // Control Flow
| queryBodyForEachStmt // Control Flow
| BREAK // Control Flow
| CONTINUE // Control Flow
| updateStmt // Data Modification
| insertStmt // Data Modification
| queryBodyDeleteStmt // Data Modification
| printStmt // Output
| printlnStmt // Output
| logStmt // Output
| returnStmt // Output
| raiseStmt // Exception
| tryStmt // Exception

DMLSubStmtList := DMLSubStmt [“,” DMLSubStmt]*

DMLSubStmt := assignStmt // Assignment
| funcCallStmt // Function Call
| gAccumAccumStmt // Assignment
| vAccumFuncCall // Function Call
| localVarDeclStmt // Declaration
| DMLSubCaseStmt // Control Flow
| DMLSubIfStmt // Control Flow
| DMLSubWhileStmt // Control Flow
| DMLSubForEachStmt // Control Flow
| BREAK // Control Flow
| CONTINUE // Control Flow
| insertStmt // Data Modification
| DMLSubDeleteStmt // Data Modification
| printlnStmt // Output
| logStmt // Output

Guidelines for understanding statement type hierarchy:

  • Top-level statements are Query-Body type (each statement ending with a semicolon).
  • The statements within a DML statement are DML-sub statements (comma-separated list).
  • The blocks within a Control Flow statement have the same type as the entire Control Flow statement itself.

Schematic illustration of relationship between queryBodyStmt and DMLSubStmt
# Each statement’s operation type is either ControlFlow, DML, or other.
# Each statement’s syntax type is either queryBodyStmt or DMLSubStmt.CREATE QUERY stmtTypes (parameterList) FOR GRAPH g [
other queryBodyStmt1;
ControlFlow queryBodyStmt2 # ControlFlow inside top level.
other queryBodyStmt2.1; # subStmts in ControlFlow are queryBody unless inside DML.
ControlFlow queryBodyStmt2.2 # ControlFlow inside ControlFlow inside top level
other queryBodyStmt2.2.1;
other queryBodyStmt2.2.2;
END;
DML queryBodyStmt2.3 # DML inside ControlFlow inside top-level
other DMLSubStmt2.3.1, # switch to DMLSubStmt
other DMLSubStmt2.3.2
;
END;
DML queryBodyStmt3 # DML inside top level.
other DMLSubStmt3.1, # All subStmts in DML must be DMLSubStmt type
ControlFlow DMLSubStmt3.2 # ControlFlow inside DML inside top level
other DMLSubStmt3.2.1,
other DMLSubStmt3.2.2
,
DML DMLsubStmt3.3
other DMLSubStmt3.3.1,
other DMLSubStmt3.3.2
;
other queryBodyStmt4;

 

Here is a descriptive list of query-body statements:

EBNF term Common Name Description

assignStmt
Assignment Statement See Chapter 6: “Declaration and Assignment Statements”

vSetVarDeclStmt
Vertex Set Variable Declaration Statement
gAccumAssignStmt Global Accumulator Assignment Statement
gAccumAccumStmt Global Accumulator Accumulation Statement

funcCallStmt
Functional Call or Query Call Statement
selectStmt SELECT Statement See Chapter 7: “SELECT Statement”

queryBodyCaseStmt
query-body CASE statement See Chapter 8: “Control Flow Statements”

queryBodyIfStmt
query-body IF statement

queryBodyWhileStmt
query-body WHILE statement

queryBodyForEachStmt
query-body FOREACH statement
updateStmt UPDATE Statement See Chapter 9: “Data Modification Statements”
insertStmt INSERT INTO statement
queryBodyDeleteStmt Query-body DELETE Statement
printStmt PRINT Statement See Chapter 10: “Output Statements”
logStmt LOG Statement
returnStmt RETURN Statement
raiseStmt PRINT Statement See Chapter 11: “Exception Statements”
tryStmt TRY Statement

Here is a descriptive list of DML-sub-statements:

EBNF term Common Name Description


assignStmt

Assignment Statement See Chapter 6: “Declaration and Assignment Statements”

funcCallStmt
Functional Call Statement
gAccumAccumStmt Global Accumulator Accumulation Statement

vAccumFuncCall
Vertex-attached Accumulator Function Call Statement

localVarDeclStmt
Local Variable Declaration Statement See Chapter 7: “SELECT Statement”
insertStmt INSERT INTO Statement See Chapter 8: “Control Flow Statements”
DMLSubDeleteStmt DML-sub DELETE Statement See Chapter 9: “Data Modification Statements”
DMLSubcaseStmt DML-sub CASE statement

DMLSubIfStmt
DML-sub IF statement
DMLSubForEachStmt DML-sub FOREACH statement

DMLSubWhileStmt
DML-sub WHILE statement
logStmt LOG Statement See Chapter 10: “Output Statements”

INSTALL
QUERY

installQuery := INSTALL QUERY [installOptions] ( “*” | ALL | name [, name]* )

A query must be installed before it can be executed. The INSTALL QUERY command will install the queries listed:

INSTALL QUERY queryName1, queryName2, …

It can also install all uninstalled queries, using either of the following commands:

INSTALL QUERY *

INSTALL QUERY ALL

The following options are available:

-force Option

Reinstall the query even if the system indicates the query is already installed. This is useful for overwriting an installation that is corrupted or otherwise outdated, without having to drop and then recreate the query. If this option is not used, the GSQL shell will refuse to re-install a query that is already installed.

-OPTIMIZE Option

During standard installation, the user-defined queries are dynamically linked to the GSQL language code. Anytime after INSTALL QUERY has been performed, another statement, INSTALL QUERY -OPTIMIZE can be executed.  The names of the individual queries are not needed. This operation optimizes all previously installed queries, reducing their run times by about 20%. Optimize a query if query run time is more important to you than query installation time.


Legal:


CREATE QUERY query1...

INSTALL QUERY query1


RUN QUERY query1(...)


...


INSTALL QUERY -OPTIMIZE    # (optional) optimizes run time performance for query1 and query2


RUN QUERY query1(...)      # runs faster than before


Illegal:


INSTALL QUERY -OPTIMIZE query_name

Running a Query

Installing a query creates a REST++ endpoint.
Once a query is installed, there are two ways of executing a query. One way is through the GSQL shell:

RUN QUERY query_name(

parameterValues

)
.


CREATE, INSTALL, RUN example
CREATE QUERY RunQueryEx(INT p1, STRING p2, DOUBLE p3) FOR GRAPH testGraph{ …. }
INSTALL QUERY RunQueryEx
RUN QUERY RunQueryEx(1, “test”, 3.14)

Query output size limitation


There is a maximum size limit of 2GB for the result set of a SELECT block. A SELECT block is the main component of a query which searches for and returns data from the graph. If the result of the SELECT block is larger than 2GB, the system will return no data.  NO error message is produced.

The quer
y response time can be reduced by directly submitting an HTTP request to the REST++ server: send a GET request to ”

http://server_ip:9000/query/graphname/queryname

“. If the REST++ server is local, then server_ip is

localhost

. The query parameter values are either included directly in the query string of the HTTP request’s URL or supplied using a data payload.


Starting with TigerGraph v1.2, the graph name is now pat of the GET /query URL.

The following two curl commands are each equivalent to the RUN QUERY command above. The first gives the parameter values in the query string in a URL. This example illustrates the simple format for primitive data types such as INT, DOUBLE, and STRING. The second gives the parameter values through the curl command’s data payload -d option.


Running a query via HTTP request
curl -X GET “http://localhost:9000/query/testGraph/RunQueryEx?p1=1&p2=test&p3=3.14”
curl -d @RunQueryExPara.dat -X GET “http://localhost:9000/query/testGraph/RunQueryEx”

where RunQueryExPara.dat has the exact string as the query string in the first URL.


RunQueryExPara.dat
p1=1&p2=test&p3=3.14

To see a list of the parameter names and types for the user-installed GSQL queries, run the following REST++ request:


curl -X GET "http://localhost:9000/endpoints?dynamic=true"

By using the data payload option, the user can avoid using a long and complex URL. In fact, to call the same query but with different parameters, only the data payload file contents need to be changed; the HTTP request can be the same. The file loader loads the entire file, appends multiple lines into one, and uses the resulting string as the URL query string. If both a query string and a data payload are given (which we strongly discourage), both are included, where the URL query string’s parameter values overwrite the values given in the data payload.

Complex Type Parameter Passing

This subsection describes how to format the complex type parameter values when executing a query by RUN QUERY or curl command. More details about all parameter types are described in Section “Query Parameter Types”.

Parameter type RUN QUERY Query string for GET /query HTTP Request
SET or BAG of primitives Square brackets enclose the collection of values.

Example: a set p1 of integers:

[1,5,10]

Assign multiple values to the same parameter name.

Example:  a set p1 of integers:

p1=1&p1=5&p1=10

VERTEX<type> If the vertex type is specified in the query definition, then the vertex argument is simply
vertex_id

Example: vertex type is person and desired id is person2.


“person2”

 


parameterName=vertex_id
Example:

vertex type is person and desired id is person2.


vp=person2

VERTEX

(type not pre-specified)

If the type is not defined in the query definition, then the argument must provide both the id and type in parentheses:

(vertex_id, vertex_type)
Example: a verte
x va w
ith id=”person1″ and type=”person:


(“person1″,”person”)


parameterName=vertex_id&parameterName.type=vertex_type
Example: parameter

vertex va when type=”person” and id=”person1″:


va=person1&va.type=person

SET or BAG of VERTEX<type> Same as a SET or BAG of primitives, where the primitive type is vertex_id. Example:


[ “person3”, “person4” ]

Same as a SET or BAG of primitives, where the primitive type is vertex_id. Example:


vp=person3&vp=person4

SET or BAG of VERTEX

(type not pre-specified)

Same as a SET or BAG of vertices, with vertex type not pre-specified. Square brackets enclose a comma-separated list of vertex (id, type) pairs. Mixed types are permitted. Example:


[ (“person1″,”person”) , (“11″,”post”)

]


The SET or BAG must be treated like an array, specifying the first, second, etc. elements with indices [0], [1], etc. The example below provides the same input arguments as the RUN QUERY example to the left.
vp[0]=person1&vp[0].type=person&vp[1]=11&vp[1].type=post

When square brackets are used in a curl URL, the -g option or escape characters must be adopted. If the parameters are given by data payload (either by file or data payload string),  the -g option is not needed and escape characters should not be used.

Below are examples.


Running a query via HTTP request – complex parameter type
# 1. SET or BAG
CREATE QUERY RunQueryEx2(SET<INT> p1) FOR GRAPH testGraph{ …. }
# To run this query (either RUN QUERY or curl):
GSQL > RUN QUERY RunQueryEx2([1,5,10])
curl -X GET “http://localhost:9000/query/testGraph/RunQueryEx2?p1=1&p1=5&p1=10″# 2. VERTEX.
# First parameter is any vertex; second parameter must be a person type.
CREATE QUERY printOneVertex(VERTEX va, VERTEX<person> vp) FOR GRAPH socialNet {
PRINT va, vp;
}
# To run this query:
GSQL > RUN QUERY printOneVertex((“person1″,”person”),”person2″) # 1st param must give type: (vertex_id, vertex_type)
curl -X GET ‘http://localhost:9000/query/socialNet/printOneVertex?va=person1&va.type=person&vp=person2’# 3. BAG or SET of VERTEX, any type
CREATE QUERY printOneBagVertices(BAG<VERTEX> va) FOR GRAPH socialNet {
PRINT va;
}
# To run this query:
GSQL > RUN QUERY printOneBagVertices([(“person1″,”person”), (“11″,”post”)]) # [(vertex_1_id, vertex_1_type), (vertex_2_id, vertex_2_type), …]
curl -X GET ‘http://localhost:9000/query/socialNet/printOneBagVertices?va\[0\]=person1&va\[0\].type=person&va\[1\]=11&va\[1\].type=post’
curl -g -X GET ‘http://localhost:9000/query/socialNet/printOneBagVertices?va[0]=person1&va[0].type=person&va[1]=11&va[1].type=post’

# 4. BAG or SET of VERTEX, pre-specified type
CREATE QUERY printOneSetVertices(SET<VERTEX<person>> vp) FOR GRAPH socialNet {
PRINT vp;
}
# To run this query:
GSQL > RUN QUERY printOneSetVertices([“person3”, “person4”]) # [vertex_1_id, vertex_2_id, …]
curl -X GET ‘http://localhost:9000/query/socialNet/printOneSetVertices?vp=person3&vp=person4’

 


Payload Size Limit

This data payload option can accept a file up to 128MB by default.

To increase this limit to xxx MB, use the following command:

gadmin –set nginx.client_max_body_size xxx -f

The upper limit of this setting is 1024 MB. Raising the size limit for the data payload buffer reduces the memory available for other operations, so be cautious about increasing this limit.

For more detailed information about REST++ endpoints and requests, see the


RESTPP API User Guide


.

 

The following options are available when running a query:

All-Vertex Mode -av Option

Some queries run with all or almost all vertices in a

SELECT statement

s, e.g. PageRank algorithm. In this case, the graph processing engine can run much more efficiently in all-vertex mode. In the all-vertex mode, all vertices are always selected, and the following actions become ineffective:

  • Filtering with selected vertices or vertex types. The source vertex set must be all vertices.
  • Filtering with the WHERE clause.
  • Filtering with the HAVING clause.
  • Assigning designated vertex or designated type of vertexes. E.g. X = {

    vertex_type

    .*}

To run the query in all-vertex mode, use the -av option in shell mode or include

__GSQL__USING_ALL_ACTIVE_MODE=true

in the query string of an HTTP request.

GSQL > RUN QUERY -av test()

## In a curl URL call. Note the use of both single and double underscores.
curl -X GET ‘http://localhost:9000/query/graphname/queryname?__GQUERY__USING_ALL_ACTIVE_MODE=true’


Diagnose

-d Option

The diagnose option can be turned on in order to produce a diagnostic monitoring log, which contains the processing time of each

SELECT block

. To turn on the monitoring log, use the -d option in shell mode or

__GSQL__monitor=true

in the query string of an HTTP request.

GSQL > RUN QUERY -d test()

## In a curl URL call. Note the use of both single and double underscores.
curl -X GET ‘http://localhost:9000/query/graphname/queryname?__GQUERY__monitor=true’

The path of the generated log file will be shown as a part of output message. An example log is shown below:



 



Query Block Start (#6) start at 11:52:06.415284
Query Block Start (#6) end at 11:52:06.415745 (takes 0.000442 s)Query test takes totally 0.001 s (restpp’s pre/post process time not included)
—————- Summary (sort by total_time desc) —————-Query Block Start on Line 6
———————————————————-
total iterations count : 1
avg iterations stats : 0.000442s
max iterations stats : 0.000442s
min iterations stats : 0.000442s
total activated vertex count : 2
max activated vertex count : 2
min activated vertex count : 2

GSQL Query Output Format

The standard output of GSQL queries is in industry-standard JSON format. A JSON

object

is an unordered set of

key:value pairs

, enclosed in curly braces. Among the acceptable data types for a JSON

value

are

array

and

object

.  A JSON

array

is an ordered list of

values

, enclosed in square brackets. Since values can be objects or arrays, JSON supports hierarchical, nested structures. Strings are enclosed in double quotation marks. We also use the term

field

to refer to a key (or a key:value pair) of a given object.

At the top level of the JSON structure are three required fields: “error”, “message”, and “results”.  If a query is successful, the value of “error” will be “false”, the “message” value will be empty, and the “results” value will be the intended output of the query.
If an error or exception occurred during query execution, the “error” value will be “true”, the “message” value will be a string message describing the error condition, and the “results” field will be empty.

Beginning with version 2 (v2) of the output specification, an addtitional top-level field is required: “version”. The “version” value is an object with the following fields:

“version” field value
api A string specifying the output API version.  Values are specified as follows:

  • “v1”: Output API used in TigerGraph platform v0.8 through v1.0.  If the output does not have a “version” field, the JSON format is presumed to be v1.
  • “v2”: Output API introduced in TigerGraph platform v1.1.  This is the latest API. (Note: for backward compatibility, TigerGraph platforms which support the v2 output api can be configured to produce either v1 or v2 output.)
schema An integer
representing which version of the user’s graph schema is currently in use.  When a CREATE GRAPH statement is executed, the version is initialized to 0. Each time a SCHEMA_CHANGE JOB is run, the schema value is incremented (e.g., 1, 2, etc.).

Other top-level objects, such as “code” may appear in certain circumstances. Note that the top-level objects are enclosed in curly braces, meaning that they form an unordered set. They may appear in any order.

Below is an example of the output of a successful query:


Top Level JSON of a Valid Query – Example
{
“version”: {“api”: “v2″,”schema”: “1”},
“error”: false,
“message”: “”,
“results”: [
{results_of_PRINT_statement_1},
…,
{results_of_PRINT_statement_N}
]
}

The value of the “results” key-value pair is a sequential list of the data objects specified by the PRINT statements of the query. The list order follows the order of PRINT execution. The detailed format of the PRINT statement results is described in the Chapter “Output Statements”.

 

For backward compatibility, TigerGraph platforms whose principal output API is v2 can also produce output with API v1.

Changing the Default Output API

The following GSQL statement can be used to set the JSON output API configuration.

SET json_api = <version_string>

Currently, the legal values for <version_string> are “v1” and “v2”. This statement sets a persistent system parameter. Each version of the TigerGraph platform is pre-configured to what was the latest output API that at the time of release. For example, platform version 1.1 is configured so that each query will produce v2 output by default. If

 

 

 

 

 

SHOW QUERY

To show the GSQL text of a query, run “SHOW QUERY

query_name

“. Additionally, the “ls” GSQL command lists all created queries and identifies which queries have been installed.

DROP QUERY

To drop a query, run “DROP QUERY

query_name

“. The query will be uninstalled (if it has been installed) and removed from the dictionary.  The GSQL language will refuse to drop an
installed query Q if another query R is installed which calls query Q
.  That is, all calling queries must be dropped before or at the same time that their called subqueries are dropped.

To drop all queries,, either of the following commands can be used:

DROP QUERY ALL

DROP QUERY *


Scope of ALL





The scope of ALL depends on the user’s current scope. If the user has set a working graph, then DROP ALL removes all the jobs for that graph. If a superuser has set their scope to be global, then DROP ALL removes all jobs across all graph spaces.

 

 


End of CREATE / INSTALL / RUN / SHOW / DROP Query Section

Data Types

This section describes the
data types
that are native to and are supported by the GSQL Query Language. Most of the data objects used in queries come from one of three sources: (1) the query’s input parameters, (2) the vertices, edges, and their attributes which are encountered when traversing the graph, or (3) variables defined within the query that are used to assist in the computational work of the query.

This section covers the following subset of the EBNF
language definiti
ons:


EBNF for Data Types
lowercase := [a-z]
uppercase := [A-Z]
letter := lowercase | uppercase
digit := [0-9]
integer := [“-“]digit+
real := [“-“](“.” digit+) | [“-“](digit+ “.” digit*)
numeric := integer | real
stringLiteral := ‘”‘ [~[“] | ‘\\’ (‘”‘ | ‘\\’)]* ‘”‘name := (letter | “_”) [letter | digit | “_”]* // Can be a single “_” or start with “_”type := baseType | name | accumType | STRING COMPRESS

baseType := INT
| UINT
| FLOAT
| DOUBLE
| STRING
| BOOL
| VERTEX [“<” name “>”]
| EDGE
| JSONOBJECT
| JSONARRAY
| DATETIME

filePath := name | stringLiteral

typedef := TYPEDEF TUPLE “<” tupleType “>” name

tupleType := (baseType name) | (name baseType) [“,” (baseType name) | (name baseType)]*

parameterType := baseType
| [ SET | BAG ] “<” baseType “>”
| FILE

Identifiers

An identifier is the name for an instance of a language element. In the GSQL query language, identifiers are used to name elements such as a query, a variable, or a user-defined function.  In the EBNF syntax, an identifier is referred as a

name

. It can be a sequence of letters, digits, or underscores (“_”). Other punctuation characters are not supported. The initial character can only be letter or an underscore.


name (identifier)
name := (letter | “_”) [letter | digit | “_”]*

Overview of Types

Different types of data can be used in different contexts. The EBNF syntax defines several classes of data types.  The most basic is called baseType.  The other independent types are FILE and STRING COMPRESS. The remaining types are either compound data types built from the independent data types, or supersets of other types.  The tabl
e below gives
an o
verview
of their definitions and their uses.

EBNF term Description Use Case
baseType INT, UINT, FLOAT, DOUBLE, STRING, BOOL, DATETIME, VERTEX, EDGE,

JSONOBJECT, or JSONARRAY

  • global variable
  • query return value
tupleType sequence of baseType
  • user-defined tuple
accumType family of specialized data objects which support accumulation operations
FILE FILE object
  • global sequential data object, linked to a text file
parameterType baseType, a SET or BAG of baseType, or  FILE object
  • query parameter
STRING COMPRESS STRING COMPRESS
  • more compact storage of STRING, when there is a limited number of different values
elementType baseType, STRING COMPRESS, or identifier
  • element for most types of container accumulators: SetAccum, BagAccum, GroupByAccum, key of a MapAccum element
type baseType, STRING COMPRESS, identifier, or accumType
  • element of a ListAccum, value of a MapAccum element
  • local variable

 

Base Types

The query language supports the following

base types

, which can be declared and assigned anywhere within their scope. Any of these base types may be used when defining a global variable, a local variable, a query return value, a parameter, part of a tuple, or an element of a container accumulator. Accumulators are described in detail in a later section.


BNF
baseType := INT
| UINT
| FLOAT
| DOUBLE
| STRING
| BOOL
| VERTEX [“<” name “>”]
| EDGE
| JSONOBJECT
| JSONARRAY
| DATETIME

The default value of each base type is shown in the table below. The default value is the initial value of a base type variable (see Section “Variable Types” for more details), or the default return value for some functions (see Section “Operators, Functions, and Expressions” for more details).

The first seven types (INT, UINT, FLOAT, DOUBLE, BOOL, STRING, and DATETIME) are
the same ones
mentioned in the “Attribute Data Types” section of the


GSQL Language Reference, Part 1


.

type default value
INT, UINT, FLOAT, DOUBLE

(see note below)

0
BOOL false
STRING “”
DATETIME 1970-01-01 00:00:00
VERTEX “Unknown”
EDGE No edge: {}
JSONOBJECT An empty object: {}
JSONARRAY An empty array: []

FLOAT and DOUBLE input values must be in fixed point

d.dddd

format, where d is a digit. Output values will be printed in either fixed point for exponential notation, whichever is more compact.

The GSQL Loader can read FLOAT and DOUBLE values with exponential notation (e.g., 1.25 E-7).

 

VERTEX and EDGE

VERTEX and EDGE are the two types of objects which form a graph.
A query parameter or variable can be declared as either of these two types.  In additional, the schema for the graph defines specific vertex and edge types (e.g., CREATE VERTEX

person

).  The parameter or variable type can be restricted by giving the vertex/edge type in angle brackets < > after the keyword VERTEX/EDGE. A VERTEX or EDGE variable declared without a specifier is called a

generic

type. Below are examples of generic and typed vertex and edge variable declarations:


Examples of generic and typed VERTEX and EDGE declarations
VERTEX anyVertex;
VERTEX<person> owner;
EDGE anyEdge;
EDGE<friendship> friendEdge;

Vertex and Edge Attribute Types

The following table map
s vertex or ed
ge attribute types in the Data Definition Language (DDL) to GSQL query language types. Accumulators are introduced in Section “Accumulators”.

DDL GSQL Query
INT INT
UINT UINT
FLOAT FLOAT
DOUBLE DOUBLE
BOOL BOOL
STRING STRING
STRING COMPRESS STRING
SET<

type

>
SetAccum<

type

>
LIST<

type

>
ListAccum<

type

>
DATETIME DATETIME

JSONOBJECT and JSONARRAY

These two base types allow users to pass a complex data object or to write output in a customized format. These types follow the industry standard definition of JSON at

www.json.org

. A JSONOBJECT instance’s external representation (as input and output) is a string, starting and ending with curly braces “{” and “}”, which enclose an unordered list of


string:value


pairs.  A JSONARRAY is represented as a string, starting and ending with square brackets “[” and “]”, which enclose an ordered list of


values


. Since a


value


can be an object or an array, JSON supports hierarchical, nested data structures.

More details are introduced in the Section entitled “JSONOBJECT and JSONARRAY Functions”.


A JSONOBJECT or JSONARRAY value is immutable. No operator is allowed to modify its value.

TUPLE

A tuple is a user-defined data structure consisting of a fixed sequence of baseType variables. Tuple types can be created and named using a TYPEDEF statement. Tuples must be defined first, before any other statements in a query.


ENBF for tuples

typedef := TYPEDEF TUPLE “<” tupleType “>” name

tupleType := (baseType name) | (name baseType) [“,” (baseType name) | (name baseType)]*

A tuple can also be defined in a graph schema and then can be used as a
vertex or edge
attribute type. A tuple type which has been defined in the graph schema
does not need to be
re-defined in a query.

The graph schema investmentNet contains two complex attributes:

  • user-defined tuple SECRET_INFO, which is used for the secret_info attribute in the person vertex.
  • portfolio MAP<STRING,
    DOUBLE
    > attribute, also in the person vertex.

investmentNet schema
TYPEDEF TUPLE <age UINT (4), mothersName STRING(20) > SECRET_INFO
CREATE VERTEX person(PRIMARY_ID personId STRING, portfolio MAP<STRING, DOUBLE>, secretInfo SECRET_INFO)
CREATE VERTEX stockOrder(PRIMARY_ID orderId STRING, ticker STRING, orderSize UINT, price FLOAT)
CREATE UNDIRECTED EDGE makeOrder(FROM person, TO stockOrder, orderTime DATETIME)
CREATE GRAPH investmentNet (*)

The query below reads both the SECRET_INFO tuple and the portfolio MAP. The tuple type does not need to redefine SECRET_INFO. To read and save the map, we define a MapAccum with the same key:value type as the original portfolio map. (The “Accumulators” chapter has more information about accumulators.)   In addition, the query creates a new tuple type, ORDER_RECORD.


tupleEx query
CREATE QUERY tupleEx(VERTEX<person> p) FOR GRAPH investmentNet{
#TYPEDEF TUPLE <UINT age, STRING mothersName> SECRET_INFO; # already defined in schema
TYPEDEF TUPLE <STRING ticker, FLOAT price, DATETIME orderTime> ORDER_RECORD; # new for querySetAccum<SECRET_INFO> @@info;
ListAccum<ORDER_RECORD> @@orderRecords;
MapAccum<STRING, DOUBLE> @@portf; # corresponds to MAP<STRING, DOUBLE> attributeINIT = {p};

# Get person p’s secret_info and portfolio
X = SELECT v FROM INIT:v
ACCUM @@portf += v.portfolio, @@info += v.secretInfo;

# Search person p’s orders to record ticker, price, and order time.
# Note that the tuple gathers info from both edges and vertices.
orders = SELECT t
FROM INIT:s -(makeOrder:e)->stockOrder:t
ACCUM @@orderRecords += ORDER_RECORD(t.ticker, t.price, e.orderTime);

PRINT @@portf, @@info;
PRINT @@orderRecords;
}


tupleEx.json
GSQL > RUN QUERY tupleEx(“person1”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“@@info”: [{
“mothersName”: “JAMES”,
“age”: 25
}],
“@@portf”: {
“AAPL”: 3142.24,
“MS”: 5000,
“G”: 6112.23
}
},
{“@@orderRecords”: [
{
“ticker”: “AAPL”,
“orderTime”: “2017-03-03 18:42:28”,
“price”: 34.42
},
{
“ticker”: “B”,
“orderTime”: “2017-03-03 18:42:30”,
“price”: 202.32001
},
{
“ticker”: “A”,
“orderTime”: “2017-03-03 18:42:29”,
“price”: 50.55
}
]}
]
}

 

STRING COMPRESS

STRING COMPRESS is an integer type encoded by the system to represent string values.
STRING COMPRESS uses less memory than STRING. The STRING COMPRESS type is designed to act like STRING: data are loaded and printed just as string data, and most functions and operators which take STRING input can also take STRING COMPRESS input.

The difference is in how the data are stored internally. A STRING COMPRESS value can be obtained from a STRING_SET COMPRESS or STRING_LIST COMPRESS attribute or from converting a STRING value.

STRING COMPRESS type is beneficial for sets of string values when the same values are used multiple times. In practice, STRING COMPRESS are most useful for container accumulators like ListAccum<STRING COMPRESS> or SetAccum<STRING COMPRESS>.

An accumulator (introduced in Section “Accumulator”) containing STRING COMPRESS stores the dictionary when it is assigned an attribute value or from another accumulator containing STRING COMPRESS. An accumulator containing STRING COMPRESS can store multiple dictionaries. A STRING value can be converted to a STRING COMPRESS value only if the value is in the dictionaries. If the STRING value is not in the dictionaries, the original string value is saved. A STRING COMPRESS value can be automatically converted to a STRING value.

When a STRING COMPRESS value is output (e.g. by PRINT statement, which is introduced in ), it is shown as a STRING.


STRING COMPRESS is not a base type.

 


STRING COMPRESS example
CREATE QUERY stringCompressEx(VERTEX<person> m1) FOR GRAPH workNet {
ListAccum<STRING COMPRESS> @@strCompressList, @@strCompressList2;
SetAccum<STRING COMPRESS> @@strCompressSet, @@strCompressSet2;
ListAccum<STRING> @@strList, @@strList2;
SetAccum<STRING> @@strSet, @@strSet2;S = {m1};S = SELECT s
FROM S:s
ACCUM @@strSet += s.interestSet,
@@strList += s.interestList,
@@strCompressSet += s.interestSet, # use the dictionary from person.interestSet
@@strCompressList += s.interestList; # use the dictionary from person.interestList

@@strCompressList2 += @@strCompressList; # @@strCompressList2 gets the dictionary from @@strCompressList, which is from person.interestList
@@strCompressList2 += “xyz”; # “xyz” is not in the dictionary, so store the actual string value

@@strCompressSet2 += @@strCompressSet;
@@strCompressSet2 += @@strSet;

@@strList2 += @@strCompressList; # string compress integer values are decoded to strings
@@strSet2 += @@strCompressSet;

PRINT @@strSet, @@strList, @@strCompressSet, @@strCompressList;
PRINT @@strSet2, @@strList2, @@strCompressSet2, @@strCompressList2;
}


stringCompressEx.json Results
GSQL > RUN QUERY stringCompressEx(“person12”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“@@strCompressList”: [
“music”,
“engineering”,
“teaching”,
“teaching”,
“teaching”
],
“@@strSet”: [ “teaching”, “engineering”, “music” ],
“@@strCompressSet”: [ “music”, “engineering”, “teaching” ],
“@@strList”: [
“music”,
“engineering”,
“teaching”,
“teaching”,
“teaching”
]
},
{
“@@strSet2”: [ “music”, “engineering”, “teaching” ],
“@@strCompressList2”: [
“music”,
“engineering”,
“teaching”,
“teaching”,
“teaching”,
“xyz”
],
“@@strList2”: [
“music”,
“engineering”,
“teaching”,
“teaching”,
“teaching”
],
“@@strCompressSet2”: [ “teaching”, “engineering”, “music” ]
}
]
}


FILE Object

A FILE object is a sequential data storage object, associated with a text file on the local machine.

 



When referring to a FILE object, we always capitalize the word FILE, to distinguish it from ordinary files.


When a FILE object is declared, associated with a particular text file, any existing content in the text file will be erased

.

During the execution of the query, content written to the FILE will be appended to the FILE.  When the query where the FILE was declared finishes running, the FILE contents are saved to the text file.


A FILE object can be passed as a parameter to another query. When a query receives a FILE object as a parameter, it can append data to that FILE, as can every other query which receives this FILE object as a parameter.


Query Parameter Types

Input parameters to a query can be
base type (except EDGE, JSONARARY, or JSONOBJECT).
A parameter can also be a SET or BAG which uses base type (except EDGE) as the element type. A FILE object caln also be a parameter. Within the query, SET and BAG are converted to SetAccum and BagAccum, respectively (See Section “Accumulator” for more details).


A query parameter is immutable
. It cannot be assigned a new value within the query.

The FILE object is a special case.  It is passed by reference, meaning that the receiving query gets a link to the original FILE object.  The receiving query can write to the FILE.

 


BNF
parameterType := baseType
| [ SET | BAG ] “<” baseType “>”
| FILE

Examples of collection type parameters
(SET<VERTEX<person> p1, BAG<INT> ids, MAP<UINT, STRING> names)


Accumulators

 

Accumulators are special types of variables that accumulate information about the graph during its traversal and exploration. Because they are a unique and important feature of the GSQL query language, we devote a separate section for their introduction, but additional detail on their usage will be covered in other sections, the “SELECT Statement” section in particular. This section covers the following subset of the EBNF language definitions:


EBNF
accumDeclStmt := accumType “@”name [“=” constant][, “@”name [“=” constant]]*
| “@”name [“=” constant][, “@”name [“=” constant]]* accumType
| [STATIC] accumType “@@”name [“=” constant][, “@@”name [“=” constant]]*
| [STATIC] “@@”name [“=” constant][, “@@”name [“=” constant]]* accumTypeaccumType := “SumAccum” “<” ( INT | FLOAT | DOUBLE | STRING | STRING COMPRESS) “>”
| “MaxAccum” “<” ( INT | FLOAT | DOUBLE ) “>”
| “MinAccum” “<” ( INT | FLOAT | DOUBLE ) “>”
| “AvgAccum”
| “OrAccum”
| “AndAccum”
| “BitwiseOrAccum”
| “BitwiseAndAccum”
| “ListAccum” “<” type “>”
| “SetAccum” “<” elementType “>”
| “BagAccum” “<” elementType “>”
| “MapAccum” “<” elementType “,” type “>”
| “HeapAccum” “<” name “>” “(” (integer | name) “,” name [ASC | DESC] [“,” name [ASC | DESC]]* “)”
| “GroupByAccum” “<” elementType name [“,” elementType name]* , accumType name [“,” accumType name]* “>”
| “ArrayAccum” “<” name “>”elementType := baseType | name | STRING COMPRESS

gAccumAccumStmt := “@@”name “+=” expr

accumClause := ACCUM DMLSubStmtList

postAccumClause := POST-ACCUM DMLSubStmtList

 

There are a number of different types of accumulators, each providing specific accumulation functions.  Accumulators are declared to have one of two types of association:

global

or

vertex-attached

.

More technically, accumulators are mutable mutex variables shared among all the graph computation threads exploring the graph within a given query. To improve performance, the graph processing engine employs multithreaded processing. Modification of accumulators is coordinated at run-time so the accumulation operator works correctly (i.e., mutually exclusively) across all threads. This is particularly relevant in the ACCUM clause. During traversal of the graph,

the selected set of edges or vertices is partitioned among a group of threads.

These threads have shared mutually exclusive access to the accumulators.

Declaration of Accumulators

All accumulator variables must be declared
at the beginning of a query, immediately after any typedefs, and before any other type of statement. The scope of the accumulator variables is the entire query.

The name of a vertex-attached accumulator begins with a single “@”.  The name of a global accumulator begins with “@@”. Additionally, a global accumulator may be declared to be static.


EBNF for Accumulator Declaration
accumDeclStmt := accumType “@”name [“=” constant][, “@”name [“=” constant]]*
| “@”name [“=” constant][, “@”name [“=” constant]]* accumType
| [STATIC] accumType “@@”name [“=” constant][, “@@”name [“=” constant]]*
| [STATIC] “@@”name [“=” constant][, “@@”name [“=” constant]]* accumType

Vertex-attached Accumulators

Vertex-attached accumulators are mutable state variables that are attached to each vertex in the graph for the duration of the query’s lifetime. They act as run-time attributes of a vertex. They are shared, mutual exclusively, among all of the query’s processes. Vertex-attached accumulators can be set to a value with the = operator. Additionally, an accumulate operator += can be used to update the state of the accumulator; the function of += depends on the accumulator type. In the example below, there are two accumulators attached to each vertex. The initial value of an accumulator of a given type is predefined, however it can be changed at declaration as in the accumulator @weight below.  All vertex-attached accumulator names have a single leading at-sign “@”.


Vertex-Attached Accumulators
SumAccum<int> @neighbors;
MaxAccum<float> @weight = 2.8;

If there is a graph with 10 vertices, then there is an instance of

@neighbors

and

@weight

for each vertex (hence 10 of each, and 20 total accumulator instances).  These are accessed via the dot operator on a vertex variable or a vertex alias (e.g.,

v.@neighbor

).  The accumulator operator += only impacts the accumulator for the specific vertex being referenced.  A statement such as

v1.@neighbors += 1

will only impact

v1

‘s

@neighbors

and not the

@neighbors

for other vertices.

Vertex-attached accumulators can only be

accessed or updated (via = or +=)
in an

ACCUM or POST-ACCUM clause within a SELECT block.  The only exception to this rule is that vertex-attached accumulators can be referenced in a PRINT statement, as the PRINT has access to all information attached to a vertex set.


Edge-attached accumulators are not supported.

Global Accumulators

A global accumulator is a single mutable accumulator that can be accessed or updated within a query.  The names of global accumulators start with a double at-sign “@@”.


Global Accumulators
SumAccum<int> @@totalNeighbors;
MaxAccum<float> @@entropy = 1.0;

Global accumulators can only be assigned (using the = operator) outside a SELECT block (i.e., not within an ACCUM or POST-ACCUM clause). Global accumulators can be accessed or updated via the accumulate operator += anywhere within a query, including inside a SELECT block.

It is important to note that the accumulation operation for global accumulators in an ACCUM clause executes once for each process. That is, if the FROM clause uses an edge-induced selection (introduced in Section “SELECT Statement”), the ACCUM clause executes one process for each edge in the selected edge set. If the FROM clause uses a vertex-induced selection (introduced in Section “SELECT Statement”), the ACCUM clause executes one process for each vertex in the selected vertex set. Since global accumulators are shared in a mutually exclusive manner among processes, they behave very differently than a non-accumulator variable (see Section “Variable Types” for more details) in an ACCUM clause.

Take the following code example. The global accumulator

@@globalRelationshipCount

is accumulated for every

worksFor

edge traversed since it is shared among processes. Conversely,

relationshipCount

appears to have only been incremented once. This is because a non-accumulator variable is not shared among processes. Each process has its own separate unshared copy of

relationshipCount

and increments


the original value


by one. (E.g., each process increments

relationshipCount

from 0 to 1.) There is no accumulation and the final value is one.


Global Variable vs Global Accumulator
#Count the total number of employment relationships for all companies
CREATE QUERY countEmploymentRelationships() FOR GRAPH workNet {INT localRelationshipCount;
SumAccum<INT> @@globalRelationshipCount;start = {company.*};

companies = SELECT s FROM start:s -(worksFor)-> :t
ACCUM @@globalRelationshipCount += 1,
localRelationshipCount = localRelationshipCount + 1;

PRINT localRelationshipCount;
PRINT @@globalRelationshipCount;
}

 


countEmploymentRelationship.json Results
GSQL > RUN QUERY countEmploymentRelationships()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“localRelationshipCount”: 1},
{“@@globalRelationshipCount”: 17}
]
}

Static Global Accumulators

A static global accumulator retains its value after the execution of a query. To declare a static global accumulator, include the STATIC keyword at the beginning of the declaration statement. For example, if a static global accumulator is incremented by 1 each time a query is executed, then its
value
is equal to the number of times the query has been run, since the query was installed. Each static global accumulator belongs to the particular query in which it is declared; it cannot be shared
among different queries. The value only persists in the context of running the same query multiple times.  The value will reset to the default value when the GPE is restarted.


Static Global Accumulators example
CREATE QUERY staticAccumEx(INT x) FOR GRAPH minimalNet {
STATIC ListAccum<INT> @@testList;
@@testList += x;
PRINT @@testList;
}

 


staticAccumEx.json Result
GSQL > RUN QUERY staticAccumEx(3)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@testList”: [
3,
-5,
3
]}]
}
GSQL > RUN QUERY staticAccumEx(-5)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@testList”: [
3,
-5,
3,
-5
]}]
}

There is no command to deallocate a static global accumulator. If a static global accumulator is a collection accumulator and it no longer needed, it should be cleared to minimize the memory usage.

Accumulator Types

The following are the accumulator types we currently support. Each type of accumulator
supports one or more data types
.


EBNF for Accumulator Types
accumType := “SumAccum” “<” ( INT | FLOAT | DOUBLE | STRING ) “>”
| “MaxAccum” “<” ( INT | FLOAT | DOUBLE ) “>”
| “MinAccum” “<” ( INT | FLOAT | DOUBLE ) “>”
| “AvgAccum”
| “OrAccum”
| “AndAccum”
| “BitwiseOrAccum”
| “BitwiseAndAccum”
| “ListAccum” “<” type “>”
| “SetAccum” “<” elementType “>”
| “BagAccum” “<” elementType “>”
| “MapAccum” “<” elementType “,” type “>”
| “HeapAccum” “<” name “>” “(” (integer | name) “,” name [ASC | DESC] [“,” name [ASC | DESC]]* “)”
| “GroupByAccum” “<” elementType name [“,” elementType name]* , accumType name [“,” accumType name]* “>”
| “ArrayAccum” “<” name “>”elementType := baseType | name | STRING COMPRESSgAccumAccumStmt := “@@”name “+=” expr

The accumulators fall into
two major groups
:


  • Scalar Accumulators

    store a single value:

    • SumAccum
    • MinAccum, MaxAccum
    • AvgAccum
    • AndAccum, OrAccum
    • BitwiseAndAccum, BitwiseOrAccum

  • Collection Accumulators

    store a set of values:

    • ListAccum
    • SetAccum
    • BagAccum
    • MapAccum
    • ArrayAccum
    • HeapAccum
    • GroupByAccum

The details of each accumulator type are summarized in the table below.  The Accumulation Operation column explains how the accumulator

accumName

is updated when the statement

accumName += newVal

is executed. Following the table are example queries for each accumulator type.

Table Ac1: Accumulator Types and Their Accumulation Behavior

Accumulator Type (Case Sensitive) Default Initial Value Accumulation operation

(result of


accumName

+=

newVal


)

SumAccum<INT> 0
accumName

plus

newVal
SumAccum<FLOAT or DOUBLE> 0.0
accumName

plus

newVal
SumAccum<STRING or STRING COMPRESS> empty string String concatenation of

accumName

and

newVal
MaxAccum<INT> INT_MIN The greater of

newVal
and


accumName

MaxAccum<FLOAT or DOUBLE> FLOAT_MIN or DOUBLE_MIN The greater of
newVal

and


accumName

MaxAccum<VERTEX> the vertex with internal id 0 The vertex with the greater internal id

, either


newVal

or


accumName

MinAccum<INT> INT_MAX The lesser of
newVal

and


accumName

MinAccum<FLOAT or DOUBLE> FLOAT_MAX or DOUBLE_MAX The lesser of
newVal

and


accumName

MinAccum<VERTEX> unknown The vertex with the lesser internal id, either

newVal

or

accumName
AvgAccum 0.0 (double precision) Double precision average of

newVal
and all previous values accumulated to


accumName

AndAccum True Boolean AND of

newVal
and


accumName

OrAccum False Boolean OR of
newVal

and


accumName

BitwiseAndAccum -1 (INT) = 64-bit sequence of 1s Bitwise AND of
newVal

and


accumName

BitwiseOrAccum 0 (INT) = 64-bit sequence of 0s Bitwise
OR of


newVal

and


accumName

ListAccum<

typ
e

>(ordered collection of elements)
empty list List with

newVal

appended to end of

accumName. newVal

can be a single value or a list. If

accumName

is [ 2, 4, 6 ], then

accumName

+= 4produces

accumName

equal to [ 2, 4, 6, 4 ]
SetAccum<t

ype

>(unordered collection of elements, duplicate items not allowed)
empty set Set union of

newVal
and


accumName

.

newVal

can be a single value or a set/bag.

If

accumName

is ( 2, 4, 6 ), then

accumName

+= 4

produces

accumName

equal to ( 2, 4, 6)

BagAccum<t

ype

>(unordered collection of elements, duplicate items allowed)
empty bag Bag union of

newVal
and


accumName

.

newVal

can be a single value or a set/bag.

If

accumName

is ( 2, 4, 6 ), then

accumName

+= 4

would result in

accumName

equal to ( 2, 4, 4, 6)

MapAccum<

type, type

>(unordered collection of (key,value) pairs)
empty map Add or update a key:value pair to the

accumName

map. If

accumName

is [ (“red”,3), (“green”,4),(“blue”,2) ], then

accumName

+= (“black”-> 5)produces

accumName

equal to [ (“red”,3), (“green”,4),(“blue”,2), (“black”,5) ]
ArrayAccum<

accumType

>
empty list See the

ArrayAccum

section below for details.
HeapAccum<

tuple

>(heapSize, sortKey [, sortKey_i]*)(sorted collection of tuples)
empty heap Insert

newVal

into the

accumName

heap, maintaining the heap in sorted order, according to the sortKey(s) and size limit declared for this HeapAccum
GroupByAccum<

type [, type]* , accumType

[, accumType]*


>
empty group by map Add or update a key:value pair in

accumName

. See Section “GroupByAccum” for more details.

SumAccum

The SumAccum type computes and stores the cumulative sum of numeric values or the cumulative concatenation of text values. The output of a SumAccum is a single numeric or string value. SumAccum variables operate on values of type
INT
, UINT, FLOAT, DOUBLE, or STRING only.

The

+=

operator updates the accumulator’s state. For INT, FLOAT, and DOUBLE types,

+=

arg


performs a numeric addition, while for the STRING value type

+=

arg


concatenates

arg

to the current value of the SumAccum.


SumAccum Example
# SumAccum Example
CREATE QUERY sumAccumEx() FOR GRAPH minimalNet {SumAccum<INT> @@intAccum;
SumAccum<FLOAT> @@floatAccum;
SumAccum<DOUBLE> @@doubleAccum;
SumAccum<STRING> @@stringAccum;@@intAccum = 1;
@@intAccum += 1;

@@floatAccum = @@intAccum;
@@floatAccum = @@floatAccum / 3;

@@doubleAccum = @@floatAccum * 8;
@@doubleAccum += -1;

@@stringAccum = “Hello “;
@@stringAccum += “World”;

PRINT @@intAccum;
PRINT @@floatAccum;
PRINT @@doubleAccum;
PRINT @@stringAccum;
}

 


sumAccumEx.json Result
GSQL > RUN QUERY sumAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@intAccum”: 2},
{“@@floatAccum”: 0.66667},
{“@@doubleAccum”: 4.33333},
{“@@stringAccum”: “Hello World”}
]
}

MinAccum / MaxAccum

The MinAccum and MaxAccum types calculate and store the cumulative minimum or the cumulative maximum of a series of values. The output of a MinAccum or a MaxAccum is a single numeric value. MinAccum and MaxAccum variables operate on values of type INT, UINT, FLOAT, and DOUBLE, VERTEX (with optional specific vertex type) only.

For MinAccum,

+=

arg


checks if the current value held is less than

arg

and stores the smaller of the two. MaxAccum behaves the same, with the exception that it checks for and stores the greater instead of the lesser of the two.


MinAccum and MaxAccum Example
# MinAccum and MaxAccum Example
CREATE QUERY minMaxAccumEx() FOR GRAPH minimalNet {MinAccum<INT> @@minAccum;
MaxAccum<FLOAT> @@maxAccum;@@minAccum += 40;
@@minAccum += 20;
@@minAccum += -10;

@@maxAccum += -1.1;
@@maxAccum += 2.5;
@@maxAccum += 2.8;

PRINT @@minAccum;
PRINT @@maxAccum;
}

 


minMaxAccumEx.json Result
GSQL > RUN QUERY minMaxAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@minAccum”: -10},
{“@@maxAccum”: 2.8}
]
}

MinAccum and MaxAccum operating on VERTEX type have a special comparison. They do not compare vertex ids, but TigerGraph internal ids, which might n
ot be in t
he same order as the external ids. Comparing internal ids is much faster, so MinAccum/ MaxAccum<VERTEX> provide an efficient way to compar
e and select vertices. This is helpful for some graph algorithms that require the vertices to be numbered and sortable
. For example, the following query returns one post from each person. The returned vertex is not necessarily the vertex with alphabetically largest id.


MaxAccum<VERTEX> example
# Output one random post vertex from each person
CREATE QUERY minMaxAccumVertex() FOR GRAPH socialNet api(“v2”) {MaxAccum<VERTEX> @maxVertex;
allUser = {person.*};
allUser = SELECT src
FROM allUser:src -(posted)-> post:tgt
ACCUM src.@maxVertex += tgt
ORDER BY src.id;
PRINT allUser[allUser.@maxVertex]; // api v2
}

minMaxAccuxVertex.json Result
GSQL > RUN QUERY minMaxAccumVertex()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“allUser”: [
{
“v_id”: “person1”,
“attributes”: {“allUser.@maxVertex”: “0”},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {“allUser.@maxVertex”: “1”},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {“allUser.@maxVertex”: “2”},
“v_type”: “person”
},
{
“v_id”: “person4”,
“attributes”: {“allUser.@maxVertex”: “3”},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {“allUser.@maxVertex”: “11”},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {“allUser.@maxVertex”: “10”},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {“allUser.@maxVertex”: “9”},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {“allUser.@maxVertex”: “7”},
“v_type”: “person”
}
]}]
}

AvgAccum

The AvgAccum type calculates and stores the cumulative mean of a series of numeric values. Internally, its state information includes the sum value of all inputs and a count of how many input values it has accumulated. The output is the mean value; the sum and the count values are not accessible to the user. The data type of a AvgAccum variable is not declared; all AvgAccum accumulators accept inputs of type INT, UINT, FLOAT, and DOUBLE.  The output is always DOUBLE type.

The

+= arg

operation updates the AvgAccum variable’s state to be the mean of all the previous arguments along with the current argument; The


= arg


operation clears all the previously accumulated state and sets the new state to be

arg

with a count of one.


AvgAccum Example
# AvgAccum Example
CREATE QUERY avgAccumEx() FOR GRAPH minimalNet {AvgAccum @@averageAccum;@@averageAccum += 10;
@@averageAccum += 5.5; # avg = (10+5.5) / 2.0
@@averageAccum += -1; # avg = (10+5.5-1) / 3.0

PRINT @@averageAccum; # 4.8333…

@@averageAccum = 99; # reset
@@averageAccum += 101; # avg = (99 + 101) / 2

PRINT @@averageAccum; # 100
}

 


avgAccumEx.json Result
GSQL > RUN QUERY avgAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@averageAccum”: 4.83333},
{“@@averageAccum”: 100}
]
}

AndAccum / OrAccum

The AndAccum and OrAccum types calculate and store the cumulative result of a series of boolean operations. The output of an AndAccum or an OrAccum is a single boolean value (True or False). AndAccum and OrAccum variables operate on boolean values only.  The data type does not need to be declared.

For AndAccum,

+= arg

updates the state to be the logical AND between the current boolean state and

arg

. OrAccum behaves the same, with the exception that it stores the result of a logical OR operation.


AndAccum and OrAccum Example
# AndAccum and OrAccum Example
CREATE QUERY andOrAccumEx() FOR GRAPH minimalNet {
# T = True
# F = FalseAndAccum @@andAccumVar; # (default value = T)
OrAccum @@orAccumVar; # (default value = F)@@andAccumVar += True; # T and T = T
@@andAccumVar += False; # T and F = F
@@andAccumVar += True; # F and T = F

PRINT @@andAccumVar;

@@orAccumVar += False; # F or F == F
@@orAccumVar += True; # F or T == T
@@orAccumVar += False; # T or F == T

PRINT @@orAccumVar;
}

 


andOrAccumEx.json Result
GSQL > RUN QUERY andOrAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@andAccumVar”: false},
{“@@orAccumVar”: true}
]
}

BitwiseAndAccum / BitwiseOrAccum

The BitwiseAndAccum and BitwiseOrAccum types calculate and store the cumulative result of a series of bitwise boolean operations and store the resulting bit sequences.  BitwiseAndAccum and BitwiseOrAccum operator on INT only. The data type does not need to be declared.

Fundamental for understanding and using bitwise operations is the knowledge that integers are stored in base-2 representation as a 64-bit sequence of 1s and 0s. “Bitwise” means that each bit is treated as a separate boolean value, with 1 representing true and 0 representing false. Hence, an integer is equivalent to a sequence of boolean values. Computing the Bitwise AND of two numbers A and B means to compute the bit sequence C where the j

th

bit of C, denoted C

j

, is equal to (A

j

AND B

j

).


For BitwiseAndAccum,

+=




arg



updates the accumulator’s state to be the Bitwise AND of the current state and


arg


.


BitwiseOrAccum behaves the same, with the exception that it computes a Bitwise OR.

Bitwise Operations and Negative Integers


Most computer systems represent negative integers using “2’s complement” format, where the uppermost bit has special significance. Operations which affect the uppermost bit are crossing the boundary between positive and negative numbers, and vice versa.

 


BitwiseAndAccum and BitwiseOrAccum Example
# BitwiseAndAccum and BitwiseOrAccum Example
CREATE QUERY bitwiseAccumEx() FOR GRAPH minimalNet {BitwiseAndAccum @@bwAndAccumVar; # default value = 64-bits of 1 = -1 (INT)
BitwiseOrAccum @@bwOrAccumVar; # default value = 64-bits of 0 = 0 (INT))# 11110000 = 240
# 00001111 = 15
# 10101010 = 170
# 01010101 = 85

# BitwiseAndAccum
@@bwAndAccumVar += 170; # 11111111 & 10101010 -> 10101010
@@bwAndAccumVar += 85; # 10101010 & 01010101 -> 00000000
PRINT @@bwAndAccumVar; # 0

@@bwAndAccumVar = 15; # reset to 00001111
@@bwAndAccumVar += 85; # 00001111 & 01010101 -> 00000101
PRINT @@bwAndAccumVar; # 5

# BitwiseOrAccum
@@bwOrAccumVar += 170; # 00000000 | 10101010 -> 10101010
@@bwOrAccumVar += 85; # 10101010 | 01010101 -> 11111111 = 255
PRINT @@bwOrAccumVar; # 255

@@bwOrAccumVar = 15; # reset to 00001111
@@bwOrAccumVar += 85; # 00001111 | 01010101 -> 01011111 = 95
PRINT @@bwOrAccumVar; # 95
}

 


bitwiseAccumEx.json Result
GSQL > RUN QUERY bitwiseAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@bwAndAccumVar”: 0},
{“@@bwAndAccumVar”: 5},
{“@@bwOrAccumVar”: 255},
{“@@bwOrAccumVar”: 95}
]
}

ListAccum

The ListAccum type maintains a sequential collection of elements. The output of a ListAccum is a list of values in the order the elements were added. The element type can be any base type, tuple, or STRING COMPRESS. Additionally, a ListAccum can contain a nested collection of type ListAccum. Nesting of ListAccums is limited to a depth of three.

The

+= arg

operation appends

arg

to the end of the list. In this case,

arg

may be either a single element or another ListAccum.

ListAccum supports two additional operations:


  • @list1 + @list2

    creates a new ListAccum, which contains the elements of @list1 followed by the elements of @list2. The two ListAccums must have identical data types.

    Change in “+” definition


    The pre-v2.0 definition of the ListAccum “+” operator (

    @list + arg

    : Add arg to each member of @list) is no longer supported.


  • @list1 * @list2

    (STRING data only) generates a new list of strings consisting of all permutations of an element of the first list followed by an element of the second list.

ListAccum also supports the following class functions.


Functions which modify the ListAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.

 

function (T is the element type) return type Accessor / Mutator description

size()
INT Accessor Returns the number of elements in the list.

contains(

T

val


)
BOOL Accessor Returns true/false if the list does/doesn’t contain the

value

.

get(

INT

idx


)
T Accessor Returns the value at the given

index

position in the list. The index begins at 0. If the index is out of bound (including any negative value), the default value of the element type is returned.

clear()
VOID Mutator Clears the list so it becomes empty with size 0.

update

(INT

index,

T

value

)
VOID Mutator Assigns

value

to the list element at position

index

.

 


ListAccum Example
# ListAccum Example
CREATE QUERY listAccumEx() FOR GRAPH minimalNet {ListAccum<INT> @@intListAccum;
ListAccum<STRING> @@stringListAccum;
ListAccum<STRING> @@stringMultiplyListAccum;
ListAccum<STRING> @@stringAdditionAccum;
ListAccum<STRING> @@letterListAccum;
ListAccum<ListAccum<STRING>> @@nestedListAccum;@@intListAccum = [1,3,5];
@@intListAccum += [7,9];
@@intListAccum += 11;
@@intListAccum += 13;
@@intListAccum += 15;

PRINT @@intListAccum;
PRINT @@intListAccum.get(0), @@intListAccum.get(1);
PRINT @@intListAccum.get(8); # Out of bound: default value of int: 0

#Other built-in functions
PRINT @@intListAccum.size();
PRINT @@intListAccum.contains(2);
PRINT @@intListAccum.contains(3);

@@stringListAccum += “Hello”;
@@stringListAccum += “World”;

PRINT @@stringListAccum; // [“Hello”,”World”]

@@letterListAccum += “a”;
@@letterListAccum += “b”;

# ListA + ListB produces a new list equivalent to ListB appended to ListA.
# Ex: [a,b,c] + [d,e,f] => [a,b,c,d,e,f]
@@stringAdditionAccum = @@stringListAccum + @@letterListAccum;

PRINT @@stringAdditionAccum;

#Multiplication produces a list of all list-to-list element combinations (STRING TYPE ONLY)
# Ex: [a,b] * [c,d] = [ac, ad, bc, bd]
@@stringMultiplyListAccum = @@stringListAccum * @@letterListAccum;

PRINT @@stringMultiplyListAccum;

#Two dimensional list (3 dimensions is possible as well)
@@nestedListAccum += [[“foo”, “bar”], [“Big”, “Bang”, “Theory”], [“String”, “Theory”]];

PRINT @@nestedListAccum;
PRINT @@nestedListAccum.get(0);
PRINT @@nestedListAccum.get(0).get(1);
}

 


listAccumEx.json Result
GSQL > RUN QUERY listAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [ {“@@intListAccum”: [ 1, 3, 5, 7, 9, 11, 13, 15 ]},
{
“@@intListAccum.get(0)”: 1,
“@@intListAccum.get(1)”: 3
},
{“@@intListAccum.get(8)”: 0},
{“@@intListAccum.size()”: 8},
{“@@intListAccum.contains(2)”: false},
{“@@intListAccum.contains(3)”: true},
{“@@stringListAccum”: [ “Hello”, “World” ]},
{“@@stringAdditionAccum”: [ “ax”, “bx” ]},
{“@@stringMultiplyListAccum”: [ “Helloa”, “Worlda”, “Hellob”, “Worldb” ]},
{“@@nestedListAccum”: [
[ “foo”, “bar” ],
[ “Big”, “Bang”, “Theory” ],
[ “String”, “Theory” ]
]},
{“@@nestedListAccum.get(0)”: [ “foo”, “bar” ]},
{“@@nestedListAccum.get(0).get(1)”: “bar”}
]
}

Example for update function on a global ListAccum

CREATE QUERY listAccumUpdateEx() FOR GRAPH workNet {

# Global ListAccum
ListAccum<INT> @@intListAccum;
ListAccum<STRING> @@stringListAccum;
ListAccum<BOOL> @@passFail;

@@intListAccum += [0,2,4,6,8];
@@stringListAccum += [“apple”,”banana”,”carrot”,”daikon”];

# Global update at Query-Body Level
@@passFail += @@intListAccum.update(1,-99);
@@passFail += @@intListAccum.update(@@intListAccum.size()-1,40); // last element
@@passFail += @@stringListAccum.update(0,”zero”); // first element
@@passFail += @@stringListAccum.update(4,”four”); // FAIL: out-of-range

PRINT @@intListAccum, @@stringListAccum, @@passFail;
}


Results in listAcccumUpdateEx.json
GSQL > RUN QUERY listAccumUpdateEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{
“@@passFail”: [ true, true, true, false ],
“@@intListAccum”: [ 0, -99, 4, 6, 40 ],
“@@stringListAccum”: [ “zero”, “banana”, “carrot”, “daikon” ]
}]
}

Example for update function on a vertex-attached ListAccum

CREATE QUERY listAccumUpdateEx2(SET<VERTEX<person>> seed) FOR GRAPH workNet api(“v2”) {

# Each person has an LIST<INT> of skills and a LIST<STRING COMPRESS> of interests.
# This function copies their lists into ListAccums, and then udpates the last
# int with -99 and updates the last string with “fizz”.
ListAccum<INT> @intList;
ListAccum<STRING COMPRESS> @stringList;
ListAccum<STRING> @@intFails, @@strFails;

S0 (person) = seed;
S1 = SELECT s
FROM S0:s
ACCUM
s.@intList = s.skillList,
s.@stringList = s.interestList
POST-ACCUM
INT len = s.@intList.size(),
IF NOT s.@intList.update(len-1,-99) THEN
@@intFails += s.id END,
INT len2 = s.@stringList.size(),
IF NOT s.@stringList.update(len2-1,”fizz”) THEN
@@strFails += s.id END
;
PRINT S1[S1.skillList, S1.interestList, S1.@intList, S1.@stringList]; // api v2
PRINT @@intFails, @@strFails;
}


Results for listAccumUpdateEx2
GSQL > RUN QUERY listAccumUpdateEx2([“person1″,”person5”])
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“S1”: [
{
“v_id”: “person1”,
“attributes”: {
“S1.@stringList”: [ “management”,”fizz” ],
“S1.interestList”: [ “management”, “financial”],
“S1.skillList”: [ 1, 2, 3 ],
“S1.@intList”: [ 1, 2, -99 ]
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“S1.@stringList”: [ “sport”, “financial”, “fizz” ],
“S1.interestList”: [ “sport”, “financial”, “engineering” ],
“S1.skillList”: [ 8, 2, 5 ],
“S1.@intList”: [ 8, 2, -99 ]
},
“v_type”: “person”
}
]},
{
“@@strFails”: [],
“@@intFails”: []
}
]
}

SetAccum

The SetAccum type maintains a collection of unique elements. The output of a SetAccum is a list of elements in arbitrary order. A SetAccum instance can contain values of one type. The element type can be any base type, tuple, or STRING COMPRESS.

For SetAccum, the

+= arg

operation adds a non-duplicate element or set of elements to the set. If an element is already represented in the set, then the SetAccum state does not change.

SetAccum also can be used with the three canonical set operators: UNION, INTERSECT, and MINUS (see Section “Set/Bag Expression and Operators” for more details).

SetAccum also supports the following class functions.


Functions which modify the SetAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.

 

function

(T is the element type)

return type Accessor / Mutator description

size()
INT Accessor Returns the number of elements in the set.

contains(

T

value


)
BOOL Accessor Returns true/false if the set does/doesn’t contain the

value

.

remove(

T

value

)

VOID Mutator Removes

value

from the set.

clear()
VOID Mutator Clears the set so it becomes empty with size 0.

 


SetAccum Example
# SetAccum Example
CREATE QUERY setAccumEx() FOR GRAPH minimalNet {SetAccum<INT> @@intSetAccum;
SetAccum<STRING> @@stringSetAccum;@@intSetAccum += 5;
@@intSetAccum.clear();

@@intSetAccum += 4;
@@intSetAccum += 11;
@@intSetAccum += 1;
@@intSetAccum += 11; # Sets do not store duplicates

@@intSetAccum += (1,2,3,4); # Can create simple sets this way
PRINT @@intSetAccum;
@@intSetAccum.remove(2);
PRINT @@intSetAccum AS RemovedVal2; # Demostrate remove.

PRINT @@intSetAccum.contains(3);

@@stringSetAccum += “Hello”;
@@stringSetAccum += “Hello”;
@@stringSetAccum += “There”;
@@stringSetAccum += “World”;
PRINT @@stringSetAccum;

PRINT @@stringSetAccum.contains(“Hello”);
PRINT @@stringSetAccum.size();
}

 


setAccumEx.json Result
GSQL > RUN QUERY setAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [ {“@@intSetAccum”: [ 3, 2, 1, 11, 4 ]},
{“@@intSetAccum.contains(3)”: true},
{“@@stringSetAccum”: [ “World”, “There”, “Hello” ]},
{“@@stringSetAccum.contains(Hello)”: true},
{“@@stringSetAccum.size()”: 3}
]
}

BagAccum

The BagAccum type maintains a collection of elements with duplicated elements allowed. The output of a BagAccum is a list of elements in arbitrary order. A BagAccum instance can contain values of one type. The element type can be any base type, tuple, or STRING COMPRESS.

For BagAccum, the

+= arg

operation adds an element or bag of elements to the bag.

BagAccum also supports the

+

operator:


  • @bag1 + @bag2

    creates a new BagAccum, which contains the elements of @bag1 and the elements of @bag2. The two BagAccums must have identical data types.

BagAccum also supports the following class functions.


Functions which modify the BagAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.
function

(T is the element type)

return type Accessor / Mutator description

size()
INT Accessor Returns the number of elements in the bag.

contains(

T

value


)
BOOL Accessor Returns true/false if the bag does/doesn’t contain the

value

.

clear()
VOID Mutator Clears the bag so it becomes empty with size 0.

remove(

T

value

)

VOID Mutator Removes one instance of

value

from the bag.

removeAll(

T

value


)
VOID Mutator Removes all instances of the given value from the bag.

 


BagAccum Example
# BagAccum Example
CREATE QUERY bagAccumEx() FOR GRAPH minimalNet {#Unordered collection
BagAccum<INT> @@intBagAccum;
BagAccum<STRING> @@stringBagAccum;@@intBagAccum += 5;
@@intBagAccum.clear();

@@intBagAccum += 4;
@@intBagAccum += 11;
@@intBagAccum += 1;
@@intBagAccum += 11; #Bag accums can store duplicates
@@intBagAccum += (1,2,3,4);
PRINT @@intBagAccum;

PRINT @@intBagAccum.size();
PRINT @@intBagAccum.contains(4);

@@stringBagAccum += “Hello”;
@@stringBagAccum += “Hello”;
@@stringBagAccum += “There”;
@@stringBagAccum += “World”;
PRINT @@stringBagAccum.contains(“Hello”);
@@stringBagAccum.remove(“Hello”); #Remove one matching element
@@stringBagAccum.removeAll(“There”); #Remove all matching elements
PRINT @@stringBagAccum;
}

 


bagAccumEx.json Result
GSQL > RUN QUERY bagAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [ {“@@intBagAccum”: [ 2, 3, 1, 1, 11, 11, 4, 4 ]},
{“@@intBagAccum.size()”: 8},
{“@@intBagAccum.contains(4)”: true},
{“@@stringBagAccum.contains(Hello)”: true},
{“@@stringBagAccum”: [ “World”, “Hello” ]}
]
}

MapAccum

The MapAccum type maintains a collection of (key -> value) pairs. The output of a MapAccum is a set of key and value pairs in which the keys are unique.

The key type of a MapAccum can be all base types, tuple, or STRING COMPRESS.  If the key type is VERTEX, then only the vertex’s id is stored and displayed.

The value type of a MapAccum can be all base types, tuple, STRING COMPRESS or any type of accumulator, except for HeapAccum.

For MapAccum, the

+=

(key->val)


operation adds a key-value element to the collection if

key

is not yet used in the MapAccum. If the MapAccum already contains

key

, then

val

is

accumulated

to the current value, where the accumulation operation depends on the data type of

val

. (Strings would get concatenated, lists would be appended, numerical values would be added, etc.)

MapAccum also supports the

+

operator:


  • @map1 + @map2

    creates a new MapAccum, which contains the (key,value) pairs of @map2 added to the (key,value) pairs of @map1. The two MapAccums must have identical data types.

MapAccum also supports the following class functions.


Functions which modify the MapAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.

 

function

(KEY is the key type)

return type Accessor / Mutator description

size()
INT Accessor Returns the number of elements in the map.

containsKey(

KEY

key


)
BOOL Accessor Returns true/false if the map does/doesn’t contain

key

.

get(

KEY

key

)


value

type
Accessor Returns the value which the map associates with

key

. If the map doesn’t contain

key

, then the return value is undefined.

clear()
VOID Mutator Clears the map so it becomes empty with size 0.

 


MapAccum Example
#MapAccum Example
CREATE QUERY mapAccumEx() FOR GRAPH minimalNet {#Map(Key, Value)
# Keys can be INT or STRING only
MapAccum<STRING, INT> @@intMapAccum;
MapAccum<INT, STRING> @@stringMapAccum;
MapAccum<INT, MapAccum<STRING, STRING>> @@nestedMapAccum;@@intMapAccum += (“foo” -> 1);
@@intMapAccum.clear();

@@intMapAccum += (“foo” -> 3);
@@intMapAccum += (“bar” -> 2);
@@intMapAccum += (“baz” -> 2);
@@intMapAccum += (“baz” -> 1); #add 1 to existing value

PRINT @@intMapAccum.containsKey(“baz”);
PRINT @@intMapAccum.get(“bar”);
PRINT @@intMapAccum.get(“root”);

@@stringMapAccum += (1 -> “apple”);
@@stringMapAccum += (2 -> “pear”);
@@stringMapAccum += (3 -> “banana”);
@@stringMapAccum += (4 -> “a”);
@@stringMapAccum += (4 -> “b”); #append “b” to existing value
@@stringMapAccum += (4 -> “c”); #append “c” to existing value

PRINT @@intMapAccum;
PRINT @@stringMapAccum;

#Checking and getting keys
if @@stringMapAccum.containsKey(1) THEN
PRINT @@stringMapAccum.get(1);
END;

#Map nesting
@@nestedMapAccum += ( 1 -> (“foo” -> “bar”) );
@@nestedMapAccum += ( 1 -> (“flip” -> “top”) );
@@nestedMapAccum += ( 2 -> (“fizz” -> “pop”) );
@@nestedMapAccum += ( 1 -> (“foo” -> “s”) );

PRINT @@nestedMapAccum;

if @@nestedMapAccum.containsKey(1) THEN
if @@nestedMapAccum.get(1).containsKey(“foo”) THEN
PRINT @@nestedMapAccum.get(1).get(“foo”);
END;
END;
}

 


mapAccumEx.json Result
GSQL > RUN QUERY mapAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@intMapAccum.containsKey(baz)”: true},
{“@@intMapAccum.get(bar)”: 2},
{“@@intMapAccum.get(root)”: 0},
{“@@intMapAccum”: {
“bar”: 2,
“foo”: 3,
“baz”: 3
}},
{“@@stringMapAccum”: {
“1”: “apple”,
“2”: “pear”,
“3”: “banana”,
“4”: “abc”
}},
{“@@stringMapAccum.get(1)”: “apple”},
{“@@nestedMapAccum”: {
“1”: {
“foo”: “bars”,
“flip”: “top”
},
“2”: {“fizz”: “pop”}
}},
{“@@nestedMapAccum.get(1).get(foo)”: “bars”}
]
}

ArrayAccum

The ArrayAccum type maintains an array of accumulators. An array is a fixed-length sequence of elements, with direct access to elements by position.  The ArrayAccum has these particular characteristics:

  • The elements are accumulators, not primitive or base data types. All accumulators, except HeapAccum, MapAccum, and GroupByAccum, can be used.
  • An ArrayAccum instance can be multidimensional. There is no limit to the number of dimensions.
  • The size can be set at run-time (dynamically).
  • There are operators which update the entire array efficiently.

When an ArrayAccum is declared, the instance name should be followed by a pair of brackets for each dimension.  The brackets may either contain an integer constant to set the size of the array, or they may be empty. In that case, the size must be set with the reallocate function before the ArrayAccum can be used.


ArrayAccum declaration example
ArrayAccum<SetAccum<STRING>> @@names[10];
ArrayAccum<SetAccum<INT>> @@ids[][]; // 2-dimensional, size to be determined

Because each element of an ArrayAccum itself is an accumulator, the operators =, +=, and + can be used in two contexts: accumulator-level and element-level.

Element-level operations

If @A is an ArrayAccum of length 6, then @A[0] and @A[5] refer to its first and last elements, respectively. Referring to an ArrayAccum element is like referring to an accumulator of that type.  For example, given the following definitions:

ArrayAccum<SumAccum<INT>> @@Sums[3];
ArrayAccum<ListAccum<STRING>> @@Lists[2];

then @@Sums[0], @@Sums[1], and @@Sums[2] each
refer
to an individual SumAccum<INT>, and @@Lists[0] and @@Lists[1] each
refer
to a ListAccum<STRING>, supporting all the operations for those accumulator and data types.

@@Sums[1] = 1;
@@Sums[1] += 2; // value is now 3
@@Lists[0] = “cat”;
@@Lists[0] += “egory”; // value is now “category”

Accumulator-level operations

The operators =, +=, and + have special meanings when applied to an ArrayAccum as a whole. There operations efficiently update an entire ArrayAccum. All of the ArrayAccums must have the same element type.

Operator Description Example
= sets the ArrayAccum on the left equal to the ArrayAccum on the right. The two ArrayAccums must have the same element type, but the left-side ArrayAccum will change its size and dimensions to match the one on the right-side. @A = @B;
+ performs element-by-element addition of two ArrayAccums of the same type and size.  The result is a new ArrayAccum of the same size. @C = @A + @B;

// @A and @B must be the same size

+= performs element-by-element accumulation (+=) from the right-side ArrayAccum to the left-side ArrayAccum. They must be the same type and size. @A += @B;

// @A and @B must be the same size

ArrayAccum also supports the following class functions.


Functions which modify the ArrayAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.
function return type Accessor / Mutator description

size()
INT Accessor Returns the total number of elements in the (multi-dimensional) array. For example, the size of an ArrayAccum declared as @A[3][4] is 12.

reallocate(

INT, …

)
VOID Mutator Discards the previous ArrayAccum instance and creates a new ArrayAccum, with the size(s) given. An N-dimensional ArrayAccum requires N integer parameters. The reallocate function cannot be used to change the number of dimensions.

Example of ArrayAccum Element-level Operations

CREATE QUERY ArrayAccumElem() FOR GRAPH minimalNet {

ArrayAccum<SumAccum<DOUBLE>> @@aaSumD[2][2]; # 2D Sum Double
ArrayAccum<SumAccum<STRING>> @@aaSumS[2][2]; # 2D Sum String
ArrayAccum<MaxAccum<INT>> @@aaMax[2];
ArrayAccum<MinAccum<UINT>> @@aaMin[2];
ArrayAccum<AvgAccum> @@aaAvg[2];
ArrayAccum<AndAccum<BOOL>> @@aaAnd[2];
ArrayAccum<OrAccum<BOOL>> @@aaOr[2];
ArrayAccum<BitwiseAndAccum> @@aaBitAnd[2];
ArrayAccum<BitwiseOrAccum> @@aaBitOr[2];
ArrayAccum<ListAccum<INT>> @@aaList[2][2]; # 2D List
ArrayAccum<SetAccum<FLOAT>> @@aaSetF[2];
ArrayAccum<BagAccum<DATETIME>> @@aaBagT[2];

## for test data
ListAccum<STRING> @@words;
BOOL toggle = false;
@@words += “1st”; @@words += “2nd”; @@words += “3rd”; @@words += “4th”;

# Int: a[0] += 1, 2; a[1] += 3, 4
# Bool: alternate true/false
# Float: a[0] += 1.111, 2.222; a[1] += 3.333, 4.444
# 2D Doub: a[0][0] += 1.111, 2.222; a[0][1] += 5.555, 6.666;
# a[1][0] += 3.333, 4.444; a[0][1] += 7.777, 8.888;

FOREACH i IN RANGE [0,1] DO
FOREACH n IN RANGE [1, 2] DO
toggle = NOT toggle;
@@aaMax[i] += i*2 + n;
@@aaMin[i] += i*2 + n;
@@aaAvg[i] += i*2 + n;
@@aaAnd[i] += toggle;
@@aaOr[i] += toggle;
@@aaBitAnd[i] += i*2 + n;
@@aaBitOr[i] += i*2 + n;
@@aaSetF[i] += (i*2 + n)/0.9;
@@aaBagT[i] += epoch_to_datetime(i*2 + n);

FOREACH j IN RANGE [0,1] DO
@@aaSumD[i][j] += (j*4 + i*2 + n)/0.9;
@@aaSumS[i][j] += @@words.get((j*2 + i + n)%4);
@@aaList[i][j] += j*4 +i*2 + n ;
END;
END;
END;

PRINT @@aaSumD; PRINT @@aaSumS;
PRINT @@aaMax; PRINT @@aaMin; PRINT @@aaAvg;
PRINT @@aaAnd; PRINT @@aaOr;
PRINT @@aaBitAnd; PRINT @@aaBitOr;
PRINT @@aaList; PRINT @@aaSetF; PRINT @@aaBagT;
}


ArrayAccumElem.json Results
GSQL > RUN QUERY ArrayAccumElem()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@aaSumD”: [
[ 3.33333, 12.22222 ],
[ 7.77778, 16.66667 ]
]},
{“@@aaSumS”: [
[ “2nd3rd”, “4th1st” ],
[ “3rd4th”, “1st2nd” ]
]},
{“@@aaMax”: [ 2, 4 ]},
{“@@aaMin”: [ 1, 3 ]},
{“@@aaAvg”: [ 1.5, 3.5 ]},
{“@@aaAnd”: [ false, false ]},
{“@@aaOr”: [ true, true ]},
{“@@aaBitAnd”: [ 0, 0 ]},
{“@@aaBitOr”: [ 3, 7]},
{“@@aaList”: [
[
[ 1, 2 ],
[ 5, 6]
],
[
[ 3, 4 ],
[ 7, 8 ]
]
]},
{“@@aaSetF”: [
[ 2.22222, 1.11111],
[ 4.44444, 3.33333 ]
]},
{“@@aaBagT”: [
[ 2, 1 ],
[ 4, 3 ]
]}
]
}

 


Example of Operations between Whole ArrayAccums

CREATE QUERY ArrayAccumOp3(INT lenA) FOR GRAPH minimalNet {

ArrayAccum<SumAccum<INT>> @@arrayA[5]; // Original size
ArrayAccum<SumAccum<INT>> @@arrayB[2];
ArrayAccum<SumAccum<INT>> @@arrayC[][]; // No size
STRING msg;
@@arrayA.reallocate(lenA); # Set/Change size dynamically
@@arrayB.reallocate(lenA+1);
@@arrayC.reallocate(lenA, lenA+1);

// Initialize arrays
FOREACH i IN RANGE[0,lenA-1] DO
@@arrayA[i] += i*i;
FOREACH j IN RANGE[0,lenA] DO
@@arrayC[i][j] += j*10 + i;
END;
END;
FOREACH i IN RANGE[0,lenA] DO
@@arrayB[i] += 100-i;
END;
msg = “Initial Values”;
PRINT msg, @@arrayA, @@arrayB, @@arrayC;

msg = “Test 1: A = C, C = B”; // = operator
@@arrayA = @@arrayC; // change dimensions: 1D <- 2D
@@arrayC = @@arrayB; // change dimensions: 2D <- 1D
PRINT msg, @@arrayA, @@arrayC;

msg = “Test 2: B += C”; // += operator
@@arrayB += @@arrayC; // B and C must have same size & dim
PRINT msg, @@arrayB, @@arrayC;

msg = “Test 3: A = B + C”; // + operator
@@arrayA = @@arrayB + @@arrayC; // B & C must have same size & dim
PRINT msg, @@arrayA; // A changes size & dim
}


ArrayAccumOp3.json Results
GSQL > RUN QUERY ArrayAccumOp3(3)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“msg”: “Initial Values”,
“@@arrayC”: [
[ 0, 10, 20, 30 ],
[ 1, 11, 21, 31 ],
[ 2, 12, 22, 32 ]
],
“@@arrayB”: [ 100, 99, 98, 97 ],
“@@arrayA”: [ 0, 1, 4 ]
},
{
“msg”: “Test 1: A = C, C = B”,
“@@arrayC”: [ 100, 99, 98, 97 ],
“@@arrayA”: [
[ 0, 10, 20, 30 ],
[ 1, 11, 21, 31 ],
[ 2, 12, 22, 32 ]
]
},
{
“msg”: “Test 2: B += C”,
“@@arrayC”: [ 100, 99, 98, 97 ],
“@@arrayB”: [ 200, 198,196, 194 ]
},
{
“msg”: “Test 3: A = B + C”,
“@@arrayA”: [ 300, 297, 294, 291 ]
}
]
}

 


Example for Vertex-Attached ArrayAccum
CREATE QUERY arrayAccumLocal() FOR GRAPH socialNet api(“v2”) {
# Count each person’s edges by type
# friend/liked/posted edges are type 0/1/2, respectively
ArrayAccum<SumAccum<INT>> @edgesByType[3];
Persons = {person.*};Persons = SELECT s
FROM Persons:s -(:e)-> :t
ACCUM CASE e.type
WHEN “friend” THEN s.@edgesByType[0] += 1
WHEN “liked” THEN s.@edgesByType[1] += 1
WHEN “posted” THEN s.@edgesByType[2] += 1
END
ORDER BY s.id;#PRINT Persons.@edgesByType; // api v1
PRINT Persons[Persons.@edgesByType]; // api v2
}


Results for Query ArrayAccumLocal
GSQL > RUN QUERY arrayAccumLocal()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“Persons”: [
{
“v_id”: “person1”,
“attributes”: {“Persons.@edgesByType”: [ 2, 1, 1 ]},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {“Persons.@edgesByType”: [ 2, 2, 1 ]},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {“Persons.@edgesByType”: [ 2, 1, 1 ]},
“v_type”: “person”
},
{
“v_id”: “person4”,
“attributes”: {“Persons.@edgesByType”: [ 3, 1, 1 ]},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {“Persons.@edgesByType”: [ 2, 1, 2 ]},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {“Persons.@edgesByType”: [ 2, 1, 2 ]},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {“Persons.@edgesByType”: [ 2, 1, 2 ]},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {“Persons.@edgesByType”: [ 3, 1, 2 ]},
“v_type”: “person”
}
]}]
}

 

HeapAccum

The HeapAccum type maintains a sorted collection of tuples and enforces a maximum number of tuples in the collection. The output of a HeapAccum is a sorted collection of tuple elements. The

+= arg

operation adds a tuple to the collection in sorted order. If the HeapAccum is already at maximum capacity when the

+=

operator is applied, then the tuple which is last in the sorted order is dropped from the HeapAccum. Sorting of tuples is performed on one or more defined tuple fields ordered either ascending or descending. Sorting precedence is performed based on defined tuple fields from left to right.

The declaration of a HeapAccum is more complex than for most other accumulators, because the user must define a custom tuple type, set the maximum capacity of the HeapAccum, and specify how the HeapAccum should be sorted. The declaration syntax is outlined in the figure below:


HeapAccum declaration syntax
TYPEDEF TUPLE<type field_1,.., type field_n> tupleName;

HeapAccum<tupleName>(capacity, field_a [ASC|DESC],… , field_z [ASC|DESC]);

First, the HeapAccum declaration must be preceded by a TYPEDEF statement which defines the tuple type. At least one of the fields (field_1, …, field_n) must be of a data type that can be sorted.

In the declaration of the HeapAccum itself, the keyword “HeapAccum” is followed by the tuple type in angle brackets < >. This is followed by a parenthesized list of two or more parameters. The first parameter is the maximum number of tuples that the HeapAccum may store. This parameter must be a positive integer. The subsequent parameters are a subset of the tuple’s field, which are used as sort keys. The sort key hierarchy is from left to right, with the leftmost key being the primary sort key. The keywords ASC and DESC indicate Ascending (lowest value first) or Descending (highest value first) sort order. Ascending order is the default.

HeapAccum also supports the following class functions.


Functions which modify the HeapAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.
function return type Accessor / Mutator description

size()
INT Accessor Returns the number of elements in the heap.

top()
tupleType Accessor Returns the top tuple. If this heap is empty, returns a tuple with each element equal to the default value.

pop()
tupleType Mutator Returns the top tuple and removes it from the heap. If this heap is empty, returns a tuple with each element equal to the default value.

resize(

INT

)
VOID Mutator Changes the maximum capacity of the heap.

clear()
VOID Mutator Clears the heap so it becomes empty with size 0.

 


HeapAccum Example
#HeapAccum Example
CREATE QUERY heapAccumEx() FOR GRAPH minimalNet {
TYPEDEF tuple<STRING firstName, STRING lastName, INT score> testResults;#Heap with max size of 4 sorted decending by score then ascending last name
HeapAccum<testResults>(4, score DESC, lastName ASC) @@topTestResults;PRINT @@topTestResults.top();

@@topTestResults += testResults(“Bruce”, “Wayne”, 80);
@@topTestResults += testResults(“Peter”, “Parker”, 80);
@@topTestResults += testResults(“Tony”, “Stark”, 100);
@@topTestResults += testResults(“Bruce”, “Banner”, 95);
@@topTestResults += testResults(“Jean”, “Summers”, 95);
@@topTestResults += testResults(“Clark”, “Kent”, 80);

#Show element with the highest sorted position
PRINT @@topTestResults.top();
PRINT @@topTestResults.top().firstName, @@topTestResults.top().lastName, @@topTestResults.top().score;

PRINT @@topTestResults;

#Increase the size of the heap to add more elements
@@topTestResults.resize(5);

#Find the size of the current heap
PRINT @@topTestResults.size();

@@topTestResults += testResults(“Bruce”, “Wayne”, 80);
@@topTestResults += testResults(“Peter”, “Parker”, 80);

PRINT @@topTestResults;

#Resizing smaller WILL REMOVE excess elements from the HeapAccum
@@topTestResults.resize(3);
PRINT @@topTestResults;

#Increasing capacity will not restore dropped elements
@@topTestResults.resize(5);
PRINT @@topTestResults;

#Removes all elements from the HeapAccum
@@topTestResults.clear();
PRINT @@topTestResults.size();
}

 


heapAccumEx.json Results

 



GSQL > RUN QUERY heapAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@topTestResults.top()”: {
“firstName”: “”,
“lastName”: “”,
“score”: 0
}},
{“@@topTestResults.top()”: {
“firstName”: “Tony”,
“lastName”: “Stark”,
“score”: 100
}},
{
“@@topTestResults.top().firstName”: “Tony”,
“@@topTestResults.top().lastName”: “Stark”,
“@@topTestResults.top().score”: 100
},
{“@@topTestResults”: [
{
“firstName”: “Tony”,
“lastName”: “Stark”,
“score”: 100
},
{
“firstName”: “Bruce”,
“lastName”: “Banner”,
“score”: 95
},
{
“firstName”: “Jean”,
“lastName”: “Summers”,
“score”: 95
},
{
“firstName”: “Clark”,
“lastName”: “Kent”,
“score”: 80
}
]},
{“@@topTestResults.size()”: 4},
{“@@topTestResults”: [
{
“firstName”: “Tony”,
“lastName”: “Stark”,
“score”: 100
},
{
“firstName”: “Bruce”,
“lastName”: “Banner”,
“score”: 95
},
{
“firstName”: “Jean”,
“lastName”: “Summers”,
“score”: 95
},
{
“firstName”: “Clark”,
“lastName”: “Kent”,
“score”: 80
},
{
“firstName”: “Peter”,
“lastName”: “Parker”,
“score”: 80
}
]},
{“@@topTestResults”: [
{
“firstName”: “Tony”,
“lastName”: “Stark”,
“score”: 100
},
{
“firstName”: “Bruce”,
“lastName”: “Banner”,
“score”: 95
},
{
“firstName”: “Jean”,
“lastName”: “Summers”,
“score”: 95
}
]},
{“@@topTestResults”: [
{
“firstName”: “Tony”,
“lastName”: “Stark”,
“score”: 100
},
{
“firstName”: “Bruce”,
“lastName”: “Banner”,
“score”: 95
},
{
“firstName”: “Jean”,
“lastName”: “Summers”,
“score”: 95
}
]},
{“@@topTestResults.size()”: 0}
]
}

GroupByAccum

The GroupByAccum is compound accumulator, an accumulator of accumulators.  At the top level, it is a MapAccum where both the key and the value can have multiple fields. Moreover, each of the value fields is an accumulator type.


GroupByAccum syntax
GroupByAccum<type [, type]* , accumType [, accumType]* >

In the EBNF above, the

type

terms form the key set, and the

accumType

terms form the map’s value. Since they are accumulators, they perform a grouping. Like a MapAccum, if we try to store a (key->value) whose key has already been used, then the new value will accumulate to the data which is already stored.  In this case, each field of the multiple-field value has its own accumulation function. One way to think about GroupByAccum is that each unique key is a group ID.

In GroupByAccum, the key types can be base type, tuple, or STRING COMPRESS. The accumulators are used for aggregating group values.  Each accumulator type can be any type except HeapAccum. Each base type and each accumulator type must be followed an alias. Below is an example declaration.

GroupByAccum<INT a, STRING b, MaxAccum<INT> maxa, ListAccum<ListAccum<INT>> lists> @@group;

To add new data to this GroupByAccum, the data should be formatted as

(key1, key2 -> value1, value2)

.

GroupByAccum also supports the following class functions.


Functions which modify the GroupByAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.
function

(KEY1..KEYn are the key types)

return type Accessor / Mutator description

size()
INT Accessor Returns the number of elements in the heap.

get(

KEY1
key_value1

, KEY2

key_value2

)

element type(s) of the accumulator(s) Accessor Returns the values from each accumulator in the group associating with the given key(s). If the key(s) doesn’t exist, return the default value(s) of the accumulator type(s).

containsKey(

KEY1

key_value1

, KEY2

key_value2



)
BOOL Accessor Returns true/false if the accumulator contains the key(s)

clear()
VOID Mutator Clears the heap so it becomes empty with size 0.


remove

(
KEY1

key_value1

, KEY2

key_value2


)

VOID Mutator Removes the group associating with the key(s)

GroupByAccum Example
#GroupByAccum Example
CREATE QUERY groupByAccumEx () FOR GRAPH socialNet {## declaration, first two primitive type are group by keys; the rest accumulator type are aggregates
GroupByAccum<INT a, STRING b, MaxAccum<INT> maxa, ListAccum<ListAccum<INT>> lists> @@group;
GroupByAccum<STRING gender, MapAccum<VERTEX<person>, DATETIME> m> @@group2;
# nested GroupByAccum
GroupByAccum<INT a, MaxAccum<INT> maxa, GroupByAccum<INT a, MaxAccum<INT> maxa> heap> @@group3;Start = { person.* };

## usage of global GroupByAccum
@@group += (1, “a” -> 1, [1]);
@@group += (1, “a” -> 2, [2]);
@@group += (2, “b” -> 1, [4]);

@@group3 += (2 -> 1, (2 -> 0) );
@@group3 += (2 -> 1, (2 -> 5) );
@@group3 += (2 -> 5, (3 -> 3) );
PRINT @@group, @@group.get(1, “a”), @@group.get(1, “a”).lists, @@group.containsKey(1, “c”), @@group3;

## two kinds of foreach
FOREACH g IN @@group DO
PRINT g.a, g.b, g.maxa, g.lists;
END;
FOREACH (g1,g2,g3,g4) IN @@group DO
PRINT g1,g2,g3,g4;
END;

S = SELECT v
FROM Start:v – (liked:e) – post:t
ACCUM @@group2 += (v.gender -> (v -> e.actionTime));

PRINT @@group2, @@group2.get(“Male”).m, @@group2.get(“Female”).m;
}


Result for Query groupByAccum

 



GSQL > RUN QUERY groupByAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“@@group.get(1,a).lists”: [
[1],
[2]
],
“@@group3”: [{
“a”: 2,
“heap”: [
{
“a”: 3,
“maxa”: 3
},
{
“a”: 2,
“maxa”: 5
}
],
“maxa”: 5
}],
“@@group.containsKey(1,c)”: false,
“@@group.get(1,a)”: {
“lists”: [
[1],
[2]
],
“maxa”: 2
},
“@@group”: [
{
“a”: 2,
“b”: “b”,
“lists”: [[4]],
“maxa”: 1
},
{
“a”: 1,
“b”: “a”,
“lists”: [
[1],
[2]
],
“maxa”: 2
}
]
},
{
“g.b”: “b”,
“g.maxa”: 1,
“g.lists”: [[4]],
“g.a”: 2
},
{
“g.b”: “a”,
“g.maxa”: 2,
“g.lists”: [
[1],
[2]
],
“g.a”: 1
},
{
“g1”: 2,
“g2”: “b”,
“g3”: 1,
“g4”: [[4]]
},
{
“g1”: 1,
“g2”: “a”,
“g3”: 2,
“g4”: [
[1],
[2]
]
},
{
“@@group2.get(Male).m”: {
“person3”: 1263618953,
“person1”: 1263209520,
“person8”: 1263180365,
“person7”: 1263295325,
“person6”: 1263468185
},
“@@group2”: [
{
“gender”: “Male”,
“m”: {
“person3”: 1263618953,
“person1”: 1263209520,
“person8”: 1263180365,
“person7”: 1263295325,
“person6”: 1263468185
}
},
{
“gender”: “Female”,
“m”: {
“person4”: 1263352565,
“person2”: 2526519281,
“person5”: 1263330725
}
}
],
“@@group2.get(Female).m”: {
“person4”: 1263352565,
“person2”: 2526519281,
“person5”: 1263330725
}
}
]
}

Nested Accumulators


Certain collection accumulators may be nested. That is, an accumulator may contain a collection of elements where the elements themselves are accumulators. For example:

ListAccum<ListAccum<INT>> @@matrix; # a 2-dimensional jagged array of integers. Each inner list has its own unique size.


Only ListAccum, ArrayAccum, MapAccum, and GroupByAccum
can contain other accumulators. However, not all combinations of collection accumulators are allowed. The following constraints apply:



  1. ListAccum:

    ListAccum is the only accumulator type which can be nested within ListAccum, up to a depth of 3:
     

    ListAccum<ListAccum<INT>>
    ListAccum<ListAccum<ListAccum<INT>>>
    ListAccum<SetAccum<INT>> # illegal


  2. MapAccum:

    All accumulator types, except for HeapAccum, can be nested within MapAccum as the value type. For example,
     

    MapAccum<STRING, ListAccum<INT>>
    MapAccum<INT, MapAccum<INT, STRING>>
    MapAccum<VERTEX, SumAccum<INT>>
    MapAccum<STRING, SetAccum<VERTEX>>
    MapAccum<STRING, GroupByAccum<VERTEX a, MaxAccum<INT> maxs>>
    MapAccum<SetAccum<INT>, INT> # illegal





  3. GroupByAccum:

    All accumulator types, except for HeapAccum, can be nested within GroupByAccum as the accumulator type. For example:
     

    GroupByAccum<INT a, STRING b, MaxAccum<INT> maxs, ListAccum<ListAccum<INT>> lists>

  4. ArrayAccum: Unlike the other accumulators in this list, where nesting is optional, nesting is mandatory for ArrayAccum. See the

    ArrayAccum

    section above.

 

It is legal to define nested ListAccums to form a multi-dimensional array. Note the declaration statements and the nested [ bracket ] notation in the example below:

CREATE QUERY nestedAccumEx() FOR GRAPH minimalNet {
ListAccum<ListAccum<INT>> @@_2d_list;
ListAccum<ListAccum<ListAccum<INT>>> @@_3d_list;
ListAccum<INT> @@_1d_list;
SumAccum <INT> @@sum = 4;@@_1d_list += 1;
@@_1d_list += 2;
// add 1D-list to 2D-list as element
@@_2d_list += @@_1d_list;// add 1D-enum-list to 2D-list as element
@@_2d_list += [@@sum, 5, 6];
// combine 2D-enum-list and 2d-list
@@_2d_list += [[7, 8, 9], [10, 11], [12]];

// add an empty 1D-list
@@_1d_list.clear();
@@_2d_list += @@_1d_list;

// combine two 2D-list
@@_2d_list += @@_2d_list;

PRINT @@_2d_list;

// test 3D-list
@@_3d_list += @@_2d_list;
@@_3d_list += [[7, 8, 9], [10, 11], [12]];
PRINT @@_3d_list;
}


nestedAccumEx.json Results
GSQL > RUN QUERY nestedAccumEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@_2d_list”: [
[1,2],
[4,5,6],
[7,8,9],
[10,11],
[12],
[],
[1,2],
[4,5,6],
[7,8,9],
[10,11],
[12],
[]
]},
{“@@_3d_list”: [
[
[1,2],
[4,5,6],
[7,8,9],
[10,11],
[12],
[],
[1,2],
[4,5,6],
[7,8,9],
[10,11],
[12],
[]
],
[
[7,8,9],
[10,11],
[12]
]
]}
]
}

 

 


End of Accumulators Section

Operators, Functions, and Expressions

 

An

expression

is a combination of fixed values, variables, operators, function calls, and groupings which specify a computation, resulting in a data value. This section of the specification describes the literals (fixed values), operators, and functions available in the GSQL query language. It covers the subset of the EBNF definitions shown below. However, more so than in other sections of the specification, syntax alone is not an adequate description. The semantics (functionality) of the particular operators and functions are an essential complement to the syntax.


EBNF for Operations, Functions, and Expressions

constant := numeric | stringLiteral | TRUE | FALSE | GSQL_UINT_MAX | GSQL_INT_MAX | GSQL_INT_MIN | TO_DATETIME “(” stringLiteral “)”

mathOperator := “*” | “/” | “%” | “+” | “-” | “<<” | “>>” | “&” | “|”

comparisonOperator := “<” | “<=” | “>” | “>=” | “==” | “!=”

condition := expr
| expr comparisonOperator expr
| expr [ NOT ] IN setBagExpr
| expr IS [ NOT ] NULL
| expr BETWEEN expr AND expr
| “(” condition “)”
| NOT condition
| condition (AND | OR) condition
| (TRUE | FALSE)

expr := [“@@”]name
| name “.” “type”
| name “.” [“@”]name
| name “.” “@”name [“\'”]
| name “.” name “.” name “(” [argList] “)”
| name “.” name “(” [argList] “)” [ “.”.FILTER “(” condition “)” ]
| name [“<” type [“,” type”]* “>”] “(” [argList] “)”
| name “.” “@”name (“.” name “(” [argList] “)”)+ [“.” name]
| “@@”name (“.” name “(” [argList] “)”)+ [“.” name]
| COALESCE “(” [argList] “)”
| ( COUNT | ISEMPTY | MAX | MIN | AVG | SUM ) “(” setBagExpr “)”
| expr mathOperator expr
| “-” expr
| “(” expr “)”
| “(” argList “->” argList “)” // key value pair for MapAccum
| “[” argList “]” // a list
| constant
| setBagExpr
| name “(” argList “)”

setBagExpr := [“@@”]name
| name “.” [“@”]name
| name “.” “@”name (“.” name “(” [argList] “)”)+
| name “.” name “(” [argList] “)” [ “.”.FILTER “(” condition “)” ]
| “@@”name (“.” name “(” [argList] “)”)+
| setBagExpr (UNION | INTERSECT | MINUS) setBagExpr
| “(” argList “)”
| “(” setBagExpr “)”

argList := expr [“,” expr]*

Constants

constant := numeric | stringLiteral | TRUE | FALSE | GSQL_UINT_MAX | GSQL_INT_MAX | GSQL_INT_MIN | TO_DATETIME “(” stringLiteral “)”

Each primitive data type supports constant values:

Data Type Constant Examples
Numeric types (INT, UINT, FLOAT, DOUBLE)
numeric
123

-5

45.67

2.0e-0.5

UINT GSQL_UINT_MAX
INT GSQL_INT_MAX

GSQL_INT_MIN

boolean TRUE

FALSE

string
stringLiteral

"atoz@com"


"0.25"

GSL_UINT_MAX = 2 ^ 64 – 1 =

18446744073709551615

GSQL_INT_MAX = 2 ^ 63 – 1 =  9223372036854775807

GSQL_INT_MIN = -2 ^ 63     = -9223372036854775808

Operators

An operator is a keyword token which performs a specific computational function to return a resulting value, using the adjacent expressions (its operands) as input values.  An operator is similar to a function in that both compute a result from inputs, but syntactically they are different. The most familiar operators are the mathematical operators for addition  +  and subtraction  – .


Tip: The operators listed in this section are designed to behave like the operators in MySQL.

 

Mathematical Operators and Expressions

We support the following standard mathematical operators and meanings. The latter four (“<<” | “>>” | “&” | “|”) are for bitwise operations.  See the section below: “Bit Operators”.

mathOperator := “*” | “/” | “%” | “+” | “-” | “<<” | “>>” | “&” | “|”


Operator precedences are shown in the following list, from highest precedence to the lowest. Operators that are shown together on a line have the same precedence:


Operator Precedence, highest to lowest
*, /, %
-, +
<<, >>
&
|
==, >=, >, <=, <, !=

 

 

 

 


Example 1. Math Operators + – * /
CREATE QUERY mathOperators() FOR GRAPH minimalNet api(“v2”)
{
int x,y;
int z1,z2,z3,z4,z5;
float f1,f2,f3,f4;x = 7;
y = 3;z1 = x * y; # z = 21
z2 = x – y; # z = 4
z3 = x + y; # z = 10
z4 = x / y; # z = 2
z5 = x / 4.0; # z = 1
f1 = x / y; # v = 2
f2 = x / 4.0; # v = 1.75
f3 = x % 3; # v = 1
f4 = x % y; # z = 1

PRINT x,y;
PRINT z1 AS xTIMESy, z2 AS xMINUSy, z3 AS xPLUSy, z4 AS xDIVy, z5 AS xDIV4f;
PRINT f1 AS xDIVy, f2 AS xDIV4f, f3 AS xMOD3, f4 AS xMODy;
}


mathOperators.json Results
GSQL > RUN QUERY mathOperators()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“x”: 7,
“y”: 3
},
{
“xTIMESy”: 21,
“xPLUSy”: 10,
“xMINUSy”: 4,
“xDIVy”: 2,
“xDIV4f”: 1
},
{
“xMODy”: 1,
“xMOD3”: 1,
“xDIVy”: 2,
“xDIV4f”: 1.75
}
]
}

Boolean Operators

We support the standard Boolean operators and standard order of precedence: AND, OR, NOT

Bit Operators

Bit operators (<<, >>, &, and |) operate on integers and return an integer.


Bit Operators
CREATE QUERY bitOperationTest() FOR GRAPH minimalNet{
PRINT 80 >> 2; # 20
PRINT 80 << 2; # 320
PRINT 2 + 80 >> 4; # 5
PRINT 2 | 3 ; # 3
PRINT 2 & 3 ; # 2
PRINT 2 | 3 + 2; # 7
PRINT 2 & 3 – 2; # 0
}

String Operators

Operator + can be used for concatenating strings.

Tuple Fields

The fields of the tuple can be accessed using the dot operator.

Comparison Operators and Conditions

A condition is an expression which evaluates to a boolean value of either true or false. One type of condition uses the familiar comparison operators.

A comparison operator compares two numeric values.

comparisonOperator := “<” | “<=” | “>” | “>=” | “==” | “!=”

condition := expr
| expr comparisonOperator expr
| expr [ NOT ] IN setBagExpr
| expr IS [ NOT ] NULL
| expr BETWEEN expr AND expr
| “(” condition “)”
| NOT condition
| condition (AND | OR) condition
| (TRUE | FALSE)
| expr NOT? LIKE expr (ESCAPE ESCAPE_CHAR)?

BETWEEN expr AND expr


The expression expr1 BETWEEN expr2 AND expr3 is true if the value expr1 is in the range from expr2 to expr3, including the endpoint values. Each expression must be numeric.


” expr1 BETWEEN expr2 AND expr3 ” is equivalent to ” expr1 <= expr3 AND expr1 >= expr2″.


BETWEEN AND example
CREATE QUERY mathOperatorBetween() FOR GRAPH minimalNet
{
int x;
bool b;
x = 1;
b = (x BETWEEN 0 AND 100); PRINT b; # True
b = (x BETWEEN 1 AND 2); PRINT b; # True
b = (x BETWEEN 0 AND 1); PRINT b; # True
}

 

IS NULL, IS NOT NULL

IS NULL and IS NOT NULL can be used for checking whether an optional parameter is given any value.


IS NULL example
CREATE QUERY parameterIsNULL (INT p) FOR GRAPH minimalNet {
IF p IS NULL THEN
PRINT “p is null”;
ELSE
PRINT “p is not null”;
END;
}

 


parameterIsNULL.json Results
GSQL > RUN QUERY parameterIsNULL(_)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“p is null”: “p is null”}]
}
GSQL > RUN QUERY parameterIsNULL(3)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“p is not null”: “p is not null”}]
}

Every attribute value stored in GSQL is a valid value, so IS NULL and IS NOT NULL is only effective for query parameters.

 

LIKE

The LIKE operator is used for string pattern matching. The expression


string1

LIKE

string_pattern

evaluates to boolean true if

string1

matches the pattern in

string_pattern

; otherwise it is false. Both operands must be strings. LIKE may be used only in WHERE clauses. Additionally,

string_pattern

supports the following wildcard and other symbols, in order to express a pattern:

character or syntax meaning
% matches zero or more characters.


Example

:

"%abc%

” matches any string which contains the sequence “abc”.

_ (underscore) matches any single character.


Example

:

"_abc_e"

matches any 6-character string where the 2nd to 4th characters are “abc” and the last character is “e”.

[charlist] match any character in charlist. charlist is a concatenated character set, with no separators.


Example

:

"[Tiger]"

matches either T, i, g, e, or r.

[^charlist] matches any character NOT in charlist.


Example

:

"[^qxz]"

matches any character other than q, x, or z.

[!charlist] matches any character NOT in charlist.
special syntax within charlist α-β matches a character in the range from α to β. A charlist can have multiple ranges.


Example

:


"[a-mA-M0-3]"

matches a letter from a to m, upper or lower case, or a digit from 0 to 3.

\\ matches the character \
\\] matches the character ]

No special treatment is needed for [ inside a charlist.


Example

:

"%[\\]!]"

matches any string which ends with either ] or !

Mathematical Functions

There are a number of built-in functions which act on either an accumulator, a base type, or vertex variable. The accumulator function calls are discussed in detail in the “Accumulators” section.

Below is a list of built-in functions which act on either INT, FLOAT, or DOUBLE value(s).

function name and parameters
(NUM means INT, FLOAT, or DOUBLE)

description

return type

abs

(NUM

num

)
Returns the absolute value of

num
Same as parameter type

sqrt

(NUM


num

)

Returns the square root of

num
FLOAT

pow

(NUM


base

,

NUM


exp

)


Returns

base




exp

If base and exp are both INT → INT;

Otherwise → FLOAT


acos

(NUM


num

)

arc cosine FLOAT

asin

(NUM


num

)

arc sine FLOAT

atan

(NUM


num

)


arc tangent
FLOAT

atan2

(NUM

y

, NUM

x

)

arc tangent of

y

/

x

FLOAT

ceil

(NUM


num

)


rounds


upward
INT

cos

(NUM


num

)


cosine
FLOAT

cosh

(NUM


num

)


hyperbolic cosine
FLOAT

exp

(NUM


num

)


base-e


exponential
FLOAT

floor

(NUM


num

)


rounds


downward
INT

fmod

(NUM

numer

, NUM

denom

)

floating-point remainder of


numer


/


denom
FLOAT

ldexp

(NUM

x

, NUM

exp

)

x




* 2



exp



FLOAT

log

(NUM


num

)

natural logarithm FLOAT

log10

(NUM


num

)

common

(base-10)

logarithm
FLOAT

sin

(NUM


num

)

sine FLOAT

sinh

(NUM


num

)


hyperbolic sine
FLOAT

tan

(NUM


num

)


tangent
FLOAT

tanh

(NUM


num

)


hyperbolic tangent
FLOAT

to_string
(NUM


num

)

Converts

num

to a STRING value
STRING

float_to_int
(FLOAT


num

)

Converts

num

to a INT value by truncating the floating part
INT

str_to_int

(STRING

str

)
Converts

str

to a INT value. If str is a floating number, the floating part is truncated; If

str

is not a numerical value, returns 0.
INT

String Functions

The following built-in functions are provided for text processing. Note that these functions do not modify the input parameter. They each return a new string.

function name and parameters

description

return type
lower(STRING

str

)
Converts

str

to all lowercase letters
STRING
upper(STRING

str

)
Converts

str

to all uppercase letters
STRING
trim( [ [

LEADING

|

TRAILING

|

BOTH

] [STRING

removal_char

]

FROM

] STRING

str

)
Trims* characters from the leading and/or trailing ends of

str
STRING

Notes about the trim() function:

  • In the syntax for trim(), the words in

    bold

    (

    LEADING, TRAILING, BOTH,

    and

    FROM

    ) are keywords which should appear exactly as shown.
  • STRING is just an indicator of the datatype; it is not an explicit keyword.
  • The trim() function have the following options:
    • By using one of the keywords LEADING, TRAILING, or BOTH, the user can specify that characters are to be removed from the left end, right end, or both ends of the string, respectively.  If none of these keywords is used, the function will removed from both ends.

    • removal_char

      is a single character. The function will remove consecutive instances of

      removal_char

      , until it encounters a different character. If

      removal_char

      is not specified, then trim() removes whitespace (spaces, tabs, and newlines).

Datetime Functions

The following functions convert from/to DATETIME to/from other types.

function name and parameters

description

return type

to_datetime

(STRING

str

)
Converts

str

to a DATETIME value
DATETIME

epoch_to_datetime

(INT

int_value

)
Converts

int_value

to a DATETIME value by epoch time conversion
DATETIME

datetime_to_epoch

(DATETIME

date

)
Converts

date

to epoch time.
INT

The following function converts a DATETIME value into a string format specified by the user:

function name and parameters

description

return type
datetime_format( DATETIME

date

[, STRING

str

] )

Prints

date

as the

str

indicates. The following specifiers may be used as the format of

str

. The “%” character is required before the format specifier characters. If

str

is not given, “%Y-%m-%d %H:%M:%S” is used.

specifier description
%Y Year, numeric, four digits
%S Seconds (0..59)
%m Month, numeric (1..12)
%M Minutes, numeric (0..59)
%H Hour, numeric (0..23)
%d Day of the month, numeric (1..31)
STRING

datetime_format example
# Show all posts’s post time
CREATE QUERY allPostTime() FOR GRAPH socialNet api(“v2”) {
start = {post.*};
#PRINT datetime_format(start.postTime, “a message was posted at %H:%M:%S on %Y/%m/%d”);
PRINT start[datetime_format(start.postTime, “a message was posted at %H:%M:%S on %Y/%m/%d”) as postTimeMsg]; // api v2
}

T
he followings are other functions related to DATETIME
:

function name and parameters

description

return type
now() Returns the current time in DATETIME type. DATETIME
year(

DATETIME


date

)

Extracts the year of

date

.
INT
month(

DATETIME


date

)

Extracts the monthof


date.

INT
day(

DATETIME


date

)

Extracts the day of month

of


date

.

INT
hour(

DATETIME


date

)

Extracts the hour

of


date

.

INT
minute(

DATETIME


date

)

Extracts the minute

of


date

.

INT
second(

DATETIME


date

)

Extracts the second

of


date

.

INT
datetime_add(

DATETIME


date

, INTERVAL

int_value time_unit

)

INTERVAL is a keyword;

time_unit

is one of the keywords YEAR, MONTH, DAY, HOUR, MINUTE, or SECOND. The function returns the DATETIME value which is

int_value

units later than

date

. For example,
datetime_add( now()
, INTERVAL 1 MONTH
) returns a DATETIME value which is 1 month from now.
DATETIME
datetime_sub(

DATETIME


date

, INTERVAL


int_value


time_unit

)

Same as datetime_add, except that the returned value is

int_value

units

earlier

than

date

.
DATETIME
datetime_diff(

DATETIME


date1

,

DATETIME


date2)

Returns the difference in seconds of these two DATETIME values:

(


date1



date2

)

.

INT

JSONOBJECT and JSONARRAY Functions

JSONOBJECT and JSONARRAY are base types, meaning they can be used as a parameter type, an element type for most accumulators, or a return type.  This enables the input and output of complex, customized data structures. For input and output, a string representation of the JSON is used. Hence, the GSQL query language offers several functions to convert a formatted string into JSON and then to search and access the components of a JSON structure.


Data Conversion Functions

The following parsing functions convert a string into a JSONOBJECT or a JSONARRAY:

function name description return type
parse_json_object(STRING

str

)
Converts

str

into a JSON object
JSONOBJECT
parse_json_array(

STRING

str

)

Converts
str

into a JSON array

JSONARRAY

Both functions generate a run-time error if the input string cannot be converted into a JSON object or a JSON array. To be properly formatted, besides having the proper nesting and matching of curly braces  { } and brackets [ ], each value field must be one of the following: a string (in double quotes “), a number, a boolean (

true

or

false

), or a JSONOBJECT or JSONARRAY. Each key of a key:value pair must be a string in double quotes.

See examples below.


parse_json_object and parse_json_array example
CREATE QUERY jsonEx (STRING strA, STRING strB) FOR GRAPH minimalNet {
JSONARRAY jsonA;
JSONOBJECT jsonO;jsonA = parse_json_array( strA );
jsonO = parse_json_object( strB );PRINT jsonA, jsonO;
}


jsonEx.json Result
GSQL > RUN QUERY jsonEx(“[123]”,”{\”abc\”:123}”)
or curl -X GET ‘http://localhost:9000/query/jsonEx?strA=\[123\]&strB=\{“abc”:123\}’
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{
“jsonA”: [123],
“jsonO”: {“abc”: 123}
}]
}
GSQL > RUN QUERY jsonEx(“{123}”,”{\”123\”:\”123\”}”)
Runtime Error: {123} cannot be parsed as a json array.

 


Data Access Methods

JSONOBJECT and JSONARRAY are object classes, each class supporting a set of data access methods, using dot notation:


jsonVariable.functionName(parameter_list)

The following methods (class functions) can act on a JSONOBJECT variable:

method name description return type
containsKey(STRING

keyStr

)
Returns a boolean value indicating whether the JSON object contains the key

keyStr

.
BOOL
getInt(STRING

keyStr

)
Returns the numeric value associated with key

keyStr

as an INT.
INT
getDouble

(STRING

keyS


tr

)

Returns the numeric value associated with key

keyStr

as a DOUBLE.
DOUBLE
getString

(STRING

keyS


tr

)

Returns the string value associated with key

keyStr


.

STRING
getBool

(STRING

keyS


tr

)

Returns the bool value associated with key

keyStr

.

BOOL
getJsonObject

(STRING

keyS


tr

)

Returns the JSONOBJECT associated with key

keyStr

.

JSONOBJECT
getJsonArray

(STRING

keySt


r

)

Returns the JSONARRAY associated with key

keyStr

.

JSONARRAY

The above getType(STRING

keyStr

) function generates a run-time error if

  1. The key

    keyStr

    doesn’t exist, or
  2. The function’s return type is different than the stored value type. See the next note about numeric data.
  3. Pure JSON stores “numbers” without distinguishing between INT and DOUBLE, but for TigerGraph, if the input value is all digits, it will be stored as INT. Other numeric values are stored as DOUBLE.  The getDouble function can read an INT and return its equivalent DOUBLE value, but it is an error to call getINT for a DOUBLE value.

The following methods can act on a JSONARRAY variable:

method name description return type
size() Returns the size of this array. INT
getInt(

INT


idx

)

Returns the numeric value at position

idx

as an INT.
INT
getDouble(

INT


idx

)

Returns the numeric value at position
idx

as a DOUBLE.

DOUBLE
getString(

INT


idx

)

Returns the string value at position
idx

.

STRING
getBool(

INT


idx

)

Returns the bool value at position
idx

.

BOOL
getJsonObject(

INT


idx

)

Returns the JSONOBJECT value at position
idx

.

JSONOBJECT
getJsonArray(

INT


idx

)

Returns the JSONARRAY value at position
idx

.

JSONARRAY

Similar to the methods of JSONOBJECT, the above getType(INT

idx

) function generates a run-time error if


  1. idx

    is out of bounds, or
  2. The function’s return type is different than the stored value type. See the next note about numeric data.
  3. Pure JSON stores “numbers” without distinguishing between INT and DOUBLE, but for TigerGraph, if the input value is all digits, it will be stored as INT. Other numeric values are stored as DOUBLE.  The getDouble function can read an INT and return its equivalent DOUBLE value, but it is an error to call getINT for a DOUBLE value.

Below is an example of using these functions and methods
:


JSONOBJECT and JSONARRAY function example

CREATE QUERY jsonEx2 () FOR GRAPH minimalNet {

JSONOBJECT jsonO, jsonO2;
JSONARRAY jsonA, jsonA2;
STRING str, str2;

str = “{\”int\”:1, \”double\”:3.0, \”string\”:\”xyz\”, \”bool\”:true, \”obj\”:{\”obj\”:{\”bool\”:false}}, \”arr\”:[\”xyz\”,123,true] }”;
str2 = “[\”xyz\”, 123, false, 5.0]”;
jsonO = parse_json_object( str ) ;
jsonA = parse_json_array( str2 ) ;

jsonO2 = jsonO.getJsonObject(“obj”);
jsonA2 = jsonO.getJsonArray(“arr”);

PRINT jsonO;
PRINT jsonO.getBool(“bool”), jsonO.getJsonObject(“obj”), jsonO.getJsonArray(“arr”), jsonO2.getJsonObject(“obj”), jsonA2.getString(0) , jsonA.getDouble(3), jsonA.getDouble(1);
}


jsonEx2.json Result
GSQL > RUN QUERY jsonEx2()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [ {“jsonO”: { “arr”: [ “xyz”, 123, true ],
“bool”: true,
“string”: “xyz”,
“double”: 3,
“obj”: {“obj”: {“bool”: false}},
“int”: 1
}},
{
“jsonO.getBool(bool)”: true,
“jsonA.getDouble(3)”: 5,
“jsonA.getDouble(1)”: 123,
“jsonO.getJsonObject(obj)”: {“obj”: {“bool”: false}},
“jsonO2.getJsonObject(obj)”: {“bool”: false},
“jsonO.getJsonArray(arr)”: [ “xyz”, 123, true ],
“jsonA2.getString(0)”: “xyz”
}
]
}

Vertex, Edge, and Accumulator Functions and Attributes

Accessing attributes

Attributes on vertices or edges are defined in the graph schema. Additionally, each vertex and edge has a
built-in STRING attribute called

type

which represents the user-defined type of that edge or vertex. These attributes, including

type

, can be accessed for a particular edge or vertex with the dot operator.

For example, the following code snippet shows two different SELECT statements which produce equivalent results. The first uses the dot operator on the vertex variable

v

to access the “subject” attribute, which is defined in the graph schema. The FROM clause in the first SELECT statement necessitates that any target vertices will be of type “post” (also defined in the graph schema). The second SELECT schema checks that the vertex variable v’s type is a “post” vertex by using the dot operator to access the built-in

type

attribute.


Accessing vertex variable attributes
CREATE QUERY coffeeRelatedPosts() FOR GRAPH socialNet
{
allVertices = {ANY};
results = SELECT v FROM allVertices:s -(:e)-> post:v WHERE v.subject == “coffee”;
PRINT results;
results = SELECT v FROM allVertices:s -(:e)-> :v WHERE v.type == “post” AND v.subject == “coffee”;
PRINT results;
}

Results for Query coffeeRelatedPosts
GSQL > RUN QUERY coffeeRelatedPosts()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“results”: [{
“v_id”: “4”,
“attributes”: {
“postTime”: “2011-02-07 05:02:51”,
“subject”: “coffee”
},
“v_type”: “post”
}]},
{“results”: [{
“v_id”: “4”,
“attributes”: {
“postTime”: “2011-02-07 05:02:51”,
“subject”: “coffee”
},
“v_type”: “post”
}]}
]
}

Vertex Functions

Below is a list of built-in functions that can be accessed by vertex aliases, using the dot operator:


Syntax for vertex functions
vertex_alias.function_name(parameter)[.FILTER(condition)]

Currently, these functions are only available for vertex aliases (defined in the FROM clause); vertex variables do not have these functions.

Note that in order to calculate outdegree by edge type, the graph schema must be defined such that vertices keep track of their edge types using WITH STATS=”OUTDEGREE_BY_EDGETYPE” (however, “OUTDEGREE_BY_EDGETYPE” is now the default STATS option).

function name

description

return type

outdegree

([STRING

edgeType

])
Returns the number of outgoing or undirected edges connected to the vertex. If the optional STRING argument

edgeType

is given, then count only edges of the given edgeType.
INT

neighbors

([STRING


edgeType

])

Returns the set of ids for the vertices which are out-neighbors or undirected neighbors of the vertex. If the optional STRING argument

edgeType

is given, then include only those neighbors reachable by edges of the given

edgeType

.
BagAccum<VERTEX>

neighborAttribute

(STRING


edgeType,

STRING

targetVertexType,

STRING

attribute

)

From the given vertex, traverses the given

edgeType

to the given

targetVertexType

, and return the set of values for the given

attribute

.

edgeType
can only be string literal.
BagAccum<attributeType>

edgeAttribute


(
STRING


edgeType,

STRING

attribute

)

From the given vertex, traverses the given

edgeType

, and return the set of values for the given edge

attribute

.

edgeType
can only be string literal.
BagAccum<attributeType>

Vertex function examples

CREATE QUERY vertexFunctionExample(vertex<person> m1) FOR GRAPH socialNet {

SetAccum<Vertex> @neighborSet;
SetAccum<Vertex> @neighborSet2;
SetAccum<DATETIME> @attr1;
BagAccum<DATETIME> @attr2;

int deg1, deg2, deg3, deg4;

S = {m1};
S2 = SELECT S
FROM S – (posted:e) -> post:t
ACCUM deg1 = S.outdegree(),
deg2 = S.outdegree(“posted”),
deg3 = S.outdegree(e.type), # same as deg2
STRING str = “posted”,
deg4 = S.outdegree(str); # same as deg2
PRINT deg1, deg2, deg3, deg4;

S3 = SELECT S
FROM S:s
POST-ACCUM s.@neighborSet += s.neighbors(),
s.@neighborSet2 += s.neighbors(“posted”),
s.@attr1 += s.neighborAttribute(“posted”, “post”, “postTime”),
s.@attr2 += s.edgeAttribute(“liked”, “actionTime”);
PRINT S3;
}


vertexFunctionExample Result

 



GSQL > RUN QUERY vertexFunctionExample(“person5”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“deg4”: 2,
“deg2”: 2,
“deg3”: 2,
“deg1”: 5
},
{“S3”: [{
“v_id”: “person5”,
“attributes”: {
“@attr2”: [1263330725],
“@attr1”: [
1297054971,
1296694941
],
“gender”: “Female”,
“@neighborSet”: [
“6”,
“11”,
“4”,
“person7”,
“person4”
],
“id”: “person5”,
“@neighborSet2”: [
“4”,
“11”
]
},
“v_type”: “person”
}]}
]
}

.FILTER

The optional .FILTER(condition) clause offers an additional filter for selecting which elements are added to the output set of the neighbor, neighborAttribute and edgeAttribute functions.  The condition is evaluated for
each element
. If the condition is true, the element is added to the output set; if false, it is not.  An example is shown below:


Example: vertex functions with optional filter

CREATE QUERY filterEx (SET<STRING> pIds, INT yr) FOR GRAPH workNet api(“v2”) {

SetAccum<vertex<company>> @recentEmplr, @allEmplr;
BagAccum<string> @diffCountry, @allCountry;

Start = {person.*};

L0 = SELECT v
FROM Start:v
WHERE v.id IN pIds
ACCUM
# filter using edge attribute
v.@recentEmplr += v.neighbors(“worksFor”).filter(worksFor.startYear >= yr),
v.@allEmplr += v.neighbors(“worksFor”).filter(true),

# vertex alias attribute and neighbor type attribute
v.@diffCountry += v.neighborAttribute(“worksFor”, “company”, “id”)
.filter(v.locationId != company.country),
v.@allCountry += v.neighborAttribute(“worksFor”, “company”, “id”)
;

PRINT yr, L0[L0.@recentEmplr, L0.@allEmplr, L0.@diffCountry, L0.@allCountry]; // api v2
}


Results in filterEx.json
GSQL > RUN QUERY filterEx([“person1″,”person2”],2016)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{
“L0”: [
{
“v_id”: “person1”,
“attributes”: {
“L0.@diffCountry”: [“company2”],
“L0.@recentEmplr”: [“company1”],
“L0.@allCountry”: [ “company1”, “company2” ],
“L0.@allEmplr”: [ “company2”, “company1” ]
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“L0.@diffCountry”: [“company1”],
“L0.@recentEmplr”: [],
“L0.@allCountry”: [ “company1”, “company2” ],
“L0.@allEmplr”: [ “company2”, “company1” ]
},
“v_type”: “person”
}
],
“yr”: 2016
}]
}

Edge Functions

Below are the built-in functions that can be accessed by edge aliases, using the dot operator. Edge functions follow the same general rules as vertex functions (see above).

function name

description

return type

isDirected

()
Returns a boolean value indicating whether this edge is directed or undirected. BOOL

Accumulator Functions

Accumulator functions for each accumulator type are illustrated at the “Accumulator Type” section.

Set/Bag Expression and Operators

SELECT blocks take an input vertex set and perform various selection and filtering operations to produce an output set. Therefore,

set/bag expressions

and their operators are a useful and powerful part of the GSQL query language. A set/bag expression can use either SetAccum or BagAccum.


BNF
setBagExpr := [“@@”] name
| name “.” [“@”] name
| name “.” “@” name (“.” name “(” [argList] “)”)+
| name “.” name “(” [argList] “)” [ “.”.FILTER “(” condition “)” ]
| “@@” name (“.” name “(” [argList] “)”)+
| setBagExpr (UNION | INTERSECT | MINUS) setBagExpr
| “(” argList “)”
| “(” setBagExpr “)”

Set/Bag Expression Operators – UNION, INTERSECT, MINUS

The operators are straightforward, when two operands are both sets, the result expression is a set. When at least one operant is a bag, the result expression is a bag. If one operant is a bag and the other is a set, the operator treats the set operant as a bag containing one of each value.


Set/Bag Operator Examples
# Demonstrate Set & Bag operators
CREATE QUERY setOperatorsEx() FOR GRAPH minimalNet {
SetAccum<INT> @@setA, @@setB, @@AunionB, @@AintsctB, @@AminusB;
BagAccum<INT> @@bagD, @@bagE, @@DunionE, @@DintsctE, @@DminusE;
BagAccum<INT> @@DminusA, @@DunionA, @@AunionBbag;BOOL x;@@setA = (1,2,3,4); PRINT @@setA;
@@setB = (2,4,6,8); PRINT @@setB;

@@AunionB = @@setA UNION @@setB ; PRINT @@AunionB; // (1, 2, 3, 4, 6, 8)
@@AintsctB = @@setA INTERSECT @@setB; PRINT @@AintsctB; // (2, 4)
@@AminusB = @@setA MINUS @@setB ; PRINT @@AminusB; // C = (1, 3)

@@bagD = (1,2,2,3); PRINT @@bagD;
@@bagE = (2,3,5,7); PRINT @@bagE;

@@DunionE = @@bagD UNION @@bagE; PRINT @@DunionE; // (1, 2, 2, 2, 3, 3, 5, 7)
@@DintsctE = @@bagD INTERSECT @@bagE; PRINT @@DintsctE; // (2, 3)
@@DminusE = @@bagD MINUS @@bagE; PRINT @@DminusE; // (1, 2)
@@DminusA = @@bagD MINUS @@setA; PRINT @@DminusA; // (2)
@@DunionA = @@bagD UNION @@setA; PRINT @@DunionA; // (1, 1, 2, 2, 2, 3, 3, 4)
// because bag UNION set is a bag
@@AunionBbag = @@setA UNION @@setB; PRINT @@AunionBbag; // (1, 2, 3, 4, 6, 8)
// because set UNION set is a set
}

 


setOperatorsEx Query Results
GSQL > RUN QUERY setOperatorsEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [ {“@@setA”: [ 4, 3, 2, 1 ]},
{“@@setB”: [ 8, 6, 4, 2 ]},
{“@@AunionB”: [ 4, 3, 2, 1, 8, 6 ]},
{“@@AintsctB”: [ 4, 2 ]},
{“@@AminusB”: [ 3, 1 ]},
{“@@bagD”: [ 1, 2, 2, 3 ]},
{“@@bagE”: [ 2, 7, 3, 5 ]},
{“@@DunionE”: [ 1, 2, 2, 2, 3, 3, 7, 5 ]},
{“@@DintsctE”: [ 2, 3 ]},
{“@@DminusE”: [ 1, 2 ]},
{“@@DminusA”: [2]},
{“@@DunionA”: [ 1, 1, 2, 2, 2, 3, 3, 4 ]},
{“@@AunionBbag”: [ 6, 8, 1, 2, 3, 4 ]}
]
}

 

The result of these operators is another set or bag, so these operations can be nested and chained to form more complex expressions, such as

(setBagExpr_A INTERSECT (setBagExpr_B UNION setBagExpr_C) ) MINUS setBagExpr_D

Set/Bag Expression Membership Operators

For example , suppose setBagExpr_A is (“a”, “b”, “c”)

“a” IN setBagExpr_A => true
“d” IN setBagExpr_A => false
“a” NOT IN setBagExpr_A => false
“d” NOT IN setBagExpr_A => true

The IN and NOT IN operators support all base types on the left-hand side, and any set/bag expression on the right-hand side. The base type must be the same as the accumulator’s element type. IN and NOT IN return a BOOL value.

The following example uses NOT IN to exclude neighbors that are on a blacklist.


Set Membership example
CREATE QUERY friendsNotInblacklist (VERTEX<person> seed, SET<VERTEX<person>> blackList) FOR GRAPH socialNet `{
Start = {seed};
Result = SELECT v
FROM Start:s-(friend:e)-person:v
WHERE v NOT IN blackList;
PRINT Result;
}

Results for Query friendsNotInblacklist
GSQL > RUN QUERY friendsNotInblacklist(“person1”, [“person2”])
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“Result”: [{
“v_id”: “person8”,
“attributes”: {
“gender”: “Male”,
“id”: “person8”
},
“v_type”: “person”
}]}]
}

Aggregation Functions – COUNT, SUM, MIN, MAX, AVG

The aggregation functions take a set/bag expression as its input parameter and return one value or element.


  • count()

    : Returns the size (INT) of the set.

  • sum()

    : Returns the sum of all elements. This is only applicable to a set/bag expression with numeric type.

  • min()

    : Returns the member with minimum value. This is only applicable to a set/bag expression with numeric type.

  • max()

    : Returns the member with maximum value. This is only applicable to a set/bag expression with numeric type.

  • avg()

    : Returns the average of all elements. This is only applicable to a set/bag expression with numeric type. The average is INT if the element type of the set/bag expression is INT.

Aggregation function example
CREATE QUERY aggregateFuncEx(BAG<INT> x) FOR GRAPH minimalNet {
BagAccum<INT> @@t;
@@t += -5; @@t += 2; @@t+= -1;
PRINT max(@@t), min(@@t), avg(@@t), count(@@t), sum(@@t);
PRINT max(x), min(x), avg(x), count(x), sum(x);
}

aggregateFuncEx.json Result
GSQL > RUN QUERY aggregateFuncEx([1,2,5])
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“sum(@@t)”: -4,
“count(@@t)”: 3,
“max(@@t)”: 2,
“avg(@@t)”: -1,
“min(@@t)”: -5
},
{
“avg(x)”: 2,
“count(x)”: 3,
“max(x)”: 5,
“min(x)”: 1,
“sum(x)”: 8
}
]
}

Miscellaneous Functions

SelectVertex()

SelectVertex() reads a data file which lists particular vertices of the graph and returns the corresponding vertex set. This function can only be used in a vertex set variable declaration statement as a seed set. The data file must be organized as a table with one or more columns.  One column must be for vertex id.  Optionally, another column is for vertex type. SelectVertex() has five parameters explained in the below table: filePath, vertexIdColumn, vertexTypeColumn, separator, and header. The rules for column separators and column headings are the same as for the GSQL Loader.

 

parameter name type description
filePath string The absolute file path of the input file to be read. A relative path is not supported.
vertexIdColumn $

num

, or $

“column_name”

if header is true.
The vertex id column position.
vertexTypeColumn $
num

,

 

$


“column_name”

if header is true, or a vertex type

The vertex type column position or a specific vertex type.
separator single-character string The column separator character.
header bool Whether this file has a header.

One vertex set variable declaration statement can have multiple SelectVertex() function calls. However, if a declaration statement has multiple SelectVertex() calls referring to the same file, they must use the same separator and header parameters. If any row of the file contains an invalid vertex type, a run time error occurs; if any row of the file contains an nonexistent vertex id, a warning message is shown with the count of nonexistent ids.

Below is a query example using SelectVertex calls, reading from the data file selectVertexInput.csv.


selectVertexInput.csv
c1,c2,c3
person1,person,3
person5,person,4
person6,person,5

 


selectVertex example
CREATE QUERY selectVertexEx(STRING filename) FOR GRAPH socialNet {
S = {SelectVertex(filename, $”c1″, $1, “,”, true),
SelectVertex(filename, $2, post, “,”, true)
};
PRINT S;
}

Result
GSQL > RUN QUERY selectVertexEx(“/file_directory/selectVertexInput.csv”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“S”: [
{
“v_id”: “4”,
“attributes”: {
“postTime”: “2011-02-07 05:02:51”,
“subject”: “coffee”
},
“v_type”: “post”
},
{
“v_id”: “person1”,
“attributes”: {
“gender”: “Male”,
“id”: “person1”
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“gender”: “Female”,
“id”: “person5”
},
“v_type”: “person”
},
{
“v_id”: “3”,
“attributes”: {
“postTime”: “2011-02-05 01:02:44”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “5”,
“attributes”: {
“postTime”: “2011-02-06 01:02:02”,
“subject”: “tigergraph”
},
“v_type”: “post”
},
{
“v_id”: “person6”,
“attributes”: {
“gender”: “Male”,
“id”: “person6”
},
“v_type”: “person”
}
]}]
}


to_vertex() and

to_vertex_set()

to_vertex() and to_vertex_set() convert a string or a string set into a vertex or a vertex set, respectively, of a given vertex type. These two functions are useful when the vertex id(s) are obtained and only known at run-time.


Running these functions requires real-time conversion of an external id to a GSQL internal id, which is a relatively slow process. Therefore,

  1. If the user can always know the id before running the query, define the query with VERTEX or SET<VERTEX> parameters instead of STRING or SET<STRING> parameters, and avoid calling to_vertex() or to_vertex_set().
  2. Calling to_vertex_set() one time is much faster than calling to_vertex() multiple times

    . Use to_vertex_set() instead of to_vertex() as much as possible.

The first parameter of to_vertex() is the vertex id string. The first parameter of to_vertex_set() is a string set representing vertex ids. The second parameter of both functions is the vertex type string.


Function signatures for to_vertex() and to_vertex_set()
VERTEX to_vertex(STRING id, STRING vertex_type)
SET<VERTEX> to_vertex_set(SET<VERTEX>, STRING vertex_type)
SET<VERTEX> to_vertex_set(BAG<VERTEX>, STRING vertex_type)

to_vertex_set can accept a bag of vertices as input, but the function will reduce the bag to a set by eliminating duplicate items.

If the vertex id or the vertex type doesn’t exist, to_vertex() will have a run-time error, as shown below. However, to_vertex_set() will have a run-time error only if the vertex type doesn’t exist. If one or more vertex ids are nonexistent, to_vertex_set() will display a warning message but will still run, converting all valid ids and skipping nonexistent vertex ids. If the user wants an error instead of a warning if a nonexistent id is given when converting a string set to a vertex set, the user can use to_vertex() inside a FOREACH loop, instead of to_vertex_set().
See the example below
.


to_vertex() and to_vertex_set() example
CREATE QUERY to_vertex_setTest (SET<STRING> uids, STRING uid, STRING vtype) FOR GRAPH workNet {
SetAccum<VERTEX> @@v2, @@v3;
SetAccum<STRING> @@strSet;
VERTEX v;v = to_vertex (uid, vtype); # to_vertex assigned to a vertex variable
PRINT v; # vertex variable -> only vertex id is printed@@v2 += to_vertex (uid, vtype); # to_vertex accumulated to a SetAccum<VERTEX>
PRINT @@v2; # SetAccum of vertex -> only vertex ids are printed

S2 = to_vertex_set (uids, vtype); # to_vertex_set assigned to a vertex set variable
PRINT S2; # vertex set variable-> full details printed

@@strSet = uids; # Show SET<STRING> & SetAccumm<STRING> are the same
S3 = to_vertex_set(@@strSet, vtype); # Input to to_vertex_set is SetAccum<STRING>
SDIFF = S2 MINUS S3; # Now S2 = S3, so SDIFF2 is empty
PRINT SDIFF.size();

#FOREACH vid in uids DO # In this case non-existing ids in uids causes run-time error
# @@v3 += to_vertex( vid, vtype );
#END;
#L3 = @@v3;
#PRINT L3;
}


to_vertex_set.json Results
GSQL > RUN QUERY to_vertex_setTest([“person1″,”personx”,”person2″], “person3”, “person”)
{
“error”: false,
“message”: “Runtime Warning: 1 ids are invalid person vertex ids.”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“v”: “person3”},
{“@@v2”: [“person3”]},
{“S2”: [
{
“v_id”: “person1”,
“attributes”: {
“interestList”: [ “management”, “financial” ],
“skillSet”: [ 3, 2, 1 ],
“skillList”: [ 1, 2, 3 ],
“locationId”: “us”,
“interestSet”: [“financial”, “management” ],
“id”: “person1”
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“interestList”: [“engineering”],
“skillSet”: [ 6, 5, 3, 2 ],
“skillList”: [ 2, 3, 5, 6],
“locationId”: “chn”,
“interestSet”: [“engineering”],
“id”: “person2”
},
“v_type”: “person”
}
]},
{“SDIFF.size()”: 0}
]
}GSQL > RUN QUERY to_vertex_setTest([“person1″,”personx”], “person1”, “abc”)
Runtime Error: abc is not valid vertex type.

COALESCE()

The COALESCE function evaluates each argument value in order, and returns the first value which is not NULL. This evaluation is the same as that used for IS NULL and IS NOT NULL. The

COALESCE function requires all its arguments have the same data type (BOOL, INT,  FLOAT, DOUBLE, STRING, or VERTEX). The only exception is that different numeric types can be used together. In this case, all values are converted into the first argument type.


coalesce function example
CREATE QUERY coalesceFuncEx (INT p1, DOUBLE p2) FOR GRAPH minimalNet {
PRINT COALESCE(p1, p2, 999.5); # p2 and the last value will be converted into first argument type, which is INT.
}

 

 


coalesceFuncEx.json Results
GSQL > RUN QUERY coalesceFuncEx(_,_)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“coalesce(p1,p2,999.5)”: 999}]
}
GSQL > RUN QUERY coalesceFuncEx(1,2)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“coalesce(p1,p2,999.5)”: 1}]
}
GSQL > RUN QUERY coalesceFuncEx(_,2.5)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“coalesce(p1,p2,999.5)”: 2}]
}


The COALESCE function is useful when multiple optional parameters are allowed, and one of them must be chosen if available. For example,


coalesce function example
CREATE QUERY coalesceFuncEx2 (STRING homePhone, STRING cellPhone, STRING companyPhone) FOR GRAPH minimalNet {
PRINT “contact number: ” + COALESCE(homePhone, cellPhone, companyPhone); # test all NULL
PRINT “contact number: ” + COALESCE(homePhone, cellPhone, companyPhone, “N/A”);
}

The COALESCE function’s parameter list should have a default value as the last argument. Otherwise, i

f all values are NULL, the default value of the data type is returned.


coalesceFuncEx2.json Results
GSQL > RUN QUERY coalesceFuncEx2(_,_,_)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“contact number: +coalesce(homePhone,cellPhone,companyPhone)”: “contact number: “},
{“contact number:+coalesce(homePhone,cellPhone,companyPhone,N/A)”: “contact number:N/A”}
]
}

 

Dynamic Expressions with EVALUATE()

The function evaluate() takes a string argument and interprets it as an expression which is evaluated during run-time. This enables users to create a general purpose query instead of separate queries for each specific computation.


evaluate(expressionStr, typeStr)

The evaluate() function has two parameters: expressionStr is the expression string, and typeStr is a string literal indicating the type of expression. This function

returns a value whose type is typeStr and whose value is the evaluation of expressionStr.

The following rules apply:

  1. evaluate() can only be used inside a SELECT block, and only inside a WHERE clause, ACCUM clause, POST-ACCUM clause, HAVING clause, or ORDER BY clause. It cannot be used in a LIMIT clause or outside a SELECT block.
  2. The result type must be specified at query installation time: typeStr must be a string literal for a primitive data type, e.g., one of “int”, “float”, “double”, “bool”, “string” (case insensitive). The default value is “bool”.
  3. In expressionStr, identifiers can refer only to a vertex or edge aliases, vertex-attached accumulators, global accumulators, parameters, or scalar function calls involving the above variables. The expression may not refer to local variables, global variables, or to FROM clause vertices or edges by type.

  4. Any accumulators in the expression must be scalar accumulators (e.g., MaxAccum) for primitive-type data. Container accumulators (e.g., SetAccum) or scalar accumulators with non-primitive type (e.g. VERTEX, EDGE, DATETIME) are not supported. Container type attributes are not supported.

  5. evaluate() cannot be nested.


The following situations generate a run-time error:


  1. The expression string expressionStr cannot be compiled (unless the error is due to a non-existent  vertex or edge attribute).

  2. The result type of the expression does not match the parameter typeStr.

Silent failure conditions



If any of the following conditions occur, the query may continue running, but the entire clause or statement in which the evaluate() function resides will fail, without producing a run-time error message. For conditional clauses (WHERE, HAVING), a failing evaluate() clause is treated as if the condition is false. An assignment statement with a failing evaluate() will not execute, and an ORDER BY clause with a failing evaluate() will not sort.

  1. The expression references a non-existent attribute of a vertex or edge alias.
  2. The expression uses an operator for non-compatible operation. For example, 123 == “xyz”.


The following example employs dynamic expressions in both the WHERE condition and the accumulator value in the POST-ACCUM clause.


Evaluate example
CREATE QUERY evaluateEx (STRING whereCond = “TRUE”, STRING postAccumIntExpr = “1”) FOR GRAPH socialNet {
SetAccum<INT> @@timeSet;
MaxAccum<INT> @latestLikeTime, @latestLikePostTime;S = {person.*};
S2 = SELECT s
FROM S:s – (liked:e) -> post:t
WHERE evaluate(whereCond)
ACCUM s.@latestLikeTime += datetime_to_epoch( e.actionTime ),
s.@latestLikePostTime += datetime_to_epoch( t.postTime )
POST-ACCUM @@timeSet += evaluate(postAccumIntExpr, “int”)
;
PRINT @@timeSet;
}

Results for Query evaluateEx
GSQL > RUN QUERY evaluateEx(_,_)
{
“error”: false,
“message”: “”,
“results”: [{“@@timeSet”: [1]}]
}GSQL > RUN QUERY evaluateEx(“s.gender==\”Male\””, “s.@latestLikePostTime”)
{
“error”: false,
“message”: “”,
“results”: [
{
“@@timeSet”: [1263295325,1296752752,1297054971,1296788551]
}
]
}GSQL > RUN QUERY evaluateEx(“s.gender==\”Female\””, “s.@latestLikeTime + 1”)
{
“error”: false,
“message”: “”,
“results”: [
{
“@@timeSet”: [1263293536,1263352566,1263330726]
}
]
}

GSQL > RUN QUERY evaluateEx(“xx”, _)
Runtime Error: xx is undefined parameter.

GSQL > RUN QUERY evaluateEx(“e.xyz”, _)’ # The attribute doesn’t exist, so the entire condition in WHERE clause is false.
{
“error”: false,
“message”: “”,
“results”: [{“@@timeSet”: []}]
}

GSQL > RUN QUERY evaluateEx(“e.actionTime”, _)
Runtime Error: actionTime is not a primitive type attribute.

GSQL > RUN QUERY evaluateEx(“s.id”, _)
Runtime Error: Expression ‘s.id’ value type is not bool.

GSQL > RUN QUERY evaluateEx(“s.gender==\”Female\””, “s.xx”) # The attribute doesn’t exist, so the entire assignment is skipped.
{
“error”: false,
“message”: “”,
“results”: [{“@@timeSet”: []}]
}

Queries as Functions

A query that has been defined (with a CREATE QUERY … RETURNS statement) can be treated as a callable function. A query can call itself recursively.

Known bug


A query cannot call itself. A fix is in progress.

The following limitations apply to queries calling queries:

  1. Each parameter of the called query may be one of the following types:
    1. Primitives: INT, UINT, FLOAT, DOUBLE, STRING, BOOL
    2. VERTEX
    3. A Set or Bag of primitive or VERTEX elements
  2. The return value may be one of the following types. See also the “Return Statement” section.
    1. Primitives: INT, UINT, FLOAT, DOUBLE, STRING, BOOL
    2. VERTEX
    3. a vertex set (e.g., the result of a SELECT statement)
    4. An accumulator of primitive types.  GroupByAccum and accumulators containing tuples are not supported.
  3. A query which returns a SetAccum or BagAccum may be called with a Set or Bag argument, respectively.
  4. The order of definition matters.  A query cannot call a query which has not yet been defined.

 


Subquery Example 1
CREATE QUERY subquery1 (VERTEX<person> m1) FOR GRAPH socialNet RETURNS(BagAccum<VERTEX<post>>)
{
Start = {m1};
L = SELECT t
FROM Start:s – (liked:e) – post:t;
RETURN L;
}
CREATE QUERY mainquery1 () FOR GRAPH socialNet
{
BagAccum<VERTEX<post>> @@testBag;
Start = {person.*};
Start = SELECT s FROM Start:s
ACCUM @@testBag += subquery1(s);
PRINT @@testBag;
}

User-Defined Functions

Users can define their own expression functions in C++ in <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp. Only bool, int, float, double, and string (NOT std::string) are allowed as the return value type and the function argument type. However, any C++ type is allowed inside a function body. Once defined, the new functions will be added into GSQL automatically next time GSQL is executed.


If a user-defined struct or a helper function needs to be defined, define it in <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprUtil.hpp.

Here is an example:


new code in ExprFunction.hpp
#include <algorithm> // for std::reverse
inline bool greater_than_three (double x) {
return x > 3;
}
inline string reverse(string str){
std::reverse(str.begin(), str.end());
return str;
}

user defined expression function
CREATE QUERY udfExample() FOR GRAPH minimalNet {
DOUBLE x;
BOOL y;x = 3.5;
PRINT greater_than_three(x);
y = greater_than_three(2.5);
PRINT y;PRINT reverse(“abc”);
}


udfExample.json Results
GSQL > RUN QUERY udfExample()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“greater_than_three(x)”: true},
{“y”: false},
{“reverse(abc)”: “cba”}
]
}

 


If any code in ExprFunctions.hpp or

ExprUtil.hpp causes a compilation error, GSQL cannot install any GSQL query, even if the GSQL query doesn’t call any user-defined function. Therefore,
please test each new user-defined expression function after adding it.
One way of testing the function is creating a new cpp file test.cpp and compiling it by

> g++ test.cpp

> ./a.out

You might need to remove the include header #include <gle/engine/cpplib/headers.hpp> in ExprFunction.hpp and ExprUtil.hpp in order to compile.


test.cpp
#include “ExprFunctions.hpp”
#include <iostream>
int main () {
std::cout << to_string (123) << std::endl; // to_string and str_to_int are two built-in functions in ExprFunction.hpp
std::cout << str_to_int (“123”) << std::endl;
return 0;
}

 

Examples of Expressions



Below is a list of examples of expressions. Note that ( argList ) is a set/bag expression, while [ argList ] is a list expression.

 

 


Expression Examples
#Show various types of expressions
CREATE QUERY expressionEx() FOR GRAPH workNet {
TYPEDEF tuple<STRING countryName, STRING companyName> companyInfo;ListAccum<STRING> @companyNames;
SumAccum<INT> @companyCount;
SumAccum<INT> @numberOfRelationships;
ListAccum<companyInfo> @info;
MapAccum< STRING,ListAccum<STRING> > @@companyEmployeeRelationships;
SumAccum<INT> @@totalRelationshipCount;ListAccum<INT> @@valueList;
SetAccum<INT> @@valueSet;

SumAccum<INT> @@a;
SumAccum<INT> @@b;

#expr := constant
@@a = 10;

#expr := [“@@”] name
@@b = @@a;

#expr := expr mathOperator expr
@@b = @@a + 5;

#expr := “(” expr “)”
@@b = (@@a + 5);

#expr := “-” expr
@@b = -(@@a + 5);

PRINT @@a, @@b;

#expr := “[” argList “]” // a list
@@valueList = [1,2,3,4,5];
@@valueList += [24,80];

#expr := “(” argList “)” // setBagExpr
@@valueSet += (1,2,3,4,5);

#expr := ( COUNT | ISEMPTY | MAX | MIN | AVG | SUM ) “(” setBagExpr “)”
PRINT MAX(@@valueList);
PRINT AVG(@@valueList);

seed = {ANY};

company1 = SELECT t FROM seed:s -(worksFor)-> :t WHERE (s.id == “company1”);
company2 = SELECT t FROM seed:s -(worksFor)-> :t WHERE (s.id == “company2”);

#expr := setBagExpr
worksForBoth = company1 INTERSECT company2;
PRINT worksForBoth;

#expr := name “.” “type”
employees = SELECT s FROM seed:s WHERE (s.type == “person”);

employees = SELECT s FROM employees:s -(worksFor)-> :t

ACCUM
#expr := name “.” [“@”] name
s.@companyNames += t.id,

#expr := name “.” name “(” [argList] “)” [ “.”.FILTER “(” condition “)” ]
s.@numberOfRelationships += s.outdegree(),

#expr := name [“<” type [“,” type”]* “>”] “(” [argList] “)”
s.@info += companyInfo(t.country, t.id)

POST-ACCUM
#expr := name “.” “@” name (“.” name “(” [argList] “)”)+ [“.” name]
s.@companyCount += s.@companyNames.size(),

#expr := name “.” “@” name [“\'”]
@@totalRelationshipCount += s.@companyCount,

FOREACH comp IN s.@companyNames DO
#expr := “(” argList “->” argList “)”
@@companyEmployeeRelationships += (s.id -> comp)
END;

PRINT employees;
PRINT @@totalRelationshipCount;
PRINT @@companyEmployeeRelationships;

#expr := “@@” name (“.” name “(” [argList] “)”)+ [“.” name]
PRINT @@companyEmployeeRelationships.size();
}


expressionEx.json Results
GSQL > RUN QUERY expressionEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“@@a”: 10,
“@@b”: -15
},
{“max(@@valueList)”: 80},
{“avg(@@valueList)”: 17},
{“worksForBoth”: [
{
“v_id”: “person2”,
“attributes”: {
“interestList”: [“engineering”],
“@companyCount”: 0,
“@numberOfRelationships”: 0,
“skillSet”: [ 6, 5, 3, 2 ],
“skillList”: [ 2, 3, 5, 6 ],
“locationId”: “chn”,
“interestSet”: [“engineering”],
“@info”: [],
“id”: “person2”,
“@companyNames”: []
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“interestList”: [ “management”, “financial” ],
“@companyCount”: 0,
“@numberOfRelationships”: 0,
“skillSet”: [ 3, 2, 1 ],
“skillList”: [ 1, 2, 3 ],
“locationId”: “us”,
“interestSet”: [ “financial”, “management” ],
“@info”: [],
“id”: “person1”,
“@companyNames”: []
},
“v_type”: “person”
}
]},
{“employees”: [
{
“v_id”: “person4”,
“attributes”: {
“interestList”: [“football”],
“@companyCount”: 1,
“@numberOfRelationships”: 1,
“skillSet”: [ 10, 1, 4 ],
“skillList”: [ 4, 1, 10 ],
“locationId”: “us”,
“interestSet”: [“football”],
“@info”: [{ “companyName”: “company2”, “countryName”: “chn” }],
“id”: “person4”,
“@companyNames”: [“company2”]
},
“v_type”: “person”
},
{
“v_id”: “person12”,
“attributes”: {
“interestList”: [
“music”,
“engineering”,
“teaching”,
“teaching”,
“teaching”
],
“@companyCount”: 1,
“@numberOfRelationships”: 1,
“skillSet”: [ 2, 5, 1 ],
“skillList”: [ 1, 5, 2, 2, 2 ],
“locationId”: “jp”,
“interestSet”: [ “teaching”, “engineering”, “music” ],
“@info”: [{ “companyName”: “company4”, “countryName”: “us” }],
“id”: “person12”,
“@companyNames”: [“company4”]
},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {
“interestList”: [“teaching”],
“@companyCount”: 1,
“@numberOfRelationships”: 1,
“skillSet”: [ 6, 1, 4 ],
“skillList”: [ 4, 1, 6 ],
“locationId”: “jp”,
“interestSet”: [“teaching”],
“@info”: [{ “companyName”: “company1”, “countryName”: “us” }],
“id”: “person3”,
“@companyNames”: [“company1”]
},
“v_type”: “person”
},
{
“v_id”: “person9”,
“attributes”: {
“interestList”: [ “financial”, “teaching” ],
“@companyCount”: 2,
“@numberOfRelationships”: 4,
“skillSet”: [ 2, 7, 4 ],
“skillList”: [ 4, 7, 2 ],
“locationId”: “us”,
“interestSet”: [ “teaching”, “financial” ],
“@info”: [
{
“companyName”: “company3”,
“countryName”: “jp”
},
{
“companyName”: “company2”,
“countryName”: “chn”
}
],
“id”: “person9”,
“@companyNames”: [ “company3”, “company2” ]
},
“v_type”: “person”
},
{
“v_id”: “person11”,
“attributes”: {
“interestList”: [ “sport”, “football” ],
“@companyCount”: 1,
“@numberOfRelationships”: 1,
“skillSet”: [10],
“skillList”: [10],
“locationId”: “can”,
“interestSet”: [ “football”, “sport” ],
“@info”: [{ “companyName”: “company5”, “countryName”: “can” }],
“id”: “person11”,
“@companyNames”: [“company5”]
},
“v_type”: “person”
},
{
“v_id”: “person10”,
“attributes”: {
“interestList”: [ “football”, “sport” ],
“@companyCount”: 2,
“@numberOfRelationships”: 4,
“skillSet”: [3],
“skillList”: [3],
“locationId”: “us”,
“interestSet”: [ “sport”, “football” ],
“@info”: [
{
“companyName”: “company3”,
“countryName”: “jp”
},
{
“companyName”: “company1”,
“countryName”: “us”
}
],
“id”: “person10”,
“@companyNames”: [ “company3”, “company1” ]
},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {
“interestList”: [ “art”, “sport” ],
“@companyCount”: 2,
“@numberOfRelationships”: 4,
“skillSet”: [ 6, 8 ],
“skillList”: [ 8, 6 ],
“locationId”: “us”,
“interestSet”: [ “sport”, “art” ],
“@info”: [
{
“companyName”: “company3”,
“countryName”: “jp”
},
{
“companyName”: “company2”,
“countryName”: “chn”
}
],
“id”: “person7”,
“@companyNames”: [ “company3”, “company2” ]
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“interestList”: [ “management”, “financial” ],
“@companyCount”: 2,
“@numberOfRelationships”: 4,
“skillSet”: [ 3, 2, 1 ],
“skillList”: [ 1, 2, 3 ],
“locationId”: “us”,
“interestSet”: [ “financial”, “management” ],
“@info”: [
{
“companyName”: “company2”,
“countryName”: “chn”
},
{
“companyName”: “company1”,
“countryName”: “us”
}
],
“id”: “person1”,
“@companyNames”: [ “company2”, “company1” ]
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“interestList”: [ “sport”, “financial”, “engineering” ],
“@companyCount”: 1,
“@numberOfRelationships”: 1,
“skillSet”: [ 5, 2, 8 ],
“skillList”: [ 8, 2, 5 ],
“locationId”: “can”,
“interestSet”: [ “engineering”, “financial”, “sport” ],
“@info”: [{ “companyName”: “company2”, “countryName”: “chn” }],
“id”: “person5”,
“@companyNames”: [“company2”]
},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {
“interestList”: [ “music”, “art” ],
“@companyCount”: 1,
“@numberOfRelationships”: 1,
“skillSet”: [ 10, 7 ],
“skillList”: [ 7, 10 ],
“locationId”: “jp”,
“interestSet”: [ “art”, “music” ],
“@info”: [{ “companyName”: “company1”, “countryName”: “us” }],
“id”: “person6”,
“@companyNames”: [“company1”]
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“interestList”: [“engineering”],
“@companyCount”: 2,
“@numberOfRelationships”: 4,
“skillSet”: [ 6, 5, 3, 2 ],
“skillList”: [ 2, 3, 5, 6 ],
“locationId”: “chn”,
“interestSet”: [“engineering”],
“@info”: [
{
“companyName”: “company2”,
“countryName”: “chn”
},
{
“companyName”: “company1”,
“countryName”: “us”
}
],
“id”: “person2”,
“@companyNames”: [ “company2”, “company1” ]
},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {
“interestList”: [“management”],
“@companyCount”: 1,
“@numberOfRelationships”: 1,
“skillSet”: [ 2, 5, 1 ],
“skillList”: [ 1, 5, 2 ],
“locationId”: “chn”,
“interestSet”: [“management”],
“@info”: [{ “companyName”: “company1”, “countryName”: “us” }],
“id”: “person8”,
“@companyNames”: [“company1”]
},
“v_type”: “person”
}
]},
{“@@totalRelationshipCount”: 17},
{“@@companyEmployeeRelationships”: {
“person4”: [“company2”],
“person3”: [“company1”],
“person2”: [ “company2”, “company1” ],
“person1”: [ “company2”, “company1” ],
“person9”: [ “company3”, “company2” ],
“person12”: [“company4”],
“person8”: [“company1”],
“person7”: [ “company3”, “company2” ],
“person6”: [“company1”],
“person10”: [ “company3”, “company1” ],
“person5”: [“company2”],
“person11”: [“company5”]
}},
{“@@companyEmployeeRelationships.size()”: 12}
]
}

 

Examples of Expression Statements


Expression Statement Examples
#Show various types of expression statements
CREATE QUERY expressionStmntEx() FOR GRAPH workNet {
TYPEDEF tuple<STRING countryName, STRING companyName> companyInfo;ListAccum<companyInfo> @employerInfo;
SumAccum<INT> @@a;
ListAccum<STRING> @employers;
SumAccum<INT> @employerCount;
SetAccum<STRING> @@countrySet;int x;

#exprStmnt := name “=” expr
x = 10;

#gAccumAssignStmt := “@@” name (“+=” | “=”) expr
@@a = 10;

PRINT x, @@a;

start = {person.*};

employees = SELECT s FROM start:s -(worksFor)-> :t
ACCUM #exprStmnt := name “.” “@” name (“+=”| “=”) expr
s.@employers += t.id,
#exprStmnt := name [“<” type [“,” type”]* “>”] “(” [argList] “)”
s.@employerInfo += companyInfo(t.country, t.id),
#gAccumAccumStmt := “@@” name “+=” expr
@@countrySet += t.country
#exprStmnt := name “.” “@” name [“.” name “(” [argList] “)”]
POST-ACCUM s.@employerCount += s.@employers.size();

#exprStmnt := “@@” name [“.” name “(” [argList] “)”]+
PRINT @@countrySet.size();
PRINT employees;
}

GSQL > RUN QUERY expressionStmntEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“@@a”: 10,
“x”: 10
},
{“@@countrySet.size()”: 4},
{“employees”: [
{
“v_id”: “person4”,
“attributes”: {
“interestList”: [“football”],
“skillSet”: [ 10, 1, 4 ],
“skillList”: [ 4, 1, 10 ],
“locationId”: “us”,
“@employerInfo”: [{
“companyName”: “company2”,
“countryName”: “chn”
}],
“interestSet”: [“football”],
“@employerCount”: 1,
“id”: “person4”,
“@employers”: [“company2”]
},
“v_type”: “person”
},
{
“v_id”: “person11”,
“attributes”: {
“interestList”: [ “sport”, “football” ],
“skillSet”: [10],
“skillList”: [10],
“locationId”: “can”,
“@employerInfo”: [{
“companyName”: “company5”,
“countryName”: “can”
}],
“interestSet”: [ “football”, “sport” ],
“@employerCount”: 1,
“id”: “person11”,
“@employers”: [“company5”]
},
“v_type”: “person”
},
{
“v_id”: “person10”,
“attributes”: {
“interestList”: [ “football”, “sport” ],
“skillSet”: [3],
“skillList”: [3],
“locationId”: “us”,
“@employerInfo”: [
{
“companyName”: “company3”,
“countryName”: “jp”
},
{
“companyName”: “company1”,
“countryName”: “us”
}
],
“interestSet”: [ “sport”, “football” ],
“@employerCount”: 2,
“id”: “person10”,
“@employers”: [ “company3”, “company1” ]
},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {
“interestList”: [ “art”, “sport” ],
“skillSet”: [ 6, 8 ],
“skillList”: [ 8, 6 ],
“locationId”: “us”,
“@employerInfo”: [
{
“companyName”: “company3”,
“countryName”: “jp”
},
{
“companyName”: “company2”,
“countryName”: “chn”
}
],
“interestSet”: [ “sport”, “art” ],
“@employerCount”: 2,
“id”: “person7”,
“@employers”: [ “company3”, “company2” ]
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“interestList”: [ “management”, “financial” ],
“skillSet”: [ 3, 2, 1 ],
“skillList”: [ 1, 2, 3 ],
“locationId”: “us”,
“@employerInfo”: [
{
“companyName”: “company2”,
“countryName”: “chn”
},
{
“companyName”: “company1”,
“countryName”: “us”
}
],
“interestSet”: [ “financial”, “management” ],
“@employerCount”: 2,
“id”: “person1”,
“@employers”: [ “company2”, “company1” ]
},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {
“interestList”: [ “music”, “art” ],
“skillSet”: [ 10, 7 ],
“skillList”: [ 7, 10 ],
“locationId”: “jp”,
“@employerInfo”: [{ “companyName”: “company1”, “countryName”: “us” }],
“interestSet”: [ “art”, “music” ],
“@employerCount”: 1,
“id”: “person6”,
“@employers”: [“company1”]
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“interestList”: [“engineering”],
“skillSet”: [ 6, 5, 3, 2 ],
“skillList”: [ 2, 3, 5, 6 ],
“locationId”: “chn”,
“@employerInfo”: [
{
“companyName”: “company2”,
“countryName”: “chn”
},
{
“companyName”: “company1”,
“countryName”: “us”
}
],
“interestSet”: [“engineering”],
“@employerCount”: 2,
“id”: “person2”,
“@employers”: [ “company2”, “company1” ]
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“interestList”: [ “sport”, “financial”, “engineering” ],
“skillSet”: [ 5, 2, 8 ],
“skillList”: [ 8, 2, 5 ],
“locationId”: “can”,
“@employerInfo”: [{
“companyName”: “company2”,
“countryName”: “chn”
}],
“interestSet”: [ “engineering”, “financial”, “sport” ],
“@employerCount”: 1,
“id”: “person5”,
“@employers”: [“company2”]
},
“v_type”: “person”
},
{
“v_id”: “person12”,
“attributes”: {
“interestList”: [
“music”,
“engineering”,
“teaching”,
“teaching”,
“teaching”
],
“skillSet”: [ 2, 5, 1 ],
“skillList”: [ 1, 5, 2, 2, 2 ],
“locationId”: “jp”,
“@employerInfo”: [{ “companyName”: “company4”, “countryName”: “us” }],
“interestSet”: [ “teaching”, “engineering”, “music” ],
“@employerCount”: 1,
“id”: “person12”,
“@employers”: [“company4”]
},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {
“interestList”: [“teaching”],
“skillSet”: [ 6, 1, 4 ],
“skillList”: [ 4, 1, 6 ],
“locationId”: “jp”,
“@employerInfo”: [{ “companyName”: “company1”, “countryName”: “us” }],
“interestSet”: [“teaching”],
“@employerCount”: 1,
“id”: “person3”,
“@employers”: [“company1”]
},
“v_type”: “person”
},
{
“v_id”: “person9”,
“attributes”: {
“interestList”: [ “financial”, “teaching” ],
“skillSet”: [ 2, 7, 4 ],
“skillList”: [ 4, 7, 2 ],
“locationId”: “us”,
“@employerInfo”: [
{
“companyName”: “company3”,
“countryName”: “jp”
},
{
“companyName”: “company2”,
“countryName”: “chn”
}
],
“interestSet”: [ “teaching”, “financial” ],
“@employerCount”: 2,
“id”: “person9”,
“@employers”: [ “company3”, “company2” ]
},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {
“interestList”: [“management”],
“skillSet”: [ 2, 5, 1 ],
“skillList”: [ 1, 5, 2 ],
“locationId”: “chn”,
“@employerInfo”: [{ “companyName”: “company1”, “countryName”: “us” }],
“interestSet”: [“management”],
“@employerCount”: 1,
“id”: “person8”,
“@employers”: [“company1”]
},
“v_type”: “person”
}
]}
]
}

 


End of Operators, Functions, and Expressions Section

Declaration and Assignment Statements

 

Previous sections focused on the lowest level building blocks of queries: data types (Section 3), operators, functions, and expressions (Section 5), and a special section devoted to accumulators (Section 4). We now begin to look at the types of statements available in GSQL queries. This section focuses on declaration and assignment statements. Later sections will provide a closer look at the all-important SELECT statement, control flow statements and data modification statements. Furthermore, some types of statements can be nested within SELECT, UPDATE, or control flow statements.

This section covers the following subset of the EBNF syntax:


EBNF
## Declarations ##
accumDeclStmt := accumType “@”name [“=” constant][, “@”name [“=” constant]]*
| “@”name [“=” constant][, “@”name [“=” constant]]* accumType
| [STATIC] accumType “@@”name [“=” constant][, “@@”name [“=” constant]]*
| [STATIC] “@@”name [“=” constant][, “@@”name [“=” constant]]* accumTypebaseDeclStmt := baseType name [“=” constant][, name [“=” constant]]*fileDeclStmt := FILE fileVar “(” filePath “)”
fileVar := name

localVarDeclStmt := baseType name “=” expr

vSetVarDeclStmt := name [“(” vertexEdgeType “)”] “=” (seedSet | simpleSet | selectBlock)

simpleSet := name | “(” simpleSet “)” | simpleSet (UNION | INTERSECT | MINUS) simpleSet

seedSet := “{” [seed [“,” seed ]*] “}”
seed := ‘_’
| ANY
| [“@@”]name
| name “.*”
| “SelectVertex” selectVertParams

selectVertParams := “(” filePath “,” columnId “,” (columnId | name) “,”
stringLiteral “,” (TRUE | FALSE) “)” [“.”.FILTER “(” condition “)”]

columnId := “$” (integer | stringLiteral)

## Assignment Statements ##
assignStmt := name “=” expr
| name “.” name “=” expr
| name “.” “@”name (“+=”| “=”) expr

gAccumAssignStmt := “@@”name (“+=” | “=”) expr

loadAccumStmt := “@@”name “=” “{” “LOADACCUM” loadAccumParams [“,” “LOADACCUM” loadAccumParams]* “}”

loadAccumParams := “(” filePath “,” columnId “,” [columnId “,”]*
stringLiteral “,” (TRUE | FALSE) “)” [“.”.FILTER “(” condition “)”]

## Function Call Statement ##
funcCallStmt := name [“<” type [“,” type”]* “>”] “(” [argList] “)”
| “@@”name (“.” name “(” [argList] “)”)+

argList := expr [“,” expr]*

Declaration Statements

There are six types of variable declarations in a GSQL query:

  • Accumulator
  • Global baseType variable
  • Local baseType variable
  • Vertex set
  • File object
  • Vertex or Edge aliases

The first five types each have their own declaration statement syntax and are covered in this section. Aliases are declared implicitly in a SELECT statement.

Accumulators

vertexEdgeType := “_” | ANY | name | ( “(” name [“|” name]* “)” )

Accumulator
declaration is discussed in Section 4: “Accumulators”.

Global Variables

After accumulator declarations, base type variables can be declared as global variables. The scope of a global variable is from the point of declaration until the end of the query.


EBNF for global variable declaration
baseDeclStmt := baseType name [“=” constant][, name [“=” constant]]*


A global variable can be accessed (read) anywhere in the query; however, there are restrictions on wh
ere it can
be updated.  See the subsection below on “Assignment Statements”.


Global Variable Example
# Assign global variable at various places
CREATE QUERY globalVariable(VERTEX<person> m1) FOR GRAPH socialNet {SetAccum<VERTEX<person>> @@personSet;
SetAccum<Edge> @@edgeSet;# Declare global variables
STRING gender;
DATETIME dt;
VERTEX v;
VERTEX<person> vx;
EDGE ee;

allUser = {person.*};
allUser = SELECT src
FROM allUser:src – (liked:e) -> post
ACCUM dt = e.actionTime,
ee = e, # assignment does NOT take effect yet
@@edgeSet += ee # so ee is null
POST-ACCUM @@personSet += src;
PRINT @@edgeSet; # EMPTY because ee was frozen in the SELECT statement.
PRINT dt; # actionTime of the last edge e processed.

v = m1; # assign a vertex value to a global variable.
gender = m1.gender; # assign a vertex’s attribute value to a global variable.
PRINT v, gender;

FOREACH m IN @@personSet DO
vx = m; # global variable assignment inside FOREACH takes place.
gender = m.gender; # global variable assignment inside FOREACH takes place.
PRINT vx, gender; # display the values for each iteration of the loop.
END;
}


globalVariable Query Result

 



GSQL > RUN QUERY globalVariable(“person1”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@edgeSet”: [{}]},
{“dt”: “2010-01-12 21:12:05”},
{
“gender”: “Male”,
“v”: “person1”
},
{
“vx”: “person3”,
“gender”: “Male”
},
{
“vx”: “person7”,
“gender”: “Male”
},
{
“vx”: “person1”,
“gender”: “Male”
},
{
“vx”: “person5”,
“gender”: “Female”
},
{
“vx”: “person6”,
“gender”: “Male”
},
{
“vx”: “person2”,
“gender”: “Female”
},
{
“vx”: “person8”,
“gender”: “Male”
},
{
“vx”: “person4”,
“gender”: “Female”
}
]
}

 


Multiple global variables of the same type can be declared and initialized at the same line, as in the example below:


Multiple variable declaration example
CREATE QUERY variableDeclaration() FOR GRAPH minimalNet {
INT a=5,b=1;
INT c,d=10;MaxAccum<INT> @@max1 = 3, @@max2 = 5, @@max3;
MaxAccum<INT> @@max4, @@max5 = 2;PRINT a,b,c,d;
PRINT @@max1, @@max2, @@max3, @@max4, @@max5;
}


variableDeclaration.json Result

 



GSQL > RUN QUERY variableDeclaration()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“a”: 5,
“b”: 1,
“c”: 0,
“d”: 10
},
{
“@@max3”: -9223372036854775808,
“@@max2”: 5,
“@@max1”: 3,
“@@max5”: 2,
“@@max4”: -9223372036854775808
}
]
}

Local Variables

A local variable can be declared only in an ACCUM, POST-ACCUM, or UPDATE SET clause, and its scope is limited to that clause. Local variables can only be of base types (e.g. INT, FLOAT, DOUBLE, BOOL, STRING, VERTEX). A local variable must be declared and initialized together at the same statement.


EBNF for local variable declaration and initialization
localVarDeclStmt := baseType name “=” expr

Within a local variable’s scope, another local variable with the same name cannot be declared at the same level.  However, a new local variable with the same name can be declared at a lower level (i.e., within a nested SELECT or UPDATE statement.) . The lower declaration takes precedence at the lower level.

In a POST-ACCUM clause, each local variable may only be used in source vertex statements or target vertex statements, not both.


Local Variable Example
# An example showing a local variable succeeded where a global variable fails
CREATE QUERY localVariable(vertex<person> m1) FOR GRAPH socialNet {
MaxAccum<INT> @@maxDate, @@maxDateGlob;
DATETIME dtGlob;allUser = {person.*};
allUser = SELECT src
FROM allUser:src – (liked:e) -> post
ACCUM
DATETIME dt = e.actionTime, # Declare and assign local dt
dtGlob = e.actionTime, # dtGlob doesn’t update yet
@@maxDate += datetime_to_epoch(dt),
@@maxDateGlob += datetime_to_epoch(dtGlob);
PRINT @@maxDate, @@maxDateGlob, dtGlob; # @@maxDateGlob will be 0
}

localVariable Query Results
GSQL > RUN QUERY localVariable(“person1”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{
“dtGlob”: “2010-01-11 03:26:05”,
“@@maxDateGlob”: 0,
“@@maxDate”: 1263618953
}]
}

 

Vertex Set Variable Declaration and Assignment

Vertex set variables play a special role within GSQL queries. They are used for both the input and output of SELECT statements. Therefore, before the first SELECT statement in a query, a vertex set variable must be declared and initialized. This initial vertex set is called the

seed set

.


EBNF for Vertex Set Variable Declaration
vSetVarDeclStmt := name [“(” vertexEdgeType “)”] “=” (seedSet | simpleSet | selectBlock)
## Seed Sets ##
seedSet := “{” [seed [“,” seed ]*] “}”
seed := ‘_’
| ANY
| [“@@”]name
| name “.*”
| “SelectVertex” selectVertParamsselectVertParams := “(” filePath “,” columnId “,” (columnId | name) “,”
stringLiteral “,” (TRUE | FALSE) “)” [“.”.FILTER “(” condition “)”]columnId := “$” (integer | stringLiteral)

simpleSet := name | “(” simpleSet “)” | simpleSet (UNION | INTERSECT | MINUS) simpleSet

The query below
lists all ways of assigning a vertex set variable an initial set of vertices (that is, forming a seed set).

  • a vertex parameter, untyped (S1) or typed (S2)
  • a vertex set parameter, untyped (S3) or typed (S4)
  • a global SetAccum<VERTEX> accumulator, untyped (S5) or typed (S6)
  • all vertices of any type (S7, S9) or of one type (S8)
  • a list of vertex ids in an external file (S10)
  • copy of another vertex set (S11)
  • a combination of individual vertices, vertex set parameters, or global variables (S12)
  • union of vertex set variables (S13)

Seed Set Example
CREATE QUERY seedSetExample(VERTEX v1, VERTEX<person> v2, SET<VERTEX> v3, SET<VERTEX<person>> v4) FOR GRAPH socialNet {
SetAccum<VERTEX> @@testSet;
SetAccum<VERTEX<person>> @@testSet2;
S1 = { v1 };
S2 = { v2 };
S3 = v3;
S4 = v4;
S5 = @@testSet;
S6 = @@testSet2;
S7 = ANY; # All vertices
S8 = person.*; # All person vertices
S9 = _; # Equivalent to ANY
S10 = SelectVertex(“absolute_path_to_input_file”, $0, post, “,”, false); # See Section “SelectVertex()” function
S11 = S1;
S12 = {@@testSet, v2, v3}; # S1 is not allowed to be in {}
S13 = S11 UNION S12; # but we can use UNION to combine S1
}

When declaring a vertex set variable, a set of vertex types can be optionally specified to the vertex set variable. If the vertex set variable set type is not specified explicitly, the system determines the type implicitly by the vertex set value. The type can be ANY, _ (equivalent to ANY), or
any explicit vertex type(s). See the EBNF grammar rule

vertexEdgeType.

 

Declaration syntax difference: vertex set variable vs. base type variable



In a vertex set variable declaration, the type specifier follows the variable name and should be surrounded by parentheses:

vSetName (type)

This is different than a base type variable declaration, where the type specifier comes before the base variable name:

type varName

After a vertex set variable is declared, the vertex type of the vertex set variable is immutable. Every assignment (e.g. SELECT statement) to this vertex set variable must match the type. The following is an example in which we must declare the vertex set variable type.


Vertex set variable type
CREATE QUERY vertexSetVariableTypeExample(vertex<person> m1) FOR GRAPH socialNet {
INT ite = 0;
S (ANY) = {m1}; # ANY is necessary
WHILE ite < 5 DO
S = SELECT t
FROM S:s – (ANY:e) -> ANY:t;ite = ite + 1;
END;
PRINT S;
}

In the above example, the query returns the set of vertices after a 5-step traversal from the input “person” vertex. If we declare the vertex set variable S without explicitly giving a type, because the type of vertex parameter m1 is “person”, the GSQL engine will implicitly assign S to be “person”-type. However, if S is assigned to “person”-type, the SELECT statement inside the WHILE loop causes a type checking error, because the SELECT block will generate all connected vertices, including non-“person” vertices. Therefore, S must be declared as a ANY-type vertex set variable.

FILE Object Declaration

A FILE object is a sequential text storage object, associated with a text file on the local machine.

 



When referring to a FILE object, we always capitalize the word FILE, to distinguish it from ordinary files.

 


EBNF for FILE object declaration
fileDeclStmt := FILE fileVar “(” filePath “)”
fileVar := name


When a FILE object is declared, associated with a particular text file, any existing content in the text file will be erased

.

During the execution of the query, content written to or printed to the FILE will be appended to the FILE.  When the query where the FILE was declared finishes running, the FILE contents are saved to the text file.



Note that the declaration statement is invoking the FILE object constructor. The syntax for the constructor includes parentheses surrounding the filepath parameter.

Currently, the filePath must be a absolute path.


File object query example

CREATE QUERY fileEx (STRING fileLocation) FOR GRAPH workNet {

FILE f1 (fileLocation);
P = {person.*};

PRINT “header” TO_CSV f1;

USWorkers = SELECT v FROM P:v
WHERE v.locationId == “us”
ACCUM f1.println(v.id, v.interestList);
PRINT “footer” TO_CSV f1;
}
INSTALL QUERY fileEx
RUN QUERY fileEx(“/home/tigergraph/fileEx.txt”)

 

Assignment and Accumulate Statements

Assignment statements are used to set or update the value of a variable, after it has been declared. This applies to baseType variables, vertex set variables, and accumulators. Accumulators also have the special += accumulate statement, which was discussed in the Accumulator section.  Assignment statements can use expressions (expr) to define the new value of the variable.


EBNF for Assignment Statements
## Assignment Statement ##
assignStmt := name “=” expr # baseType variable, vertex set variable
| name “.” name “=” expr # attribute of a vertex or edge
| name “.” “@”name (“+=”| “=”) expr # vertex.attached accumulatorgAccumAssignStmt := “@@”name “=” expr # global accumulator
| loadAccumStmt
loadAccumStmt := “@@”name “=” “{” “LOADACCUM” loadAccumParams [“,” “LOADACCUM” loadAccumParams]* “}”


Restrictions
on Assignment Statements

In general, assignment statements can take place anywhere after the variable has been declared.  However, t
here are some restrictions.
These restrictions apply to “inner level” statements which are within the body of a higher-level statement:

  • The ACCUM or POST-ACCUM clause of a SELECT statement
  • The SET clause of an UPDATE statement
  • The body of a FOREACH statement

  • Global accumulator assignment “=” is not permitted within the body of SELECT or UPDATE statements
  • Global variable assignment is permitted in ACCUM or POST-ACCUM clauses, but the change in value will not take place until exiting the clause. Therefore, if there are multiple assignment statements for the same variable, only the final one will take effect.
  • Vertex attribute assignment “=” is not permitted in an ACCUM clause. However, edge attribute assignment is permitted. This is because the ACCUM clause iterates over an edge set.
  • There are additional restrictions within FOREACH loops for the loop variable. See the Data Modification section.

LOADACCUM Statement

loadAccumStmt := “@@” name “=” “{” “LOADACCUM” loadAccumParams (“,” “LOADACCUM” loadAccumParams)* “}”

loadAccumParams := “(” filePath “,” columnId “,” [columnId “,”]*
stringLiteral “,” (TRUE | FALSE) “)” [“.”.FILTER “(” condition “)”]
columnId := “$” (integer | stringLiteral)

LOADACCUM() can initialize a global accumulator by loading data from a file. LOADACCUM() has 3+n parameters explained in the table below: (filePath, fieldColumn_1, …., fieldColumn_n, separator, header), where n is the number of fields in the accumulator. One assignment statement can have multiple LOADACCUM() function calls. However, every LOADACCUM() referring to the same file in the same assignment statement must use the same separator and header parameter values.


Any accumulator using generic VERTEX as an element type cannot be initialized by LOADACCUM().

 

parameter name type description
filePath string The absolute file path of the input file to be read. A relative path is not supported.
accumField1,…., accumFieldN $

num

, or $

“column_name”
 

if header is true.

The column position(s) or column name(s) of the data file which supply data values to each field of the accumulator.
separator single-character string The separator of columns.
header bool Whether this file has a header.

Below is an example with an external file


loadAccumInput.csv
person1,1,”test1″,3
person5,2,”test2″,4
person6,3,”test3″,5

 


LoadAccum example
CREATE QUERY loadAccumEx(STRING filename) FOR GRAPH socialNet {
TYPEDEF TUPLE<STRING aaa, VERTEX<post> ddd> yourTuple;
MapAccum<VERTEX<person>, MapAccum<INT, yourTuple>> @@testMap;
GroupByAccum<STRING a, STRING b, MapAccum<STRING, STRING> strList> @@testGroupBy;@@testMap = { LOADACCUM (filename, $0, $1, $2, $3, “,”, false)};
@@testGroupBy = { LOADACCUM ( filename, $1, $2, $3, $3, “,”, true) };PRINT @@testMap, @@testGroupBy;
}

 


Results of Query loadAccumEx
GSQL > RUN QUERY loadAccumEx(“/file_directory/loadAccumInput.csv”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{
“@@testGroupBy”: [
{
“a”: “3”,
“b”: “\”test3\””,
“strList”: {“5”: “5”}
},
{
“a”: “2”,
“b”: “\”test2\””,
“strList”: {“4”: “4”}
}
],
“@@testMap”: {
“person1”: {“1”: {
“aaa”: “\”test1\””,
“ddd”: “3”
}},
“person6”: {“3”: {
“aaa”: “\”test3\””,
“ddd”: “5”
}},
“person5”: {“2”: {
“aaa”: “\”test2\””,
“ddd”: “4”
}}
}
}]
}

Function Call Statements

funcCallStmt := name [“<” type [“,” type”]* “>”] “(” [argList] “)”
| “@@”name (“.” name “(” [argList] “)”)+argList := expr [“,” expr]*

Typically, a function call returns a value and so is part of an expression (see Section 5 – Operators, Functions and Expressions). In some cases, however, the function does not return a value (i.e., returns VOID) or the return value can be ignored, so the function call can be used as an entire statement.  This is a Function Call Statement.


Examples of Function Call statements
ListAccum<STRING> @@listAcc;
BagAccum<INT> @@bagAcc;

# examples of function call statements
@@listAcc.clear();
@@bagAcc.removeAll(0);


End of Declaration, Assignment, and Function Call Statements Section

SELECT Statement

 

This section discusses the SELECT statement in depth and covers the following EBNF syntax:


EBNF for Select Statement

selectStmt := name “=” selectBlock

selectBlock := SELECT name FROM ( edgeSet | vertexSet )
[sampleClause]
[whereClause]
[accumClause]
[postAccumClause]
[havingClause]
[orderClause]
[limitClause]

vertexSet := name [“:” name]

edgeSet := name [“:” name]
“-” “(” [vertexEdgeType] [“:” name] “)” “->”
[vertexEdgeType] [“:” name]

vertexEdgeType := “_” | ANY | name | ( “(” name [“|” name]* “)” )

sampleClause := SAMPLE ( expr | expr “%” ) EDGE WHEN condition
| SAMPLE expr TARGET WHEN condition
| SAMPLE expr “%” TARGET PINNED WHEN condition

whereClause := WHERE condition

accumClause := ACCUM DMLSubStmtList

postAccumClause := POST-ACCUM DMLSubStmtList

DMLSubStmtList := DMLSubStmt [“,” DMLSubStmt]*

DMLSubStmt := assignStmt // Assignment
| funcCallStmt // Function Call
| gAccumAccumStmt // Assignment
| vAccumFuncCall // Function Call
| localVarDeclStmt // Declaration
| DMLSubCaseStmt // Control Flow
| DMLSubIfStmt // Control Flow
| DMLSubWhileStmt // Control Flow
| DMLSubForEachStmt // Control Flow
| BREAK // Control Flow
| CONTINUE // Control Flow
| insertStmt // Data Modification
| DMLSubDeleteStmt // Data Modification
| printlnStmt // Output
| logStmt // Output

vAccumFuncCall := name “.” “@”name (“.” name “(” [argList] “)”)+

havingClause := HAVING condition

orderClause := ORDER BY expr [ASC | DESC] [“,” expr [ASC | DESC]]*

limitClause := LIMIT ( expr | expr “,” expr | expr OFFSET expr )

The SELECT block selects a set of vertices FROM a

vertex set

or

edge set

. There are a number of optional clauses that define and/or refine the selection by constraining the vertex or edge set or the result set. There are two types of SELECT,

vertex-induced

and

edge-induced

.  Both result in a vertex set, known as the

result set

.

Size limitation


There is a maximum size limit of 2GB for the
result set of a SELECT block
. If the result of the SELECT block is larger than 2GB, the system will return no data.  NO error message is produced.

SELECT Statement Data Flow

The SELECT statement is an assignment statement with a SELECT block on the right hand side. The SELECT block has many possible clauses, which fit together in a logical flow. Overall, the SELECT block starts from a source set of vertices and returns a result set that is either a subset of the source vertices or a subset of their neighboring vertices. Along the way, computations can be performed on the selected vertices and edges. The figure below graphically depicts the overall SELECT data flow. While the ACCUM and POST-ACCUM clauses do not directly affect which vertices are included in the result set, they affect the data (accumulators) which are attached to those vertices.

 

FROM Clause: Vertex and Edge Sets

There are two options for the FROM clause: vertexSet or edgeSet. If vertexSet is used, then the query will be a vertex-induced selection.  If edge is used, then the query is an edge-induced selection.


FROM clause
### selectBlock := SELECT name FROM ( edgeSet | vertexSet ) …

Vertex-Induced Selection


EBNF for vertexSet, signaling a vertex-induced selection
vertexSet := name [“:” name]

A vertex-induced selection takes an input set of vertices and produces a result set, which is a subset of the input set.  The FROM argument has the form

Source:s

, where

Source

is a vertex set.

Source

is optionally followed by

:s

, where s is a vertex alias which represents any vertex in the set

Source.

resultSet = SELECT s FROM Source:s;

This statement can be interpreted as ”

Select all vertices s, from the vertex set Source

.”  The result is a vertex set.

Below is a simple example of a vertex-induced selection.


Vertex-Induced SELECT example
# displays all ‘post’-type vertices
CREATE QUERY printAllPosts() FOR GRAPH socialNet
{
start = {post.*}; # start is initialized with all vertices of type ‘post’
results = SELECT s FROM start:s; # select these vertices
PRINT results;
}

 


Results of Query printAllPosts

 



GSQL > RUN QUERY printAllPosts()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“results”: [
{
“v_id”: “0”,
“attributes”: {
“postTime”: “2010-01-12 11:22:05”,
“subject”: “Graphs”
},
“v_type”: “post”
},
{
“v_id”: “10”,
“attributes”: {
“postTime”: “2011-02-04 03:02:31”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “2”,
“attributes”: {
“postTime”: “2011-02-03 01:02:42”,
“subject”: “query languages”
},
“v_type”: “post”
},
{
“v_id”: “4”,
“attributes”: {
“postTime”: “2011-02-07 05:02:51”,
“subject”: “coffee”
},
“v_type”: “post”
},
{
“v_id”: “9”,
“attributes”: {
“postTime”: “2011-02-05 23:12:42”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “3”,
“attributes”: {
“postTime”: “2011-02-05 01:02:44”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “5”,
“attributes”: {
“postTime”: “2011-02-06 01:02:02”,
“subject”: “tigergraph”
},
“v_type”: “post”
},
{
“v_id”: “7”,
“attributes”: {
“postTime”: “2011-02-04 17:02:41”,
“subject”: “Graphs”
},
“v_type”: “post”
},
{
“v_id”: “1”,
“attributes”: {
“postTime”: “2011-03-03 23:02:00”,
“subject”: “tigergraph”
},
“v_type”: “post”
},
{
“v_id”: “11”,
“attributes”: {
“postTime”: “2011-02-03 01:02:21”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “8”,
“attributes”: {
“postTime”: “2011-02-03 17:05:52”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “6”,
“attributes”: {
“postTime”: “2011-02-05 02:02:05”,
“subject”: “tigergraph”
},
“v_type”: “post”
}
]}]
}

Edge-Induced Selection


EBNF for edgeSet, signaling edge-induced selection
edgeSet := name [“:” name]
“-” “(” [vertexEdgeType] [“:” name] “)” “->”
[vertexEdgeType] [“:” name]vertexEdgeType := “_” | ANY | name | (“(” name [“|” name]* “)”)

Multiple types can also be specified by using delimiter “|”. Additionally, the keywords “_” or “ANY” can be used for denoting a set which can include any vertex or edge type.

An edge-induced selection starts from a set of vertices, defines a set of edges incident to that set, and produces a result set of vertices that are also incident to those edges. Typically, this is used to traverse from a set of source vertices over a specific edge type to a set of target vertices. The FROM clause argument (defined formally by the EBNF

edgeSet

rule) is structured as an edge template:

Source:s-(eType:e)->tType:t

. The edge template has three parts: the source vertex set (Source), the edge type or types (eType), and the target vertex type or types (tType). Both s and t are the vertex aliases and e is the edge alias.  The template defines a pattern s → e → t, from source vertex s, across eType edges, to tType target vertices. The edge alias e represents any edge that fits the complete pattern. Likewise, s and t are aliases that represent any source vertices and target vertices, respectively, that fit the complete pattern.

Either the source vertex set (

s

) or target vertex set (

t

) can be used as the SELECT argument, which determines the result of the SELECT statement. Note the small difference in the two SELECT statements below.


Selecting source or target vertices from edge-induced selection
resultSet1 = SELECT s FROM Source:s-(eType:e)->tType:t; //Select from the source set
resultSet2 = SELECT t FROM Source:s-(eType:e)->tType:t; //Select from the target set

resultSet1 is based on the source end of the edges.  resultSet2 is based on the target end of the selected edges. However, resultSet1 is NOT identical to the Source vertex set.  It is only those members of Source which connect to an eType edge and then to a tType vertex. Other clauses (presented later in this “SELECT Statement” section, can do additional filtering of the Source set.


We strongly suggest that an alias should be declared with every vertex and edge in the FROM clause, as there are several functions and features which are only available to vertex and edge aliases.

Edge Set and Target Vertex Set Options

The FROM clause chooses edges and target vertices by type. The EBNF symbol vertexEdgeType describes the options:

accepted vertex/edge types
_ any type
ANY any type
name the given vertex/edge type
name | name … any of the vertex/edge types listed

Note that eType and tType are optional. If

eType/tType

is omitted (or if ANY or _ is used), then the SELECT will seek out any edge or target vertex that is valid (i.e., there exists a valid path between two vertices over an edge). For the example below, if

V1

and

V2

are the only possible reachable vertex types via

eType

, we can omit the target vertex type, making all of the following SELECT statements equivalent. The system will infer the target vertex type at run time.

If is legal to declare an alias without explicitly stating an edge/target type.  See the examples below.


Target vertex type inference
resultSet3 = SELECT v FROM Source:v-(eType:e)->(V1|V2):t;
resultSet4 = SELECT v FROM Source:v-(eType:e)->:t;
resultSet5 = SELECT v FROM Source:v-(eType:e)->ANY:t;
resultSet6 = SELECT v FROM Source:v-(eType:e)->_:t;

Type inference is used whenever possible for the edge set and target vertex set to prune ineligible edges and thereby optimize performance. The vertex type in Source is checked against the graph schema to find all incident edge types. The knowledge of the graph schema is combined with the selection’s explicit type conditions given by eType and tType, as well as explicit and implicit type conditions in the WHERE clause to determine a final set of eligible edge sets which match the pattern Source → eType → tType.  With type inference, the user has the freedom to express only as much as necessary to select edges.

Similarly, the GSQL engine will infer the edge type at run time. For example, if

E1,


E2

, and

E3

are the only possible edge types that can be traversed to reach vertices of type

tType

, we can omit specifying the edge type, making the following SELECT statements equivalent.


Edge type inference
resultSet7 = SELECT v FROM Source:v-((E1|E2|E3):e)->tType:t;
resultSet8 = SELECT v FROM Source:v-(:e)->tType:t;
resultSet9 = SELECT v FROM Source:v-(_:e)->tType:t;
resultSet10 = SELECT v FROM Source:v-(ANY:e)->tType:t;

The following are a set of queries that demonstrate edge-induced SELECT blocks. The allPostsLiked and allPostsMade queries show how the target vertex type can be omitted. The allPostsLikedOrMade query uses the “|” operator  to select multiple types of edges.


Edge induced SELECT example
# uses various SELECT statements (some of which are equivalent) to print out
# either the posts made by the given user, the posts liked by the given
# user, or the posts made or liked by the given user.
CREATE QUERY printAllPosts2(vertex<person> seed) FOR GRAPH socialNet
{
start = {seed}; # initialize starting set of vertices# — statements produce equivalent results
# select all ‘post’ vertices which can be reached from ‘start’ in one hop
# using an edge of type ‘liked’
allPostsLiked = SELECT targetVertex FROM start -(liked:e)-> post:targetVertex;# select all vertices of any type which can be reached from ‘start’ in one hop
# using an edge of type ‘liked’
allPostsLiked = SELECT targetVertex FROM start -(liked:e)-> :targetVertex;
# —-

# — statements produce equivalent results
# start with the vertex set from above, and traverse all edges of type “posted”
# (locally those edges are just given a name ‘e’ in case they need accessed)
# and return all vertices of type ‘post’ which can be reached within one-hop of ‘start’ vertices
allPostsMade = SELECT targetVertex FROM start -(posted:e)-> post:targetVertex;

# start with the vertex set from above, and traverse all edges of type “posted”
# (locally those edges are just given a name ‘e’ in case they need accessed)
# and return all vertices of any type which can be reached within one-hop of ‘start’ vertices
allPostsMade = SELECT targetVertex FROM start -(posted:e)-> :targetVertex;
# —-

# — statements produce equivalent results
# select all vertices of type ‘post’ which can be reached from ‘start’ in one hop
# using an edge of any type
# not equivalent to any statement. because it doesn’t restrict the edge type,
# this will include any vertex connected by ‘liked’ or ‘posted’ edge types
allPostsLikedOrMade = SELECT t FROM start -(:e)-> t;

# select all vertices of type ‘post’ which can be reached from ‘start’ in one hop
# using an edge of type either ‘posted’ or ‘liked’
allPostsLikedOrMade = SELECT t FROM start -((posted|liked):e)-> post:t;

# select all vertices of any type which can be reached from ‘start’ in one hop
# using an edge of type either ‘posted’ or ‘liked/
allPostsLikedOrMade = SELECT t FROM start -((posted|liked):e)-> :t;
# —-

PRINT allPostsLiked;
PRINT allPostsMade;
PRINT allPostsLikedOrMade;
}

 


Results of Query printAllPosts2

 



GSQL > RUN QUERY printAllPosts2(“person2”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“allPostsLiked”: [
{
“v_id”: “0”,
“attributes”: {
“postTime”: “2010-01-12 11:22:05”,
“subject”: “Graphs”
},
“v_type”: “post”
},
{
“v_id”: “3”,
“attributes”: {
“postTime”: “2011-02-05 01:02:44”,
“subject”: “cats”
},
“v_type”: “post”
}
]},
{“allPostsMade”: [{
“v_id”: “1”,
“attributes”: {
“postTime”: “2011-03-03 23:02:00”,
“subject”: “tigergraph”
},
“v_type”: “post”
}]},
{“allPostsLikedOrMade”: [
{
“v_id”: “0”,
“attributes”: {
“postTime”: “2010-01-12 11:22:05”,
“subject”: “Graphs”
},
“v_type”: “post”
},
{
“v_id”: “3”,
“attributes”: {
“postTime”: “2011-02-05 01:02:44”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “1”,
“attributes”: {
“postTime”: “2011-03-03 23:02:00”,
“subject”: “tigergraph”
},
“v_type”: “post”
}
]}
]
}
GSQL > RUN QUERY printAllPosts2(“person6”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“allPostsLiked”: [{
“v_id”: “8”,
“attributes”: {
“postTime”: “2011-02-03 17:05:52”,
“subject”: “cats”
},
“v_type”: “post”
}]},
{“allPostsMade”: [
{
“v_id”: “10”,
“attributes”: {
“postTime”: “2011-02-04 03:02:31”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “5”,
“attributes”: {
“postTime”: “2011-02-06 01:02:02”,
“subject”: “tigergraph”
},
“v_type”: “post”
}
]},
{“allPostsLikedOrMade”: [
{
“v_id”: “10”,
“attributes”: {
“postTime”: “2011-02-04 03:02:31”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “5”,
“attributes”: {
“postTime”: “2011-02-06 01:02:02”,
“subject”: “tigergraph”
},
“v_type”: “post”
},
{
“v_id”: “8”,
“attributes”: {
“postTime”: “2011-02-03 17:05:52”,
“subject”: “cats”
},
“v_type”: “post”
}
]}
]
}

This example is another edge selection that uses the “|” operator to select edges that have target vertices of multiple types.


Edge induced SELECT example
# uses a SELECT statement to print out everything related to a given user
# this includes posts that the user liked, posts that the user made, and friends
# of the user
CREATE QUERY printAllRelatedItems(vertex<person> seed) FOR GRAPH socialNet
{
sourceVertex = {seed};# — statements produce equivalent output
# returns all vertices of type either ‘person’ or ‘post’ that can be reached
# from the sourceVertex set using one edge of any type
everythingRelated = SELECT v FROM sourceVertex -(:e)-> (person|post):v;# returns all vertices of any type that can be reached from the sourceVertex
# using one edge of any type
# this statement is equivalent to the above one because the graph schema only
# has vertex types of either ‘person’ or ‘post’. if there were more vertex
# types present, these would not be equivalent.
everythingRelated = SELECT v FROM sourceVertex -(:e)-> :v;
# —

PRINT everythingRelated;
}

 


Results

 



GSQL > RUN QUERY printAllRelatedItems(“person2”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“everythingRelated”: [
{
“v_id”: “0”,
“attributes”: {
“postTime”: “2010-01-12 11:22:05”,
“subject”: “Graphs”
},
“v_type”: “post”
},
{
“v_id”: “person3”,
“attributes”: {
“gender”: “Male”,
“id”: “person3”
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“gender”: “Male”,
“id”: “person1”
},
“v_type”: “person”
},
{
“v_id”: “3”,
“attributes”: {
“postTime”: “2011-02-05 01:02:44”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “1”,
“attributes”: {
“postTime”: “2011-03-03 23:02:00”,
“subject”: “tigergraph”
},
“v_type”: “post”
}
]}]
}
GSQL > RUN QUERY printAllRelatedItems(“person6”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“everythingRelated”: [
{
“v_id”: “person4”,
“attributes”: {
“gender”: “Female”,
“id”: “person4”
},
“v_type”: “person”
},
{
“v_id”: “10”,
“attributes”: {
“postTime”: “2011-02-04 03:02:31”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “5”,
“attributes”: {
“postTime”: “2011-02-06 01:02:02”,
“subject”: “tigergraph”
},
“v_type”: “post”
},
{
“v_id”: “person8”,
“attributes”: {
“gender”: “Male”,
“id”: “person8”
},
“v_type”: “person”
},
{
“v_id”: “8”,
“attributes”: {
“postTime”: “2011-02-03 17:05:52”,
“subject”: “cats”
},
“v_type”: “post”
}
]}]
}

Vertex and Edge Aliases

Vertex and edge

aliases

are declared within the FROM clause of a SELECT block, by using the character “:”, followed by the alias name. Aliases can be accessed anywhere within the same SELECT block. They are used to reference a single selected vertex or edge of a set. It is through the vertex or edge aliases that attributes of these vertices or edges can be accessed.

For example, the following code snippets shows two different SELECT statements. The first SELECT statement starts from a vertex set called allVertices, and the vertex alias name

v

can access each individual vertex from allVertices. The second SELECT statement selects a set of edges. It can use the vertex alias

s

to reference the source vertices, or the alias

t

to reference the target vertices.


Vertex variables
results = SELECT v FROM allVertices:v;
results = SELECT v FROM allVertices:s -()-> :t;

The following example shows an edge-based SELECT statement, declaring aliases for all three parts of the edge. In the ACCUM clause, the e and t aliases are assigned to local vertex and edge variables.


Edge variables
results = SELECT v
FROM allVertices:s -(:e)-> :t
ACCUM VERTEX v = t, EDGE eg = e;

We strongly suggest that an alias should be declared with every vertex and edge in the FROM clause, as there are several functions and features which are only available to vertex and edge aliases.

SAMPLE Clause

The SAMPLE clause is an optional clause that selects a uniform random sample from the population of edges or vertices specified in the FROM argument. To be clear, the edge population consists of those edges which satisfy all three parts – source set, edge type, and target type – of the FROM clause. The SAMPLE clause is intended to provide a representative sample of the distribution of edges (or vertices) connected to

hub

vertices, instead of dealing with all edges. A

hub

vertex is a vertex with a relatively high degree. (The

degree

of a vertex is the number of edges which connect to it. If edges are directional, one can distinguish between indegree and outdegree.)

Note


Currently, the WHEN condition that can be used with a SAMPLE clause is limited strictly to checking if the result of
a function call
on a vertex
is greater than or greater than/equal
to some number.

The expression following SAMPLE specifies the sample size, either an absolute number or a percentage of the population. The expression in sampleClause must evaluate to a positive integer. There are two sampling methods. One is sampling based on edge id. The other is based on target vertex id: if a target vertex id is sampled, all edges from this source vertex to the sampled target vertex are sampled.


EBNF for Sample Clause
sampleClause := SAMPLE ( expr | expr “%” ) EDGE WHEN condition # Sample an absolute number (or a percentage) of edges for each source vertex.
| SAMPLE expr TARGET WHEN condition # Sample an absolute number of edges incident to each target vertex.
| SAMPLE expr “%” TARGET PINNED WHEN condition # Sample a percentage of edges incident to each target vertex.

Given that the sampling is random, some of the details of each of the example queries may change each time they are run.

The following query displays two modes of sampling: an absolute number of edges from a source vertex and a percentage of edges fro a source vertex. We use the computerNet graph (see Appendix D).  In computerNet, there are 31 vertices and 43 edges, but only 7 vertices are source vertices. Moreover, c1, c12, and c23 are hub nodes, with at least 10 outgoing edges each.  For the absolute count case, we set the size to 1 edge per source vertex, which is equivalent to a random walk. We expect exactly 7 edges to be selected.  For the percentage sampling case, we sample 33% of the edges for vertices which have 3 or more outgoing edges. We expect about 15 edges, but the number may vary.


sampleEx3: SAMPLE based on edges per source vertex
CREATE QUERY sampleEx3() FOR GRAPH computerNet
{
MapAccum<STRING,ListAccum<STRING>> @@absEdges; // record each selected edge as (src->tgt)
SumAccum<INT> @@totalAbs;
MapAccum<STRING,ListAccum<STRING>> @@pctEdges; // record each selected edge as (src->tgt)
SumAccum<INT> @@totalPct;start = {computer.*};# Sample one outgoing edge per source vertex = Random Walk
absSample = SELECT v FROM start:s -(:e)-> :v
SAMPLE 1 EDGE WHEN s.outdegree() >= 1 # sample 1 target vertex from each source vertex
ACCUM @@absEdges += (s.id -> v.id),
@@totalAbs += 1;
PRINT @@totalAbs, @@absEdges;

pctSample = SELECT v FROM start:s -(:e)-> :v
SAMPLE 33% EDGE WHEN s.outdegree() >= 3 # select ~1/3 of edges when outdegree >= 3
ACCUM @@pctEdges += (s.id -> v.id),
@@totalPct += 1;
PRINT @@totalPct, @@pctEdges;
}

 


sampleEx3.json

 



GSQL > RUN QUERY sampleEx3()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“@@totalAbs”: 7,
“@@absEdges”: {
“c4”: [“c23”],
“c11”: [“c12”],
“c10”: [“c11”],
“c12”: [“c14”],
“c23”: [“c26”],
“c14”: [“c24”],
“c1”: [“c10”]
}
},
{
“@@totalPct”: 13,
“@@pctEdges”: {
“c4”: [“c23”],
“c11”: [“c12”],
“c10”: [“c11”],
“c12”: [
“c14”,
“c15”,
“c19”
],
“c23”: [
“c29”,
“c25”
],
“c14”: [
“c24”,
“c23”
],
“c1”: [
“c3”,
“c8”,
“c2”
]
}
}
]
}

Below is an example of using SELECT to only traverse one edge for each source vertex. The vertex-attached accumulators @timesTraversedNoSample and @timesTraversedWithSample


are used to keep track of the number of times an edge is traversed to reach the target vertex. Without using sampling, this occurs once for each edge; thus @timesTraversedNoSample has the same number as the in-degree of the vertex. With sampling edges, the number of edges is restricted. This is reflected in the @timesTraversedWithSample accumulator. Notice the difference in the result set. Because only one edge per source vertex is traversed when the SAMPLE clause is used, not all target vertices are reached. The vertex

company3

has 3 incident edges, but in one instance of the query execution, it is never reached. Additionally,

company2

has 6 incident edges, but only 4 source vertices sampled an edge incident to

company2

.


example of SAMPLE using an absolute number of edges
CREATE QUERY sampleEx1() FOR GRAPH workNet
{
SumAccum<INT> @timesTraversedNoSample;
SumAccum<INT> @timesTraversedWithSample;
workers = {person.*};# The ‘beforeSample’ result set encapsulates the normal functionality of
# a SELECT statement, where ‘timesTraversedNoSample’ vertex accumulator is increased for
# each edge incident to the vertex.
beforeSample = SELECT v FROM workers:t -(:e)-> :v
ACCUM v.@timesTraversedNoSample += 1;

# The ‘afterSample’ result set is formed by those vertices which can be
# reached when for each source vertex, only one edge is used for traversal.
# This is demonstrated by the values of ‘timesTraversedWithSample’ vertex accumulator, which
# is increased for each edge incident to the vertex which is used in the
# sample.
afterSample = SELECT v FROM workers:t -(:e)-> :v
SAMPLE 1 EDGE WHEN t.outdegree() >= 1 # only use 1 edge from the source vertex
ACCUM v.@timesTraversedWithSample += 1;PRINT beforeSample;
PRINT afterSample;
}

 


sampleEx1.json

 



GSQL > RUN QUERY sampleEx1()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“beforeSample”: [
{
“v_id”: “company4”,
“attributes”: {
“country”: “us”,
“@timesTraversedNoSample”: 1,
“@timesTraversedWithSample”: 1,
“id”: “company4”
},
“v_type”: “company”
},
{
“v_id”: “company5”,
“attributes”: {
“country”: “can”,
“@timesTraversedNoSample”: 1,
“@timesTraversedWithSample”: 1,
“id”: “company5”
},
“v_type”: “company”
},
{
“v_id”: “company3”,
“attributes”: {
“country”: “jp”,
“@timesTraversedNoSample”: 3,
“@timesTraversedWithSample”: 3,
“id”: “company3”
},
“v_type”: “company”
},
{
“v_id”: “company2”,
“attributes”: {
“country”: “chn”,
“@timesTraversedNoSample”: 6,
“@timesTraversedWithSample”: 4,
“id”: “company2”
},
“v_type”: “company”
},
{
“v_id”: “company1”,
“attributes”: {
“country”: “us”,
“@timesTraversedNoSample”: 6,
“@timesTraversedWithSample”: 3,
“id”: “company1”
},
“v_type”: “company”
}
]},
{“afterSample”: [
{
“v_id”: “company4”,
“attributes”: {
“country”: “us”,
“@timesTraversedNoSample”: 1,
“@timesTraversedWithSample”: 1,
“id”: “company4”
},
“v_type”: “company”
},
{
“v_id”: “company5”,
“attributes”: {
“country”: “can”,
“@timesTraversedNoSample”: 1,
“@timesTraversedWithSample”: 1,
“id”: “company5”
},
“v_type”: “company”
},
{
“v_id”: “company3”,
“attributes”: {
“country”: “jp”,
“@timesTraversedNoSample”: 3,
“@timesTraversedWithSample”: 3,
“id”: “company3”
},
“v_type”: “company”
},
{
“v_id”: “company2”,
“attributes”: {
“country”: “chn”,
“@timesTraversedNoSample”: 6,
“@timesTraversedWithSample”: 4,
“id”: “company2”
},
“v_type”: “company”
},
{
“v_id”: “company1”,
“attributes”: {
“country”: “us”,
“@timesTraversedNoSample”: 6,
“@timesTraversedWithSample”: 3,
“id”: “company1”
},
“v_type”: “company”
}
]}
]
}

Since the PRINT statements are placed at the end of query, the two vertex sets

beforeSample

and

afterSample

are almost identical, showing the final values of both accumulators

@timesTraversedNoSample and @timesTraversedWithSample. There is one difference: company3 is not included in afterSample because none of the sample-selected edges reached company3.

WHERE Clause

The WHERE clause is an optional clause that constrains edges and vertices specified in the FROM and SAMPLE clauses.


EBNF for Where Clause
whereClause := WHERE condition

The WHERE clause uses a boolean condition to test each vertex or edge in the FROM set (or the sampled vertex and edge sets, if the SAMPLE clause was used).

If the expression evaluates to false for vertex/edge X, then X excluded from further consideration in the result set. The expression may use constants or any variables or parameters within the scope of the SELECT, arithmetic operators (+, -, *, /,%), comparison operators (==, !=, <, <=, >,>=), boolean operators (AND, OR, NOT), set operators (IN, NOT IN) and parentheses to enforce precedence. The WHERE conditional expression may use any of the variables within its scope (global accumulators, vertex set variables, query input parameters, the FROM clause’s vertex and edge sets (or their vertex and edge aliases), or any of the attributes or accumulators of the vertex/edge sets.) For a more formal explanation of condition, see the EBNF definitions of

condition

and

expr.

Using built-in vertex and edge attributes and functions, such as .type and .neighbors(), the WHERE clause can be used to implement sophisticated selection rules for the edge traversal.  In the following example, the selection conditions are completely specified in the WHERE clause, with no edge types or vertex types mentioned in the FROM clause.


WHERE used as a filter
resultSet1 = SELECT v FROM S:v-((E1|E2|E3):e)->(V1|V2):t;
resultSet2 = SELECT v FROM S:v-(:e)->:t
WHERE t.type IN (“V1”, “V2”) AND
t IN v.neighbors(“E1|E2|E3”)

The following examples demonstrate using the WHERE clause to limit the resulting vertex set based on a vertex attribute.


Basic SELECT WHERE
CREATE QUERY printCatPosts() FOR GRAPH socialNet {
posts = {post.*};
catPosts = SELECT v FROM posts:v # select only those post vertices
WHERE v.subject == “cats”; # which have a subset of ‘cats’
PRINT catPosts;
}

 


Results for Query printCatPosts

 



GSQL > RUN QUERY printCatPosts()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“catPosts”: [
{
“v_id”: “10”,
“attributes”: {
“postTime”: “2011-02-04 03:02:31”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “9”,
“attributes”: {
“postTime”: “2011-02-05 23:12:42”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “3”,
“attributes”: {
“postTime”: “2011-02-05 01:02:44”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “11”,
“attributes”: {
“postTime”: “2011-02-03 01:02:21”,
“subject”: “cats”
},
“v_type”: “post”
},
{
“v_id”: “8”,
“attributes”: {
“postTime”: “2011-02-03 17:05:52”,
“subject”: “cats”
},
“v_type”: “post”
}
]}]
}

 


SELECT WHERE using IN operator
CREATE QUERY findGraphFocusedPosts() FOR GRAPH socialNet
{
posts = {post.*};
results = SELECT v FROM posts:v # select only post vertices
WHERE v.subject IN (“Graph”, “tigergraph”); # which have a subject of either ‘Graph’ or ‘tigergraph’
PRINT results;
}

 


Results for Query findGraphFocusedPosts

 



GSQL > RUN QUERY findGraphFocusedPosts()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“results”: [
{
“v_id”: “5”,
“attributes”: {
“postTime”: “2011-02-06 01:02:02”,
“subject”: “tigergraph”
},
“v_type”: “post”
},
{
“v_id”: “1”,
“attributes”: {
“postTime”: “2011-03-03 23:02:00”,
“subject”: “tigergraph”
},
“v_type”: “post”
},
{
“v_id”: “6”,
“attributes”: {
“postTime”: “2011-02-05 02:02:05”,
“subject”: “tigergraph”
},
“v_type”: “post”
}
]}]
}

 

WHERE NOT limitations


The NOT operator may not be used in combination with the .type attribute selector. To check if an edge or vertex type is not equal to a given type, use the != operator. See the example below.

The following example shows the equivalence of using WHERE as a type filter as well as its limitations.


SELECT WHERE using AND/OR
# finds female person in the social network. all of the following statements
# are equivalent (i.e., produce the same results)
CREATE QUERY findFemaleMembers() FOR GRAPH socialNet
{
allVertices = {ANY}; # includes all posts and person
females = SELECT v FROM allVertices:v
WHERE v.type == “person” AND
v.gender != “Male”;females = SELECT v FROM allVertices:v
WHERE v.type == “person” AND
v.gender == “Female”;females = SELECT v FROM allVertices:v
WHERE v.type == “person” AND
NOT v.gender == “Male”;

females = SELECT v FROM allVertices:v
WHERE v.type != “post” AND
NOT v.gender == “Male”;

# does not compile. cannot use NOT operator in combination with type attribute
#females = SELECT v FROM allVertices:v
# WHERE NOT v.type != “person” AND
# NOT v.gender == “Male”;

# does not compile. cannot use NOT operator in combination with type attribute
#females = SELECT v FROM allVertices:v
# WHERE NOT v.type == “post” AND
# NOT v.gender == “Male”;

personVertices = {person.*};
females = SELECT v FROM personVertices:v
WHERE NOT v.gender == “Male”;

females = SELECT v FROM personVertices:v
WHERE v.gender != “Male”;

females = SELECT v FROM personVertices:v
WHERE v.gender != “Male” AND true;

females = SELECT v FROM personVertices:v
WHERE v.gender != “Male” OR false;

PRINT females;
}

 


Results for Query findFemaleMembers

 



GSQL > RUN QUERY findFemaleMembers()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“females”: [
{
“v_id”: “person4”,
“attributes”: {
“gender”: “Female”,
“id”: “person4”
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“gender”: “Female”,
“id”: “person5”
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“gender”: “Female”,
“id”: “person2”
},
“v_type”: “person”
}
]}]
}

The following example uses edge attributes to determine which workers are registered as full time for some company.


WHERE using edge attributes
# find all workers who are full time at some company
CREATE QUERY fullTimeWorkers() FOR GRAPH workNet
{
start = {person.*};
fullTimeWorkers = SELECT v FROM start:v -(worksFor:e)-> company:t
WHERE e.fullTime; # fullTime is a boolean attribute on the edgePRINT fullTimeWorkers;
}

 


fullTimeWorkers Results

 



GSQL > RUN QUERY fullTimeWorkers()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“fullTimeWorkers”: [
{
“v_id”: “person4”,
“attributes”: {
“interestList”: [“football”],
“skillSet”: [ 10, 1, 4 ],
“skillList”: [ 4, 1, 10 ],
“locationId”: “us”,
“interestSet”: [“football”],
“id”: “person4”
},
“v_type”: “person”
},
{
“v_id”: “person11”,
“attributes”: {
“interestList”: [ “sport”, “football” ],
“skillSet”: [10],
“skillList”: [10],
“locationId”: “can”,
“interestSet”: [ “football”, “sport” ],
“id”: “person11”
},
“v_type”: “person”
},
{
“v_id”: “person10”,
“attributes”: {
“interestList”: [ “football”, “sport” ],
“skillSet”: [3],
“skillList”: [3],
“locationId”: “us”,
“interestSet”: [ “sport”, “football” ],
“id”: “person10”
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“interestList”: [ “management”, “financial” ],
“skillSet”: [ 3, 2, 1 ],
“skillList”: [ 1, 2, 3 ],
“locationId”: “us”,
“interestSet”: [ “financial”, “management” ],
“id”: “person1”
},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {
“interestList”: [ “music”, “art” ],
“skillSet”: [ 10, 7 ],
“skillList”: [ 7, 10 ],
“locationId”: “jp”,
“interestSet”: [ “art”, “music” ],
“id”: “person6”
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“interestList”: [“engineering”],
“skillSet”: [ 6, 5, 3, 2 ],
“skillList”: [ 2, 3, 5, 6 ],
“locationId”: “chn”,
“interestSet”: [“engineering”],
“id”: “person2”
},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {
“interestList”: [“management”],
“skillSet”: [ 2, 5, 1 ],
“skillList”: [ 1, 5, 2 ],
“locationId”: “chn”,
“interestSet”: [“management”],
“id”: “person8”
},
“v_type”: “person”
},
{
“v_id”: “person12”,
“attributes”: {
“interestList”: [
“music”,
“engineering”,
“teaching”,
“teaching”,
“teaching”
],
“skillSet”: [ 2, 5, 1 ],
“skillList”: [ 1, 5, 2, 2, 2 ],
“locationId”: “jp”,
“interestSet”: [ “teaching”, “engineering”, “music” ],
“id”: “person12”
},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {
“interestList”: [“teaching”],
“skillSet”: [ 6, 1, 4 ],
“skillList”: [ 4, 1, 6 ],
“locationId”: “jp”,
“interestSet”: [“teaching”],
“id”: “person3”
},
“v_type”: “person”
},
{
“v_id”: “person9”,
“attributes”: {
“interestList”: [ “financial”, “teaching” ],
“skillSet”: [ 2, 7, 4 ],
“skillList”: [ 4, 7, 2 ],
“locationId”: “us”,
“interestSet”: [ “teaching”, “financial” ],
“id”: “person9”
},
“v_type”: “person”
}
]}]
}

 


If multiple edge types are specified in edge-induced selection, the WHERE clause should use OR to separate each edge type or each target vertex type. For example,


Multiple Edge Type WHERE clause
CREATE QUERY multipleEdgeTypeWhereEx(vertex<person> m1) FOR GRAPH socialNet {
allUser = {m1};
FilteredUser = SELECT s
FROM allUser:s – ((posted|liked|friend):e) -> (post|person):t
# WHERE e.actionTime > epoch_to_datetime(1) AND t.gender == “Male”;
WHERE ( e.type == “liked” AND e.actionTime > epoch_to_datetime(1) ) OR
( e.type == “friend” AND t.gender == “Male” )
;
PRINT FilteredUser;
}

The above query is compilable. However, if we use line 5 as the WHERE clause instead, the query is not compilable.

The edge-type conflict checking detects an error, because i

t uses attributes from both “liked” edges and “friend” edges without separating them out by OR.

ACCUM and POST-ACCUM Clauses

The optional ACCUM and POST-ACCUM clauses enable sophisticated aggregation and other computations across the set of vertices or edges selected by the preceding FROM, SAMPLE, and WHERE clauses.

A query can contain one or both of these clauses.


The statements
in an ACCUM
clause are applied for every edge in an edge-induced selection or every vertex in a vertex-induced selection.


I

f there is more than one statement in the ACCUM clause, the statements are separated by commas and executed sequentially for each selected element. However,

the TigerGraph system uses parallelism to improve performance.


Within an ACCUM clause, each edge is handled by a separate process. As such, there is no fixed order in which the edges are processed within the ACCUM clause and the edges should not be treated as executing sequentially. The a


ccumulators are mutex variables shared among each of these processes.


The results of any accumulation within the ACCUM clause is not complete until all edges are traversed. Any inspection of an intermediate result within the ACCUM is incomplete and may not be that meaningful.

 


The statements within the ACCUM clause are executed sequentially for a given vertex or edge.  However, there is no fixed order in which a vertex set or edge set is processed.

The optional POST-ACCUM clause enables aggregation and other computations across the set of vertices (but not edges) selected by the preceding clauses. POST-ACCUM can be used without ACCUM. If it is preceded by an ACCUM clause, then it can be used for 2-stage accumulative computation: a first stage in ACCUM followed by a second stage in POST-ACCUM.


As of v1.1, the keyword POST-ACCUM may also be spelled with an underscore: POST_ACCUM.

 


Each statement within the POST-ACCUM clause can refer to either source vertices or target vertices but not both.

In edge-induced selection, since the ACCUM clause iterates over edges, and often two edges will connect to the same source vertex or to the same target vertex, the ACCUM clause can be repeated multiple times for one vertex.


Operations that are to be performed exactly once per vertex should be performed in the POST-ACCUM clause.

 

The primary purpose of the ACCUM or POST-ACCUM clause is to collect information about the graph by updating accumulators (via += or =). See the “Accumulator” section for details on the += operation. However, other kinds of statements (e.g., branching, iteration, local assignments) are permitted to support more complex computations or to log activity. The EBNF syntax below defines the allowable kinds of statements that can occur within an ACCUM or POST-ACCUM.  The

DMLSubStmt

list is similar to the

queryBodyStmt

list which applies to statements outside of a SELECT block; it is important to note the differences.  Each of these statement types is discussed in one of the main sections of this reference document.

 


EBNF for ACCUM and POST-ACCUM Clauses

accumClause := ACCUM DMLSubStmtList

postAccumClause := POST-ACCUM DMLSubStmtList

DMLSubStmtList := DMLSubStmt [“,” DMLSubStmt]*

DMLSubStmt := assignStmt // Assignment (including vertex-attached accumulate)
| funcCallStmt // Function Call
| gAccumAccumStmt // Assignment (global accumulate)
| vAccumFuncCall // Function Call
| localVarDeclStmt // Declaration
| DMLSubCaseStmt // Control Flow
| DMLSubIfStmt // Control Flow
| DMLSubWhileStmt // Control Flow
| DMLSubForEachStmt // Control Flow
| BREAK // Control Flow
| CONTINUE // Control Flow
| insertStmt // Data Modification
| DMLSubDeleteStmt // Data Modification
| printlnStmt // Output
| logStmt // Output

 


Note that DML-sub-statements do not include global accumulator assignment statement (gAccumAssignStmt) but global accumulator accumulation statement (gAccumAccumStmt). Global accumulators may perform accumulation += but not assignment “=” within these clauses.

 


There are additional restrictions on DML-sub level statements:

  • Global variable assignment is permitted in ACCUM or POST-ACCUM clauses, but the change in value will not take place until the query completes. Therefore, if there are multiple assignment statements for the same variable, only the final one will take effect.
  • Vertex attribute assignment “=” is not permitted in an ACCUM clause. However, edge attribute assignment is permitted. This is because the ACCUM clause iterates over an edge set. Vertex attribute attribute assignment is permitted in the POST-ACCUM clause. Like all updates, the change in value does not take place until the query completes.

Aliases and ACCUM/POST-ACCUM Iteration Model

To reference each element of the selected set, use the aliases defined in the FROM clause.  For example, assume that we have the following aliases:


Example of vertex and edge aliases
FROM Source:s -(edgeTypes:e)-> targetTypes:t # edge-induced selection
FROM Source:v # vertex-induced selection

Let  (V1, V2,… Vn) be the vertices in
the vertex-induced selection
. The following pseudocode emulates ACCUM clause behavior.


Model for ACCUM behavior in vertex-induced selection
FOREACH v in (V1,V2,…Vn) DO # iterations may occur in parallel, in unknown order
DMLSubStmts referencing v
DONE

Let E = (E1, E2,… En) be the edges in the edge-induced selected set. Further, let S = (S1,S1,…Sn) and T= (T1,T2,…Tn) be the multisets (bags) of source vertices and target vertices which correspond to the edge set.  S and T are bags, because they can contain repeated elements.


Model for ACCUM behavior in edge-induced selection
FOREACH i in (1..n) DO # iterations may occur in parallel, in unknown order
DMLSubStmts referencing e, s, t, which really means e_i, s_i, t_i
DONE

Note that any reference to the source alias s or target alias t is for the endpoint vertices of the current edge.

Similarly, the POST-ACCUM clause acts like a FOREACH loop on the vertex result set specified in the SELECT clause (e.g., either S or T).

Edge/Vertex Type Inference and Conflict

If multiple edge types are specified in edge-induced selection, each ACCUM statement in ACCUM clause checks whether edge types are conflicted. If only a subset of edge types are effective
in an ACCUM statement
, this statement is not executed on other edge types. For example:


Multiple Edge Type ACCUM statement check
CREATE QUERY multipleEdgeTypeCheckEx(vertex<person> m1) FOR GRAPH socialNet {
ListAccum<STRING> @@testList1, @@testList2, @@testList3;
allUser = {m1};
allUser = SELECT s
FROM allUser:s – ((posted|liked|friend):e) -> (post|person):t
ACCUM @@testList1 += to_string(datetime_to_epoch(e.actionTime))
,@@testList2 += t.gender
#,@@testList3 += to_string(datetime_to_epoch(e.actionTime)) + t.gender # illegal
;
PRINT @@testList1, @@testList2, @@testList3;
}

In the above example, line 6 is only executed on “liked” edges, because “actionTime” is the attribute of “liked” edge only. Similarly, line 7 is only executed on “friend” edges, because “gender” is the attribute of “person” only, and only “friend” edge uses “person” as target vertex. However, line 8 causes a compilation error, because it uses multiple edges where some edges cannot be supported in a part of the statement, i.e., “liked” edges doesn’t have t.gender, “friend” edges doesn’t have e.actionTime.


We strongly suggest that if multiple edge types are specified in edge-induced selection, ACCUM clauses should uses CASE statement (see Section “Control Flow Statements” for more details) to separate the operation on each edge type or each target vertex type (or combination of target vertex type and edge type). The edge-type conflict checking then checks the ACCUM statement inside each THEN/ELSE blocks based on the condition. For example,


Multiple Edge Type ACCUM statement check 2
CREATE QUERY multipleEdgeTypeCheckEx2(vertex<person> m1) FOR GRAPH socialNet {
ListAccum<STRING> @@testList1;
allUser = {m1};
allUser = SELECT s
FROM allUser:s – ((posted|liked|friend):e) -> (post|person):t
ACCUM CASE
WHEN e.type == “liked” THEN # for liked edges
@@testList1 += to_string(datetime_to_epoch(e.actionTime))
WHEN e.type == “friend” THEN # for friend edges
@@testList1 += t.gender
ELSE # For the remained edge type, which is posted edges
@@testList1 += to_string(datetime_to_epoch(t.postTime))
END
;
PRINT @@testList1;
}

The above query is compilable. However, if we switch line 8 and line 10, the

edge-type conflict checking generates errors because “liked” edges doesn’t support t.gender and “friend” edges doesn’t support e.actionTime.

Similar to the ACCUM clause, if multiple source/target vertex types are specified in edge-induced selection and the POST-ACCUM clauses accesses source/target vertex, each ACCUM statement in POST-ACCUM clause checks whether source/target vertex types are conflicted. If only a subset of source/target vertex types are effective in a POST-ACCUM statement, this statement is not executed on other source/target vertex types.


Similar to ACCUM clause, we strongly suggest that if multiple source/target vertex types are specified in edge-induced selection and the POST-ACCUM clauses accesses source/target vertex, POST-ACCUM clauses should uses CASE statement (see Section “Control Flow Statements” for more details) to separate the operation on each source/target vertex type. The vertex type conflict checking then checks the ACCUM statement inside each THEN/ELSE blocks based on the condition.

Rules for Updating Vertex-Attached Accumulators

Prior to v1.0, a vertex-attached accumulator could only be updated in an ACCUM or POST-ACCUM clause and only if its vertex was selected for by the preceding FROM-SAMPLE-WHERE clauses.

Beginning in v1.0, there are additional circumstances where a vertex-attached accumulator may be updated. Vertices which are


referenced via a vertex-attached accumulator of a selected vertex

may have their vertex-attached accumulators updated in the ACCUM clause (but not in the POST-ACCUM clause).  That is, a vertex referenced by an selected vertex can be updated, with some limitations explained below. Some examples will help to illustrate this more complex condition.

  • Suppose a query declares a vertex-attached
    accumulator


    which holds vertex information

    . We call this a

    vertex-holding accumulator

    . This could take several forms:

    • A scalar accumulator, e.g., MaxAccum<

      VERTEX

      > @maxV;
    • A collection accumulator: e.g., ListAccum<

      VERTEX

      > @listV;
    • An accumulator containing tuple(s), where the tuple type contains a
      VERTEX

      field.

  • If a vertex V is selected, then not only can V’s accumulators be updated, but the vertices stored in its vertex-holding accumulators can also be updated, in the ACCUM clause.
  • Before these indirectly referenced vertices can be used, they need to be

    activated

    . There are two ways to activate an indirect vertex:

    • A vertex from a vertex-holding accumulator is first assigned to a local vertex variable.  The vertex can now be updated through the local vertex variable.

      ACCUM
      VERTEX<person> mx = tgt.@maxV,   # assign to local variable
      mx.@curId += src.id      # access via local variable
    • A FOREACH loop can iterate on a vertex-holding collection accumulator. The vertices can now be updated through the loop variable.

      ACCUM
      FOREACH vtx IN src.@setIds DO   # iterate on collection accumulator
      vtx.@curId += tgt.id        # access via loop variable
      END

The following uses are NOT supported by the new rules:

  • Indirectly activated vertices may not be updated in the POST-ACCUM clause or outside of a SELECT statement.
  • Passing a vertex into the query as an input parameter is not a route to activation.
  • Using a global vertex-holding accumulator is not a route to activation.
  • If a vertex is being indirectly activated by assigning it to a local variable (e.g., a variable declaring in ACCUM or POST-ACCUM), note the following rule, which always applies to all local variables:
    • A local variable can be declared and initialized in an ACCUM block once.  It cannot be redeclared or reassigned later in the ACCUM block.

The following query demonstrates updates to indirectly activated vertices.


Updating an Indirectly-Referenced Vertex

CREATE QUERY vUpdateIndirectAccum() FOR GRAPH socialNet {

SetAccum<VERTEX<person>> @posters;
SetAccum<VERTEX<person>> @fellows;

Persons = {person.*};
# To each post, attach a list of persons who liked the post
likedPosts = SELECT p
FROM Persons:src -(liked:e)-> post:p
ACCUM
p.@posters += src;

# To each person who liked a post, attach a list of everyone
# who also liked one of this person’s liked posts.
likedPosts = SELECT src
FROM likedPosts:src
ACCUM
FOREACH v IN src.@posters DO
v.@fellows += src.@posters
END
ORDER BY src.subject;

PRINT Persons[Persons.@fellows];
}


Results from Query vUpdateIndirectAccums

 



GSQL > RUN QUERY vUpdateIndirectAccess()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“Persons”: [
{
“v_id”: “person4”,
“attributes”: {“Persons.@fellows”: [
“person8”,
“person4”
]},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {“Persons.@fellows”: [ “person2”, “person1”, “person3” ]},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {“Persons.@fellows”: [“person7”]},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {“Persons.@fellows”: [ “person2”, “person1”, “person3” ]},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {“Persons.@fellows”: [“person5”]},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {“Persons.@fellows”: [“person6”]},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {“Persons.@fellows”: [ “person2”, “person1”, “person3” ]},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {“Persons.@fellows”: [ “person8”, “person4” ]},
“v_type”: “person”
}
]}]
}

 

ACCUM and POST-ACCUM Examples

We now show several examples. This example demonstrates how ACCUM or POST-ACCUM can be used to count the number of vertices in the given set.


Accum and PostAccum Semantics
#Show Accum PostAccum Behavior
CREATE QUERY accumPostAccumSemantics() FOR GRAPH workNet {SumAccum<INT> @@vertexOnlyAccum;
SumAccum<INT> @@vertexOnlyPostAccum;SumAccum<INT> @@vertexOnlyWhereAccum;
SumAccum<INT> @@vertexOnlyWherePostAccum;

SumAccum<INT> @@sourceWithEdgeAccum;
SumAccum<INT> @@sourceWithEdgePostAccum;

SumAccum<INT> @@targetWithEdgeAccum;
SumAccum<INT> @@targetWithEdgePostAccum;

#Seed start set with all company vertices
start = {company.*};

#Select all vertices in source set start
selectVertexSet = SELECT v from start:v
#Happens once for each vertex discovered
ACCUM @@vertexOnlyAccum += 1

#Happens once for each vertex in the result set “v”
POST-ACCUM @@vertexOnlyPostAccum += 1;

#Select all vertices in source set start with a where constraint
selectVertexSetWhere = SELECT v from start:v WHERE (v.country == “us”)
#Happens once for each vertex discovered that also
# meets the constraint condition
ACCUM @@vertexOnlyWhereAccum += 1

#Happens once for each vertex in the result set “v”
POST-ACCUM @@vertexOnlyWherePostAccum += 1;

#Select all source “s” vertices in set start and explore all “worksFor” edge paths
selectSourceWithEdge = SELECT s from start:s -(worksFor)-> :t
#Happens once for each “worksFor” edge discovered
ACCUM @@sourceWithEdgeAccum += 1

#Happens once for each vertex in result set “s” (source)
POST-ACCUM @@sourceWithEdgePostAccum += 1;

#Select all target “t” vertices found from exploring all “worksFor” edge paths from set start
selectTargetWithEdge = SELECT t from start:s -(worksFor)-> :t
#Happens once for each “worksFor” edge discovered
ACCUM @@targetWithEdgeAccum += 1

#Happens once for each vertex in result set “t” (target)
POST-ACCUM @@targetWithEdgePostAccum += 1;

PRINT @@vertexOnlyAccum;
PRINT @@vertexOnlyPostAccum;

PRINT @@vertexOnlyWhereAccum;
PRINT @@vertexOnlyWherePostAccum;

PRINT @@sourceWithEdgeAccum;
PRINT @@sourceWithEdgePostAccum;

PRINT @@targetWithEdgeAccum;
PRINT @@targetWithEdgePostAccum;
}

 


accumPostAccumSemantics Result
GSQL > RUN QUERY accumPostAccumSemantics()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@vertexOnlyAccum”: 5},
{“@@vertexOnlyPostAccum”: 5},
{“@@vertexOnlyWhereAccum”: 2},
{“@@vertexOnlyWherePostAccum”: 2},
{“@@sourceWithEdgeAccum”: 17},
{“@@sourceWithEdgePostAccum”: 5},
{“@@targetWithEdgeAccum”: 17},
{“@@targetWithEdgePostAccum”: 12}
]
}

This example uses ACCUM to find all the subjects a user posted about.


Vertex ACCUM Example
# For each person, make a list of all their post subjects
CREATE QUERY userPosts() FOR GRAPH socialNet {
ListAccum<STRING> @personPosts;
start = {person.*};# Find all user post topics and append them to the vertex list accum
userPostings = SELECT s FROM start:s -(posted)-> :g
ACCUM s.@personPosts += g.subject;PRINT userPostings;
}

 


Results for Query userPosts
GSQL > RUN QUERY userPosts()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“userPostings”: [
{
“v_id”: “person4”,
“attributes”: {
“gender”: “Female”,
“@personPosts”: [“cats”],
“id”: “person4”
},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {
“gender”: “Male”,
“@personPosts”: [“query languages”],
“id”: “person3”
},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {
“gender”: “Male”,
“@personPosts”: [ “cats”, “tigergraph” ],
“id”: “person7”
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“gender”: “Male”,
“@personPosts”: [“Graphs”],
“id”: “person1”
},
“v_type”: “person”
},
/*** other vertices omitted ***/
]}]
}

 

This example shows each person’s posted vertices and each person’s like behaviors (liked edges).


ACCUM<VERTEX> and ACCUM<EDGE> Example
# Show each user’s post and liked post time
CREATE QUERY userPosts2() FOR GRAPH socialNet {
ListAccum<VERTEX> @personPosts;
ListAccum<EDGE> @personLikedInfo;
start = {person.*};# Find all user post topics and append them to the vertex list accum
userPostings = SELECT s FROM start:s -(posted)-> :g
ACCUM s.@personPosts += g;userPostings = SELECT s from start:s -(liked:e)-> :g
ACCUM s.@personLikedInfo += e;

PRINT start;
}

 


Results from Query userPosts2
GSQL > RUN QUERY userPosts2()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“start”: [
{
“v_id”: “person4”,
“attributes”: {
“gender”: “Female”,
“@personPosts”: [“3”],
“id”: “person4”,
“@personLikedInfo”: [{
“from_type”: “person”,
“to_type”: “post”,
“directed”: true,
“from_id”: “person4”,
“to_id”: “4”,
“attributes”: {“actionTime”: “2010-01-13 03:16:05”},
“e_type”: “liked”
}]
},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {
“gender”: “Male”,
“@personPosts”: [ “9”, “6” ],
“id”: “person7”,
“@personLikedInfo”: [{
“from_type”: “person”,
“to_type”: “post”,
“directed”: true,
“from_id”: “person7”,
“to_id”: “10”,
“attributes”: {“actionTime”: “2010-01-12 11:22:05”},
“e_type”: “liked”
}]
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“gender”: “Male”,
“@personPosts”: [“0”],
“id”: “person1”,
“@personLikedInfo”: [{
“from_type”: “person”,
“to_type”: “post”,
“directed”: true,
“from_id”: “person1”,
“to_id”: “0”,
“attributes”: {“actionTime”: “2010-01-11 11:32:00”},
“e_type”: “liked”
}]
},
“v_type”: “person”
},
/*** other vertices omitted ***/
]}]
}

 

This example counts the total number of times each topic is used.


Global ACCUM Example
# Show number of total posts by topic
CREATE QUERY userPostsByTopic() FOR GRAPH socialNet {
MapAccum<STRING, INT> @@postTopicCounts;
start = {person.*};# Append subject and update the appearance count in the global map accum
posts = SELECT g FROM start -(posted)-> :g
ACCUM @@postTopicCounts += (g.subject -> 1);PRINT @@postTopicCounts;
}

 


Results for Query userPostsByTopic
GSQL > RUN QUERY userPostsByTopic()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@postTopicCounts”: {
“cats”: 5,
“coffee”: 1,
“query languages”: 1,
“Graphs”: 2,
“tigergraph”: 3
}}]
}

This is an example of using ACCUM and POST-ACCUM in conjunction. The ACCUM traverses the graph and finds all people who live and work in the same country. After this is determined, POST-ACCUM examines each vertex (person) to see if they work where they live.


Vertex POST-ACCUM Example
#Show all person who both work and live in the same country
CREATE QUERY residentEmployees() FOR GRAPH workNet {ListAccum<STRING> @company;
OrAccum @worksAndLives;start = {person.*};

employees = SELECT s FROM start:s -(worksFor)-> :c
#If a person works for a company in the same country where they live
# add the company to the list
ACCUM CASE WHEN (s.locationId == c.country) THEN
s.@company += c.id
END

#Check each vertex and see if a person works where they live
POST-ACCUM CASE WHEN (s.@company.size() > 0) THEN
s.@worksAndLives += True
ELSE
s.@worksAndLives += False
END;

PRINT employees WHERE (employees.@worksAndLives == True);
}

 


residentEmployees Result
GSQL > RUN QUERY residentEmployees()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“employees”: [
{
“v_id”: “person11”,
“attributes”: {
“interestList”: [
“sport”,
“football”
],
“skillSet”: [10],
“skillList”: [10],
“@worksAndLives”: true,
“locationId”: “can”,
“interestSet”: [ “football”, “sport” ],
“id”: “person11”,
“@company”: [“company5”]
},
“v_type”: “person”
},
{
“v_id”: “person10”,
“attributes”: {
“interestList”: [ “football”, “sport” ],
“skillSet”: [3],
“skillList”: [3],
“@worksAndLives”: true,
“locationId”: “us”,
“interestSet”: [ “sport”, “football” ],
“id”: “person10”,
“@company”: [“company1”]
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“interestList”: [ “management”, “financial” ],
“skillSet”: [ 3, 2, 1 ],
“skillList”: [ 1, 2, 3 ],
“@worksAndLives”: true,
“locationId”: “us”,
“interestSet”: [ “financial”, “management” ],
“id”: “person1”,
“@company”: [“company1”]
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“interestList”: [“engineering”],
“skillSet”: [ 6, 5, 3, 2 ],
“skillList”: [ 2, 3, 5, 6 ],
“@worksAndLives”: true,
“locationId”: “chn”,
“interestSet”: [“engineering”],
“id”: “person2”,
“@company”: [“company2”]
},
“v_type”: “person”
}
]}]
}

This is an example of a POST-ACCUM only that counts the number people with a particular gender.


Global POST-ACCUM Example
#Count the number of person of a given gender
CREATE QUERY personGender(STRING gender) FOR GRAPH socialNet {SumAccum<INT> @@genderCount;start = {ANY};

# Select all person vertices and check the gender attribute
friends = SELECT v FROM start:v
WHERE v.type == “person”

POST-ACCUM CASE WHEN (start.gender == gender) THEN
@@genderCount += 1
END;

PRINT @@genderCount;
}

 


Results for Query personGender
GSQL > RUN QUERY personGender(“Female”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@genderCount”: 3}]
}

HAVING Clause

The optional HAVING clause provides constraints on the result set of the SELECT. The constraints are applied

after

ACCUM and POST-ACCUM actions. This differs from the WHERE clause, which is applied

before

the ACCUM and POST-ACCUM actions.


EBNF for HAVING Clause
havingClause := HAVING condition

A HAVING clause can only be used if there is an ACCUM or POST-ACCUM clause
. The condition is applied to each vertex in the SELECT set (either source or target vertices) which also fulfilled the FROM and WHERE conditions. The HAVING clause is intended to test one or more of the accumulator variables that were updated in the ACCUM or POST-ACCUM clause, though the condition may be anything that equates to a boolean value. If the condition is false for a particular vertex, then that vertex is excluded from the result set.

The following example demonstrates using the HAVING clause to constrain a result set based on the vertex accumulator variable which was updated during the ACCUM clause.


Example 1. HAVING
# find all persons meeting a given activityThreshold, based on how many posts or likes a person has made
CREATE QUERY activeMembers(int activityThreshold) FOR GRAPH socialNet
{
SumAccum<int> @activityAmount;
start = {person.*};
result = SELECT v FROM start:v -(:e)-> post:tgt
ACCUM v.@activityAmount +=1
HAVING v.@activityAmount >= activityThreshold;
PRINT result;
}

If the activityThreshold parameter is set to 3, the query returns 5 vertices:


Example 1 Results

 



GSQL > RUN QUERY activeMembers(3)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“result”: [
{
“v_id”: “person7”,
“attributes”: {
“gender”: “Male”,
“@activityAmount”: 3,
“id”: “person7”
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“gender”: “Female”,
“@activityAmount”: 3,
“id”: “person5”
},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {
“gender”: “Male”,
“@activityAmount”: 3,
“id”: “person6”
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“gender”: “Female”,
“@activityAmount”: 3,
“id”: “person2”
},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {
“gender”: “Male”,
“@activityAmount”: 3,
“id”: “person8”
},
“v_type”: “person”
}
]}]
}

If the activityThreshold parameter is set to 2, the query would return 8 vertices. With activityThreshold = 4, the query would return no vertices.

The following example demonstrates the equivalence of a SELECT statement in which the condition for the HAVING clause is always true.


Example 2. HAVING with literal condition
# find all person meeting a given activityThreshold, based on how many posts or likes a person has made
CREATE QUERY printMemberActivity() FOR GRAPH socialNet
{
SumAccum<int> @activityAmount;
start = {person.*};### — equivalent statements —–
result = SELECT v FROM start:v -(:e)-> post:tgt
ACCUM v.@activityAmount +=1
HAVING true;result = SELECT v FROM start:v -(:e)-> post:tgt
ACCUM v.@activityAmount +=1;
### —–

PRINT result;
}

 


Results from Query printMemberActivity

 



GSQL > RUN QUERY printMemberActivity()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“result”: [
{
“v_id”: “person4”,
“attributes”: {
“gender”: “Female”,
“@activityAmount”: 4,
“id”: “person4”
},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {
“gender”: “Male”,
“@activityAmount”: 4,
“id”: “person3”
},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {
“gender”: “Male”,
“@activityAmount”: 6,
“id”: “person7”
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“gender”: “Male”,
“@activityAmount”: 4,
“id”: “person1”
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“gender”: “Female”,
“@activityAmount”: 6,
“id”: “person5”
},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {
“gender”: “Male”,
“@activityAmount”: 6,
“id”: “person6”
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“gender”: “Female”,
“@activityAmount”: 6,
“id”: “person2”
},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {
“gender”: “Male”,
“@activityAmount”: 6,
“id”: “person8”
},
“v_type”: “person”
}
]}]
}

The following shows an example of
equivalent
result sets from using WHERE vs. HAVING. Recall that the WHERE clause is evaluated before the ACCUM and that the HAVING clause is evaluated after the ACCUM. Both constrain the result set based on a condition that vertices must meet.


Example 3. HAVING vs. WHERE
# Compute the total post activity for each male person.
# Because the gender of the vertex does not change, evaluating whether the person vertex
# is male before (WHERE) the ACCUM clause or after (HAVING) the ACCUM clause does not
# change the result. However, if the condition in the HAVING clause could change within
# the ACCUM clause, these statements would produce different results.CREATE QUERY activeMaleMembers() FOR GRAPH socialNet
{
SumAccum<INT> @activityAmount;
start = {person.*};### — statements produce equivalent results
result1 = SELECT v FROM start:v -(:e)-> post:tgt
WHERE v.gender == “Male”
ACCUM v.@activityAmount +=1;

result2 = SELECT v FROM start:v -(:e)-> post:tgt
ACCUM v.@activityAmount +=1
HAVING v.gender == “Male”;

PRINT result2[result2.@activityAmount];
PRINT result2[result2.@activityAmount];
}

 


Results from Query ActiveMaleMembers

 



GSQL > RUN QUERY activeMaleMembers()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“result2”: [
{
“v_id”: “person3”,
“attributes”: {“result2.@activityAmount”: 4},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {“result2.@activityAmount”: 6},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {“result2.@activityAmount”: 4},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {“result2.@activityAmount”: 6},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {“result2.@activityAmount”: 6},
“v_type”: “person”
}
]},
{“result2”: [
{
“v_id”: “person3”,
“attributes”: {“result2.@activityAmount”: 4},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {“result2.@activityAmount”: 6},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {“result2.@activityAmount”: 4},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {“result2.@activityAmount”: 6},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {“result2.@activityAmount”: 6},
“v_type”: “person”
}
]}
]
}

 

The following example has a compilation error
because the result set is taken from the source vertices, but the HAVING condition is checking the target vertices.


Example 4. HAVING the wrong vertex set
# find all person having a post subject about cats
# This query is illegal because the having condition is testing the wrong vertex set
CREATE QUERY printMemberAboutCats() FOR GRAPH socialNet
{
start = {person.*};result = SELECT v FROM start:v -(:e)-> post:tgt
HAVING tgt.subject == “cats”;
PRINT result;
}

Compilation Error for printMemberAboutCats
> gsql printMemberAboutCats.gsql
Semantic Check Error in query printMemberAboutCats (SEM-50): line 8, col 33
The SELECT block selects src, but the HAVING clause uses tgt

ORDER BY Clause

The optional ORDER BY clause sorts the result set.


EBNF for ORDER BY Clause
orderClause := ORDER BY expr [ASC | DESC] [“,” expr [ASC | DESC]]*

ASC specifies ascending order (least value first), and DESC specifies descending order (greatest value first). If neither is specified, then ascending order is used. Each expr must refer to the attributes or accumulators of a member of the result set, and the expr must evaluate to a sortable value (e.g., a number or a string). ORDER BY offers hierarchical sorting by allowing a comma-separated list of expressions, sorting first by the leftmost expr.  It uses the next expression only to sort items where the current sort expr results in identical values. Any items in the result set which cannot be sorted (because the sort expressions do not pertain to them) will appear at the end of the set, after the sorted items.

The following example demonstrates the use of ORDER BY with multiple expressions. The returned vertex set is first ordered by the number of friends of the vertex, and then ordered by the number of coworkers of that vertex.


topPopular.gsql: ORDER BY Descending
# find the most popular people, sorting first based on the number as friends
# and then in case of a tie by the number of coworkers
CREATE QUERY topPopular() FOR GRAPH friendNet
{
SumAccum<INT> @numFriends;
SumAccum<INT> @numCoworkers;
start = {person.*};result = SELECT v FROM start -((friend|coworker):e)-> person:v
ACCUM CASE WHEN e.type == “friend” THEN v.@numFriends += 1
WHEN e.type == “coworker” THEN v.@numCoworkers += 1
END
ORDER BY v.@numFriends DESC, v.@numCoworkers DESC;PRINT result;
}

 

 


topPopular.json

 



GSQL > RUN QUERY topPopular()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“result”: [
{
“v_id”: “person9”,
“attributes”: {
“@numCoworkers”: 3,
“@numFriends”: 5,
“id”: “person9”
},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {
“@numCoworkers”: 1,
“@numFriends”: 4,
“id”: “person8”
},
“v_type”: “person”
},
{
“v_id”: “person12”,
“attributes”: {
“@numCoworkers”: 1,
“@numFriends”: 4,
“id”: “person12”
},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {
“@numCoworkers”: 4,
“@numFriends”: 3,
“id”: “person6”
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“@numCoworkers”: 3,
“@numFriends”: 3,
“id”: “person1”
},
“v_type”: “person”
},
{
“v_id”: “person4”,
“attributes”: {
“@numCoworkers”: 5,
“@numFriends”: 2,
“id”: “person4”
},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {
“@numCoworkers”: 3,
“@numFriends”: 2,
“id”: “person3”
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“@numCoworkers”: 3,
“@numFriends”: 2,
“id”: “person2”
},
“v_type”: “person”
},
{
“v_id”: “person10”,
“attributes”: {
“@numCoworkers”: 1,
“@numFriends”: 2,
“id”: “person10”
},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {
“@numCoworkers”: 6,
“@numFriends”: 1,
“id”: “person7”
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“@numCoworkers”: 5,
“@numFriends”: 1,
“id”: “person5”
},
“v_type”: “person”
},
{
“v_id”: “person11”,
“attributes”: {
“@numCoworkers”: 1,
“@numFriends”: 1,
“id”: “person11”
},
“v_type”: “person”
}
]}]
}

LIMIT Clause

The optional LIMIT clause sets constraints on the number and ranking of items included in the final result set.


EBNF for LIMIT Clause
limitClause := LIMIT ( expr | expr “,” expr | expr OFFSET expr )

Each of the expr must evaluate
to a
nonnegative integer. To understand LIMIT, note that the tentative result set is held in the computer as a list of vertices. If the query
has an
ORDER BY clause, the order is specified; otherwise the list order is unknown.  Assume we number the vertices as v

1

, v

2

, …, v

n

. The LIMIT clause specifies a range of vertices, starting from a lower position in the list to an upper position.

There are three forms:


LIMIT scenarios
result = SELECT v FROM S -(:e)-> :v LIMIT k; # case 1: k = Count
result = SELECT v FROM S -(:e)-> :v LIMIT j, k; # case 2: j = Offset from the start of the list, k = Count
result = SELECT v FROM S -(:e)-> :v LIMIT k OFFSET j; # case 3: k = Count, j = Offset from the start of the list

Case 1: LIMIT k

  • When a single expr is provided, LIMIT returns the first

    k

    elements from the tentative result set. If there are fewer than

    k

    elements available, then all elements will be returned in the result set.  If k=5 and the tentative result set has at least 5 items, then the final result list will be [ v

    1

    , v

    2

    , v

    3

    , v

    4

    , v

    5

    ].

Case 2: LIMIT j, k

  • When a comma separates two expressions, LIMIT treats the first expression

    j

    as an offset.  That is, it skips the first

    j

    items in the list.  The second expr

    k

    tells the maximum number of items items to include. If the list has at least 7 items, then LIMIT 2, 5 would return [ v

    3

    , v

    4

    , v

    5,

    v

    6

    , v

    7

    ].

Case 3: LIMIT k OFFSET j

  • The behavior of Case 3 is the same as that of Case 2, except that the syntax is different.  The keyword OFFSET separates the two expressions, and the count comes before the offset, rather than vice versa. If the list has at least 7 items, then LIMIT 5 OFFET 2 would return [ v

    3

    , v

    4

    , v

    5,

    v

    6

    , v

    7

    ].

If any of the expressions evaluate to a negative integer, the results are undefined.



OFFSET is intended for result sets which are in a known order. It is a compile time error to use OFFSET without the ORDER BY clause.

The following examples demonstrate the various forms of the LIMIT clause.

The first example shows the LIMIT clause when used as an upper limit. It returns a result set with a maximum size of 4 elements in the set.


limitEx1.gsql: LIMIT by some number
CREATE QUERY limitEx1(INT k) FOR GRAPH friendNet
{
start = {person.*};result1 = SELECT v FROM start:v
ORDER BY v.id
LIMIT k;PRINT result1[result1.id]; // api v2
}

 


limit1Ex.json Results

 



GSQL > RUN QUERY limitEx1(4)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“result1”: [
{
“v_id”: “person1”,
“attributes”: {“result1.id”: “person1”},
“v_type”: “person”
},
{
“v_id”: “person10”,
“attributes”: {“result1.id”: “person10”},
“v_type”: “person”
},
{
“v_id”: “person11”,
“attributes”: {“result1.id”: “person11”},
“v_type”: “person”
},
{
“v_id”: “person12”,
“attributes”: {“result1.id”: “person12”},
“v_type”: “person”
}
]}]
}

The following example shows how to use the LIMIT clause with an offset.


limit2Ex.gsql: LIMIT with lower-bound and size
CREATE QUERY limitEx2(INT j, INT k) FOR GRAPH friendNet
{
start = {person.*};
result2 = SELECT v FROM start:v
ORDER BY v.id
LIMIT j, k;PRINT result2[result2.id]; // api v2
}

 


limit2Ex.json Results

 



GSQL > RUN QUERY limitEx2(2,3)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“result2”: [
{
“v_id”: “person11”,
“attributes”: {“result2.id”: “person11”},
“v_type”: “person”
},
{
“v_id”: “person12”,
“attributes”: {“result2.id”: “person12”},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {“result2.id”: “person2”},
“v_type”: “person”
}
]}]
}

The following example shows the
alternative
syntax for a result size limit with an offset.  This time we try larger values for offset and size.  In a large data set, limitTest(5,20) might return 20 vertices, but since we don’t have 25 vertices in the original data, the output was fewer than 20 vertices.


limit3Ex.gsql: LIMIT with OFFSET
CREATE QUERY limitEx3(INT j, INT k) FOR GRAPH friendNet
{
start = {person.*};result3 = SELECT v FROM start:v
ORDER BY v.id
LIMIT k OFFSET j;PRINT result3[result3.id]; // api v2
}

 


limit3Ex.json Results

 



GSQL > RUN QUERY limitEx3(5,20)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“result3”: [
{
“v_id”: “person3”,
“attributes”: {“result3.id”: “person3”},
“v_type”: “person”
},
{
“v_id”: “person4”,
“attributes”: {“result3.id”: “person4”},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {“result3.id”: “person5”},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {“result3.id”: “person6”},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {“result3.id”: “person7”},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {“result3.id”: “person8”},
“v_type”: “person”
},
{
“v_id”: “person9”,
“attributes”: {“result3.id”: “person9”},
“v_type”: “person”
}
]}]
}

 

 


End of Select Statement Section

Control Flow Statements

The GSQL Query Language includes a comprehensive set of control flow statements to empower sophisticated graph traversal and data computation: IF/ELSE, CASE, WHILE, and FOREACH.

Note that any of these statements can be used as a query-body statement or as a DML-sub level statement.



If the control flow statement is at the query-body level, then its block(s) of statements are query-body statements (

queryBodyStmts

).

In a


queryBodyStmts

block


,

each individual statement ends with a semicolon, so there is always a semicolon at the end.

If the control flow statement is at the DML-sub level, then its block(s) of statements are DML-sub statements (

DMLSubStmtList

).

In a


DMLSubStmtList

block, a comma separates statements, but there is no punctuation at the end.

The “Statement Types” subsection in the Chapter on “CREATE / INSTALL / RUN / SHOW / DROP QUERY” has a more detailed general example of the difference between queryBodyStmts and DMLSUbStmts.

IF Statement

The IF statement provides conditional branching: execute a block of statements (

queryBodyStmts

or

DMLSubStmtList

) only if a given

condition

is true. The IF statement allows for zero or more ELSE-IF clauses, followed by an optional ELSE clause. The IF statement can be used either at the query-body level or at the DML-sub-statement level. (See the

note about differences in block syntax

.)


IF syntax
queryBodyIfStmt := IF condition THEN queryBodyStmts [ELSE IF condition THEN queryBodyStmts ]* [ELSE queryBodyStmts ] END
DMLSubIfStmt := IF condition THEN DMLSubStmtList [ELSE IF condition THEN DMLSubStmtList ]* [ELSE DMLSubStmtList ] END

If a particular IF condition is not true, then the flow proceeds to the next ELSE IF condition.  When a true condition is encountered, its corresponding block of statements is executed, and then the IF statement terminates (skipping any remaining ELSE-IF or ELSE clauses). If an ELSE-clause is present, its block of statements are executed if none of the preceding conditions are true. Overall, the functionality can be summarized as “execute the first block of statements whose conditional test is true.”


IF semantics
# if then
IF x == 5 THEN y = 10; END; # y is assigned to 10 only if x is 5.# if then else
IF x == 5 THEN y = 10; # y is 10 only if x is 5.
ELSE y = 20; END; # y is 20 only if x is NOT 5.

#if with ELSE IF
IF x == 5 THEN y = 10; # y is 10 only if x is 5.
ELSE IF x == 7 THEN y = 5; # y is 5 only if x is 7.
ELSE y = 20; END; # y is 20 only if x is NOT 5 and NOT 7.


Example 1. countFriendsOf2.gsql : Simple IF-ELSE at query-body level
# count the number of friends a person has, and optionally include coworkers in that count
CREATE QUERY countFriendsOf2(vertex<person> seed, BOOL includeCoworkers) FOR GRAPH friendNet
{
SumAccum<INT> @@numFriends = 0;
start = {seed};IF includeCoworkers THEN
friends = SELECT v FROM start -((friend | coworker):e)-> :v
ACCUM @@numFriends +=1;
ELSE
friends = SELECT v FROM start -(friend:e)-> :v
ACCUM @@numFriends +=1;
END;
PRINT @@numFriends, includeCoworkers;
}

 


Example 1 Results

 



GSQL > RUN QUERY countFriendsOf2(“person2”, true)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{
“@@numFriends”: 5,
“includeCoworkers”: true
}]
}
GSQL > RUN QUERY countFriendsOf2(“person2”, false)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{
“@@numFriends”: 2,
“includeCoworkers”: false
}]
}

 


Example 2. IF-ELSE IF-ELSE at query-body level
# determine if a user is active in terms of social networking (i.e., posts frequently)
CREATE QUERY calculateActivity(vertex<person> seed) FOR GRAPH socialNet
{
SumAccum<INT> @@numberPosts = 0;
start = {seed};
result = SELECT postVertex FROM start -(posted:e)-> :postVertex
ACCUM @@numberPosts += 1;IF @@numberPosts < 2 THEN
PRINT “Not very active”;
ELSE IF @@numberPosts < 3 THEN
PRINT “Semi-active”;
ELSE
PRINT “Very active”;
END;
}

 


Example 2 Results for Query calculateActivity

 



GSQL > RUN QUERY calculateActivity(“person1”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“Not very active”: “Not very active”}]
}
GSQL > RUN QUERY calculateActivity(“person5”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“Semi-active”: “Semi-active”}]
}

 


Example 3. Nested IF at query-body level
# use a more advanced activity calculation, taking into account number of posts
# and number of likes that a user made
CREATE QUERY calculateInDepthActivity(vertex<person> seed) FOR GRAPH socialNet
{
SumAccum<INT> @@numberPosts = 0;
SumAccum<INT> @@numberLikes = 0;
start = {seed};
result = SELECT postVertex FROM start -(posted:e)-> :postVertex
ACCUM @@numberPosts += 1;
result = SELECT likedPost FROM start -(liked:e)-> :likedPost
ACCUM @@numberLikes += 1;IF @@numberPosts < 2 THEN
IF @@numberLikes < 1 THEN
PRINT “Not very active”;
ELSE
PRINT “Semi-active”;
END;
ELSE IF @@numberPosts < 3 THEN
IF @@numberLikes < 2 THEN
PRINT “Semi-active”;
ELSE
PRINT “Active”;
END;
ELSE
PRINT “Very active”;
END;
}

 


Example 3 Results for Query calculateInDepthActivity

 



GSQL > RUN QUERY calculateInDepthActivity(“person1”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“Semi-active”: “Semi-active”}]
}

Example 4. Nested IF at DML-sub level
# give each user post an accumulated rating based on the subject and how many likes it has
# This query is equivalent to the query ratePosts shown above
CREATE QUERY ratePosts2() FOR GRAPH socialNet {
SumAccum<INT> @rating = 0;
allPeople = {person.*};results = SELECT v FROM allPeople -(:e)-> post:v
ACCUM IF e.type == “posted” THEN
IF v.subject == “cats” THEN
v.@rating += -1 # -1 if post is about cats
ELSE IF v.subject == “Graphs” THEN
v.@rating += 2 # +2 if post is about graphs
ELSE IF v.subject == “tigergraph” THEN
v.@rating += 10 # +10 if post is about tigergraph
END
ELSE IF e.type == “liked” THEN
v.@rating += 3 # +3 each time post was liked
END
ORDER BY v.@rating DESC
LIMIT 5;
PRINT results;
}

Example 4 Results for Query ratePosts2
GSQL > RUN QUERY ratePosts2()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“results”: [
{
“v_id”: “6”,
“attributes”: {
“postTime”: “2011-02-05 02:02:05”,
“subject”: “tigergraph”,
“@rating”: 13
},
“v_type”: “post”
},
{
“v_id”: “0”,
“attributes”: {
“postTime”: “2010-01-12 11:22:05”,
“subject”: “Graphs”,
“@rating”: 11
},
“v_type”: “post”
},
{
“v_id”: “1”,
“attributes”: {
“postTime”: “2011-03-03 23:02:00”,
“subject”: “tigergraph”,
“@rating”: 10
},
“v_type”: “post”
},
{
“v_id”: “5”,
“attributes”: {
“postTime”: “2011-02-06 01:02:02”,
“subject”: “tigergraph”,
“@rating”: 10
},
“v_type”: “post”
},
{
“v_id”: “4”,
“attributes”: {
“postTime”: “2011-02-07 05:02:51”,
“subject”: “coffee”,
“@rating”: 6
},
“v_type”: “post”
}
]}]
}

 

CASE Statement

The CASE statement provides conditional branching: execute a block of statements only if a given condition is true. CASE statements can be used as query-body statements or DML-sub-statements. (See the

note about differences in block syntax

.)


CASE syntax
queryBodyCaseStmt := CASE (WHEN condition THEN queryBodyStmts)+ [ELSE queryBodyStmts] END
| CASE expr (WHEN constant THEN queryBodyStmts)+ [ELSE queryBodyStmts] END
DMLSubCaseStmt := CASE (WHEN condition THEN DMLSubStmtList)+ [ELSE DMLSubStmtList] END
| CASE expr (WHEN constant THEN DMLSubStmtList)+ [ELSE DMLSubStmtList] END

One CASE statement contains one or more WHEN-THEN clauses, each WHEN presenting one expression. The CASE statement may also have one ELSE clause whose statements are executed if none of the preceding conditions are true.

There are two syntaxes of the CASE statement: one equivalent to an if-else statement, and the other is structured like a switch statement. The if-else version evaluates the boolean

condition

within each WHEN-clause and executes the first block of statements whose

condition

is true. The optional concluding ELSE-clause is executed only if all WHEN-clause conditions are false.

The switch version evaluates the expression following the keyword WHEN and compares its value to the expression immediately following the keyword CASE. These expressions do not need to be boolean; the CASE statement compares pairs of expressions to see if their values are equal. The first WHEN-THEN clause to have an expression value equal to the CASE expression value is executed; the remaining clauses are skipped. The optional ELSE-clause is executed only if no WHEN-clause expression has a value matching the CASE value.


CASE Semantics

STRING drink = “Juice”;

# CASE statement: if-else version
CASE
WHEN drink == “Juice” THEN @@calories += 50
WHEN drink == “Soda” THEN @@calories += 120

ELSE @@calories = 0 # Optional else-clause
END
# Since drink = “Juice”, 50 will be added to calories

# CASE statement: switch version
CASE drink
WHEN “Juice” THEN @@calories += 50
WHEN “Soda” THEN @@calories += 120

ELSE @@calories = 0 # Optional else-clause
END
# Since drink = “Juice”, 50 will be added to calories

 


Example 1. CASE as IF-ELSE
# Display the total number times connected users posted about a certain subject
CREATE QUERY userNetworkPosts (vertex<person> seedUser, STRING subjectName) FOR GRAPH socialNet {
SumAccum<INT> @@topicSum = 0;
OrAccum @visited;
reachableVertices = {}; # empty vertex set
visitedVertices (ANY) = {seedUser}; # set that can contain ANY type of vertexWHILE visitedVertices.size() !=0 DO # loop terminates when all neighbors are visited
visitedVertices = SELECT s # s is all neighbors of visitedVertices which have not been visited
FROM visitedVertices-(:e)->:s
WHERE s.@visited == false
ACCUM s.@visited = true,
CASE
WHEN s.type == “post” and s.subject == subjectName THEN @@topicSum += 1
END;
END;
PRINT @@topicSum;
}

 


Example 1 Results for Query userNetworkPosts

 



GSQL > RUN QUERY userNetworkPosts(“person1”, “Graphs”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@topicSum”: 3}]
}

Example 2. CASE as switch
# tally male and female friends of the starting vertex
CREATE QUERY countGenderOfFriends(vertex<person> seed) FOR GRAPH socialNet {
SumAccum<INT> @@males = 0;
SumAccum<INT> @@females = 0;
SumAccum<INT> @@unknown = 0;
startingVertex = {seed};people = SELECT v FROM startingVertex -(friend:e)->:v
ACCUM CASE v.gender
WHEN “Male” THEN @@males += 1
WHEN “Female” THEN @@females +=1
ELSE @@unknown += 1
END;
PRINT @@males, @@females, @@unknown;
}

 


Example 2 Results for Query countGenderOfFriends

 



GSQL > RUN QUERY countGenderOfFriends(“person4”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{
“@@males”: 2,
“@@unknown”: 0,
“@@females”: 1
}]
}

Example 3. Multiple CASE statements
# give each social network user a social impact score which accumulates
# based on how many friends and posts they have
CREATE QUERY scoreSocialImpact() FOR GRAPH socialNet api(“v2”) {
SumAccum<INT> @socialImpact = 0;
allPeople = {person.*};
people = SELECT v FROM allPeople:v
ACCUM CASE WHEN v.outdegree(“friend”) > 1 THEN v.@socialImpact +=1 END, # +1 point for having > 1 friend
CASE WHEN v.outdegree(“friend”) > 2 THEN v.@socialImpact +=1 END, # +1 point for having > 2 friends
CASE WHEN v.outdegree(“posted”) > 1 THEN v.@socialImpact +=1 END, # +1 point for having > 1 posts
CASE WHEN v.outdegree(“posted”) > 3 THEN v.@socialImpact +=2 END; # +2 points for having > 2 posts
#PRINT people.@socialImpact; // api v1
PRINT people[people.@socialImpact]; // api v2
}

 


Example 3 Results for Query scoreSocialImpact

 



GSQL > RUN QUERY scoreSocialImpact()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“people”: [
{
“v_id”: “person4”,
“attributes”: {“people.@socialImpact”: 2},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {“people.@socialImpact”: 1},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {“people.@socialImpact”: 2},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {“people.@socialImpact”: 1},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {“people.@socialImpact”: 2},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {“people.@socialImpact”: 2},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {“people.@socialImpact”: 1},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {“people.@socialImpact”: 3},
“v_type”: “person”
}
]}]
}

 


Example 4. Nested CASE statements
# give each user post a rating based on the subject and how many likes it has
CREATE QUERY ratePosts() FOR GRAPH socialNet api(“v2”) {
SumAccum<INT> @rating = 0;
allPeople = {person.*};results = SELECT v FROM allPeople -(:e)-> post:v
ACCUM CASE e.type
WHEN “posted” THEN
CASE
WHEN v.subject == “cats” THEN v.@rating += -1 # -1 if post about cats
WHEN v.subject == “Graphs” THEN v.@rating += 2 # +2 if post about graphs
WHEN v.subject == “tigergraph” THEN v.@rating += 10 # +10 if post about tigergraph
END
WHEN “liked” THEN v.@rating += 3 # +3 each time post was liked
END;
#PRINT results.@rating; // api v1
PRINT results[results.@rating]; // api v2
}

 


Example 4 Results for Query ratePosts

 



GSQL > RUN QUERY ratePosts()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“results”: [
{
“v_id”: “0”,
“attributes”: {“results.@rating”: 11},
“v_type”: “post”
},
{
“v_id”: “10”,
“attributes”: {“results.@rating”: 2},
“v_type”: “post”
},
{
“v_id”: “2”,
“attributes”: {“results.@rating”: 0},
“v_type”: “post”
},
{
“v_id”: “4”,
“attributes”: {“results.@rating”: 6},
“v_type”: “post”
},
{
“v_id”: “9”,
“attributes”: {“results.@rating”: -1},
“v_type”: “post”
},
{
“v_id”: “3”,
“attributes”: {“results.@rating”: 2},
“v_type”: “post”
},
{
“v_id”: “5”,
“attributes”: {“results.@rating”: 10},
“v_type”: “post”
},
{
“v_id”: “7”,
“attributes”: {“results.@rating”: 2},
“v_type”: “post”
},
{
“v_id”: “1”,
“attributes”: {“results.@rating”: 10},
“v_type”: “post”
},
{
“v_id”: “11”,
“attributes”: {“results.@rating”: -1},
“v_type”: “post”
},
{
“v_id”: “8”,
“attributes”: {“results.@rating”: 2},
“v_type”: “post”
},
{
“v_id”: “6”,
“attributes”: {“results.@rating”: 13},
“v_type”: “post”
}
]}]
}

WHILE Statement

The WHILE statement provides unbounded iteration over a block of statements. WHILE statements can be used as query-body statements or DML-sub-statements. (See the

note about differences in block syntax

.)


WHILE syntax
queryBodyWhileStmt := WHILE condition [LIMIT (name | integer)] DO queryBodyStmts END
DMLSubWhileStmt := WHILE condition [LIMIT (name | integer)] DO DMLSubStmtList END

The WHILE statement iterates over its body (

queryBodyStmts

or

DMLSubStmtList

) until the

condition

evaluates to false or until the iteration limit is met.  A

condition

is any expression that evaluates to a boolean.  The condition is evaluated before each iteration.

CONTINUE

statements can be used to change the control flow within the while block.

BREAK

statements can be used to exit the while loop.

A WHILE statement may have an optional LIMIT clause.  LIMIT clauses has a constant positive integer value or integer variable to constrain the maximum number of loop iterations.  The example below demonstrates how the LIMIT behaves.


If a limit value is not specified, it is possible for a WHILE loop to iterate infinitely. It is the responsibility of the query author to design the condition logic so that it is guaranteed to eventually be true (or to set a limit).

 


WHILE LIMIT semantics
# These three WHILE statements behave the same. Each terminates when
# (v.size == 0) or after 5 iterations of the loop.
WHILE v.size() !=0 LIMIT 5 DO
# Some statements
END;INT iter = 0;
WHILE (v.size() !=0) AND (iter < 5) DO
# Some statements
iter = iter + 1;
END;INT iter = 0;
WHILE v.size() !=0 DO
IF iter == 5 THEN BREAK; END;
# Some statements
iter = iter + 1;
END;

Below are a number of examples that demonstrate the use of WHILE statements.


Example 1. Simple WHILE loop
# find all vertices which are reachable from a starting seed vertex (i.e., breadth-first search)
CREATE QUERY reachable(vertex<person> seed) FOR GRAPH workNet
{
OrAccum @visited;
reachableVertices = {}; # empty vertex set
visitedVertices (ANY) = {seed}; # set that can contain ANY type of vertexWHILE visitedVertices.size() !=0 DO # loop terminates when all neighbors are visited
visitedVertices = SELECT s # s is all neighbors of visitedVertices which have not been visited
FROM visitedVertices-(:e)->:s
WHERE s.@visited == false
POST-ACCUM s.@visited = true;
reachableVertices = reachableVertices UNION visitedVertices;
END;
PRINT reachableVertices;
}

reachable Results
GSQL > RUN QUERY reachable(“person1”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“reachableVertices”: [
{
“v_id”: “person3”,
“attributes”: {
“interestList”: [“teaching”],
“skillSet”: [ 6, 1, 4 ],
“skillList”: [ 4, 1, 6 ],
“locationId”: “jp”,
“interestSet”: [“teaching”],
“@visited”: true,
“id”: “person3”
},
“v_type”: “person”
},
{
“v_id”: “person9”,
“attributes”: {
“interestList”: [ “financial”, “teaching” ],
“skillSet”: [ 2, 7, 4 ],
“skillList”: [ 4, 7, 2 ],
“locationId”: “us”,
“interestSet”: [ “teaching”, “financial” ],
“@visited”: true,
“id”: “person9”
},
“v_type”: “person”
},
{
“v_id”: “person4”,
“attributes”: {
“interestList”: [“football”],
“skillSet”: [ 10, 1, 4 ],
“skillList”: [ 4, 1, 10 ],
“locationId”: “us”,
“interestSet”: [“football”],
“@visited”: true,
“id”: “person4”
},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {
“interestList”: [ “art”, “sport” ],
“skillSet”: [ 6, 8 ],
“skillList”: [ 8, 6 ],
“locationId”: “us”,
“interestSet”: [ “sport”, “art” ],
“@visited”: true,
“id”: “person7”
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“interestList”: [ “management”, “financial” ],
“skillSet”: [ 3, 2, 1 ],
“skillList”: [ 1, 2, 3 ],
“locationId”: “us”,
“interestSet”: [ “financial”, “management” ],
“@visited”: true,
“id”: “person1”
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“interestList”: [ “sport”, “financial”, “engineering” ],
“skillSet”: [ 5, 2, 8 ],
“skillList”: [ 8, 2, 5 ],
“locationId”: “can”,
“interestSet”: [ “engineering”, “financial”, “sport” ],
“@visited”: true,
“id”: “person5”
},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {
“interestList”: [ “music”, “art” ],
“skillSet”: [ 10, 7 ],
“skillList”: [ 7, 10 ],
“locationId”: “jp”,
“interestSet”: [ “art”, “music” ],
“@visited”: true,
“id”: “person6”
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“interestList”: [“engineering”],
“skillSet”: [ 6, 5, 3, 2 ],
“skillList”: [ 2, 3, 5, 6 ],
“locationId”: “chn”,
“interestSet”: [“engineering”],
“@visited”: true,
“id”: “person2”
},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {
“interestList”: [“management”],
“skillSet”: [ 2, 5, 1 ],
“skillList”: [ 1, 5, 2 ],
“locationId”: “chn”,
“interestSet”: [“management”],
“@visited”: true,
“id”: “person8”
},
“v_type”: “person”
},
{
“v_id”: “company3”,
“attributes”: {
“country”: “jp”,
“@visited”: true,
“id”: “company3”
},
“v_type”: “company”
},
{
“v_id”: “company2”,
“attributes”: {
“country”: “chn”,
“@visited”: true,
“id”: “company2”
},
“v_type”: “company”
},
{
“v_id”: “company1”,
“attributes”: {
“country”: “us”,
“@visited”: true,
“id”: “company1”
},
“v_type”: “company”
},
{
“v_id”: “person10”,
“attributes”: {
“interestList”: [ “football”, “sport” ],
“skillSet”: [3],
“skillList”: [3],
“locationId”: “us”,
“interestSet”: [ “sport”, “football” ],
“@visited”: true,
“id”: “person10”
},
“v_type”: “person”
}
]}]
}

 

 


Example 2. WHILE loop using a LIMIT
# find all vertices which are reachable within two hops from a starting seed vertex (i.e., breadth-first search)
CREATE QUERY reachableWithinTwo(vertex<person> seed) FOR GRAPH workNet
{
OrAccum @visited;
reachableVertices = {}; # empty vertex set
visitedVertices (ANY) = {seed}; # set that can contain ANY type of vertexWHILE visitedVertices.size() !=0 LIMIT 2 DO # loop terminates when all neighbors within 2-hops of the seed vertex are visited
visitedVertices = SELECT s # s is all neighbors of visitedVertices which have not been visited
FROM visitedVertices-(:e)->:s
WHERE s.@visited == false
POST-ACCUM s.@visited = true;
reachableVertices = reachableVertices UNION visitedVertices;
END;
PRINT reachableVertices;
}

reachableWithinTwo Results
GSQL > RUN QUERY reachableWithinTwo(“person1”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“reachableVertices”: [
{
“v_id”: “person4”,
“attributes”: {
“interestList”: [“football”],
“skillSet”: [ 10, 1, 4 ],
“skillList”: [ 4, 1, 10 ],
“locationId”: “us”,
“interestSet”: [“football”],
“@visited”: true,
“id”: “person4”
},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {
“interestList”: [“teaching”],
“skillSet”: [ 6, 1, 4 ],
“skillList”: [ 4, 1, 6 ],
“locationId”: “jp”,
“interestSet”: [“teaching”],
“@visited”: true,
“id”: “person3”
},
“v_type”: “person”
},
{
“v_id”: “person9”,
“attributes”: {
“interestList”: [ “financial”, “teaching” ],
“skillSet”: [ 2, 7, 4 ],
“skillList”: [ 4, 7, 2 ],
“locationId”: “us”,
“interestSet”: [ “teaching”, “financial” ],
“@visited”: true,
“id”: “person9”
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“interestList”: [ “sport”, “financial”, “engineering” ],
“skillSet”: [ 5, 2, 8 ],
“skillList”: [ 8, 2, 5 ],
“locationId”: “can”,
“interestSet”: [ “engineering”, “financial”, “sport” ],
“@visited”: true,
“id”: “person5”
},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {
“interestList”: [ “music”, “art” ],
“skillSet”: [ 10, 7 ],
“skillList”: [ 7, 10 ],
“locationId”: “jp”,
“interestSet”: [ “art”, “music” ],
“@visited”: true,
“id”: “person6”
},
“v_type”: “person”
},
{
“v_id”: “person10”,
“attributes”: {
“interestList”: [ “football”, “sport” ],
“skillSet”: [3],
“skillList”: [3],
“locationId”: “us”,
“interestSet”: [ “sport”, “football” ],
“@visited”: true,
“id”: “person10”
},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {
“interestList”: [“management”],
“skillSet”: [ 2, 5, 1 ],
“skillList”: [ 1, 5, 2 ],
“locationId”: “chn”,
“interestSet”: [“management”],
“@visited”: true,
“id”: “person8”
},
“v_type”: “person”
},
{
“v_id”: “company1”,
“attributes”: {
“country”: “us”,
“@visited”: true,
“id”: “company1”
},
“v_type”: “company”
},
{
“v_id”: “person2”,
“attributes”: {
“interestList”: [“engineering”],
“skillSet”: [ 6, 5, 3, 2 ],
“skillList”: [ 2, 3, 5, 6 ],
“locationId”: “chn”,
“interestSet”: [“engineering”],
“@visited”: true,
“id”: “person2”
},
“v_type”: “person”
},
{
“v_id”: “company2”,
“attributes”: {
“country”: “chn”,
“@visited”: true,
“id”: “company2”
},
“v_type”: “company”
},
{
“v_id”: “person7”,
“attributes”: {
“interestList”: [ “art”, “sport” ],
“skillSet”: [ 6, 8 ],
“skillList”: [ 8, 6 ],
“locationId”: “us”,
“interestSet”: [ “sport”, “art” ],
“@visited”: true,
“id”: “person7”
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“interestList”: [ “management”, “financial” ],
“skillSet”: [ 3, 2, 1 ],
“skillList”: [ 1, 2, 3 ],
“locationId”: “us”,
“interestSet”: [ “financial”, “management” ],
“@visited”: true,
“id”: “person1”
},
“v_type”: “person”
}
]}]
}

 

FOREACH Statement

The FOREACH statement provides bounded iteration over a block of statements. FOREACH statements can be used as query-body statements or DML-sub-statements. (See the

note about differences in block syntax

.)


FOREACH syntax
queryBodyForEachStmt := FOREACH forEachControl DO queryBodyStmts END
DMLSubForEachStmt := FOREACH forEachControl DO DMLSubStmtList END
forEachControl := (name | “(” name [, name]+ “)”) IN setBagExpr
| name IN RANGE “[” expr , expr”]” [“.STEP(” expr “)”]

The formal syntax for forEachControl appears complex.  It can be broken down into the following cases:

  • name IN setBagExpr
  • tuple IN setBagExpr
  • name IN RANGE [ expr, expr ]
  • name IN RANGE [ expr, expr ].STEP ( expr )

Note that setBagExpr includes container accumulators and explicit sets.


The FOREACH statement has the following restrictions:

  • In a DML-sub level FOREACH, it is never permissible to update the loop variable (the variable declared before IN, e.g., var in “FOREACH var IN setBagExpr”).
  • In a query-body level FOREACH, in most cases it is not permissible to update the loop variable.  The following exceptions apply:
    • If the iteration is over a ListAccum, its values can be updated.
    • If the iteration is over a MapAccum, its values can be updated, but its keys cannot.
  • If the iteration is over a set of vertices, it is not permissible to access (read or write) their vertex-attached accumulators.

  • A query-body-level FOREACH cannot iterate over a set or bag of constants. For example, FOREACH i in (1,2,3) is not supported. However, DML-sub FOREACH does support this.

FOREACH … IN RANGE

The FOREACH statement has an optional RANGE clause RANGE[expr, expr], which can be used to define the iteration collection. Optionally, the range may specify a step size:

RANGE[expr, expr].STEP(expr)

Each expr must evaluate to an integer. Any of the integers may be negative, but the step expr may not be 0.

The clause RANGE[a,b].STEP(c)  produces the sequence of integers from a to b, inclusive, with step size c.  That is,

(a, a+c, a+2*c, a+3*c, … a+k*c), where k = the largest integer such that |k*c| ≤ |b-a|.

If the .STEP method is not given, then the step size c = 1.


Nested FOREACH IN RANGE with MapAccum
CREATE QUERY foreachRangeEx() FOR GRAPH socialNet {
ListAccum<INT> @@t;
Start = {person.*};
FOREACH i IN RANGE[0, 2] DO
@@t += i;
L = SELECT Start
FROM Start
WHERE Start.id == “person1”
ACCUM
FOREACH j IN RANGE[0, i] DO
@@t += j
END
;
END;
PRINT @@t;
}

Results for Query foreachRangeEx

 



GSQL > RUN QUERY foreachRangeEx()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@t”: [ 0, 0, 1, 0, 1, 2, 0, 1, 2 ]}]
}

FOREACH IN RANGE with step
CREATE QUERY foreachRangeStep(INT a, INT b, INT c) FOR GRAPH minimalNet {
ListAccum<INT> @@t;
FOREACH i IN RANGE[a,b].step(c) DO
@@t += i;
END;
PRINT @@t;
}

The step value can be positive for an ascending range or negative for a descending range.  If the step has the wrong polarity, then the loop has zero iterations; that is, the exit condition is already satisfied.


foreachRangeStep.json Results

 



GSQL > RUN QUERY foreachRangeStep(100,0,-9)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@t”: [
100,
91,
82,
73,
64,
55,
46,
37,
28,
19,
10,
1
]}]
}
GSQL > RUN QUERY foreachRangeStep(-100,100,-9)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@t”: []}]
}

 

Query-body-level FOREACH Examples


Example 1 – FOREACH with ListAccum
# Count the number of companies whose country matches the provided string
CREATE QUERY companyCount(STRING countryName) FOR GRAPH workNet {
ListAccum<STRING> @@companyList;
INT countryCount;
start = {ANY}; # start will have a set of all vertex typess = SELECT v FROM start:v # get all vertices
WHERE v.type == “company” # that have a type of “company”
ACCUM @@companyList += v.country; # append the country attribute from all company vertices to the ListAccum# Iterate the ListAccum and compare each element to the countryName parameter
FOREACH item in @@companyList DO
IF item == countryName THEN
countryCount = countryCount + 1;
END;
END;
PRINT countryCount;
}

 


companyCount Results

 



GSQL > RUN QUERY companyCount(“us”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“countryCount”: 2}]
}
GSQL > RUN QUERY companyCount(“can”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“countryCount”: 1}]
}

 


Example 2 – FOREACH with a seed set
#Find all company person who live in a given country
CREATE QUERY employeesByCompany(STRING country) FOR GRAPH workNet {
ListAccum<VERTEX<company>> @@companyList;
start = {ANY};# Build a list of all company vertices
# (these are vertex IDs only)
s = SELECT v FROM start:v
WHERE v.type == “company”
ACCUM @@companyList += v;# Use the vertex IDs as Seeds for vertex sets
FOREACH item IN @@companyList DO
companyItem = {item};
employees = SELECT t FROM companyItem -(worksFor)-> :t
WHERE (t.locationId == country);
PRINT employees;
END;
}


employeesByCompany Results
GSQL > RUN QUERY employeesByCompany(“us”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [ {“employees”: []},
{“employees”: []},
{“employees”: [
{
“v_id”: “person9”,
“attributes”: {
“interestList”: [
“financial”,
“teaching”
],
“skillSet”: [ 2, 7, 4 ],
“skillList”: [ 4, 7, 2 ],
“locationId”: “us”,
“interestSet”: [ “teaching”, “financial” ],
“id”: “person9”
},
“v_type”: “person”
},
{
“v_id”: “person10”,
“attributes”: {
“interestList”: [ “football”, “sport” ],
“skillSet”: [3],
“skillList”: [3],
“locationId”: “us”,
“interestSet”: [ “sport”, “football” ],
“id”: “person10”
},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {
“interestList”: [ “art”, “sport” ],
“skillSet”: [ 6, 8 ],
“skillList”: [ 8, 6 ],
“locationId”: “us”,
“interestSet”: [ “sport”, “art” ],
“id”: “person7”
},
“v_type”: “person”
}
]},
{“employees”: [
{
“v_id”: “person4”,
“attributes”: {
“interestList”: [“football”],
“skillSet”: [ 10, 1, 4 ],
“skillList”: [ 4, 1, 10 ],
“locationId”: “us”,
“interestSet”: [“football”],
“id”: “person4”
},
“v_type”: “person”
},
{
“v_id”: “person9”,
“attributes”: {
“interestList”: [ “financial”, “teaching” ],
“skillSet”: [ 2, 7, 4 ],
“skillList”: [ 4, 7, 2 ],
“locationId”: “us”,
“interestSet”: [ “teaching”, “financial” ],
“id”: “person9”
},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {
“interestList”: [ “art”, “sport” ],
“skillSet”: [ 6, 8 ],
“skillList”: [ 8, 6 ],
“locationId”: “us”,
“interestSet”: [ “sport”, “art” ],
“id”: “person7”
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“interestList”: [ “management”, “financial” ],
“skillSet”: [ 3, 2, 1 ],
“skillList”: [ 1, 2, 3 ],
“locationId”: “us”,
“interestSet”: [ “financial”, “management” ],
“id”: “person1”
},
“v_type”: “person”
}
]},
{“employees”: [
{
“v_id”: “person10”,
“attributes”: {
“interestList”: [
“football”,
“sport”
],
“skillSet”: [3],
“skillList”: [3],
“locationId”: “us”,
“interestSet”: [ “sport”, “football” ],
“id”: “person10”
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“interestList”: [ “management”, “financial” ],
“skillSet”: [ 3, 2, 1 ],
“skillList”: [ 1, 2, 3 ],
“locationId”: “us”,
“interestSet”: [ “financial”, “management” ],
“id”: “person1”
},
“v_type”: “person”
}
]}
]
}

 

 


Example 3 – Nested FOREACH with MapAccum
# Count the number of employees from a given country and list their ids
CREATE QUERY employeeByCountry(STRING countryName) FOR GRAPH workNet {
MapAccum <STRING, ListAccum<STRING>> @@employees;# start will have a set of all person type vertices
start = {person.*};# Build a map using person locationId as a key and a list of strings to hold multiple person ids
s = SELECT v FROM start:v
ACCUM @@employees += (v.locationId -> v.id);

# Iterate the map using (key,value) pairs
FOREACH (key,val) in @@employees DO
IF key == countryName THEN
PRINT val.size();

# Nested foreach to iterate over the list of person ids
FOREACH employee in val DO
PRINT employee;
END;

# MapAccum keys are unique so we can BREAK out of the loop
BREAK;
END;
END;
}

 


employeeByCountry Results

 



GSQL > RUN QUERY employeeByCountry(“us”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“val.size()”: 5},
{“employee”: “person4”},
{“employee”: “person10”},
{“employee”: “person7”},
{“employee”: “person1”},
{“employee”: “person9”}
]
}

DML-sub FOREACH Examples


ACCUM FOREACH
# Show post topics liked by users and show total likes per topic
CREATE QUERY topicLikes() FOR GRAPH socialNet {
SetAccum<STRING> @@personPosts;
SumAccum<INT> @postLikes;
MapAccum<STRING,INT> @@likesByTopic;start = {person.*};# Find all user posts and generate a set of post topics
# (set has no duplicates)
posts = SELECT g FROM start – (posted) -> :g
ACCUM @@personPosts += g.subject;

# Use set of topics to increment how many times a specfic
# post is liked by other users
likedPosts = SELECT f FROM start – (liked) -> :f
ACCUM FOREACH x in @@personPosts DO
CASE WHEN (f.subject == x) THEN
f.@postLikes += 1
END
END
# Aggregate all liked totals by topic
POST-ACCUM @@likesByTopic += (f.subject -> f.@postLikes);

# Display the number of likes per topic
PRINT @@likesByTopic;
}

 


Results for Query topicLikes

 



GSQL > RUN QUERY topicLikes()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@likesByTopic”: {
“cats”: 3,
“coffee”: 2,
“Graphs”: 3,
“tigergraph”: 1
}}]
}

 


Example 1 – POST-ACCUM FOREACH
#Show a summary of the number of friends all persons have by gender
CREATE QUERY friendGender() FOR GRAPH socialNet {
ListAccum<STRING> @friendGender;
SumAccum<INT> @@maleGenderCount;
SumAccum<INT> @@femaleGenderCount;start = {person.*};# Record a list showing each friend’s gender
socialMembers = SELECT s from start:s -(friend)-> :g
ACCUM s.@friendGender += (g.gender)

# Loop over each list of genders and total them
POST-ACCUM FOREACH x in s.@friendGender DO
CASE WHEN (x == “Male”) THEN
@@maleGenderCount += 1
ELSE
@@femaleGenderCount += 1
END
END;

PRINT @@maleGenderCount;
PRINT @@femaleGenderCount;
}

 


Results for Query friendGender

 



GSQL > RUN QUERY friendGender()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“@@maleGenderCount”: 11},
{“@@femaleGenderCount”: 7}
]
}

 

CONTINUE and BREAK Statements

The CONTINUE and BREAK statements can only be used within a block of a WHILE or FOREACH statement.  The CONTINUE statement branches control flow to the end of the loop, skipping any remaining statements in the current iteration, and proceeding to the next iteration. That is, everything in the loop block after the CONTINUE statement will be skipped, and then the loop will continue as normal.

The BREAK statement branches control flow out of the loop, i.e., it will exit the loop and stop iteration.

Below are a number of examples that demonstrate the use of BREAK and CONTINUE.


Continue and Break Semantics
# While with a continue
INT i = 0;
INT nCount = 0;
WHILE i < 10 DO
i = i + 1;
IF (i % 2 == 0) { CONTINUE; }
nCount = nCount + 1;
END;
# i is 10, nCount is 5 (skips the increment for every even i).# While with a break
i = 0;
WHILE i < 10 DO
IF (i == 5) { BREAK; } # When i is 5 the loop is exited
i = i + 1;
END;
# i is now 5

 


Example 1. Break
# find posts of a given person, and post of friends of that person, friends of friends, etc
# until a post about cats is found. The number of friend-hops to reach is the ‘degree’ of cats
CREATE QUERY findDegreeOfCats(vertex<person> seed) FOR GRAPH socialNet
{
SumAccum<INT> @@degree = 0;
OrAccum @@foundCatPost = false;
OrAccum @visited = false;friends (ANY) = {seed};
WHILE @@foundCatPost != true AND friends.size() > 0 DO
posts = SELECT v FROM friends-(posted:e)->:v
ACCUM CASE WHEN v.subject == “cats” THEN @@foundCatPost += true END;IF @@foundCatPost THEN
BREAK;
END;

friends = SELECT v FROM friends-(friend:e)->:v
WHERE v.@visited == false
ACCUM v.@visited = true;
@@degree += 1;
END;
PRINT @@degree;
}

 


Results of Query findDegreeOfCats

 



GSQL > RUN QUERY findDegreeOfCats(“person2”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@degree”: 2}]
}
GSQL > RUN QUERY findDegreeOfCats(“person4”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@degree”: 0}]
}

 


Example 2. findEnoughFriends.gsql: While loop using continue statement
# find all 3-hop friends of a starting vertex. count coworkers as friends
# if there are not enough friends
CREATE QUERY findEnoughFriends(vertex<person> seed) FOR GRAPH friendNet
{
SumAccum<INT> @@distance = 0; # keep track of the distance from the seed
OrAccum @visited = false;
visitedVertices = {seed};
WHILE true LIMIT 3 DO
@@distance += 1;
# traverse from visitedVertices to its friends
friends = SELECT v
FROM visitedVertices -(friend:e)-> :v
WHERE v.@visited == false
POST-ACCUM v.@visited = true;
PRINT @@distance, friends;# if number of friends at this level is sufficient, finish this iteration
IF visitedVertices.size() >= 2 THEN
visitedVertices = friends;
CONTINUE;
END;
# if fewer than 4 friends, add in coworkers
coworkers = SELECT v
FROM visitedVertices -(coworker:e)-> :v
WHERE v.@visited == false
POST-ACCUM v.@visited = true;
visitedVertices = friends UNION coworkers;
PRINT @@distance, coworkers;
END;
}

 


findEnoughFriends.json Example 2 Results

 



GSQL > RUN QUERY findEnoughFriends(“person1”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“@@distance”: 1,
“friends”: [
{
“v_id”: “person4”,
“attributes”: {
“@visited”: true,
“id”: “person4”
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“@visited”: true,
“id”: “person2”
},
“v_type”: “person”
},
{
“v_id”: “person3”,
“attributes”: {
“@visited”: true,
“id”: “person3”
},
“v_type”: “person”
}
]
},
{
“coworkers”: [
{
“v_id”: “person5”,
“attributes”: {
“@visited”: true,
“id”: “person5”
},
“v_type”: “person”
},
{
“v_id”: “person6”,
“attributes”: {
“@visited”: true,
“id”: “person6”
},
“v_type”: “person”
}
],
“@@distance”: 1
},
{
“@@distance”: 2,
“friends”: [
{
“v_id”: “person9”,
“attributes”: {
“@visited”: true,
“id”: “person9”
},
“v_type”: “person”
},
{
“v_id”: “person1”,
“attributes”: {
“@visited”: true,
“id”: “person1”
},
“v_type”: “person”
},
{
“v_id”: “person8”,
“attributes”: {
“@visited”: true,
“id”: “person8”
},
“v_type”: “person”
}
]
},
{
“@@distance”: 3,
“friends”: [
{
“v_id”: “person12”,
“attributes”: {
“@visited”: true,
“id”: “person12”
},
“v_type”: “person”
},
{
“v_id”: “person10”,
“attributes”: {
“@visited”: true,
“id”: “person10”
},
“v_type”: “person”
},
{
“v_id”: “person7”,
“attributes”: {
“@visited”: true,
“id”: “person7”
},
“v_type”: “person”
}
]
}
]
}

 


Example 3. While loop using break statement
# find at least the top-k companies closest to a given seed vertex, if they exist
CREATE QUERY topkCompanies(vertex<person> seed, INT k) FOR GRAPH workNet
{
SetAccum<vertex<company>> @@companyList;
OrAccum @visited = false;
visitedVertices (ANY) = {seed};
WHILE true DO
visitedVertices = SELECT v # traverse from x to its unvisited neighbors
FROM visitedVertices -(:e)-> :v
WHERE v.@visited == false
ACCUM CASE
WHEN (v.type == “company”) THEN # count the number of company vertices encountered
@@companyList += v
END
POST-ACCUM v.@visited += true; # mark vertices as visited# exit loop when at least k companies have been counted
IF @@companyList.size() >= k OR visitedVertices.size() == 0 THEN
BREAK;
END;
END;
PRINT @@companyList;
}

 


Example 3. topkCompanies Results

 



GSQL > RUN QUERY topkCompanies(“person1”, 2)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@companyList”: [
“company2”,
“company1”
]}]
}
GSQL > RUN QUERY topkCompanies(“person2”, 3)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@companyList”: [
“company3”,
“company2”,
“company1”
]}]
}

 


Example 4 – Usage of CONTINUE in FOREACH
#List out all companies from a given country
CREATE QUERY companyByCountry(STRING countryName) FOR GRAPH workNet {
MapAccum <STRING, ListAccum<STRING>> @@companies;
start = {company.*}; # start will have a set of all company type vertices#Build a map using company country as a key and a list of strings to hold multiple company ids
s = SELECT v FROM start:v
ACCUM @@companies += (v.country -> v.id);#Iterate the map using (key,value) pairs
FOREACH (key,val) IN @@companies DO
IF key != countryName THEN
CONTINUE;
END;

PRINT val.size();

#Nested foreach to iterate over the list of company ids
FOREACH comp IN val DO
PRINT comp;
END;
END;
}

 


companyByCountry Results

 



GSQL > RUN QUERY companyByCountry(“us”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“val.size()”: 2},
{“comp”: “company1”},
{“comp”: “company4”}
]
}

 


Example 5 – Usage of BREAK in FOREACH
#List all the persons located in the specified country
CREATE QUERY employmentByCountry(STRING countryName) FOR GRAPH workNet {
MapAccum < STRING, ListAccum<STRING> > @@employees;
start = {person.*}; # start will have a set of all person type vertices#Build a map using person locationId as a key and a list of strings to hold multiple person ids
s = SELECT v FROM start:v
ACCUM @@employees += (v.locationId -> v.id);#Iterate the map using (key,value) pairs
FOREACH (key,val) IN @@employees DO
IF key == countryName THEN
PRINT val.size();

#Nested foreach to iterate over the list of person ids
FOREACH employee IN val DO
PRINT employee;
END;

BREAK;
END;
END;
}

 


employmentByCountry Result

 



GSQL > RUN QUERY employmentByCountry(“us”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“val.size()”: 5},
{“employee”: “person1”},
{“employee”: “person4”},
{“employee”: “person7”},
{“employee”: “person9”},
{“employee”: “person10”}
]
}

Data

Modification Statements

 

The GSQL language provides full support for vertex and edge insertion, deletion, and attribute update is provided. Therefore, the language is more than just a “query” language.


Each query is considered one transaction. Therefore, modifications to the graph data do not take effect until the entire query is completed (committed). Accordingly, any modification statement does not affect any other statements inside the same query.

Query-body DELETE Statement

The query-body DELETE statement deletes a given set of edges or vertices. This statement

can only be used as a query-body statement. (Deletion at the DML-sub level is served by the DML-sub DELETE statement, described next.)


EBNF
QueryBodyDeleteStmt := DELETE name FROM ( edgeSet | vertexSet ) [whereClause]

The vertexSet and edgeSet terms in the FROM clause follow the same rules as those in the FROM clause in a SELECT statement. The WHERE clause can filter the items in the vertexSet or edgeSet.

Below are two examples, one for deleting vertices and one for deleting edges.


DELETE statement example
# Delete all “person” vertices with location equal to “us”
CREATE QUERY deleteEx() FOR GRAPH workNet {
S = {person.*};
DELETE s FROM S:s
WHERE s.locationId == “us”;
}

DELETE statement example 2
# Delete all “worksFor” edges where the person’s location is “us”
CREATE QUERY deleteEx2() FOR GRAPH workNet {
S = {person.*};
DELETE e FROM S:s -(worksFor:e)-> company:t
WHERE s.locationId == “us”;
}

The following query can be used to observe the effect of the delete statements. This query counts the person vertices who work in the US (“us”) and the worksFor edges for persons in the US.  When the initial workNet test data loaded, there are 5 persons and 9 worksFor edges for locationId = “us”.  If query deleteEx2 is run, the worksAtUS query will then find the 5 persons but 0 worksFor edges.  Next, if the deleteEx query is run, the worksAtUS query will then find 0 persons and 0 worksFor edges.


Query to check the results of deleteEx and deleteEx2
CREATE QUERY countAtLocation(STRING loc) FOR GRAPH workNet {
SetAccum<EDGE> @@selEdge;
Start = {person.*};SV = SELECT s FROM Start:s
WHERE s.locationId == loc;
PRINT SV.size() AS numVertices;SE = SELECT s FROM Start:s -(worksFor:e)-> company:t
WHERE s.locationId == loc
ACCUM @@selEdge += e;
PRINT @@selEdge.size() AS numEdges;
}

 

For example, the following sequence of countAtLocation, deleteEx2, and deleteEx queries


deleteEx.run
RUN QUERY countAtLocation(“us”)
RUN QUERY deleteEx2()
RUN QUERY countAtLocation(“us”)
RUN QUERY deleteEx()
RUN QUERY countAtLocation(“us”)

will produce the following result:


Results from DeleteEx Example

 



# Before any deletions
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“numVertices”: 5},
{“numEdges”: 9}
]
}
# Delete edges
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: []
}
# After deleting edges
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“numVertices”: 5},
{“numEdges”: 0}
]
}
# Delete vertices
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: []
}
# After deleting vertices
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“numVertices”: 0},
{“numEdges”: 0}
]
}

 

DML-sub DELETE Statement

DML-sub DELETE is a DML-substatement which deletes one vertex or edge each time it is called.  (Deletion at the query-body level is served by the Query-body DELETE statement described above.) In practice, this statement resides within the body of a SELECT…ACCUM/POST-ACCUM clause, so it is called once for each member of a selected vertex set or edge set.


The ACCUM clause iterates over an edge set, which can encounter the same vertex multiple times. If you wish to delete a vertex, it is best practice to place the DML-sub DELETE statement in the POST-ACCUM clause rather than in the ACCUM clause.

 


EBNF
DMLSubDeleteStmt := DELETE “(” name “)”

The following example uses and modifies the graph data for socialNet.


DELETE within ACCUM vs. POST-ACCUM
# Remove any post vertices posted by the given user
CREATE QUERY deletePosts(vertex<person> seed) FOR GRAPH socialNet {
start = {seed};# Best practice is to delete a vertex in a POST-ACCUM, which only
# occurs once for each vertex v, guaranteeing that a vertex is not
# deleted more than once
postAccumDeletedPosts = SELECT v FROM start -(posted:e)-> post:v
POST-ACCUM DELETE (v);# Possible, but not recommended as the DML-sub DELETE statement occurs
# once for each edge of the vertex v
accumDeletedPosts = SELECT v FROM start -(posted:e)-> post:v
ACCUM DELETE (v);
}

# Need a separate query to display the results, because deletions don’t take effect during the query.
CREATE QUERY selectUserPosts(vertex<person> seed) FOR GRAPH socialNet {
start = {seed};

userPosts = SELECT v FROM start -(posted:e)-> post:v;
PRINT userPosts;
}

For example, the following sequence of selectUserPosts and deletePosts queries


deletePosts.run
RUN QUERY selectUserPosts(“person3”)
RUN QUERY deletePosts(“person3”)
RUN QUERY selectUserPosts(“person3”)

will produce the following result:


Results from DeletePosts Example

 



# Before the deletion
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“selectedPosts”: [{
“v_id”: “2”,
“attributes”: {
“postTime”: “2011-02-03 01:02:42”,
“subject”: “query languages”
},
“v_type”: “post”
}]}]
}
# Deletion; no output results requested at this point
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: []
}
# After the deletion
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“selectedPosts”: []}]
}

 

INSERT
INTO Statement

The INSERT INTO statement adds edges or vertices to the graph.
However, if the ID value(s) for the inserted vertex/edge match those of an existing vertex/edge, then the new values will overwrite the old values. To insert an edge, its endpoint vertices must already exist, either prior to running the query or inserted earlier in that query.The INSERT INTO statement can be used as a query-body-level statement or a DML-substatement.


EBNF
insertStmt := INSERT INTO name [“(” ( PRIMARY_ID | FROM “,” TO ) (“,” name)* “)”]
VALUES “(” ( “_” | expr ) [name] [“,” ( “_” | expr ) [name] (“,” (“_” | expr))*] “)”

The formal syntax is complex because it encompasses several options, and even so, it requires additional explanation. The first

name

symbol is the vertex type or edge type. The user then has two options:

1) Provide a value for the ID(s) and then each attribute, in the canonical order for the vertex or edge type.  This format is similar to that of a LOAD statement.  In this case, it is not necessary to explicitly name the attributes, since it is assumed that every one is being referenced, in order.


INSERT with implicit attribute names
INSERT INTO name VALUES (full_list_of_parameter_values)

2) Name the specific attributes to be set, and then provide a corresponding list of values. The attributes can be in any order, with the exception that the IDs must come first.  That is, to insert a vertex, the first attribute name must be PRIMARY_ID.  To insert an edge, the first two attribute names must be FROM and TO.


INSERT with explicit attribute names
INSERT INTO name (IDs, specified_attributes) VALUES (values_for_specified_attributes)

For each attribute value, provide either an expression

expr

or “_”, which means the default value for that attribute type.  The optional

name

which follows the first two (id) values is to specify the source vertex type and target vertex type, if the edge type had been defined with wildcard vertex types.

Query-Body INSERT

The query insertEx illustrates query-body level INSERT statements: insert new company vertices and worksFor examples into the workNet graph.


INSERT statement
CREATE QUERY insertEx(STRING name, STRING name2, STRING name3, STRING comp) FOR GRAPH workNet {
# Vertex insertion
# Adds 2 ‘company’ vertices. One is located in the USA, and a sister company in Japan.
INSERT INTO company VALUES ( comp, comp, “us” );
INSERT INTO company (PRIMARY_ID, country) VALUES ( comp + “_jp”, “jp” );# Edge insertion
# Adds a ‘worksFor’ edge from person ‘name’ to the company ‘comp’, filling in default
# values for startYear (0), startMonth (0), and fullTime (false).
INSERT INTO worksFor VALUES (name person, comp company, _, _, _);# Adds a ‘worksFor’ edge from person ‘name2’ to the company ‘comp’, filling in default
# values for startMonth (0), but specifying values for startYear and fullTime.
INSERT INTO worksFor (FROM, TO, startYear, fullTime) VALUES (name2 person, comp company, 2017, true);

# Adds a ‘worksFor’ edge from person ‘name3’ to the company ‘comp’, filling in default
# values for startMonth (0), and fullTime (false) but specifying a value for startYear (2017).
INSERT INTO worksFor (FROM, TO, startYear) VALUES (name3 person, comp company, 2000 + 17);
}

The query whoWorksForCompany can be used to check the effect of query insertEx.  Prior to running insertEx, running whoWorksForCompany(“gsql”) will find 0 companies called “gsql” and 0 worksFor edges for company “gsql”.  If we then run the query insertEx(“tic”, “tac”, “toe”, “gsql”), then insertEx(“gsql”) will find a company called “gsql” and another one called “gsql_jp”.  Moreover, it will find 3 edges, tic, tac, and toe, with different values for the startMonth, startYear, and fullTime parameters.


Query to check the results of insertEx
CREATE QUERY whoWorksForCompany(STRING comp) FOR GRAPH workNet {
SetAccum<EDGE> @@setEdge;Comps = {company.*};
PRINT Comps[Comps.id]; # output api v2Pers = {person.*};
S = SELECT s
FROM Pers:s -(worksFor:e)-> :t
WHERE t.id == comp
ACCUM @@setEdge += e;
PRINT @@setEdge;
}

DML-sub INSERT

The following example show a DML-sub level INSERT.
Because the statement applies to allCompanies, several vertices will be inserted.


DML-sub INSERT statement
# Add a child company of a given company name. The new child company is in japan
CREATE QUERY addNewChildCompany(STRING name) FOR GRAPH workNet {
allCompanies = {company.*};
X = SELECT s
FROM allCompanies:s
WHERE s.id == name
ACCUM INSERT INTO company VALUES ( name + “_jp”, name + “_jp”, “jp” );
}# Add separate query to list the companies, before and after the insertion
CREATE QUERY listCompanyNames(STRING countryFilter) FOR GRAPH workNet {
allCompanies = {company.*};
C = SELECT s
FROM allCompanies:s
WHERE s.country == countryFilter;PRINT C.size() AS numCompanies;
PRINT C;
}

 

Example: Add a child company in Japan to US-based company company3.  List all the Japan-based companies before and after the insertion.


addNewChildCompany.run
RUN QUERY listCompanyNames(“jp”)
RUN QUERY addNewChildCompany(“company4”)
RUN QUERY listCompanyNames(“jp”)

Results from addNewChildCompany Example

 



# Before insertion
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“numCompanies”: 1},
{“C”: [{
“v_id”: “company3”,
“attributes”: {
“country”: “jp”,
“id”: “company3”
},
“v_type”: “company”
}]}
]
}
# insert company “company4_jp”
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: []
}
# after insertion
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{“numCompanies”: 2},
{“C”: [
{
“v_id”: “company3”,
“attributes”: {
“country”: “jp”,
“id”: “company3”
},
“v_type”: “company”
},
{
“v_id”: “company4_jp”,
“attributes”: {
“country”: “jp”,
“id”: “company4_jp”
},
“v_type”: “company”
}
]}
]
}

UPDATE Statement

The UPDATE statement updates the attribute of each vertex or edge in a vertex set or edge set, respectively, with new attribute values.


EBNF
updateStmt := UPDATE name FROM ( edgeSet | vertexSet ) SET DMLSubStmtList [whereClause]

The set of vertices or edges to update is described in the FROM clause, following the same rules as the FROM clause in a SELECT block. In the SET clause, the DMLSubStmtList may contain assignment statements to update the attributes of a vertex or edge. Both simple base type attributes and collection type attributes can be updated. These assignment statements use the vertex or edge aliases declared in the FROM clause. The optional WHERE clause supports boolean conditions to filter the items in the vertexS
et or edgeSet.


UPDATE statement example
# Change all “person” vertices with location equal to “us” to “USA”
CREATE QUERY updateEx() FOR GRAPH workNet {
S = {person.*};UPDATE s FROM S:s
SET s.locationId = “USA”, # simple base type attribute
s.skillList = [1,2,3] # collection-type attribute
WHERE s.locationId == “us”;# The update cannot become effective within this query, so PRINT S still show “us”.
PRINT S;
}

 

The UPDATE statement can only be used as a query-body-level statement. However, DML-sub level updates are still possible by using other statement types. A vertex attribute’s value can be updated within the POST-ACCUM clause of a SELECT block by using the assignment operator (=); An edge attribute’s value can be updated within the ACCUM clause of a SELECT block by using the assignment operator. In fact, the UPDATE statement is equivalent to a SELECT statement with ACCUM and/or POST-ACCUM to update the vertex or edge attribute values. Below is an example.


Updating a vertex’s attribute value in a ACCUM clause is not allowed, because the update can occur multiple times in parallel, and possibly result in an non-deterministic value. If the vertex attribute value update depends on an edge attribute value, use the vertex-attached accumulators to save the value and update the vertex attribute’s value in the POST-ACCUM clause.

The query below uses the SELECT statement instead of the UPDATE statement, but is functional similar to the query above.  Query updateEx2 reverses the locationId change made by updateEx (changing the location back to “us” from “USA”).


UPDATE statement example 2
# The second example is equivalent to the above updateEx
CREATE QUERY updateEx2() FOR GRAPH workNet {
S = {person.*};X = SELECT s
FROM S:s
WHERE S.locationId == “USA”
POST-ACCUM S.locationId = “us”,
S.skillList = [3,2,1];
PRINT S;
}

 

Below is an example of an edge update with two attribute changes, including an incremental change (e.startYear = e.startYear + 1):


UPDATE statement example 3
CREATE QUERY updateEx3() FOR GRAPH workNet{
S = {person.*};# update edge and target vertices’ attribute
UPDATE e FROM S:s – (worksFor:e) -> :t
SET e.startYear = e.startYear + 1,
e.fullTime = false
WHERE s.locationId == “us”;PRINT S;
}

 

 

Other Update Methods

In addition to the above UPDATE statement and SELECT statement, a simple assignment statement at the query-body level can be used to update the attribute value of a single vertex/edge, if the vertex/edge has been assigned to a variable or parameter.


update by assignment
# change the given person’s new location
CREATE QUERY updateByAssignment(VERTEX<person> v, STRING newLocation) FOR GRAPH workNet{
v.locationId = newLocation;
}


Output Statements and FILE Objects

PRINT Statement (API v2)

The PRINT statement specifies output data. Each execution of a PRINT statement adds a JSON object to the results array which will be part of the query output. A PRINT statement can appear anywhere that query-body statements are permitted.


A PRINT statement does not trigger immediate output.  The full set of data from all PRINT statements is delivered at one time, when the query concludes.


EBNF
printStmt := PRINT printExpr {,printExpr} [WHERE condition] [TO_CSV (filePath | fileVar)]
printExpr := (expr | vExprSet) [ AS name]
vExprSet := expr “[” vSetProj {, vSetProj} “]”
vSetProj := expr [ AS name]

Each PRINT statement contains a list of expressions for output data. The optional WHERE clause filters the output. If the

condition

is false for any items, then those items are excluded from the output.

Each

printExpr

contributes one key-value pair to the PRINT statement’s JSON object result.  The optional AS clause sets the key for the expression, overriding the default key (explained below).


Simple Example Showing JSON Output Format
STRING str = “first statement”;
INT number = 5;
PRINT str, number;str = “second statement”;
number = number + 1;
PRINT str, number;# The statements above produce the following output
{
“version”: {“api”: “v2″,”schema”: 0},
“error”: false,
“message”: “”,
“results”: [
{
“str”: “first statement”,
“number”: 5
},
{
“str”: “second statement”,
“number”: 6
}
]
}

PRINT Expressions

Each

printExpr

may be one of the following:

  1. A literal value
  2. A global or local variable (including VERTEX and EDGE variables)
  3. An attribute of a vertex variable, e.g., Person.name
  4. A global accumulator
  5. An expression whose terms are among the types above.  The following operators may be used:
    Numeric Arithmetic:


    + - * / . %

     

    Bit:


    << >> & |

    String concatenation:


    +

    Set
    UNION INTERSECT MINUS

    Parentheses can be used for controlling order of precedence.

  6. A vertex set variable
  7. A vertex expression set

    vExprSet

    (only available if the output API is set to “v2”. Vertex expression sets are explained in a

    separate section below

    .

In output API v2, the print expression list can be a mixed list of any of the expression types.

In output API v1, vertex set variables cannot be on the same PRINT statement with other types of expressions.

JSON Format: Keys

If a

printExpr

includes the optional

AS

name


clause, then the

name

sets the key for that expression in the JSON output. Otherwise, the following rules determine the key: If the expression is simply a single variable (local variable, global variable, global accumulator, or vertex set variable), then the key is the variable name.  Also, for a vertex expression set, the key is the vertex set variable name. Otherwise, the key is the entire expression, respresented as a string.


JSON Format: Values

Each data type has a distinct output format.

  • Simple numeric, string, and boolean data types follow JSON standards.
  • Lists, sets, bags, and arrays are printed as JSON

    arrays

    (i.e., a list enclosed in square brackets).
  • Maps and tuples are printed as JSON

    objects

    (i.e., a list of key:value pairs enclosed in curly braces).
  • Vertices and edges have a custom JSON object, shown below.
  • A vertex set variable is treated as a list of vertices.
  • Accumulator output format is determined by the accumulator’s return type. For example, an AvgAccum outputs a DOUBLE value, and a BitwiseAndAccum outputs a INT value. For container accumulators, simply consider whether the output is a list, set, bag, or map.
    • ListAccum, SetAccum, BagAccum, ArrayAccum: list
    • MapAccum: map
    • HeapAccum, GroupByAccum: list of tuples

Full details of vertices are printed only when part of a vertex set variable or vertex expression set. When a single vertex is printed (from a variable or accumulator whose data type happens to be VERTEX), only the vertex id is printed.


Cases where only the vertex id will be printed
ListAccum<VERTEX> @@vList; // not a vertex set variable
VERTEX v; // not a vertex set variable

PRINT @@vList, v; // output will contain only vertex ids

 


Vertex (when not part of a vertex set variable)

The output is just the vertex id as a string:


Output Format for a Value which is a Vertex, not part of a Vertex Set Variable
“<vertex_id>”


Vertex (as part of a vertex set variable)


Output Format for a Vertex as part of a Vertex Set Variable
{
“v_id”: “<vertex_id>”,
“v_type”: “<vertex_type>”,
“attributes”: {
<list of key:value pairs,
one for each attribute
or vertex-attached accumulator>
}
}


Edge


Output Format for a Value which is an Edge
{
“e_type”: “<edge_type>”,
“directed”: <boolean_value>,
“from_id”: “<source_vertex_id>”,
“from_type”: “<source_vertex_type>”,
“to_id”: “<target_vertex_id>”,
“to_type”: “<target_vertex_type>”,
“attributes”: {
<list of key:value pairs,
one for each attribute>
}
}


List, Set or Bag


Output format for a Value which is a List, Set, or Bag
[
<value1>,
<value2>,
…,
<valueN>
]


Map


Output Format for a Value which is a Map
{
<key1>: <value1>,
<key2>: <value2>,
…,
<keyN>: <valueN>
}


Tuple


Output Format for a Value which is a Tuple
{
<fieldName1>: <value1>,
<fieldName2>: <value2>,
…,
<fieldNameN>: <valueN>
}


Vertex Set Variable


Output Format for a Value which is a Vertex Set Variable
[
<vertex1>,
<vertex2>,
…,
<vertexN>
]

 



Vertex Expression Set

A vertex expression set is a list of expressions which is applied to each vertex in a vertex set variable. The expression list is used to compute an alternative set of values to display in the “attributes” field of each vertex.

The easiest way to understand this is to consider examples containing only one term and then consider combinations. Consider the following example query. C is a vertex set variable containing the set of all company vertices. Furthermore, each vertex has a vertex-attached accumuator @count.


Example Query for Vertex Expression Set

# CREATE VERTEX company(PRIMARY_ID clientId STRING, id STRING, country STRING)

CREATE QUERY vExprSet () FOR GRAPH workNet {
SumAccum<INT> @count;
C = {company.*};

# include some print statements here
}

If we print the full vertex set, the “attributes” field of each vertex will contain 3 fields: “id”, “country”, and “@count”.  Now consider some simple vertex expression sets:

PRINT C[C.country]

prints the vertex set variable C, except that the “attributes” field will contain only “country”, instead of 3 fields.

PRINT C[C.@count]

prints the vertex set variable C, except that the “attributes” field will contain only “@count”, instead of 3 fields.

PRINT C[C.id, C.@count]

prints the vertex set variable C, except that the “attributes” field will contain only “id” and “@count”.

PRINT C[C.id+”_ex”, C.@count+1]

prints the vertex set variable C, except that the “attributes” field contains the following:

  • One field consists of each vertex’s id value, with the string “_ex” appended to it.
  • Another field consists of the @count value incremented by 1.

    Note: the value of @count itself has not changed, only the displayed value is incremented.

The last example illustrates the general format for a vertex expression set:


Syntax for Vertex Expression Set
vExprSet := expr “[” vSetProj {, vSetProj} “]”
vSetProj := expr [ AS name]

The vertex expression set begins with the name of a vertex set variable.  It is followed by a list of attribute expressions, enclosed in square brackets. Each attribute expression follows the same rules described earlier in the Print Expressions section.  That is, each attribute expression may refer to one or more attributes or vertex-attached accumulators of the current vertices, as well as literals, local or global variables, and global accumulators. The allowed operators (for numeric, string, or set operations) are the same ones mentioned above.

The

key

for the vertex expression set is the vertex set variable name.

The

value

for the vertex expression set is a modified vertex set variable, where the regular “attributes” value for each vertex is replaced with a set of key:value pairs corresponding to the set of attribute expressions given in the print expression.

 

An example which shows all of the cases described above, in combination, is shown below.


Print Basic Example

CREATE QUERY printExampleV2(VERTEX<person> v) FOR GRAPH socialNet {

SetAccum<VERTEX> @@setOfVertices;
SetAccum<EDGE> @postedSet;
MapAccum<VERTEX,ListAccum<VERTEX>> @@testMap;
FLOAT paperWidth = 8.5;
INT paperHeight = 11;
STRING Alpha = “ABC”;

Seed = person.*;
A = SELECT s
FROM Seed:s
WHERE s.gender == “Female”
ACCUM @@setOfVertices += s;

B = SELECT t
FROM Seed:s – (posted:e) -> post:t
ACCUM s.@postedSet += e,
@@testMap += (s -> t);

# Numeric, String, and Boolean expressions, with renamed keys:
PRINT paperHeight*paperWidth AS PaperSize, Alpha+”XYZ” AS Letters,
A.size() > 10 AS AsizeMoreThan10;
# Note how an expression is named if “AS” is not used:
PRINT A.size() > 10;

# Vertex variables. Only the vertex id is included (no attributes):
PRINT v, @@setOfVertices;

# Map of Person -> Posts posted by that person:
PRINT @@testMap;

# Vertex Set Variable. Each vertex has a vertex-attached accumulator, which
# happens to be a set of edges (SetAccum<EDGE>), so edge format is shown also:
PRINT A AS VSetVarWomen;

# Vertex Set Expression. The same set of vertices as above, but with only
# one attribute plus one computed attribute:
PRINT A[A.gender, A.@postedSet.size()] AS VSetExpr;
}


Note how the results of the six PRINT statements are grouped in the JSON “results” field below:

  1. Each of the six PRINT statements is represented as one JSON object with the “results” array.
  2. When a PRINT statement has more than one expression (like the first one), the expressions may appear in the output in a different order than on the PRINT statement.
  3. The 2nd PRINT statement shows a key that is generated from the expression itself.
  4. The 3rd and 4th PRINT statements show a set of vertices (different than a vertex set variable) and a map, respectively.
  5. The 5th PRINT statement shows the vertex set variable A, including its vertex-attached accumulators (PRINT A).
  6. The 6th PRINT statement shows a vertex set expression for A, customized to include only one static attribute plus a newly computed attribute.

Results from Query printExampleV2 (WITH COMMENTS ADDED)

 



GSQL > RUN QUERY printExampleV2(“person1”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [
{
“AsizeMoreThan10”: false,
“Letters”: “ABCXYZ”,
“PaperSize”: 93.5
},
{“A.size()>10”: false},
{
“v”: “person1”,
“@@setOfVertices”: [ “person4”, “person5”, “person2” ]
},
{“@@testMap”: {
“person4”: [“3”],
“person3”: [“2”],
“person2”: [“1”],
“person1”: [“0”],
“person8”: [ “7”, “8” ],
“person7”: [ “9”, “6” ],
“person6”: [ “10”, “5” ],
“person5”: [ “4”, “11” ]
}},
{“VSetVarWomen”: [
{
“v_id”: “person4”,
“attributes”: {
“gender”: “Female”,
“id”: “person4”,
“@postedSet”: [{
“from_type”: “person”,
“to_type”: “post”,
“directed”: true,
“from_id”: “person4”,
“to_id”: “3”,
“attributes”: {},
“e_type”: “posted”
}]
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“gender”: “Female”,
“id”: “person5”,
“@postedSet”: [
{
“from_type”: “person”,
“to_type”: “post”,
“directed”: true,
“from_id”: “person5”,
“to_id”: “11”,
“attributes”: {},
“e_type”: “posted”
},
{
“from_type”: “person”,
“to_type”: “post”,
“directed”: true,
“from_id”: “person5”,
“to_id”: “4”,
“attributes”: {},
“e_type”: “posted”
}
]
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“gender”: “Female”,
“id”: “person2”,
“@postedSet”: [{
“from_type”: “person”,
“to_type”: “post”,
“directed”: true,
“from_id”: “person2”,
“to_id”: “1”,
“attributes”: {},
“e_type”: “posted”
}]
},
“v_type”: “person”
}
]},
{“VSetExpr”: [
{
“v_id”: “person4”,
“attributes”: {
“A.@postedSet.size()”: 1,
“A.gender”: “Female”
},
“v_type”: “person”
},
{
“v_id”: “person5”,
“attributes”: {
“A.@postedSet.size()”: 2,
“A.gender”: “Female”
},
“v_type”: “person”
},
{
“v_id”: “person2”,
“attributes”: {
“A.@postedSet.size()”: 1,
“A.gender”: “Female”
},
“v_type”: “person”
}
]}
]
}

 

Printing CSV to a FILE Object

Instead of printing output in JSON format, output can be written to a FILE object in comma-separated values (CSV) format. To select this option, at the end of the PRINT statement, include the keyword TO_CSV followed by the FILE object name:


PRINT to CSV FILE syntax example
PRINT @@setOfVertices TO_CSV file1;

The bracket > is no longer supported for directing output to a file or FILE.  You must use the keyword TO_CSV.

Each execution of the PRINT statement appends one line to the FILE. If the PRINT statement includes multiple expressions, then each printed value is separated from its neighbor by a comma. If an expression evaluates to a set or list, then the collection’s values are delimited by single spaces. Due to the simpler format of CSV vs. JSON, the TO_CSV feature only supports data with a simple one- or two-dimension structure.

Limitations of PRINT > File


  • Printing a full Vertex set variable is not supported.
  • If a vertex is printed, only its ID value is printed.
  • If printing a vertex set’s vertex-attached accumulator or a vertex set’s variable, the result is a list of values, one for each vertex, separated by newlines.
  • The syntax for printing a vertex set expression is currently different when printing to a file than when printing to standard output. Compare:
    • PRINT A[A.gender]; # with brackets
    • PRINT A.gender TO_CSV file1; # without brackets

Writing to FILE objects is optimized for parallel processing. Consequently, the order in which data is written to the FILE is not guaranteed.  Therefore, it is strongly recommended that the user design their queries such that one of these conditions is satisfied:

  1. The query prints only one set of data, and the order of the set is not important.
  2. Each line of data to print to a file includes a label which can be used to identify the data.

 


PRINT WHERE and PRINT TO_CSV FILE Object Example
CREATE QUERY printExampleFile() FOR GRAPH socialNet {
SetAccum<VERTEX> @@testSet, @@testSet2;
ListAccum<STRING> @@strList;
int x = 3;
FILE file1 (“/home/tigergraph/printExampleFile.txt”);Seed = person.*;
A = SELECT s
FROM Seed:s
WHERE s.gender == “Female”
ACCUM @@testSet += s, @@strList += s.gender;
A = SELECT s
FROM Seed:s
WHERE s.gender == “Male”
ACCUM @@testSet2 += s;PRINT @@testSet, @@testSet2 TO_CSV file1; # 1st line: 2 4 5, 1 3 6 7 8 (order not guaranteed)
PRINT x WHERE x < 0 TO_CSV file1; # 2nd line: <skipped because no content>
PRINT x WHERE x > 0 TO_CSV file1; # 3rd line: 3
PRINT @@strList TO_CSV file1; # 4th line: Female Female Female
PRINT A.gender TO_CSV file1; # 5th line: Male\n Male\n Male\n Male\n Male
}

Printing to a CSV File as a Filepath (DEPRECATED)

Instead of printing CSV output to a FILE object, data can be written to a regular file.


PRINT to CSV FILE syntax example
PRINT @@setOfVertices TO_CSV “/home/tigergraph/vset.csv”;

This feature is deprecated because printing to a FILE object covers the same functionality.

The table below shows the differences between printing TO_CSV <FILE object> vs. TO_CSV <fllepath>.

FILE Object filepath
When filepath is specified Either run-time or compile-time,

depending on how users chooses

to write the query

compile-time
Vertex IDs displayed correctly displayed as TigerGraph

internal ID codes

Append or overwrite Appends, but FILE object declaration will reset the FILE. Always appends.
filepath can be absolute or relative Currently only absolute Absolute or relative

 

FILE println statement

One of the two ways to write data to a FILE object is with the FILE println statement.  (The other way is with the PRINT statement’s TO_CSV option.)


EBNF for FILE println statement
printlnStmt := fileVar”.println” “(” expr {, expr} “)”


println is a method (function) of a FILE object variable.


The println statement can be used either at the query-body level or a a DML-sub-statement, e.g., within the ACCUM clause of a SELECT block.  Each time println is called, it adds one new line of values to the FILE object, and then to the corresponding file.


The println function can print most of the expressions handled by PRINT. Note, however, that this does not include vertex expression sets (vExprSet). If the println statement has a list of expressions to print, then this will produce a comma-separated list of values. If an expression refers to a list or set, then the output will be a list of values separated by spaces, the same format produced by TO_CSV.


The data from query-body level FILE print statements (either TO_CSV or println) will appear in their original order. However, due to the parallel pocessing of statements in an ACCUM block, the order in which println statements at the DML-sub-statement level are processed cannot be guaranteed. Moreover, the output from println statements in an ACCUM block can be interspersed with the query-body statements.


File object query example

CREATE QUERY fileEx (STRING fileLocation) FOR GRAPH workNet {

FILE f1 (fileLocation);
P = {person.*};

PRINT “header” TO_CSV f1;

USWorkers = SELECT v FROM P:v
WHERE v.locationId == “us”
ACCUM f1.println(v.id, v.interestList);
PRINT “footer” TO_CSV f1;
}
INSTALL QUERY fileEx
RUN QUERY

All of the PRINT statements in this example use the TO_CSV option, so there is no JSON output to the console.


Results from Query fileEx
GSQL > RUN QUERY fileEx(“/home/tigergraph/fileEx.txt”)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: []
}

All the output in this case goes the the FILE object. In the query definiton, the footer is the last FILE statement, but the println statements from the SELECT block happen to be delayed and are printed AFTER the footer line.


File contents produced by fileEx example
[tigergraph@localhost]$ more /home/tigergraph/fileEx.txt
header
person7,art sport
person10,football sport
person4,football
person9,financial teaching
person1,management financial
footer

Passing a FILE Object as a Parameter

A FILE Object can be passed from one query to a subquery.  The subquery can then also write to the FILE object.


Example: query passing a FILE object to another query
CREATE QUERY fileParamSub(FILE f, STRING label, INT num) FOR GRAPH socialNet {
f.println(label, “header”);
FOREACH i IN RANGE [1,2] DO
f.println(label, num+i);
END;
f.println(label, “footer”);
}CREATE QUERY fileParamMain(STRING mainlabel) FOR GRAPH socialNet {
FILE f (“/home/tigergraph/fileParam.txt”);
f.println(mainlabel, “header”);
FOREACH i IN RANGE [1,2] DO
f.println(mainlabel, i);
fileParamSub(f, ” sub”, 10*i);
END;
f.println(mainlabel, “footer”);
}
GSQL > RUN QUERY fileParamMain(“main”)
GSQL > EXIT$ cat /home/tigergraph/fileParam.txt
main,header
main,1
sub,header
sub,11
sub,12
sub,footer
main,2
sub,header
sub,21
sub,22
sub,footer
main,footer

 

LOG Statement

The LOG statement is another means to output data.  It works as a function that outputs information to a log file.


EBNF for LOG statement
logStmt := LOG “(” condition “,” argList “)”

The first argument of the LOG statement is a boolean condition that enables or disables logging.  This allows logging to be easily turned on/off, for uses such as debugging.  After the condition, LOG takes one or more expressions (separated by commas).  These expressions are evaluated and output to the log file.

Unlike the PRINT statement, which can only be used as a query-body statement, the LOG statement can be used as both a query-body statement and a DML-sub-statement.

The values will be recorded in the GPE log. To find the log file after the query has completed, open a Linux shell and use the command  “gadmin log gpe”.  It may show you more than one log file name; use the one ending in “INFO”.  Search this file for “UDF_”.


Examples
BOOLEAN debug = TRUE;
INT x = 10;LOG(debug, 20);
LOG(debug, 10, x);

RETURN Statement


EBNF for RETURN statement

createQuery := CREATE QUERY name “(” [parameterList] “)” FOR GRAPH name [RETURNS “(” baseType | accumType “)”] “{” [typedefs] [declStmts] queryBodyStmts “}”

returnStmt := RETURN expr

The RETURN statement specifies data that a

sub-query

passes back to an outer query that called the sub-query. In order for a query to be used as a subquery, its initial CREATE QUERY statement must include the optional

RETURNS clause, and its body must end with a RETURN statement. Exactly one type is allowed in the RETURNS clause, and thus RETURN statement can only return one expression.

The returned expression must have the same type as the RETURNS clause indicates. A sub-query must be created before its corresponding super-query.  A sub-query must be install either before or in the same INSTALL QUERY command with its super-query.



The return type can be any base type or any


accumulator type, except GroupByAccum and any accumulator containing any tuple type. For the purposes of return type, SetAccum is equivalent to SET, and BagAccum is equivalent to BAG.  A vertex set variable can be returned if SET<VERTEX<type>> or SetAccum<VERTEX<type>> (<type> is optional) is used in the RETURNS clause.

See also Section 5.11 – Queries ad Functions.


Subquery Example 1
CREATE QUERY subquery1 (VERTEX<person> m1) FOR GRAPH socialNet RETURNS(BagAccum<VERTEX<post>>)
{
Start = {m1};
L = SELECT t
FROM Start:s – (liked:e) – post:t;
RETURN L;
}
CREATE QUERY mainquery1 () FOR GRAPH socialNet
{
BagAccum<VERTEX<post>> @@testBag;
Start = {person.*};
Start = SELECT s FROM Start:s
ACCUM @@testBag += subquery1(s);
PRINT @@testBag;
}


 


Result

 



GSQL > RUN QUERY mainquery1()
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@testBag”: [
“6”,
“3”,
“8”,
“4”,
“4”,
“0”,
“0”,
“0”,
“10”
]}]
}


 


Subquery Example 2
CREATE QUERY subquery2 (VERTEX<person> m1) FOR GRAPH socialNet RETURNS(INT)
{
int x;
Start = {m1};
Start = SELECT t FROM Start:t
ACCUM CASE WHEN t.gender == “Male” THEN x = 5
WHEN t.gender == “Female” THEN x = 10
ELSE x = -1
END;
RETURN x;
}
CREATE QUERY mainquery2 (SET<VERTEX<person>> m1) FOR GRAPH socialNet
{
SumAccum<INT> @@sum1;
Start = {m1};
Start = SELECT t FROM Start:t
ACCUM @@sum1 += subquery2(t);
PRINT @@sum1;
}


 


Result

 



GSQL > RUN QUERY mainquery2([“person1″,”person2”])
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“@@sum1”: 15}]
}


End of Output Statement Section

Exception Statements

This section describes how the GSQL language
responds to exceptions and supports user-defined exception handling
. An exception is a run-time error. The GSQL language supports both built-in system exceptions and user-defined exceptions. Built-in exceptions include GSQL language exceptions (such as out-of-range value, wrong data type, and illegal operation), and errors arising in other TigerGraph components or from the operation system.

The
GSQL que
ry language also supports user-defined exception responses, also known as exception handling.  This section covers the following syntax for user-defined exception behavior:

#########################################################
## Exception Statements ##declExceptStmt := EXCEPTION exceptVarName “(” errorCode “)”
exceptVarName := name
errorCode := integerraiseStmt := RAISE exceptVarName [errorMsg]
errorMsg := “(” expr “)”

tryStmt := TRY queryBodyStmts EXCEPTION caseExceptBlock+
[elseExceptBlock] END “;”
caseExceptBlock := WHEN exceptVarName THEN queryBodyStmts
elseExceptBlock := ELSE queryBodyStmts


Default
Exception Response

When an exception occurs during the execution of a query, the default response is the following:

  • The query will not execute any more statements; it will exit.
  • If the query was run using the RUN QUERY command, then an error message will be displayed.
  • If the query was run by invoking the GET /query REST++ endpoint, then the output will be a simple JSON obje
    ct. Some errors have a error “code” field; others do no
    t:


    Output of Unhandled Exception (query run as REST Endpoint)
    {
    “error”: true,
    “message”: “<errorMsg>”
    “code”: “<errType><errorCode>”
    }

The example below show two common errors: wrong data type and divide-by-zero. First we define a simple query that divides 100.0 by the query’s input parameter.


Example: query excpBuiltin
CREATE QUERY excpBuiltin(INT n1) FOR GRAPH minimalNet {
PRINT 100.0/n1;
}

We then test three cases:

  1. A valid input (such as n1 = 7)
  2. Wrong data type (n1 = “A”)
  3. Divide by zero (n1 = 0)

First we test using the GSQL interface. When the query runs without error, the output is in JSON format. Where there is a built-in exception, however, only an error message is displayed.


Exception response for RUN QUERY
GSQL > RUN QUERY excpBuiltin(7)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“100.0/n1”: 14.28571}]
}GSQL > RUN QUERY excpBuiltin(“a”)
Values of parameter n1 must be INT64 type, invalid value [a] provided.GSQL > RUN QUERY excpBuiltin(0)
Runtime Error: divider is zero.

The situation is a little different when running the query as a REST++ endpoint. The output is always in JSON format.


As of TigerGraph v1.2, the format for the GET /query endpoint has changed.  The graph name must now be specified after /query:


/query/{graph_name}/{query_name}

 


Exception response for GET /query request
$ curl -X GET “http://localhost:9000/query/minimalNet/excpBuiltin?n1=7”
{
“error”: false,
“message”: “”,
“results”: [
{
“100.0/n1”: 14.28571
}
],
“version”: {
“api”: “v2”,
“schema”: 0
}
}$ curl -X GET “http://localhost:9000/query/minimalNet/excpBuiltin?n1=a”
{
“code”: “REST-30000”,
“error”: true,
“message”: “Values of parameter n1 must be INT64 type, invalid value [a] provided.”,
“version”: {
“api”: “v2”,
“schema”: 0
}
}$ curl -X GET “http://localhost:9000/query/minimalNet/excpBuiltin?n1=0”
{
“error”: true,
“message”: “Runtime Error: divider is zero.”,
“version”: {
“api”: “v2”,
“schema”: 0
}
}


User-Defined


Exception Behavior

A query author can specify what should be the response if a particular type of exception occurs within a particular specified block of statements.

 

The following statement types are available to specify a user-defined exception condition or a user-defined exception response.

  • The EXCEPTION Declaration Statement names a user-defined exception.
  • The RAISE Statement
    indicates that one of the user-defined exceptions has occurred.
  • The TRY…EXCEPTION Statement is used to define and apply user-defined exception handling to a block of query-body statements. This can be used with or without preceding user-defined EXCEPTION and RAISE statements.

Built-in exceptions always take precedence over user-defined exceptions. Therefore, user-defined exceptions can only be used to catch conditions that would not be caught by a built-in exception. This means that built–in exceptions are best used to capture situations which are legal according to the general syntax and semantics of the GSQL query language, but which are illegal or undesirable for a particular user application.


EXCEPTION Declaration Statement

declExceptStmt := EXCEPTION exceptVarName “(” errorCode “)”
exceptVarName := name
errorCode := integer

To use a user-defined exception, it must first be declared. An exception declaration statement declares a user-defined exception type, assigning a name and identification number. The id number errorCode must be greater than 40,000.  Numbers 40,000 and lower are reserved for system exceptions.
Exception statements must be placed before any query-body statements, after accumulator declaration statements
. A query can declare multiple exception types.


RAISE Statement

raiseStmt := RAISE exceptVarName [errorMsg]
errorMsg := “(” expr “)”

The RAISE statement announces that a user-defined exception has just occurred.  The exceptVarName must match one of the exceptions that was previously declared.  An optional error message can be specified. Once the RAISE statement is executed, the flow of execution changes. If the RAISE statement is not within a TRY clause, then the query ends with the default exception response, using the error code and error message defined by the exception type and RAISE statements. If the RAISE is within a TRY statement, then execution jumps to the EXCEPTION handling clause of the TRY statement.

A RAISE statement itself does not include the conditions that define the exception. Typically, the user will use an IF…THEN statement and place the RAISE statement within the THEN clause.


In the current version, a RAISE statement can only be used as a query-body-statement. It cannot be used as a DML-sub-statement. In particular, you cannot RAISE an exception inside a SELECT statement.

The example below defines and checks for two types of exceptions: an empty input set (40001) and no matching edges (40002). Remember that the minimum allowed code number is 40001.


Example: Unhandled User-Defined Exceptions
CREATE QUERY excpCountActivity(SET<VERTEX<person>> vSet, STRING eType) FOR GRAPH socialNet {
# Count how many edges there are from each member of the input person set to posts,
# along the specified edge type.MapAccum<STRING,INT> @@allCount;
EXCEPTION emptyList (40001);
EXCEPTION noEdges (40002);IF ISEMPTY(vSet) THEN ## Raise 40001
RAISE emptyList (“Error: Input parameter ‘vSet’ (type SET<VERTEX<person>>) is empty”);
END;

Start = vSet;
Results = SELECT s
FROM Start:s -(:e)-> post:t
WHERE e.type == eType
ACCUM @@allCount += (t.subject -> 1);

IF Results.size() == 0 THEN ## Raise 40002
RAISE noEdges (“Error: No ‘” + eType + “‘ edges from the vertex set”);
END;
PRINT @@allCount;
}


Results
// Valid input: no exceptions
$ curl -X GET “http://localhost:9000/query/socialNet/excpCountActivity?vSet=person2&vSet=person6&eType=posted”
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{
“@@allCount”: {
“cats”: 1,
“tigergraph”: 2
}
}]
}// empty input set (due to spelling error in parameter name)
$ curl -X GET “http://localhost:9000/query/socialNet/excpCountActivity?vset=person2&vset=person6&eType=posted”
{
“code”: “40001”,
“error”: true,
“version”: {
“schema”: 0,
“api”: “v2”
},
“message”: “Error: Input parameter ‘vSet’ (type SET<VERTEX<person>>) is empty”
}// no edges (due to unknown edge type)
$ curl -X GET “http://localhost:9000/query/socialNet/excpCountActivity?vSet=person2&vSet=person6&eType=commented”
{
“code”: “40002”,
“error”: true,
“version”: {
“schema”: 0,
“api”: “v2”
},
“message”: “Error: No ‘commented’ edges from the vertex set”
}

 


TRY…EXCEPTION Statement for Custom Error Ha
ndling

tryStmt := TRY queryBodyStmts EXCEPTION caseExceptBlock [elseExceptBlock] END “;”
caseExceptBlock := WHEN exceptVarName THEN queryBodyStmts+
elseExceptBlock := ELSE queryBodyStmts

The TRY…EXCEPTION Statement is used to define and apply user-defined exception handling to a block of query-body statements. A TRY…EXCEPTION statement can be nested within a TRY block or EXCEPTION block.


The current version of GSQL does not support custom handling of built-in exceptions. Therefore, if a built-in exception occurs, it ignores the TRY..EXCEPTION blocks and simply applies the default handling, and the query aborts. In future updates, we plan to support custom handling of both custom exceptions (RAISE) and built-in exception with the TRY…EXCEPTION block.

The TRY…EXCEPTION Statement is a compound statement containing two blocks. The first block (TRY) consists of the query-body statements for which custom error handling should be applied. The second block (EXCEPTION) contains a series of WHEN…THEN exception handling clauses.  Each exception handling clause names an exception type and specifies what actions to take in the event of the exception. An optional ELSE clause contains handling statements for all other exceptions. The following text and visual flowchart details how the TRY… EXCEPTION block handles an exception.

When an exception occurs within a TRY block, the flow of execution skips the remainder of the TRY block and jumps to the EXCEPTION block. The GSQL flow now seeks to match the exception type with a handler. After executing the handling statements in the THEN or ELSE clause, the flow skips the remainder of the EXCEPTION block and continues with the statement following the END statement. However, if there is no matching WHEN or ELSE handler, then the exception is propagated. That is, the RAISE state is maintained after exiting the EXCEPTION block. If the TRY…EXCEPTION block is nested inside another TRY block, then the handling process is repeated at this upper level. This repeats until either the exception is handled or there are no more TRY…EXCEPTION blocks.

Finally, if the unhandled exception is not within a TRY block, then the the query is aborted, and the default exception response is the output.




Case 1:


If cond1 is true

in the outer TRY block,

  • RAISE A and jump to the output EXCEPTION block.

Handled by ELSE HandStmtsZ.


Case 2: If cond2 is true

in the inner TRY block,

  • RAISE A and jump to the inner EXCEPTION block.

Handled by handStmtsX;


Case 3: If cond3 is tru

e in the inner TRY block,

  • RAISE B and jump to the inner EXCEPTION block. There is no matching handler here, so propagate the exception. Jump to the outer EXCEPTION block. Handled by handStmtsY.



Custom Handling Example:


The following example is a modified shortest path query.  It looks for all paths from a source to a target in a computer network. It uses breadth-first search and stops at depth N when it has found at least one path at depth N, or it has searched the entire graph. There are three conditions which will cause it to RAISE an exception and abort the search:


  1. Seeing an edge with a negative connection speed (because the graph has bad data).

  2. Seeing an edge with a very slow connection speed (again because the graph has bad data).

  3. If no path was found in the graph (the search is already over, but we skip printing results).


Note that cases 1 and 2 do NOT mean that a negative or slow speed edge is actually on a shortest path, only that the query noticed a bad edge during its search. Also, b

ecause we cannot RAISE within the SELECT block,
we use a workaro
und: set an integer variable with an error code.  Immediately after the SELECT block, test the integer variable and RAISE exceptions as needed.


Example: Path Search with Exceptions
CREATE QUERY compPathValid (vertex<computer> src, vertex<computer> tgt, BOOL enExcp)
FOR GRAPH computerNet {
# Find valid paths in a computer network from a source to a target.
# Stop search once you have found some paths.
# 3 Exceptions: (1) Negative connection speed, (2) Slow connection speed, (3) No path.
# Set enExcp=true to raise exceptions. enExcp=false will find paths, good or bad.OrAccum @@reached, @visited;
ListAccum<STRING> @paths;
DOUBLE minSpeed = 0.4;
INT err;EXCEPTION negSpeed (40001);
EXCEPTION slowSpeed (40002);
EXCEPTION notReached (40003);

TRY
Start = {src};
# Initialize: path to src is itself.
Start = SELECT s
FROM Start:s
ACCUM s.@paths = s.id;

WHILE Start.size() != 0 AND NOT @@reached DO
Start = SELECT t
FROM Start:s -(:e)-> :t
WHERE t.@visited == false
ACCUM CASE
WHEN e.connectionSpeed < 0 THEN err = 1
WHEN e.connectionSpeed < minSpeed THEN err = 2
WHEN t == tgt THEN @@reached += true
END,
# List1 * List2 -> List(each elem of List1 concat w/each elem of List2)
t.@paths += (s.@paths * [“~”]) * [t.id]
POST-ACCUM t.@visited = true;
IF err == 1 AND enExcp THEN
RAISE negSpeed (“Negative Speed”);
ELSE IF err == 2 AND enExcp THEN
RAISE slowSpeed (“Slow Speed”);
END;
END; # WHILE

IF NOT @@reached AND enExcp THEN
RAISE notReached (“No path to target”);
ELSE
Result = {tgt};
PRINT Result[Result.@paths]; // api v2
END;
EXCEPTION
WHEN negSpeed THEN PRINT “bad path: negative speed”;
WHEN slowSpeed THEN PRINT “bad path: slow speed”;
WHEN notReached THEN PRINT “no path from source to target”;
END;
}

As the data in Appendix D show:

  • Any search passing through c1 will see negative edges.
  • Any search passing through c12 will see negative and slow edges.
  • Any search passing through c14 will see negative edges.

The results for 5 cases are shown: 1 valid search plus each of the 3 exception conditions.  The 5th case is the same as the 4th, but exception handling is not enabled.


compPathValid.json
GSQL > RUN QUERY compPathValid(“c10″,”c12”,true)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“Result”: [{
“v_id”: “c12”,
“attributes”: {“Result.@paths”: [“c10~c11~c12”]},
“v_type”: “computer”
}]}]
}
GSQL > RUN QUERY compPathValid(“c1″,”c12”,true)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“bad path: negative speed”: “bad path: negative speed”}]
}
GSQL > RUN QUERY compPathValid(“c10″,”c13”,true)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“bad path: slow speed”: “bad path: slow speed”}]
}
GSQL > RUN QUERY compPathValid(“c24″,”c25”,true)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“no path from source to target”: “no path from source to target”}]
}
GSQL > RUN QUERY compPathValid(“c24″,”c25”,false)
{
“error”: false,
“message”: “”,
“version”: {
“schema”: 0,
“api”: “v2”
},
“results”: [{“Result”: [{
“v_id”: “c25”,
“attributes”: {“Result.@paths”: []},
“v_type”: “computer”
}]}]
}

 

Exception Handling Flowchart


The flowchart below summarizes all the cases for triggering and handling exceptions, both user-defined and built-in.




 

Comments

A comment is a section of text that is ignored by the language parser; its purpose is to provide information to human readers.  The comment markers follow the conventions used in C++ and SQL:

  • Single-line or partial-line comments begin with either # or // and end at the end of the line (with the newline character).
  • Multi-line comment blocks begin with /* and end with */

Appendices

Appendix A. Common Errors and Problems

Floating Point Precision Limits

No computer can store all floating point numbers (i.e., non-integers) with perfect precision. The

float


data type offers about 7 decimal digits of precision; the


double


data type offers about 15 decimal digits of precision. Comparing two float or double values by using operators involving


exact equality (==, <=, >=, BETWEEN … AND …) might lead to unexpected behavior. If the GSQL language parser detects that the user is attempting an exact equivalence test with float or double data types, it will display a warning message and suggestion. For example, if there are two float variables v and v2, the expression v == v2 causes the following warning message:

The comparison ‘v==v2’ may lead to unexpected behavior because it involves
equality test between float/double numeric values. We suggest to do such
comparison with an error margin, e.g. ‘abs((v) – (v2)) < epsilon’, where epsilon
is a very small positive value of your choice, such as 0.0001.




Response to Non-existent vertex ID


If a query has a vertex parameter (VERTEX or VERTEX<vType>), and if the ID for a nonexistent vertex is given when running the query, an error message is shown, and the query won’t run. This is also the

response when calling a function to convert a single vertex ID string to a vertex:


  • to_vertex(): See Section “Miscellaneous Functions”.






However, if the parameter is a vertex set (SET<VERTEX> or SET<VERTEX<vType>>), and one or more nonexistent IDs are given when running the query, a warning message is shown, but the query still runs, ignoring those nonexistent IDs. Therefore, if all given IDs are nonexistent, the parameter becomes
an
empty set.

T


his is also the response when calling a function to convert a set of vertex IDs to a set of vertices

:


  • to_vertex_set():

    See Section ”


    Miscellaneous Functions


    “.

  • SelectVertex():

    See Section ”


    Miscellaneous Functions


    “.

 

 

Appendix B. Complete Formal Syntax for Query Language

Version 2.1

This is the definition for the
GSQL
Query Language syntax.  It is defined as a set of rules expressed in EBNF notation.

Notation Used to Define Syntax

This defines the EBNF notation used to describe the syntax.  Rules contains terminal and non-terminal symbols.  A terminal symbol is a base-level symbol which expresses literal output. All symbols in single or double quotes (e.g., ‘+’, “=”,  “)”, “10”) are terminal symbols. A non-terminal symbol is defined as some combination of terminal and non-terminal symbols. The left-hand side of a rule is always a non-terminal; this rule defines the non-terminal.  The example rule below defines assignmentStmt (that is, an Assignment Statement) to be a name followed by an equal sign followed by an expression, operator, and expression with a terminating semi-colon.   AssignmentStmt, name, and expr are all non-terminals.  Additionally, all KEYWORDS are in all-capitals and are terminal symbols.   The “:=” is part of EBNF and states the left hand side can be expanded to the right hand side.


EBNF Syntax example: A rule
assignmentStmt := name “=” expr op expr “;”

A

vertical bar


|

in EBNF indicates choice.  Choose either the symbol on the left or on the right.  A sequence of vertical bars means choose any one of the symbols in the sequence.


EBNF Syntax: vertical bar
op := “+” | “-” | “*” | “/”


Square brackets


[ ]

indicate an optional part or group of symbols.

Parentheses


( )

group symbols together.  The rule below defines a constant to be one, two, or three digits preceded by an optional plus or minus sign.


EBNF Syntax: Square brackets and parentheses
constant := [“+” | “-“] (digit | (digit digit) | (digit digit digit))


Star *

and

plus +

are symbols in EBNF for closure.  Star means zero or more occurrences, and plus means one or more occurrences.  The following defines intConstant to be an optional plus or minus followed by one or more digits.  It also defines floatConstant to be an optional plus or minus followed by zero or more digits followed by a decimal followed by one or more digits.  The star and plus also can be applied to groups of symbols as in the definition of list.  The non-terminal list is defined as a parenthesized list of comma-separated expressions (expr).  The list has at least one expression which can be followed by zero or more comma-expression pairs.


EBNF Syntax: square brackets and parentheses
intConstant := [“+” | “-“] digit+
floatConstant := [“+” | “-“] digit* “.” digit+
list := “(” expr [“,” expr]* “)”

Curly braces

{ }

enclose an optional group of symbols which are repeated zero or more times. Therefore, curly braces are equivalent to square brackets or parentheses followed by a star + to indicate zero or more repetitions.  All of the following expressions are equivalent:

list1 := expr [“,” expr]*
list2 += expr (“,” expr)*
list3 := expr {“,” expr}

For brevity, the literal comma is sometimes shown without quotation marks:

list4 := expr {, expr}

 

GSQL Query Language
E
BNF

#########################################################
## EBNF for GSQL Query LanguagecreateQuery := CREATE [DISTRIBUTED] [OR REPLACE] QUERY name “(” [parameterList] “)” FOR GRAPH name
[RETURNS “(” baseType | accumType “)”]
[API “(” stringLiteral “)”]
“{” [typedefs] [declStmts] [declExceptStmts] queryBodyStmts “}”parameterValueList := parameterValue [, parameterValue]*
parameterValue := parameterConstant
| “[” parameterValue [, parameterValue]* “]” // BAG or SET
| “(” stringLiteral, stringLiteral “)” // a generic VERTEX value
parameterConstant := numeric | stringLiteral | TRUE | FALSE
parameterList := parameterType name [“=” constant] [“,” parameterType name [“=” constant]]*

typedefs := (typedef “;”)+
declStmts := (declStmt “;”)+
declStmt := baseDeclStat | accumDeclStmt | fileDeclStmt
declExceptStmts := (declExceptStmt “;”)+
queryBodyStmts := (queryBodyStmt “;”)+
queryBodyStmt := assignStmt // Assignment
| vSetVarDeclStmt // Declaration
| gAccumAssignStmt // Assignment
| gAccumAccumStmt // Assignment
| funcCallStmt // Function Call
| selectStmt // Select
| queryBodyCaseStmt // Control Flow
| queryBodyIfStmt // Control Flow
| queryBodyWhileStmt // Control Flow
| queryBodyForEachStmt // Control Flow
| BREAK // Control Flow
| CONTINUE // Control Flow
| updateStmt // Data Modification
| insertStmt // Data Modification
| queryBodyDeleteStmt // Data Modification
| printStmt // Output
| printlnStmt // Output
| logStmt // Output
| returnStmt // Output
| raiseStmt // Exception
| tryStmt // Exception

installQuery := INSTALL QUERY [installOptions] ( “*” | ALL |name [, name]* )
runQuery := RUN QUERY [runOptions] name “(” parameterValueList “)”

showQuery := SHOW QUERY name
dropQuery := DROP QUERY ( “*” | ALL | name [, name]* )

#########################################################
## Types and names

lowercase := [a-z]
uppercase := [A-Z]
letter := lowercase | uppercase
digit := [0-9]
integer := [“-“]digit+
real := [“-“](“.” digit+) | [“-“](digit+ “.” digit*)

numeric := integer | real
stringLiteral := ‘”‘ [~[“] | ‘\\’ (‘”‘ | ‘\\’)]* ‘”‘

name := (letter | “_”) [letter | digit | “_”]* // Can be a single “_” or start with “_”

type := baseType | name | accumType | STRING COMPRESS

baseType := INT
| UINT
| FLOAT
| DOUBLE
| STRING
| BOOL
| VERTEX [“<” name “>”]
| EDGE
| JSONOBJECT
| JSONARRAY
| DATETIME

filePath := name | stringLiteral

typedef := TYPEDEF TUPLE “<” tupleType “>” name

tupleType := (baseType name) | (name baseType) [“,” (baseType name) | (name baseType)]*

parameterType := baseType
| [ SET | BAG ] “<” baseType “>”
| FILE

#########################################################
## Accumulators

accumDeclStmt := accumType “@”name [“=” constant][, “@”name [“=” constant]]*
| “@”name [“=” constant][, “@”name [“=” constant]]* accumType
| [STATIC] accumType “@@”name [“=” constant][, “@@”name [“=” constant]]*
| [STATIC] “@@”name [“=” constant][, “@@”name [“=” constant]]* accumType

accumType := “SumAccum” “<” ( INT | FLOAT | DOUBLE | STRING | STRING COMPRESS) “>”
| “MaxAccum” “<” ( INT | FLOAT | DOUBLE ) “>”
| “MinAccum” “<” ( INT | FLOAT | DOUBLE ) “>”
| “AvgAccum”
| “OrAccum”
| “AndAccum”
| “BitwiseOrAccum”
| “BitwiseAndAccum”
| “ListAccum” “<” type “>”
| “SetAccum” “<” elementType “>”
| “BagAccum” “<” elementType “>”
| “MapAccum” “<” elementType “,” type “>”
| “HeapAccum” “<” name “>” “(” (integer | name) “,” name [ASC | DESC] [“,” name [ASC | DESC]]* “)”
| “GroupByAccum” “<” elementType name [“,” elementType name]* , accumType name [“,” accumType name]* “>”
| “ArrayAccum” “<” name “>”

elementType := baseType | name | STRING COMPRESS

gAccumAccumStmt := “@@”name “+=” expr

###############################################################################
## Operators, Functions, and Expressions

constant := numeric | stringLiteral | TRUE | FALSE | GSQL_UINT_MAX
| GSQL_INT_MAX | GSQL_INT_MIN | TO_DATETIME “(” stringLiteral “)”

mathOperator := “*” | “/” | “%” | “+” | “-” | “<<” | “>>” | “&” | “|”

condition := expr
| expr comparisonOperator expr
| expr [ NOT ] IN setBagExpr
| expr IS [ NOT ] NULL
| expr BETWEEN expr AND expr
| “(” condition “)”
| NOT condition
| condition (AND | OR) condition
| (TRUE | FALSE)

comparisonOperator := “<” | “<=” | “>” | “>=” | “==” | “!=”

expr := [“@@”]name
| name “.” “type”
| name “.” [“@”]name
| name “.” “@”name [“\'”]
| name “.” name “.” name “(” [argList] “)”
| name “.” name “(” [argList] “)” [ “.”.FILTER “(” condition “)” ]
| name [“<” type [“,” type”]* “>”] “(” [argList] “)”
| name “.” “@”name (“.” name “(” [argList] “)”)+ [“.” name]
| “@@”name (“.” name “(” [argList] “)”)+ [“.” name]
| COALESCE “(” [argList] “)”
| ( COUNT | ISEMPTY | MAX | MIN | AVG | SUM ) “(” setBagExpr “)”
| expr mathOperator expr
| “-” expr
| “(” expr “)”
| “(” argList “->” argList “)” // key value pair for MapAccum
| “[” argList “]” // a list
| constant
| setBagExpr
| name “(” argList “)” // function call or a tuple object

setBagExpr := [“@@”]name
| name “.” [“@”]name
| name “.” “@”name (“.” name “(” [argList] “)”)+
| name “.” name “(” [argList] “)” [ “.”.FILTER “(” condition “)” ]
| “@@”name (“.” name “(” [argList] “)”)+
| setBagExpr (UNION | INTERSECT | MINUS) setBagExpr
| “(” argList “)”
| “(” setBagExpr “)”

#########################################################
## Declarations and Assignments ##

## Declarations ##
baseDeclStmt := baseType name [“=” constant][, name [“=” constant]]*
fileDeclStmt := FILE fileVar “(” filePath “)”
fileVar := name

localVarDeclStmt := baseType name “=” expr

vSetVarDeclStmt := name [“(” vertexEdgeType “)”] “=” (seedSet | simpleSet | selectBlock)

simpleSet := name | “(” simpleSet “)” | simpleSet (UNION | INTERSECT | MINUS) simpleSet

seedSet := “{” [seed [“,” seed ]*] “}”
seed := ‘_’
| ANY
| [“@@”]name
| name “.*”
| “SelectVertex” selectVertParams

selectVertParams := “(” filePath “,” columnId “,” (columnId | name) “,”
stringLiteral “,” (TRUE | FALSE) “)” [“.”.FILTER “(” condition “)”]

columnId := “$” (integer | stringLiteral)

## Assignment Statements ##
assignStmt := name “=” expr
| name “.” name “=” expr
| name “.” “@”name (“+=”| “=”) expr

gAccumAssignStmt := “@@”name (“+=” | “=”) expr

loadAccumStmt := “@@”name “=” “{” “LOADACCUM” loadAccumParams [“,” “LOADACCUM” loadAccumParams]* “}”

loadAccumParams := “(” filePath “,” columnId “,” [columnId “,”]*
stringLiteral “,” (TRUE | FALSE) “)” [“.”.FILTER “(” condition “)”]

## Function Call Statement ##
funcCallStmt := name [“<” type [“,” type”]* “>”] “(” [argList] “)”
| “@@”name (“.” name “(” [argList] “)”)+

argList := expr [“,” expr]*

#########################################################
## Select Statement

selectStmt := name “=” selectBlock

selectBlock := SELECT name FROM ( edgeSet | vertexSet )
[sampleClause]
[whereClause]
[accumClause]
[postAccumClause]
[havingClause]
[orderClause]
[limitClause]

vertexSet := name [“:” name]

edgeSet := name [“:” name]
“-” “(” [vertexEdgeType] [“:” name] “)” “->”
[vertexEdgeType] [“:” name]

vertexEdgeType := “_” | ANY | name | ( “(” name [“|” name]* “)” )

sampleClause := SAMPLE ( expr | expr “%” ) EDGE WHEN condition
| SAMPLE expr TARGET WHEN condition
| SAMPLE expr “%” TARGET PINNED WHEN condition

whereClause := WHERE condition

accumClause := ACCUM DMLSubStmtList

postAccumClause := POST-ACCUM DMLSubStmtList

DMLSubStmtList := DMLSubStmt [“,” DMLSubStmt]*

DMLSubStmt := assignStmt // Assignment
| funcCallStmt // Function Call
| gAccumAccumStmt // Assignment
| vAccumFuncCall // Function Call
| localVarDeclStmt // Declaration
| DMLSubCaseStmt // Control Flow
| DMLSubIfStmt // Control Flow
| DMLSubWhileStmt // Control Flow
| DMLSubForEachStmt // Control Flow
| BREAK // Control Flow
| CONTINUE // Control Flow
| insertStmt // Data Modification
| DMLSubDeleteStmt // Data Modification
| printlnStmt // Output
| logStmt // Output

vAccumFuncCall := name “.” “@”name (“.” name “(” [argList] “)”)+

havingClause := HAVING condition

orderClause := ORDER BY expr [ASC | DESC] [“,” expr [ASC | DESC]]*

limitClause := LIMIT ( expr | expr “,” expr | expr OFFSET expr )

#########################################################
## Control Flow Statements ##

queryBodyIfStmt := IF condition THEN queryBodyStmts [ELSE IF condition THEN queryBodyStmts ]* [ELSE queryBodyStmts ] END
DMLSubIfStmt := IF condition THEN DMLSubStmtList [ELSE IF condition THEN DMLSubStmtList ]* [ELSE DMLSubStmtList ] END

queryBodyCaseStmt := CASE (WHEN condition THEN queryBodyStmts)+ [ELSE queryBodyStmts] END
| CASE expr (WHEN constant THEN queryBodyStmts)+ [ELSE queryBodyStmts] END
DMLSubCaseStmt := CASE (WHEN condition THEN DMLSubStmtList)+ [ELSE DMLSubStmtList] END
| CASE expr (WHEN constant THEN DMLSubStmtList)+ [ELSE DMLSubStmtList] END

queryBodyWhileStmt := WHILE condition [LIMIT (name | integer)] DO queryBodyStmts END
DMLSubWhileStmt := WHILE condition [LIMIT (name | integer)] DO DMLSubStmtList END

queryBodyForEachStmt := FOREACH forEachControl DO queryBodyStmts END
DMLSubForEachStmt := FOREACH forEachControl DO DMLSubStmtList END

forEachControl := ( name | “(” name (, name)+ “)”) (IN | “:”) setBagExpr
| name IN RANGE “[” expr , expr”]” [“.STEP(” expr “)”]

#########################################################
## Other Data Modifications Statements ##

queryBodyDeleteStmt := DELETE name FROM ( edgeSet | vertexSet ) [whereClause]
DMLSubDeleteStmt := DELETE “(” name “)”

updateStmt := UPDATE name FROM ( edgeSet | vertexSet ) SET DMLSubStmtList [whereClause]

insertStmt := INSERT INTO name [“(” ( PRIMARY_ID | FROM “,” TO ) [“,” name]* “)”]
VALUES “(” ( “_” | expr ) [name] [“,” ( “_” | expr ) [name] [“,” (“_” | expr)]*] “)”

#########################################################
## Output Statements ##

printStmt := PRINT printExpr {, printExpr} [WHERE condition] [TO_CSV (filePath | fileVar)]
printExpr := (expr | vExprSet) [ AS name]
vExprSet := expr “[” vSetProj {, vSetProj} “]”
vSetProj := expr [ AS name]

printlnStmt := fileVar”.println” “(” expr {, expr} “)”

logStmt := LOG “(” condition “,” argList “)”

returnStmt := RETURN expr

#########################################################
## Exception Statements ##

declExceptStmt := EXCEPTION exceptVarName “(” errorInt “)”
exceptVarName := name
errorInt := integer

raiseStmt := RAISE exceptVarName [errorMsg]
errorMsg := “(” expr “)”

tryStmt := TRY queryBodyStmts EXCEPTION caseExceptBlock+ [elseExceptBlock] END “;”
caseExceptBlock := WHEN exceptVarName THEN queryBodyStmts
elseExceptBlock := ELSE queryBodyStmts

 

 

 

 

Appendix C. Query Language Reserved Words

The following words are reserved for use by the GSQL query language. This includes words which are currently keywords (such as GRAPH), as well as words which might be used in the future (such as EXTERN).


Query Language Reserved Words
ACCUM ALIGNAS ALIGNOF AND
AND_EQ ANY ASC ASM
AUTO AVG BAG BETWEEN
BITAND BITOR BOOL BREAK
BY CASE CATCH CHAR
CHAR16_T CHAR32_T CLASS COALESCE
COMPL COMPRESS CONCEPT CONST
CONSTEXPR CONST_CAST CONTINUE COUNT
CREATE DATETIME DATETIME_ADD DATETIME_SUB
DECLTYPE DEFAULT DELETE DESC
DISTRIBUTED
DO DONE DOUBLE DYNAMIC_CAST
EDGE ELSE END ENUM
ESCAPE EXCEPTION EXPLICIT EXPORT
EXTERN FALSE FILTER FLOAT
FOR FOREACH FRIEND FROM
GOTO GRAPH GSQL_INT_MAX GSQL_INT_MIN
GSQL_UINT_MAX HAVING IF IN
INLINE INSERT INT INTERSECT
INTERVAL INTO IS ISEMPTY
JSONARRAY JSONOBJECT LIKE LIMIT
LIST LOADACCUM LOG LONG
MAP MAX MIN MINUS
MUTABLE NAMESPACE NEW NOEXCEPT
NOT NOT_EQ