Data Types
This section describes the data types that are native to and are supported by the GSQL Query Language. Most of the data objects used in queries come from one of three sources:
-
The query’s input parameters
-
The vertices, edges, and their attributes which are encountered when traversing the graph
-
The variables defined within the query to assist in the computational work of the query
This section covers the following subset of the EBNF language definitions:
lowercase := [a-z]
uppercase := [A-Z]
letter := lowercase | uppercase
digit := [0-9]
integer := ["-"]digit+
real := ["-"]("."digit+) | ["-"](digit+"."digit*)
numeric := integer | real
stringLiteral := '"' [~["] | '\\' ('"' | '\\')]* '"'
name := (letter | "_") [letter | digit | "_"]* // Can be a single "_" or start with "_"
graphName := name
queryName := name
paramName := name
vertexType := name
edgeType := name
accumName := name
vertexSetName := name
attrName := name
varName := name
tupleType := name
fieldName :=name
funcName := name
type := baseType | tupleType | accumType | STRING COMPRESS
baseType := INT
| UINT
| FLOAT
| DOUBLE
| STRING
| BOOL
| VERTEX ["<" vertexType ">"]
| EDGE
| JSONOBJECT
| JSONARRAY
| DATETIME
filePath := paramName | stringLiteral
typedef := TYPEDEF TUPLE "<" tupleFields ">" tupleType
tupleFields := (baseType fieldName) | (fieldName baseType)
["," (baseType fieldName) | (fieldName baseType)]*
parameterType := baseType
| [ SET | BAG ] "<" baseType ">"
| FILE
Identifiers
An identifier is the name for an instance of a language element. In the GSQL query language, identifiers are used to name elements such as a query, a variable, or a user-defined function.
In the EBNF syntax, an identifier is referred as name
.
It can be a sequence of letters, digits, or underscores ("_"
).
Other punctuation characters are not supported. The initial character can only be a letter or an underscore.
name := (letter | "_") [letter | digit | "_"]*
Overview of Types
Different types of data can be used in different contexts. The EBNF syntax defines several classes of data types. The most basic is called base type (baseType
).
The other independent types are FILE
and STRING COMPRESS
. The remaining types are either compound data types built from the independent data types, or supersets of other types. The table below gives an overview of their definitions and their uses.
EBNF term | Description | Use Case |
---|---|---|
|
|
|
|
Sequence of base types |
|
|
Family of specialized data objects which support accumulation operations |
|
|
|
|
|
|
|
(⚠suitable only in limited circumstances) |
STRING COMPRESS |
|
|
|
|
|
|
|
Base Types
The query language supports the following base types, which can be declared and assigned anywhere within their scope. Any of these base types may be used when defining a global variable, a local variable, a query return value, a parameter, part of a tuple, or an element of a container accumulator. Accumulators are described in detail in a later section.
baseType := INT
| UINT
| FLOAT
| DOUBLE
| STRING
| BOOL
| VERTEX ["<" vertexType ">"]
| EDGE
| JSONOBJECT
| JSONARRAY
| DATETIME
The default value of each base type is shown in the table below. The default value is the initial value of a base type variable (see Section "Variable Types" for more details), or the default return value for some functions.
The first seven types (INT
, UINT
, FLOAT
, DOUBLE
, BOOL
, STRING
, and DATETIME
) are the same ones mentioned in the "Attribute Data Types" section of _GSQL Language Reference, Part 1.
Type | Default value | Literal |
---|---|---|
|
|
A signed integer: |
|
|
An unsigned integer: |
|
|
A decimal: |
|
|
A decimal with greater precision than |
|
|
|
|
|
Characters enclosed by double quotes: |
|
|
No literal. Can be converted from a correctly formatted string with |
|
|
No literal. |
|
No edge: |
No literal. |
|
An empty object: |
No literal. Can be converted from a correctly formatted string with |
|
An empty array: |
No literal. Can be converted from a correctly formatted string with |
The GSQL Loader can read FLOAT and DOUBLE values with exponential notation (e.g., 1.25 E-7). |
Vertex
VERTEX
is considered a base type in the GSQL query language.
Both query parameters and variables in a query body can be of type VERTEX
.
Vertex types
A graph schema defines specific vertex types.
Each vertex type has its own set of attributes.
The parameter or variable type can be restricted by giving the vertex type in angle brackets <>
after the keyword VERTEX
.
A vertex variable declared without a specifier is called a generic vertex variable.
VERTEX anyVertex; VERTEX<person> owner;
All vertices have a built-in attribute type
. The built-in attribute is of type string. You can access it with the dot (.
) operator.
For example, if you declare a vertex variable VERTEX<person> personVertex
, then personVertex.type
returns "person"
.
Edge
EDGE
is considered a base type in the GSQL query language.
Both query parameters and variables in a query body can be of type EDGE
.
Edge types
A graph schema defines specific edge types.
Each edge type has its own set of attributes.
The parameter or variable type can be restricted by giving the edge type in angle brackets <>
after the keyword EDGE
.
An edge variable declared without a specifier is called a generic edge variable.
EDGE anyEdge; EDGE<friendship> friendEdge;
All edges have a built-in attribute type
. The built-in attribute is of type string. You can access it with the dot (.
) operator.
For example, if you define an edge variable EDGE<friendship> friendEdge
, then friendEdge.type
returns "Friendship"
.
Vertex and Edge Attribute Types
The following table maps vertex or edge attribute types in the Data Definition Language (DDL) to GSQL query language types. If an attribute of a vertex or edge is referenced in a GSQL query, they will be automatically converted to their corresponding data type in the GSQL query language.
DDL | GSQL Query |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SET
and LIST
literals
In the GSQL query language, one cannot declare a variable of SET
(vertex sets are an exception), LIST
, or MAP
types. However, one can still use SET
and LIST
literals to update the value of a vertex attribute of type SET
or LIST
, insert a vertex or edge with attributes of type SET
or LIST
, and initialize an accumulator.
// Elements within a set or a list need to be of the same type
set_literal := "(" expr ("," expr)* ")"
list_literal := "[" expr ("," expr)* "]"
expr := INT | UINT | FLOAT | DOUBLE | BOOL | STRING | UDT | DATETIME
Currently, GSQL query language syntax does not support |
JSONOBJECT
and JSONARRAY
These two base types allow users to pass a complex data object or to write output in a customized format.
These types follow the industry-standard definition of JSON.
A JSONOBJECT
instance’s external representation (as input and output) is a string, starting and ending with curly braces ({}
) which enclose an unordered list of key-value pairs. A JSONARRAY
is represented as a string, starting and ending with square brackets ([]
)which enclose an ordered list of values.
Since a value can be an object or an array, JSON supports hierarchical, nested data structures.
A |
Tuple
A tuple is a user-defined data structure consisting of a fixed sequence of base type variables.
Tuple types can be created and named using a TYPEDEF
statement.
Tuples must be defined first, before any other statements in a query.
typedef := TYPEDEF TUPLE "<" tupleFields ">" tupleType
tupleFields := (baseType fieldName) | (fieldName baseType)
["," (baseType fieldName) | (fieldName baseType)]*
A tuple can also be defined in a graph schema and then can be used as a vertex or edge attribute type. A tuple type that has been defined in the graph schema does not need to be re-defined in a query.
The vertex type person
contains two complex attributes:
-
secretInfo
of typeSECRET_INFO
, which a user-defined tuple -
portfolio
of typeMAP<STRING, DOUBLE>
investmentNet
SchemaTYPEDEF TUPLE <age UINT (4), mothersName STRING(20) > SECRET_INFO
CREATE VERTEX person(PRIMARY_ID personId STRING, portfolio MAP<STRING, DOUBLE>, secretInfo SECRET_INFO)
CREATE VERTEX stockOrder(PRIMARY_ID orderId STRING, ticker STRING, orderSize UINT, price FLOAT)
CREATE UNDIRECTED EDGE makeOrder(FROM person, TO stockOrder, orderTime DATETIME)
CREATE GRAPH investmentNet (*)
The query below reads both the SECRET_INFO
tuple and the portfolio MAP.
The tuple type does not need to be redefined in the query.
To read and save the map, we define a MapAccum
with the same types for key and value as the portfolio
attribute.
In addition, the query creates a new tuple type, ORDER_RECORD
.
CREATE QUERY tupleEx(VERTEX<person> p) FOR GRAPH investmentNet{
TYPEDEF TUPLE <STRING ticker, FLOAT price, DATETIME orderTime> ORDER_RECORD;(1)
SetAccum<SECRET_INFO> @@info; (2)
ListAccum<ORDER_RECORD> @@orderRecords;
MapAccum<STRING, DOUBLE> @@portf;
INIT = {p};
# Get person p's secret_info and portfolio
X = SELECT v FROM INIT:v
ACCUM @@portf += v.portfolio, @@info += v.secretInfo;
# Search person p's orders to record ticker, price, and order time.
# Note that the tuple gathers info from both edges and vertices.
orders = SELECT t
FROM INIT:s -(makeOrder:e)->stockOrder:t
ACCUM @@orderRecords += ORDER_RECORD(t.ticker, t.price, e.orderTime);
PRINT @@portf, @@info;
PRINT @@orderRecords;
}
1 | This statement defines a new tuple ORDER_RECORD at the top of the query. |
2 | SECRET_INFO has already been defined in investmentNet Schema. |
GSQL > RUN QUERY tupleEx("person1")
{
"error": false,
"message": "",
"version": {
"edition": "developer",
"schema": 0,
"api": "v2"
},
"results": [
{
"@@info": [{
"mothersName": "JAMES",
"age": 25
}],
"@@portf": {
"AAPL": 3142.24,
"MS": 5000,
"G": 6112.23
}
},
{"@@orderRecords": [
{
"ticker": "AAPL",
"orderTime": "2017-03-03 18:42:28",
"price": 34.42
},
{
"ticker": "B",
"orderTime": "2017-03-03 18:42:30",
"price": 202.32001
},
{
"ticker": "A",
"orderTime": "2017-03-03 18:42:29",
"price": 50.55
}
]}
]
}
STRING COMPRESS
STRING COMPRESS
is an integer type encoded by the system to represent string values. STRING COMPRESS
uses less memory than STRING
.
The STRING COMPRESS
type is designed to act like STRING
: data are loaded and printed just as string data, and most functions and operators which take STRING
input can also take STRING COMPRESS
input. The difference is in how the data are stored internally.
A STRING COMPRESS
value can be obtained from a STRING_SET COMPRESS
or STRING_LIST COMPRESS
attribute or from converting a STRING
value.
Using We recommend performing comparison tests for both performance and memory usage before settling on |
STRING COMPRESS
type is beneficial for sets of string values when the same values are used multiple times.
In practice, STRING COMPRESS
are most useful for container accumulators like ListAccum<STRING COMPRESS>
or SetAccum<STRING COMPRESS>
.
An accumulator containing STRING COMPRESS
stores the dictionary when it is assigned an attribute value or from another accumulator containing STRING COMPRESS
.
An accumulator containing STRING COMPRESS
can store multiple dictionaries.
A STRING
value can be converted to a STRING COMPRESS
value only if the value is in the dictionaries.
If the STRING
value is not in the dictionaries, the original string value is saved.
A STRING COMPRESS
value can be automatically converted to a STRING
value.
When a STRING COMPRESS
value is output (e.g. by a PRINT
statement), it is shown as a STRING
.
Below is an example query that uses the STRING COMPRESS
type.
|
CREATE QUERY stringCompressEx(VERTEX<person> m1) FOR GRAPH workNet {
ListAccum<STRING COMPRESS> @@strCompressList, @@strCompressList2;
SetAccum<STRING COMPRESS> @@strCompressSet, @@strCompressSet2;
ListAccum<STRING> @@strList, @@strList2;
SetAccum<STRING> @@strSet, @@strSet2;
S = {m1};
S = SELECT s
FROM S:s
ACCUM @@strSet += s.interestSet,
@@strList += s.interestList,
@@strCompressSet += s.interestSet, # use the dictionary from person.interestSet
@@strCompressList += s.interestList; # use the dictionary from person.interestList
@@strCompressList2 += @@strCompressList; # @@strCompressList2 gets the dictionary from @@strCompressList, which is from person.interestList
@@strCompressList2 += "xyz"; # "xyz" is not in the dictionary, so store the actual string value
@@strCompressSet2 += @@strCompressSet;
@@strCompressSet2 += @@strSet;
@@strList2 += @@strCompressList; # string compress integer values are decoded to strings
@@strSet2 += @@strCompressSet;
PRINT @@strSet, @@strList, @@strCompressSet, @@strCompressList;
PRINT @@strSet2, @@strList2, @@strCompressSet2, @@strCompressList2;
}
GSQL > RUN QUERY stringCompressEx("person12")
{
"error": false,
"message": "",
"version": {
"edition": "developer",
"schema": 0,
"api": "v2"
},
"results": [
{
"@@strCompressList": [
"music",
"engineering",
"teaching",
"teaching",
"teaching"
],
"@@strSet": [ "teaching", "engineering", "music" ],
"@@strCompressSet": [ "music", "engineering", "teaching" ],
"@@strList": [
"music",
"engineering",
"teaching",
"teaching",
"teaching"
]
},
{
"@@strSet2": [ "music", "engineering", "teaching" ],
"@@strCompressList2": [
"music",
"engineering",
"teaching",
"teaching",
"teaching",
"xyz"
],
"@@strList2": [
"music",
"engineering",
"teaching",
"teaching",
"teaching"
],
"@@strCompressSet2": [ "teaching", "engineering", "music" ]
}
]
}
FILE
Object
A FILE
object is a sequential data storage object, associated with a text file on the local machine.
When referring to a |
When a FILE
object is declared, associated with a particular text file, any existing content in the text file will be erased.
During the execution of the query, content written to the FILE
will be appended to the FILE
.
When the query where the FILE
was declared finishes running, the FILE
contents are saved to the text file.
A FILE
object can be passed as a parameter to another query. When a query receives a FILE
object as a parameter, it can append data to that FILE
, as can every other query which receives this FILE
object as a parameter.
Query Parameter Types
Input parameters to a query can be base type (except EDGE
, JSONARRAY
, or JSONOBJECT
).
A parameter can also be a SET
or BAG
which uses base type (except EDGE
, JSONARRAY
, or JSONOBJECT
) as the element type. A FILE
object can also be a parameter.
Within the query, SET
and BAG
are converted to SetAccum
and BagAccum
, respectively.
A query parameter is immutable. It cannot be assigned a new value within the query. The |
parameterType := INT
| UINT
| FLOAT
| DOUBLE
| STRING
| BOOL
| VERTEX ["<" vertexType ">"]
| DATETIME
| [ SET | BAG ] "<" baseType ">"
| FILE
(SET<VERTEX<person> p1, BAG<INT> ids, FILE f1)