Query Functions
The functions on this page run installed or interpret queries in TigerGraph.
All functions in this module are called as methods on a TigerGraphConnection
object.
getInstalledQueries()
getInstalledQueries(fmt: str = "py") → Union[dict, str, pd.DataFrame]
Returns a list of installed queries.
Parameter:
-
fmt
: Format of the results:-
"py": Python objects (default)
-
"json": JSON document
-
"df": pandas DataFrame
-
Returns:
The names of the installed queries.
runInstalledQuery()
runInstalledQuery(queryName: str, params: Union[str, dict] = None, timeout: int = None, sizeLimit: int = None, usePost: bool = False, runAsync: bool = False, replica: int = None, threadLimit: int = None) → list
Runs an installed query.
The query must be already created and installed in the graph.
Use getEndpoints(dynamic=True)
or GraphStudio to find out the generated endpoint URL of
the query. Only the query name needs to be specified here.
Parameters:
-
queryName
: The name of the query to be executed. -
params
: Query parameters. A string of param1=value1¶m2=value2 format or a dictionary. See below for special rules for dictionaries. -
timeout
: Maximum duration for successful query execution (in milliseconds). See GSQL query timeout -
sizeLimit
: Maximum size of response (in bytes). See Response size -
usePost
: Defaults to False. The RESTPP accepts a maximum URL length of 8192 characters. Use POST if additional parameters cause you to exceed this limit, or if you choose to pass an empty set into a query for database versions >= 3.8 -
runAsync
: Run the query in asynchronous mode. See Async operation -
replica
: If your TigerGraph instance is an HA cluster, specify which replica to run the query on. Must be a value between [1, (cluster replication factor)]. See Specify replica -
threadLimit
: Specify a limit of the number of threads the query is allowed to use on each node of the TigerGraph cluster. See Thread limit
Returns:
The output of the query, a list of output elements (vertex sets, edge sets, variables, accumulators, etc.
Notes:
When specifying parameter values in a dictionary:
-
For primitive parameter types use
"key": value
-
For
SET
andBAG
parameter types with primitive values, use
"key": [value1, value2, …]
-
For
VERTEX<type>
use
"key": primary_id
-
For
VERTEX
(no vertex type specified) use
"key": (primary_id, "vertex_type")
-
For
SET<VERTEX<type>>
use
"key": [primary_id1, primary_id2, …]
-
For
SET<VERTEX>
(no vertex type specified) use
"key": [(primary_id1, "vertex_type1"), (primary_id2, "vertex_type2"), …]
Endpoints:
-
GET /query/{graph_name}/{query_name}
See Run an installed query (GET) -
POST /query/{graph_name}/{query_name}
See Run an installed query (POST)
checkQueryStatus()
checkQueryStatus(requestId: str = "")
Checks the status of the queries running on the graph specified in the connection.
Parameter:
-
requestId (str, optional)
: String ID of the request. If empty, returns all running requests. See Check query status (detached mode)
Endpoint:
-
GET /query_status/{graph_name}
See Check query status (detached mode)
getQueryResult()
getQueryResult(requestId: str = "")
Gets the result of a detached query.
Parameter:
-
requestId (str)
: String ID of the request. See Check query results (detached mode)
runInterpretedQuery()
runInterpretedQuery(queryText: str, params: Union[str, dict] = None) → list
Runs an interpreted query.
Use $graphname
or @graphname@
in the FOR GRAPH
clause to avoid hardcoding the
name of the graph in your app. It will be replaced by the actual graph name.
Parameters:
-
queryText
: The text of the GSQL query that must be provided in this format:
INTERPRET QUERY (<params>) FOR GRAPH <graph_name> {
<statements>
}
-
params
: A string ofparam1=value1¶m2=value2…
format or a dictionary. See below for special rules for dictionaries.
Returns:
The output of the query, a list of output elements such as vertex sets, edge sets, variables and accumulators.
Notes:
When specifying parameter values in a dictionary:
-
For primitive parameter types use
"key": value
-
For
SET
andBAG
parameter types with primitive values, use
"key": [value1, value2, …]
-
For
VERTEX<type>
use
"key": primary_id
-
For
VERTEX
(no vertex type specified) use
"key": (primary_id, "vertex_type")
-
For
SET<VERTEX<type>>
use
"key": [primary_id1, primary_id2, …]
-
For
SET<VERTEX>
(no vertex type specified) use
"key": [(primary_id1, "vertex_type1"), (primary_id2, "vertex_type2"), …]
Endpoint:
-
POST /gsqlserver/interpreted_query
See Run an interpreted query
parseQueryOutput()
parseQueryOutput(output: list, graphOnly: bool = True) → dict
Parses query output and separates vertex and edge data (and optionally other output) for easier use.
Parameters:
-
output
: The data structure returned byrunInstalledQuery()
orrunInterpretedQuery()
. -
graphOnly
: IfTrue
(the default setting), restricts captured output to vertices and edges. IfFalse
, captures values of variables and accumulators and any other plain text printed.
Returns:
A dictionary with two (or three) keys: "vertices"
, "edges"
and optionally "output"
.
The first two refer to another dictionary containing keys for each vertex and edge types
found and the instances of those vertex and edge types. "output"
is a list of
dictionaries containing the key/value pairs of any other output.
The JSON output from a query can contain a mixture of results: vertex sets (the output of a SELECT statement), edge sets (e.g. collected in a global accumulator), printout of global and local variables and accumulators, including complex types (LIST, MAP, etc.). The type of the various output entries is not explicit and requires manual inspection to determine the type.
This function "cleans" this output, separating and collecting vertices and edges in an easy
to access way. It can also collect other output or ignore it.
The output of this function can be used e.g. with the vertexSetToDataFrame()
and
edgeSetToDataFrame()
functions or (after some transformation) to pass a subgraph to a
visualization component.
getStatistics()
getStatistics(seconds: int = 10, segments: int = 10) → dict
Retrieves real-time query performance statistics over the given time period.
Parameters:
-
seconds
: The duration of statistic collection period (the last n seconds before the function call). -
segments
: The number of segments of the latency distribution (shown in results asLatencyPercentile
). By default, segments is10
, meaning the percentile range 0-100% will be divided into ten equal segments: 0%-10%, 11%-20%, etc. This argument must be an integer between 1 and 100.
Endpoint:
-
GET /statistics/{graph_name}
See Show query performance