Query Functions

The functions on this page run installed or interpret queries in TigerGraph. All functions in this module are called as methods on a TigerGraphConnection object.

getInstalledQueries()

getInstalledQueries(fmt: str = "py") → Union[dict, str, pd.DataFrame]

Returns a list of installed queries.

Parameter:

fmt: Format of the results:
- "py": Python objects (default)
- "json": JSON document
- "df": pandas DataFrame

Returns:

The names of the installed queries.

runInstalledQuery()

runInstalledQuery(queryName: str, params: Union[str, dict] = None, timeout: int = None, sizeLimit: int = None, usePost: bool = False, runAsync: bool = False, replica: int = None, threadLimit: int = None) → list

Runs an installed query.

The query must be already created and installed in the graph. Use getEndpoints(dynamic=True) or GraphStudio to find out the generated endpoint URL of the query. Only the query name needs to be specified here.

Parameters:

queryName: The name of the query to be executed.
params: Query parameters. A string of param1=value1&param2=value2 format or a dictionary. See below for special rules for dictionaries.
timeout: Maximum duration for successful query execution (in milliseconds). See GSQL query timeout
sizeLimit: Maximum size of response (in bytes). See Response size
usePost: Defaults to False. The RESTPP accepts a maximum URL length of 8192 characters. Use POST if additional parameters cause you to exceed this limit, or if you choose to pass an empty set into a query for database versions >= 3.8
runAsync: Run the query in asynchronous mode. See Async operation
replica: If your TigerGraph instance is an HA cluster, specify which replica to run the query on. Must be a value between [1, (cluster replication factor)]. See Specify replica
threadLimit: Specify a limit of the number of threads the query is allowed to use on each node of the TigerGraph cluster. See Thread limit

Returns:

The output of the query, a list of output elements (vertex sets, edge sets, variables, accumulators, etc.

Notes:

When specifying parameter values in a dictionary:

For primitive parameter types use
"key": value
For SET and BAG parameter types with primitive values, use
"key": [value1, value2, …]
For VERTEX<type> use
"key": primary_id
For VERTEX (no vertex type specified) use
"key": (primary_id, "vertex_type")
For SET<VERTEX<type>> use
"key": [primary_id1, primary_id2, …]
For SET<VERTEX> (no vertex type specified) use
"key": [(primary_id1, "vertex_type1"), (primary_id2, "vertex_type2"), …]

Endpoints:
GET /query/{graph_name}/{query_name} See Run an installed query (GET)
POST /query/{graph_name}/{query_name} See Run an installed query (POST)

checkQueryStatus()

checkQueryStatus(requestId: str = "")

Checks the status of the queries running on the graph specified in the connection.

Parameter:

requestId (str, optional): String ID of the request. If empty, returns all running requests. See Check query status (detached mode)

Endpoint:

GET /query_status/{graph_name} See Check query status (detached mode)

getQueryResult()

getQueryResult(requestId: str = "")

Gets the result of a detached query.

Parameter:

requestId (str): String ID of the request. See Check query results (detached mode)

runInterpretedQuery()

runInterpretedQuery(queryText: str, params: Union[str, dict] = None) → list

Runs an interpreted query.

Use $graphname or @graphname@ in the FOR GRAPH clause to avoid hardcoding the name of the graph in your app. It will be replaced by the actual graph name.

Parameters:

queryText: The text of the GSQL query that must be provided in this format:

INTERPRET QUERY (<params>) FOR GRAPH <graph_name> {
<statements>
}

params: A string of param1=value1&param2=value2… format or a dictionary. See below for special rules for dictionaries.

Returns:

The output of the query, a list of output elements such as vertex sets, edge sets, variables and accumulators.

Notes:

When specifying parameter values in a dictionary:

For primitive parameter types use
"key": value
For SET and BAG parameter types with primitive values, use
"key": [value1, value2, …]
For VERTEX<type> use
"key": primary_id
For VERTEX (no vertex type specified) use
"key": (primary_id, "vertex_type")
For SET<VERTEX<type>> use
"key": [primary_id1, primary_id2, …]
For SET<VERTEX> (no vertex type specified) use
"key": [(primary_id1, "vertex_type1"), (primary_id2, "vertex_type2"), …]

Endpoint:

POST /gsqlserver/interpreted_query See Run an interpreted query

parseQueryOutput()

parseQueryOutput(output: list, graphOnly: bool = True) → dict

Parses query output and separates vertex and edge data (and optionally other output) for easier use.

Parameters:

output: The data structure returned by runInstalledQuery() or runInterpretedQuery().
graphOnly: If True (the default setting), restricts captured output to vertices and edges. If False, captures values of variables and accumulators and any other plain text printed.

Returns:

A dictionary with two (or three) keys: "vertices", "edges" and optionally "output". The first two refer to another dictionary containing keys for each vertex and edge types found and the instances of those vertex and edge types. "output" is a list of dictionaries containing the key/value pairs of any other output.

The JSON output from a query can contain a mixture of results: vertex sets (the output of a SELECT statement), edge sets (e.g. collected in a global accumulator), printout of global and local variables and accumulators, including complex types (LIST, MAP, etc.). The type of the various output entries is not explicit and requires manual inspection to determine the type.

This function "cleans" this output, separating and collecting vertices and edges in an easy to access way. It can also collect other output or ignore it.
The output of this function can be used e.g. with the vertexSetToDataFrame() and edgeSetToDataFrame() functions or (after some transformation) to pass a subgraph to a visualization component.

getStatistics()

getStatistics(seconds: int = 10, segments: int = 10) → dict

Retrieves real-time query performance statistics over the given time period.

Parameters:

seconds: The duration of statistic collection period (the last n seconds before the function call).
segments: The number of segments of the latency distribution (shown in results as LatencyPercentile). By default, segments is 10, meaning the percentile range 0-100% will be divided into ten equal segments: 0%-10%, 11%-20%, etc. This argument must be an integer between 1 and 100.

Endpoint:

GET /statistics/{graph_name} See Show query performance

Query Functions

getInstalledQueries()

Parameter:

Returns:

runInstalledQuery()

Parameters:

Returns:

Notes:

Endpoints:

checkQueryStatus()

Parameter:

Endpoint:

getQueryResult()

Parameter:

runInterpretedQuery()

Parameters:

Returns:

Notes:

Endpoint:

parseQueryOutput()

Parameters:

Returns:

getStatistics()

Parameters:

Endpoint: