Query User-Defined Functions
In GSQL, users can supplement the language by defining their own query user-defined functions (query UDFs) in C++. Query UDFs can be called in queries and subqueries to perform a set of defined actions and return a value like the built-in functions.
This page introduces the process to define a query UDF. Once defined, the new functions are added into GSQL automatically the next time GSQL is executed.
UDFs are written in C++ in two files, ExprFunctions.hpp
and ExprUtil.hpp
:
-
ExprFunctions is used for functions that are called directly in GSQL queries.
-
ExprUtil contains structs or helper functions that used called by the functions in ExprFunctions. The functions defined in
ExprUtil.hpp
cannot be used in a GSQL query.
These files are stored as .hpp
files in AppRoot/dev/gdk/gsql/src/QueryUdf/
in a TigerGraph Server installation.
There are two ways to modify these files to add user-defined functions to GSQL:
-
Store the files in a GitHub repository, and configure GSQL to read from the repository.
-
Use
GET
andPUT
commands to download, modify, and store the files locally.-
The
GET
command requires theREAD_FILE
privilege. -
The
PUT
command requires theWRITE_FILE
privilege.
-
It is strongly recommended that you enable GSQL user authentication by changing the password of the default user tigergraph to protect your UDF files.
If you don’t enable user authentication, anyone with GSQL shell access can log in as the default user with superuser privileges and modify your files.
|
This section first explains how to define a query UDF, then how to integrate query UDFs into GSQL.
Define a query UDF in C++
User-defined functions are C++ functions with a certain set of allowed data types.
The function definition must include the keyword inline
.
This is a sample function that returns true
if the value passed to the function is greater than 3.
inline bool greater_than_three (double x) {
return x > 3;
}
Data Type | Argument | Return | Function Body |
---|---|---|---|
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
No |
No |
Yes |
Yes |
Yes |
Yes |
|
All other C++ data types |
No |
No |
Yes |
You can write your functions in the ExprFunctions
file provided as a sample in TigerGraph Server installations, or create your own .hpp
files from scratch.
If your function requires a user-defined struct or helper function, that struct or helper function must be defined in a separate ExprUtil
file.
Below is an example of a short ExprFunctions
file containing a single UDF that reverses a string. Note the include
statement on the first line.
#include <algorithm> // for std::reverse
inline string reverse(string str){
std::reverse(str.begin(), str.end());
return str;
}
Use GitHub to store UDFs
You can configure GSQL to read from a GitHub repository for ExprFunctions and ExprUtil.
If GitHub access is configured, GSQL will retrieve user source code files from GitHub before files added via PUT
, so long as the files exist.
TigerGraph only allows one UDF file at a time. Files on GitHub take priority. If GitHub is connected but files are missing, TigerGraph will look for a UDF file added via PUT .
|
New additions to the files in the GitHub repository are instantly available in GSQL.
You can retrieve ExprFunctions.hpp and ExprUtil.hpp from AppRoot/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp
and copy them to a Git repository of your choice.
When the files are hosted on GitHub, the |
The file names must be ExprFunctions.hpp
and ExprUtil.hpp
.
This is in contrast to the PUT
method, where the files could have any file name.
The gadmin
configuration parameters for setting up the connection to GitHub are as follows:
Parameter | Description | Example |
---|---|---|
|
The credential used to access the repository |
|
|
The user and repository where the files are held |
|
|
The branch to access |
|
|
Path to the directory in the repository that has ExprFunctions.hpp and ExprUtil.hpp |
|
|
Optional parameter used for GitHub Enterprise |
Use the gadmin config set
command to configure the aforementioned parameters to connect GSQL to the GitHub repository hosting your files.
Below is an example configuration. Remember to run gadmin config apply
after changing the parameters.
If GSQL is already running, you will need to run gadmin restart all
to restart GSQL before the UDFs become available.
gadmin config set GSQL.GithubUserAcessToken anonymous
gadmin config set GSQL.GithubRepository tigergraph/ecosys
gadmin config set GSQL.GithubBranch demo_github
gadmin config set GSQL.GithubPath sample_code/src
gadmin config apply
After the parameters are successfully configured, you can access your UDFs in new queries right away.
Store a UDF file locally
Step 1: Modify current query UDF file
Use the GET ExprFunctions
command in GSQL to copy the current set of functions into a local file.
The path can be absolute or relative to your current directory, but the file extension must be .hpp
:
GSQL > GET ExprFunctions TO "/example/path/to/ExprFunctions.hpp"
GSQL > GET ExprFunctions TO "./ExprFunctions.hpp"
If your query UDF requires a user-defined struct or helper function, also use the GET ExprUtil
command to download the current ExprUtil
file:
GSQL > GET ExprUtil TO "/example/path/ExprUtil.hpp"
Step 2: Define your function
Write your function in ExprFunctions and any helper functions in ExprUtil.
Step 3: Store the updated query UDF file
After you have defined the function, use the PUT
command to store the files you modified.
GSQL > PUT ExprFunctions FROM "/path/to/udf_file.hpp"
PUT ExprFunctions successfully.
GSQL > PUT ExprUtil FROM "/path/to/utils_file.hpp"
PUT ExprUtil successfully.
The PUT
command will automatically store the files in all nodes in a cluster, overwriting any existing files that contain UDFs.
Once the files are stored, you will be able to call the Query UDF the next time GSQL is executed. This includes the next time you start the GSQL shell or execute GSQL scripts from a bash shell. If you are using GraphStudio, however, you will be able to use the queries without needing to refresh the page.
CREATE QUERY udfExample() FOR GRAPH minimalNet {
DOUBLE x;
BOOL y;
x = 3.5;
PRINT greater_than_three(x);
y = greater_than_three(2.5);
PRINT y;
}
Example
Suppose you are working in a distributed environment and want to add a function rng()
that that returns a random double between 0 and 1. In this example, suppose you want to modify the ExprFunctions file locally rather than using GitHub.
Start by downloading the current UDF file with the GET
command. In this example, we will place our download in the working directory and use the name udf.hpp
in contrast to above, where it was named ExprFunctions.hpp
, to illustrate the flexibility of the naming scheme.
GSQL > GET ExprFunctions TO "./udf.hpp"
In the downloaded file, add the function definition for the rng()
function.
inline double rng() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution < double > distribution(0.0, 1.0);
return distribution(gen);
}
After adding your query, use the PUT
command to store the file in all nodes in a cluster:
GSQL > PUT ExprFunctions FROM "./udf.hpp"
PUT ExprFunctions successfully.
The file has been stored and the UDF has now been added to GSQL. You can add it to a query, then run the commands INSTALL QUERY
and RUN QUERY
to test the rng()
function.
The following commands demonstrate the process with a one-line query called rngExample
that simply prints the output of the new function rng()
.
GSQL > CREATE QUERY rngExample() FOR GRAPH example_graph {PRINT rng();}
Successfully created queries: [rngExample].
GSQL > INSTALL QUERY rngExample
Start installing queries, about 1 minute ...
rngExample query: curl -X GET 'http://127.0.0.1:9000/query/example_graph/rngExample'. Add -H "Authorization: Bearer TOKEN" if authentication is enabled.
Select 'm1' as compile server, now connecting ...
Node 'm1' is prepared as compile server.
[=========================================================================================] 100% (1/1)
Query installation finished.
GSQL > RUN QUERY rngExample()
{
"error": false,
"message": "",
"version": {
"schema": 0,
"edition": "enterprise",
"api": "v2"
},
"results": [{"rng()": 0.51352}]
}