Query User-Defined Functions
In GSQL, users can supplement the language by defining their own query user-defined functions (query UDF) in C++. Query UDFs can be called in queries and subqueries to perform a set of defined actions and return a value like the built-in functions.
This page introduces the process to define a query UDF. Once defined, the new functions are added into GSQL automatically the next time GSQL is executed.
Query UDFs are user-editable in the form of an .hpp
file locally on your machine.
Creating, editing, or removing Query UDFs is done by downloading them from the server, editing the file, and re-uploading through GET
and PUT
commands.
Uploading and downloading the UDF file (ExprFunctions.hpp
) requires the following privileges:
-
The
GET
command requires theREAD_FILE
privilege. -
The
PUT
command requires theWRITE_FILE
privilege.
It is strongly recommended that you enable GSQL user authentication by changing the password of the default user tigergraph to protect your UDF files.
If you don’t enable user authentication, anyone with GSQL shell access can log in as the default user with superuser privileges and modify your files.
|
Define a query UDF
Below are the steps to add a Query UDF to GSQL:
Step 1: Download current query UDF file
Use the GET ExprFunctions
command in GSQL to download the current list of functions into a local file. The path can be absolute or relative to your current directory, but the file extension must be .hpp
:
GSQL > GET ExprFunctions TO "/example/path/to/ExprFunctions.hpp"
GSQL > GET ExprFunctions TO "./ExprFunctions.hpp"
If your query UDF requires a user-defined struct or helper function, also use the GET ExprUtil
command to download the current ExprUtil
file:
GSQL > GET ExprUtil TO "/example/path/ExprUtil.hpp"
Step 2: Define C++ function
Define the C++ function inside the UDF file you just downloaded in Step 1. The definition of the function should include the keyword inline
.
Below is a list of data types and where each is allowed in a function.
Data Type |
Argument |
Return |
Function Body |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
No |
No |
Yes |
Yes |
Yes |
Yes |
|
All other C++ data types |
No |
No |
Yes |
If the function requires a user-defined struct or helper function, define it in the ExprUtil
file you downloaded in Step 1.
Below is an example of a query UDF definition that includes two separate functions. The first function returns true
if the passed value is greater than 3, and the second reverses a string.
#include <algorithm> // for std::reverse
inline bool greater_than_three (double x) {
return x > 3;
}
inline string reverse(string str){
std::reverse(str.begin(), str.end());
return str;
}
If any code in |
Step 3: Upload files
After you have defined the function, use the PUT
command to upload the files you modified.
GSQL > PUT ExprFunctions FROM "/path/to/udf_file.hpp"
PUT ExprFunctions successfully.
GSQL > PUT ExprUtil FROM "/path/to/utils_file.hpp"
PUT ExprUtil successfully.
The PUT
command will automatically upload the files to all nodes in a cluster, overwriting any existing files that contain UDFs.
Once the files are uploaded, you will be able to call the query UDF the next time GSQL is executed. This includes the next time you start the GSQL shell or execute GSQL scripts from a bash shell. If you are using GraphStudio, however, you will be able to use the queries without needing to refresh the page.
CREATE QUERY udfExample() FOR GRAPH minimalNet {
DOUBLE x;
BOOL y;
x = 3.5;
PRINT greater_than_three(x);
y = greater_than_three(2.5);
PRINT y;
PRINT reverse("abc");
}
Example
Suppose you are working in a distributed environment and want to add a function rng()
that that returns a random double between 0 and 1.
Start by downloading the current UDF file with the GET
command. In this example, we will place our download in the working directory and use the name udf.hpp
in contrast to above, where it was named ExprFunctions.hpp
.
GSQL > GET ExprFunctions TO "./udf.hpp"
In the downloaded file, add the function definition for the rng()
function.
inline double rng() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution < double > distribution(0.0, 1.0);
return distribution(gen);
}
After adding your query, use the PUT
command to upload the file. This will upload the file to all nodes in a cluster:
GSQL > PUT ExprFunctions FROM "./udf.hpp"
PUT ExprFunctions successfully.
The UDF has now been added to GSQL. You can add it to a query, then run the commands INSTALL QUERY
and RUN QUERY
to test the rng()
function. The following commands demonstrate the process with a one-line query called rngExample
that simply prints the output of the new function rng()
.
GSQL > CREATE QUERY rngExample() FOR GRAPH example_graph {PRINT rng();}
Successfully created queries: [rngExample].
GSQL > INSTALL QUERY rngExample
Start installing queries, about 1 minute ...
rngExample query: curl -X GET 'http://127.0.0.1:9000/query/example_graph/rngExample'. Add -H "Authorization: Bearer TOKEN" if authentication is enabled.
Select 'm1' as compile server, now connecting ...
Node 'm1' is prepared as compile server.
[=========================================================================================] 100% (1/1)
Query installation finished.
GSQL > RUN QUERY rngExample()
{
"error": false,
"message": "",
"version": {
"schema": 0,
"edition": "enterprise",
"api": "v2"
},
"results": [{"rng()": 0.51352}]
}