Load Data
After mapping data files to the graph schema, you can start loading data. Click "Load Data" on the left side menu bar to go to the Load Data page.
The "Load Data" interface is separated into three parts:
-
Data Mapping Overview
-
Provides a general view of the graph and the data mapping.
-
Shows the loading progress of each data file.
-
-
Toolbar (above Data Mapping)
-
Start/pause/resume/stop data loading and clear graph data buttons.
-
-
Statistics
-
Graph statistics: displays the numbers of vertices and edges in total and per type, with real-time loading progress.
-
Loading statistics: displays the total number of vertices and edges loader vs. time.
-
To display real-time graph statistics, this page checks the number of vertices and edges every 10 seconds, which adds overhead. To maximize loading performance, move to a different page after starting loading, and only come back here occasionally to check the progress. |
Start Loading
GraphStudio provides two types of loading:
-
Partial Loading: load a subset of the data files which the user selects.
-
Full Loading: load all of the data files.
Pause Loading
Similar to Start Loading, you can pause loading some of the data files, or all loading data files.
Select one or more data files (holding down the "shift" key to select multiple data files), and click on the "pause loading" button on the toolbar. In the Paused state, the progress bar will change to a solid orange color.
Resume Loading
You can resume loading some or all loading data files which have been paused.
Select one or more data files (holding down the "shift" key to select multiple data files), and click on the "start/resume loading" buttonon the toolbar. After resuming, the data file loading will continue from where it was paused:
Stop Loading
After loading has been started or paused, you can stop loading from these data files by clicking the "stop load" button . Similar to Start Loading, you can stop loading some or all loading data files. After stopping, the loading status of the data files will become "Stopped":
Statistics Panel
The Statistics panel contains two tabs: Graph Statistics (1st tab) and Data Loading Statistics (2nd tab).
Graph Statistics
By default if no data file is selected, the Statistics panel will show Graph Statistics.
The table at the top shows the total number of vertices and edges in the current graph, and the number of each vertex type and edge type as well. The line chart at the bottom shows the number of vertices and edges over time, when loading is in progress.
Data Loading Statistics
If you click on one data file, the Statistics panel will change to show Data Loading Statistics:
The table at the top shows the detailed loading information of the selected data file, including:
-
Status (RUNNING, PAUSED, STOPPED, etc)
-
Loaded percentage (for files on server) or loaded size (for S3 file)
-
Loading speed
-
Average loading speed
-
Number of loaded lines
-
Number of missing token lines
-
Number of oversize lines
-
Loading start time
-
Loading duration
The area chart in the middle shows the real-time loading speed (lines per second) for this data file.
The pie chart at the bottom shows the distribution of data lines, among three categories:
-
Loaded lines
-
Missing token lines (the lines contain fewer tokens than required by the data mapping)
-
Oversize lines (some tokens are too large)
The number of loaded lines doesn’t mean all these lines are successfully loaded. Some issues during Data Mapping (like mapping a non-numeric column to an integer attribute) or because of dirty data may cause some of these lines not to be loaded. |
If data file loading encounters any issues and gets an error message, the error message will be shown at the bottom:
Clear Graph Data
Click on the "clear graph data" buttonon the toolbar to clear the graph data. This operation will take approximately 1 minute or more, depending on the size of your graph and the hardware.
Caution: Clear Graph Data deletes all data from your database. The schema and queries will remain. This deletion is irreversible. Please confirm the impact before you proceed with clearing graph data operation. |
Tip: Only users with superuser role can clear graph. You can consider assigning other roles to your team to avoid accidental data deletion. |
Tip: If you clear graph data by accident, you can reload the data into the database by clicking on the "start/resume loading" button on the toolbar. The data files are still in the filesystem, as long as you do not deliberately delete the data files from the filesystem. |
After the clear operation, the graph vertex and edge number statistics will both drop to 0.
After data has been loaded, you can go to the Explore Graph or Write Queries pages.