This page will describe the service excel-path and how the excel file used for this service should be formatted in order for the service to function properly.
Setting up the service
In order to upload data using an excel file using the excel-path format, add a new service to the space using the 'Upload Excel' template and use the profile 'excel-path'
To see how to set up a service to a space, look here.
Preparing an excel file
Tabs:
An excel file that will be used for the excel-path service needs to follow certain rules in order for the import to be successful. If these rules are not followed the service will return an error message.
First we will look at how the sheets need to be set up.
- it must have one sheet called Headers
- it can have a sheet called Settings (See Settings Sheet section for more details)
- it must have at least one other sheet with data, these can have any name (but not Headers and Settings), for example Person
Data Sheets
For data sheets there is also a structure they need to follow in order for the import to be successful.
An example of how a data sheet, for example Fish, should look like.
- Each sheet has to have a first column called NODE
- Any following column needs to have a header with the format key.key.key.etc with at least one key
- If a path (more than one key, separated by dots) is specified in the header, a column on the left of this column needs to specify the path of one level up
For example, somewhere on the left side of location.name there is a column location, which has a value - In a path, attributes can only be set as the last key in a path
- Each key in the header should be mentioned in the Headers Sheet (see next section)
- For any row containing values in any column, the first cell must have a value in the NODE column (the cell can not be left empty)
- Any subsequent columns can have a value, if it does not have a value, that cell is ignored
- Setting a value in any column requires the column to have a header
Deleting nodes
A variation of the NODE header is the DELETE header. It means the nodes in the rows of that sheet will get deleted from the database.
Headers Sheet
The headers sheet needs to be set up properly as well for the import to be successful. For each unique header, with exception of the NODE header, across all data sheets there needs to be an entry in the headers sheet.
These entries must adhere to the following rules:
- They must have a key value in the first column with the exact value they have in the first row on any data sheet
- When update is chosen they must get a header for update
- They must mention either relation or attribute in the third column
- They must mention either create, update or delete in the fourth column
- They must have a key value in the fifth column
- If attribute was chosen in the third column, a datatype value can be set in the fourth column, by default it will be set to string
- If attribute was chosen in the third column, a value TRUE or FALSE (for users who use the Dutch excel version it is WAAR or ONWAAR) can be set in the fifth column (explanation (1) below), if it is not set the value will be interpreted as FALSE
- If relation was chosen in the third column, a value TRUE or FALSE (for users who use the Dutch excel version it is WAAR or ONWAAR) can be set in the sixth column (explanation (2) below), if it is not set the value will be interpreted as FALSE
1. Updating attributes
If a value is set for an attribute on the data tab, but the node in the database already has a different value, the excel-path profile will throw an error and not perform the import. It assumes there is no value set, or the value is already identical.
When the user wants the profile to update attribute values for that specific header (for example name) it can set the fifth column (E) to TRUE
2. Creating relation to node
If a value is set for a relation on the data tab, but the to node does not exist yet, the excel-path profile will throw an error and not perform the import. By default it assumes the node mentioned in the cell already exist.
When the user wants the profile to create the to node on the fly for that specific header (for example location) it can set the sixth column (F) to TRUE
Settings Sheet
There is also an option to include an additional sheet in the excel file with the name 'Settings'. In this sheet you can include some extra settings that the space should use upon importing the file. You can also include a list of prefixes.
How the Data is imported
From the file, the import will create every node in the NODE column. After this it will take any value in a row for a specific node and set attributes and relations following the rules explained in the header setting.
If a column is set to be a relation in the headers sheet it will result in a relation between two nodes:
If the header of the column is not part of a path: The relation will be going from the node with the ID that can be found in the first column (A) in a data sheet, to a node with the ID that has the value of the cell.
In the example shown in the images above this means that there will be a relation from the node with ID '6a7106b7-4aa6-91d7-0ef113c35343' to the node with ID 'w:Person' and that this relation will have the key: 'rdf:type' (the key is set in the headers sheet)
If the header of the column is part of a path: The relation will be going to a node with the ID that has the value of the cell. The relation will be coming from the node that is defined one level up the path.
For example if make a new column (and define the new header in the header sheet as a relation) 'livesin.street' and give the cell on the second row the value 'baker.street' it will create a relation from the node with the ID 'e3d3f046-7069-4401-9a00-078d96c73f95' to a new node with the id 'baker.street'
If a column is set to be an attribute in the headers sheet, this will result in an attribute on a node.
This attribute will always be on the last node defined in a path, if the header of a column does not have the path format, the attribute will be set on the node with the ID that can be found in the first column.
In the example shown in the images above the name attribute in the third column will be give to the node with ID in the first column.
The name attribute in the fifth column (E) has the path format and point to the livesin column, so this attribute will be set on the node with ID 'e3d3f046-7069-4401-9a00-078d96c73f95'