Explore the BrainHBP Knowledge Graph - Design your API Draft!

Design your API!

The HBP Knowledge Graph stores interesting information you should be able to access easily. We are aware, that the applications you (and us) build on top of the provided data do have the need to shape the scope (you will most certainly never be interested in all data of the graph) as well as to define the resulting structure of the data (denormalization) you're interested in.
This is why we provide our "KG Query API" registration mechanism, which allows you to comfortably define which data you would like to fetch, and how it shall be structured.

The specification

The API specification is defined in a JSON-LD file and sets the scope of the data (what should be queried) and allows a basic restructuring of the data (e.g. flattening nested structures).

Hello world
A simple hello world:
{
    "@context": {
        "@vocab": "http://schema.hbp.eu/graph_query/",
        "target": "http://schema.hbp.eu/dataset/search/",
        "schema": "http://schema.org/",
        "searchui": "http://schema.hbp.eu/search_ui/",
        "fieldname": {
            "@id": "fieldname",
            "@type": "@id"
        },
        "relative_path": {
            "@id": "relative_path",
            "@type": "@id"
        }
    },
    "root_schema": {"@id": "https://nexus-dev.humanbrainproject.org/v0/schemas/foo/core/person/v0.0.4"},
    "fields": [
        {
            "fieldname": "target:title",
            "relative_path": "schema:name",
            "required": true,
            "sort": true
        }
    ]
}
       

The @context part just simplifies our life by allowing us to use alias instead of full blown JSON-LD structures. Find more about the @context here.

Please note, that all key names without a prefix (prefix:property) automatically apply to the namespace defined in @vocab - in this case http://schema.hbp.eu/graph_query/. root_schema therefore becomes http://schema.hbp.eu/graph_query/root_schema when the JSON-LD is fully qualified. The root_schema defines the schema of instances that should be queried as a root element. This means, that you will get one object per instance of this schema as a result (unless no further filters apply).
The fields section then defines in which fields you're interested in:

When executing this specification against the API, the result will look something like:
{
    "results": [
        {
            "http://schema.hbp.eu/dataset/search/title": "Foo, Bar"
        },
        {
            "http://schema.hbp.eu/dataset/search/title": "Bar, Foo"
        }
    ],
    "total": 2,
    "size": 2,
    "start": 0
}
As you can see, there is an array at results which contains as many instances as can be found in the database with the new structure and the values are translated to the new property key.
The query API will provide some additional convenience functionalities for you like pagination (this is when the values total (the overall count in the database) size (the number of elements returned in this response) and start (the offset of the current result) become interesting.
Additionally, you can declare the "@vocab" as a query parameter - if you e.g. would have defined ?vocab=http://schema.hbp.eu/dataset/search/, the result would have looked like this:
{
    "results": [
        {
            "title": "Foo, Bar"
        },
        {
            "title": "Bar, Foo"
        }
    ],
    "total": 2,
    "size": 2,
    "start": 0
}
    
So what you're getting is basically JSON (although in fact it's a very simple JSON-LD).
Nesting
The really interesting part is now, if you are traversing the path to combine information coming from different entities:
{
    "@context": ...,
    "root_schema": ...,
    "fields": [
        {
          ...
        },
        {
          "fieldname": "target:custodian_of",
          "relative_path": {
            "@id": "foo:owners",
            "reverse": true
          },
          "fields": [
            {
              "fieldname": "target:name",
              "relative_path": "schema:name"
            }
          ]
        }
    ]
}
Here, we can see multiple things: As before, we define the target field (in this case target:custodian_of) which also contains a relative_path as above. But this time, we declare that we walk the graph in the reverse direction - this means, that we go to any other instance that has a field called foo:owners which points to the current person.

Please note, that the "relative_path": "schema:name" was a shortcut for "relative_path": {"@id": "schema:name"} (due to the context definition) and that the default value of reverse is false. Therefore, we could also have written "relative_path": { "@id": "schema:name", "reverse": false} before.

Because we're additionally defining another fields key, we don't expect target:custodian_of to contain a single value (or array respectively) but rather an object which again contains a property called target:name. If we execute the combination of the two examples, we expect the result to look like:
{
    "results": [
        {
            "http://schema.hbp.eu/dataset/search/title": "Foo, Bar",
            "http://schema.hbp.eu/dataset/search/custodian_of": {
                "http://schema.hbp.eu/dataset/search/name": "Some dataset A"
            }
        },
        {
            "http://schema.hbp.eu/dataset/search/title": "Bar, Foo",
            "http://schema.hbp.eu/dataset/search/custodian_of": [
                {
                    "http://schema.hbp.eu/dataset/search/name": "Some dataset B",

                },
                {
                    "http://schema.hbp.eu/dataset/search/name": "Some dataset C",

                }
            ]
        }
    ],
    "total": 2,
    "size": 2,
    "start": 0
}
    

As you can see, the second result contains an array of objects. Because in a graph structure, (theoretically) everything can be linked to everything in a many-to-many manner, all traversals have potentially multiple connections. A client should therefore be prepared to handle either a single value or multiple values at any time.

Flatten
Sometimes, you want just to skip an instance and don't create a nested data structure. If you e.g. would have wanted to get rid of the target:name part in the above example, you could have written this instead:
{
    "@context": ... ,
    "root_schema": ... ,
    "fields": [
        {
          ...
        },
        {
          "fieldname": "target:custodian_of",
          "relative_path": [
            {
                "@id": "foo:owners",
                "reverse": true
            },
            "schema:name"
          ]
        }
    ]
}
This would result in the following structure:
{
    "results": [
        {
            "http://schema.hbp.eu/dataset/search/title": "Foo, Bar",
            "http://schema.hbp.eu/dataset/search/custodian_of": "Some dataset A"
        },
        {
            "http://schema.hbp.eu/dataset/search/title": "Bar, Foo",
            "http://schema.hbp.eu/dataset/search/custodian_of": ["Some dataset B", "Some dataset C"]
        }
    ],
    "total": 2,
    "size": 2,
    "start": 0
}
    
You are not restricted by the steps you walk down inside the relative_path, but please be aware that graph traversal depth has significant impact on performance - so it's recommended to take the shortest path possible. ;)
Merge
If you have values in multiple paths in the original datastructure which you want to unify in a single resulting field, you can do so by making use of the special command merge:
{
    "@context": ... ,
    "root_schema": ... ,
    "fields": [
        {
          ...
        },
        {
            "fieldname": "this:unified",
            "merge": [
                {
                    "relative_path": [
                        "foo:onePath",
                        "schema:name"
                    ]
                },
                {
                    "relative_path": [
                        "foo:otherPath",
                        "foo:someMoreDepth",
                        "schema:name"
                    ]
                }
            ]
        }

}
This means, that all values are collected in both paths, a list is created and returned as a result in this:unified.
Meta information
A nice feature of the API specifications is that they can be applied in different ways. Next to the data query API, they can also be used for the provided meta-query API. This is another API method that accepts exactly the same specification but that returns meta information instead of the actual data in the same resulting structure. Let's extend our initial specification with some meta information:
{
    "@context": {
        "@vocab": "http://schema.hbp.eu/graph_query/",
        "target": "http://schema.hbp.eu/dataset/search/",
        "schema": "http://schema.org/",
        "searchui": "http://schema.hbp.eu/search_ui/",
        "fieldname": {
            "@id": "fieldname",
            "@type": "@id"
        },
        "relative_path": {
            "@id": "relative_path",
            "@type": "@id"
        }
    },
    "root_schema": {"@id": "https://nexus-dev.humanbrainproject.org/v0/schemas/foo/core/person/v0.0.4"},
    "fields": [
        {
            "fieldname": "target:title",
            "relative_path": "schema:name",
            "required": true,
            "sort": true,
            "label": "Name",
            "searchui:boost": 20
        }
    ]
}
    
We have added a label as well as a customer specific value called searchui:boost. If we send this specification to the query API, we get the same result as described in the first example. But if we send it to the meta-api, we will get the following information:
{
  "results": [
    {
      "http://schema.hbp.eu/dataset/search/title": {
        "http://schema.hbp.eu/graph_query/label": "Name",
        "http://schema.hbp.eu/search_ui/boost": 20
      }
    }
  ]
}

As you can see, this response follows the same structure as the initial response including the previously added meta-information. This could for example be used in a user interface which can reflect on the data received and therefore knows, that the value in http://schema.hbp.eu/dataset/search/title can be labeled by the value defined in http://schema.hbp.eu/graph_query/label.
While the label is something which could be used by multiple clients, the value of searchui:boost is only relevant for some very specific clients (in fact those that are using elasticsearch, to define the search boost for this field). We therefore expect the clients to know about the existance of this field and to be able to query it specifically. All other clients will silently ignore the field.

Contact

Any questions? Contact us: kg-team@humanbrainproject.eu