Tutorial: from databases to JSON-LD

Databases are great. However, no-one is going to let you connect directly with their database to share data. We aren’t going to spend any time on all of the ways in which data have been shared between databases (EDI, XML, direct API queries); the answer today is JavaScript Object Notation.

JavaScript Object Notation, JSON from here on out, has a small, well-defined and logical set of rules, enabling you to encode, store and retrieve structured data in a format that is easily readable by both humans and machines.

What JSON is good for
What JSON is good for

It has become the data exchange format on the Web, and if you aren’t already working with it, you will be working with it when communicating your compliance framework.

The structure of JSON

JSON has only two structures; objects and arrays. Everything in JSON is either an object or an array. And more importantly, objects can have embedded objects as well as embedded arrays. And arrays can have embedded objects. Way cool.

A JSON Object containing an Array
A JSON Object containing an Array

To explain this, we’ll go back to a couple of the tables from above; a simple name table (top) and a complex name/address table (bottom):

Simple table
Simple table

Complex name array
Complex name array

Object syntax

The properties of every JSON object are derived from three elements:

  • Key – think field name or table name here.
  • Value – this is the content of the field, and is left blank for an object or array.
  • Type – this describes one of three things: an object, an array, or the field type for the content.

JSON turns the combination of keys and values into a property, which are paired together separated by a colon in a JSON Object. This is also called the key:value pair, where the property name is expressed then the property value "property name": "property value".

Property names and values
Property names and values

This is more easily understood once you start adding content to a JSON file. Let’s start with a blank one, below:

Root Object in JSON
Root Object in JSON

The Root object in JSON is always described as a simple pair of curly brackets “{ }”. At this point, there are no keys, no values, and no defined types.

To this, we are going to add the various types of content.

Writing Objects

Objects (think a single record in a table even if it only has one field), are surrounded by curly brackets “{ }” to denote that everything inside of the brackets is a single object, that consist of strings of comma-separated key:value pairs (pairings of keys and then their values separated by colons). Here’s the first row of the name table in JSON.

In tree format, the object looks like this:

JSON Tree
JSON Tree

In JSON format, it looks like this, where each of the fields in the tree is returned as a new line ending in a comma. The key (field name) always precedes the value (field contents):

A single JSON Object with two key/value pairs
A single JSON Object with two key/value pairs

The type isn’t presented in basic JSON code (we’ll get to that as a part of JSON-LD in a bit).

Arrays

Simple arrays, as a JSON type, are surrounded by square brackets “[ ]” and consist of comma-separated values. If you wanted to present the column of three numbers as we did in the spreadsheet, it would be expressed as a simple array. In tree format, the simple array looks like this:

JSON array in tree format
JSON array in tree format

In JSON format, it looks like this, with the account number split from the account name.

Oversimplified Array
Oversimplified Array

That won’t work well to create a table. And this is where objects as a type come in to play. If we re-arrange the tree to add a set of object brackets “{ }”, we can then add the key to the property and display each of the name records with their individual fields in the array:

The same array in JSON
The same array in JSON

Now the JSON structure will separate each record and display the key/value pair for each record in the array:

JSON structure with table and field names
JSON structure with table and field names

This JSON clearly tells us that this table is called Names and has two fields (keys) named Acct: and Name. It is worth it to see the two styles of JSON, with the less-structured version on the left and the more-structured version on the right, as shown below:

Un-named (left) and named (right) JSON arrays
Un-named (left) and named (right) JSON arrays

Complex Object Arrays

Now that we understand objects and arrays, let’s go back to that name and address table and combine them. When finished, what we want is:

  • an array of names; and
  • an array of addresses for each name.

So we build out a tree that looks like the one that follows:

A complex object array
A complex object array

With this, the JSON structure embeds an array for each name and then embeds another array within that one for each address.

A complex JSON Object with embedded Arrays
A complex JSON Object with embedded Arrays

While this is great stuff, there isn’t yet enough information to tell a developer how to automatically translate this into a database structure. Remember that basic JSON doesn’t even pass along the key/value type.

We must turn to JSON-LD for more information.

Adding structured and linked data to JSON

There is no doubt that JSON is the shareable language that all systems are currently using to share data back and forth. However, there was no standardized methodology to share JSON with the ubiquitous web browsers that everyone uses to communicate.

JSON allows ubiquitous communication
JSON allows ubiquitous communication

In 2011, Google, Bing, Yahoo!, and Yandex created a joint effort to unify a structured data vocabulary for the web and the output was twofold; JavaScript Object Notation for Linked Data (JSON-LD) and the vocabulary repository for it at Schema.org.

Schema.org
Schema.org

The initial goal for JSON-LD was to annotate elements on a web page, structuring the data, which can then be used by search engines to disambiguate elements and establish facts surrounding entities, which is then associated with creating a more organized, better web overall.

JSON-LD
JSON-LD

The Context

The first element that retains a permanent place in JSON-LD markup is the @context with the value of the schema URL you are going to use. Currently, there are two known schemas that support compliance frameworks, http://schema.org and here at https://grcschema.org. In the tree view, the context is laid out as an array of information.

The context
The context

One thing to notice here in JSON-LD is the wealth of information about the data structure that is also passed to the reader! The rdfs:labels tell give you the object name while the rdfs:comment gives you the information about the object you are dealing with.

rdfs:comment
rdfs:comment

The Type

The second element in the JSON-LD Schema “always there” squad is the @type specification (after the colon, it becomes all data annotation). @type specifies the item type being marked up. All Types have as their top-level, Thing as shown below:

@type
@type

Schema Properties

Within JSON-LD, each object’s properties are described in-depth. Below we present the property for first_name and are able to tell the reader that this is text element and, in the comments, that it represents a person’s first name.

schema:Property
schema:Property

Schema Arrays

In addition to standard schema properties, JSON Context can also tell the reader that what is being presented is an array. In the example below, the person object allows for an array of additional e-mail addresses. This is described, in JSON-LD Context, as a set (“@set”):

@set
@set

By labeling each type of thing in a JSON object, you can provide the necessary code to developers to create structured pages that use either Microdata or RDF to tag HTML tag attributes that correspond to the user-visible content that you want to describe.

Labeled JSON-LD
Labeled JSON-LD