JSON User Guide

Introduction

Here you will be acquainted with exporting your graph to a JSON text and populating your graph from a JSON text.

On import, your JSON text may embed node and edge sections at any point. These sections must adhere to the Graph for Scala JSON Grammar to enable data retrieval. This extended JSON grammar was planned with flexibility in mind:

  • An arbitrary number of node and edge sections within the same JSON text will be processed to support different node and edge types within the same graph.
  • JSON texts to be imported may also include any non-graph related data that will be skipped.
  • All identifiers within the JSON text marking node and edge sections or node and edge types are configurable.
  • You have full control over JSON formats to represent nodes and edges.
  • You also have fine-grained control over each phase of the import/export process.

With the exception of serializers, Graph for Scala JSON import/export is transparently implemented on top of Lift-Json.

Loosely coupled with the core module, Graph for Scala JSON is supplied as an extra artifact. The core and JSON modules need not have the very same version. Please refer to the latest release of graph-json to ensure compatibility.

Most examples below are based on an imaginary, simplistic academic library represented by a graph. In this library graph, books and authors are represented by nodes, their relations by directed edges or directed hyperedges in case a book has several authors:

</p>

// ADT of nodes
sealed trait Library
case class Book(title: String, isbn: String)          extends Library
case class Author(surName: String, firstName: String) extends Library

// nodes: two books, four authors
val (programming, inDepth) = (
  Book("Programming in Scala", "978-0-9815316-2-5"),
  Book("Scala in Depth",       "978-1-9351827-0-2")
)
val (martin, lex, bill, josh) = (
  Author("Odersky", "Martin"),
  Author("Spoon", "Lex"),
  Author("Venners", "Bill"),
  Author("Suereth", "Joshua D.")
)

// mixed graph with two author-book relations
val library = Graph[Library, AnyDiHyperEdge](
  OneOrMore(programming) ~~> OneOrMore(martin, lex, bill),
  inDepth ~> josh
)

The full example code is included in JsonDemoSpec.scala.

Export your graph

To export your graph to JSON text, call toJson:

import scalax.collection.io.json._
val exported = library.toJson(descriptor)

Alternatively, you can control export phases one by one:

import scalax.collection.io.json.exp.Export

val export = new Export[N,E](library, descriptor)
import export._

val (nodesToExport, edgesToExport) = (jsonASTNodes, jsonASTEdges)
val astToExport = jsonAST(nodesToExport ++ edgesToExport)
val exported = jsonText(astToExport)

Clearly, exported of type String will contain the JSON text, but what about the descriptor argument?

Work with descriptors

Fine-grained control over JSON import/export is achieved by means of Graph JSON descriptors, a kind of export/import configuration made up of

  • node descriptors per node class
  • edge descriptors per edge class and
  • node and edge JSON section keys

Descriptors are identified by their `typeId: String` which are used in the JSON text.

Prior to calling toJson think about

  • which node/edge classes are present in your graph. You will need to pass a descriptor for every class.
  • how you want to serialize your classes. You can go with predefined serializers or implement your own serializer.

In case of our academic library example you could start with

val bookDescriptor = new NodeDescriptor[Book](typeId = "Books") {
  def id[B >: Book](node: B): String = node match {
    case Book(_, isbn) => isbn
  }
}
val authorDescriptor = new NodeDescriptor[Author](typeId = "Authors") {
  def id[B >: Author](node: B): String = node match {
    case Author(surName, firstName) => "" + surName.head + firstName.head
  }
}

import scalax.collection.io.json.descriptor.predefined.DiHyper
val quickJson =
  Descriptor[Library](authorDescriptor, bookDescriptor)(DiHyper.descriptor[Library]())()

First, we defined node descriptors for the node classes Book and Author respectively where

  • typeId denotes the node type in the JSON node sections like Books in
    {"nodes":{
       "Books":[{"title":"Programming in Scala","isbn":"978-0-9815316-2-5"}, ...
    ]}}
  • def id returns a unique shortcut for each node that will be inserted in serialized edges to reference the ends of the edge like "SJ" in
    {"edges":{
       "DiEdge":[["978-1-9351827-0-2","SJ"], ...]
    }}
    Without these node Ids, JSON edges would have to contain all node data meaning that, in proportion to the complexity of nodes and the order of the graph, JSON texts would explode in length.
    Please exercise care when implementing def id to ensure uniqueness.

Second, we assembled a Descriptor with the type argument Library, the node descriptors authorDescriptor and bookDescriptor, and the predefined edge descriptor DiHyper. Predefined edge descriptors have a typeId equaling to their name and are type-safe with respect to the corresponding predefined edge class.

Next, if you inspect the resulting JSON text, you get a run-time exception teaching you that "No 'NodeDescriptor' capable of processing type "demo.Book" found". So it turns out that the above quickJson is not complete as you might have wondered. Since the descriptor must deal with all node and edge classes contained in the graph, we also need to pass an edge descriptor for DiEdge. Learned this lesson, here is a complete descriptor covering our academic library graph:

import scalax.collection.io.json.descriptor.predefined.{Di, DiHyper}
val descriptor =
  Descriptor[Library](
    authorDescriptor,
    bookDescriptor
  )(DiHyper.descriptor[Library](),
    Di.descriptor[Library]()
  )()

With the above descriptor, calling toJson yields

{
  "nodes":{
    "Books":[{
      "title":"Scala in Depth",
      "isbn":"978-1-9351827-0-2"
    },{
      "title":"Programming in Scala",
      "isbn":"978-0-9815316-2-5"
    }],
    "Authors":[{
      "surName":"Odersky",
      "firstName":"Martin"
    },{
      "surName":"Spoon",
      "firstName":"Lex"
    },{
      "surName":"Venners",
      "firstName":"Bill"
    },{
      "surName":"Suereth",
      "firstName":"Joshua D."
    }]
  },
  "edges":{
    "DiEdge":[{
      "n1":"978-1-9351827-0-2",
      "n2":"SJ"
    }],
    "DiHyperEdge":[{
      "nodeIds":["978-0-9815316-2-5","OM","SL","VB"]
    }]
  }
}

Let's analyze this JSON text in more detail:

You can easily identify the node and edge sections denoted by the keys "nodes" and "edges". These are the default keys that you can alter by passing different ones to sectionKeys.

Our above JSON text may raise criticism in that it is polluted with the repeated field names "surName", "firstName" etc. You might object to such a lengthy output. If so just go for what we call positional JSON meaning that JSON values will be mapped to the fields of nodes and edges by their position. However, to get the export generate positional JSON requires a few more lines of code namely you need to supply appropriate Lift-Json custom serializers like

private object PositionedNodeDescriptor {
  import net.liftweb.json._

  private class AuthorSerializer
      extends CustomSerializer[Author](_ =>
        (
          { case JArray(JString(surName) :: JString(firstName) :: Nil) =>
            Author(surName, firstName)
          },
          { case Author(surName, firstName) =>
            JArray(JString(surName) :: JString(firstName) :: Nil)
          }
        )
      )

  val author = new NodeDescriptor[Author](typeId = "Authors", customSerializers = Seq(new AuthorSerializer)) {
    def id[B >: Author](node: B): String = node match {
      case Author(surName, firstName) => "" + surName(0) + firstName(0)
    }
  }
}

For each node class we need to extend net.liftweb.json.Serializer what is fairly straightforward. Then we pass the custom serializer AuthorSerializer to the node descriptor author. Above, we have separated out the implementation into the object PositionedNodeDescriptor that exposes the specific node descriptor author. It should also contain a specific serializer for Book which we left out intentionally.

Now we are ready to assemble a JSON descriptor utilizing our positioned node serializers. At the same time we also pass positioned edge serializers that are predefined for all supplied edge classes:

import scalax.collection.io.json.serializer.EdgeSerializer

  val descriptor = Descriptor[Library](
    PositionedNodeDescriptor.author,
    PositionedNodeDescriptor.book
  )(
    DiHyper.descriptor[Library](),
    Di.descriptor[Library](Some(new EdgeSerializer))
  )()

Armed with the above descriptor we finally call

val exported = library.toJson(descriptor)

to verify that the resulting JSON text is now condensed:

{
  "nodes":{
    "Books":[
      ["Scala in Depth","978-1-9351827-0-2"],
      ["Programming in Scala","978-0-9815316-2-5"]
    ],
    "Authors":[
      ["Odersky","Martin"],
      ["Spoon","Lex"],
      ["Venners","Bill"],
      ["Suereth","Joshua D."]
    ]
  },
  "edges":{
    "DiHyperEdge":[[{ "sources":["978-0-9815316-2-5"] }, { "targets":["OM", "SL", "VB"] }]],
    "DiEdge":[["978-1-9351827-0-2","SJ"]]
  }
}

Import JSON texts

Being well versed in the design of Graph for Scala JSON descriptors, there is virtually nothing more left to know about how to populate Graph instances from JSON texts. To process JSON texts you simply call fromJson:

import scalax.collection.io.json._
val library = Graph.fromJson[Library,HyperEdge[Library]](jsonTextLibrary, descriptor)

library of the type Graph[Library,HyperEdge[Library]] will then contain all nodes and edges derived from the node and edge sections of the JSON text jsonTextLibrary. Most of the time you can reuse the very same descriptor that you assembled for the export.

Alternatively, you can control import phases one by one:

import scalax.collection.io.json.imp.Parser._
val parsed = parse(jsonText, descriptor)
val result = Graph.fromJson[...](parsed)

Note on inversion

val expLibrary = library.toJson(descriptor)
Graph.fromJson[Library,HyperEdge](expLibrary, descriptor) shouldBe library

If you think of the JSON export as the inverse function of JSON import, the following rules apply:

  • Import(Export(graph))== graph but
  • Export(Import(JSON-text)) does not necessarily equal to JSON-text.

This is because a JSON-text is a sequence of characters while graphs contain unordered sets of nodes and edges.

Grammar

nodeSection0..* ::= JsonField( nodeSectionKey : nodeValues )
nodeValues ::= nodeList
|   JsonObject( nodeTypeId : nodeList )0-1
nodeList ::= JsonArray( JsonObject( nodeFieldId : nodeField )1..* )0-1
|   JsonArray( JsonArray ( nodeField )1..* )0-1
nodeField ::= JsonValue
edgeSection0..* ::= JsonField( edgeSectionKey : edgeValues )
edgeValues ::= edgeList
|   JsonObject( edgeTypeId : edgeList )0-1
edgeList ::= JsonArray( JsonObject( edgeIdFields )2..* )0-1
|   JsonArray( JsonArray ( edgeFields )2..* )0-1
edgeIdFields ::= (edgeFieldId : edgeField)1..*
edgeFields ::= (edgeField)1..*
edgeField ::= JsonValue

Notes on the grammar notation

  1. Entries with the prefix Json refer to JSON values as defined in RFC 4627. The parenthesis following such a Json entry are not part of the syntax. For instance,
           JsonArray( JsonObject( edgeIdFields ))
    reads "a JSON array containing JSON objects containing edgeIdFields".
  2. If the multiplicity of a repetitive JSON element is restricted, the allowed multiplicity is given in superscript notation. For instance,
    JsonObject( edgeTypeId : edgeList )0-1 translates to
    ‘{‘ edgeTypeId ‘:‘ edgeList ‘}‘
    with zero or one field in the JSON object. Thus it reads "a JSON object containing zero or one field".

Notes on specific grammar elements

  1. nodeSection/edgeSection JSON fields:

    The JSON text passed to the Graph conversion method fromJson will be parsed for an arbitrary number of nodeSections and edgeSections both described in the above grammar.

  2. *Id and *Key JSON strings:

    Grammar elements suffixed by Id or Key such as nodeSectionKey, nodeTypeId, edgeSectionKey or edgeTypeId are always JSON strings.  In general, they allow using custom JSON constants.
    For instance, JSON objects containing edges will be found in the JSON text based on edgeSectionKey which defaults to "edges" but may be altered to any other name such as "vertices".

  3. nodeTypeId/edgeTypeId JSON Strings:

    These Ids allow for choosing the appropriate node and edge descriptor at run time.

  4. nodeList/edgeList JSON arrays:

    Nodes and edges enlisted in nodeList and edgeList may be represented either by JSON objects (named fields) or by JSON arrays (positioned field values).