Presumably, you go for graphs because you have things that are connected, and you want to easily navigate along those connections. So the things become the nodes and their connections the edges. For instance, think of a graph describing relationships between persons. Persons become the nodes and their relations become the edges:
case class Person(name: String) val g: Graph[Person, UnDiEdge[Person]] = ...
Person
and undirected edge type UnDiEdge
.
The predefined UnDiEdge
suffices as long as you need a single type of relation between persons.
Simple enough, why should we waste time worrying about nodes and edges that are self-evident? Well, because real life often leaves you with different options.
Let's consider some options for modelling a database schema including
Intuitively, Tables and columns become nodes, edges represent which columns a table has. Next, what about indexes? Indexes could be modelled as another node type since they are kind of things, not just connections:
sealed trait SchemaObject case class Table(name: String) extends SchemaObject case class Column(name: String) extends SchemaObject case class Index(name: String) extends SchemaObject val g: Graph[SchemaObject, DiEdge[SchemaObject]] = ...
DiEdge
may point
The above attempt has some deficiencies, though. First of all, there is no guarantee that nodes will be connected as expected. For instance, it is possible to add a directed edge from a column to an index. Such an arrow would result in a wrong set of adjacent nodes which should always be tables. Here is what we should prefer as a first improvement:
// ... (node ADT unchanged) sealed trait Relation case class TableColumn(table: Table, column: Column) extends AbstractDiEdge(from = table, to = column) with Relation case class IndexColumn(index: Index, column: Column) extends AbstractDiEdge(from = index, to = column) with Relation val g: Graph[SchemaObject, Relation] = ...
DiEdge
, Relation
has no type parameter.
Hence, we call such a graph typed graph.
Another drawback of the above design is that we missed relating indexes with tables.
To overcome this issue in a property graph, you would use an edge labelled like IndexTable
to be able to distinguish between arrows pointing to columns and arrows pointing to the table.
Alternatively you could check that target node for its type.
Keeping this in mind, let's move to the next drawback because once we get that resolved we also get rid of this
IndexTable
issue.
You have probably noticed, that an index is existentially bound to a table what is proven by the fact that if you drop a table, the index needs be dropped as well. In terms of a graph it feels a little strange if a node needs be dropped on dropping another node. Would it be more viable to model an index as an edge?
In fact, you can model indexes as hyperedges.
In the database schema demo
we play around with this approach, including Subset
,
but to start with take a look at the following simplified version:
sealed trait SchemaObject case class Table(name: String) extends SchemaObject case class Column(name: String) extends SchemaObject sealed trait Relation case class TableColumn(table: Table, column: Column) extends AbstractDiEdge(from = table, to = column) with Relation case class Index(table: Table, columns: OneOrMore[Column]) extends AbstractDiHyperEdge(one(table), columns) with Relation val g: Graph[SchemaObject, Relation] = ...
Index
any more.Index
relates a table with a non-empty list of columns.
OneOrMore
is a data type supplied with Graph for Scala.
It represents a list of at least one
while Several
represents a list of at leas two elements.
OneOrMore
and Several
facilitate type-safe hyperedges because
a directed hyperedge has one or more sources and also one or more targets
while a hyperedge has at least two ends.
With this approach we have won the following:
With hyperedges at your fingertips, don't hesitate to take advantage of them whenever they improve your design.
If the relation you plan to represent is symmetrical go for undirected edges or undirected hyperedges. To test for symmetry, just read the relation from both sides: If you end up with the same phrase from both sides your relation is symmetrical. For instance, in a road net locations are symmetrical tested by "I can drive from location A to location B and, likewise, I can drive from location B to location A." Let`s deal with one-way roads below.
As an example for asymmetrical relations, think of a person net that includes family as well as social relations. Then, parenthood is asymmetrical proven by "A is a parent of B and B is not a parent of A". B is rather a child of A. Also, assume you want to represent both parenthood and children in some way.
For asymmetrical relations, different options are viable. Concerning our example you may
case class Person(name: String) sealed trait Relation case class Parent(child: Person, parent: Person) extends AbstractDiEdge(from = child, to = parent) with Relation // ...more relations val g: Graph[Person, Relation] = ...
sealed trait Relation case class Parent(child: Person, parent: Person) extends AbstractDiEdge(from = child, to = parent) with Relation case class Child(parent: Person, child: Person) extends AbstractDiEdge(from = parent, to = child) with Relation // ...more relations
sealed trait Relation case class Parent(child: Person, parent: Person) extends AbstractUnDiEdge(child, parent) with Relation // ...more relations
sealed trait Relation case class Parents(child: Person, mather: Person, father: Person) extends AbstractDiHyperedge(sources = one(child), targets = OneOrMore(mather, father)) with Relation case class Children(parent: Person, children: OneOrMore[Person]) extends AbstractDiHyperedge(sources = one(parent), targets = children) with Relation // ...more relations
Most often you also need some edge label. Feel free to add any number of fields to your typed edge for this purpose.
In terms of our above database schema example, we could add a "unique" property to the Index
edge like:
case class Index(table: Table, columns: OneOrMore[Column], unique: Boolean) extends AbstractDiHyperEdge(one(table), columns) with Relation
As a second example, we could extend the above Parent
edge like
case class Parent(child: Person, parent: Person, `type`: ParenthoodType) extends AbstractDiEdge(from = child, to = parent) with Relation
with ParenthoodType
being an enumeration like {Biological, Adoptive, Surrogate...}.
In case you are still wondering, here is why we refrain from providing generic labelled edges:
label
.
You are free to mix any combination of directed, undirected and hyperedges in the same graph. Just look up the lowest common ancestor of your edge types in
and use the corresponding abstraction with the prefix Any
like
sealed trait Relation extends AnyEdge[Person] case class Parent(child: Person, parent: Person) extends AbstractDiEdge(from = child, to = parent) with Relation case class Friends(one: Person, another: Person) extends AbstractUnDiEdge(one, another) with Relation val g: Graph[Person, Relation] = ...
AnyEdge
is the lowest common ancestor of DiEdge
and UnDiEdge
.
Mixed graphs work for generic and typed edges alike.
The previous example graph supporting edges of the type Parent
or Friends
works fine.
You can connect two given Person
s with a single edge or both Parent
and Friends
while you cannot connect them by more than one edge of the same type like Friends
.
This is the case because AbstractDiEdge
and AbstractUnDiEdge
apply a different built-in
equality with respect to the same nodes. In specific,
But what happens if we add yet another edge type?
case class Siblings(one: Person, another: Person) extends AbstractUnDiEdge(one, another) with Relation
Now Friends
and Siblings
have the same equality,
so you will not be able to connect the same nodes by both edge types.
To resolve this conflict mix in MultiEdge
which serves to extend equality:
case class Friends(one: Person, another: Person) extends AbstractUnDiEdge(one, another) with MultiEdge with Relation { def extendKeyBy = OneOrMore.one(Friends) } case class Siblings(one: Person, another: Person) extends AbstractUnDiEdge(one, another) with MultiEdge with Relation { def extendKeyBy = OneOrMore.one(Siblings) }
MultiEdge
requires extendKeyBy
to be implemented.
Defining a single constant key suffices here.
Friends
but we need a different constant.
Using constants ensures that you cannot connect two given Person
s by the same edge type
more than once.
In contrast, whenever you need to allow for several edges of the same type between two given nodes
you would implement extendKeyBy
by referring to some discriminator value specific to edge instances.
For instance, were you also keeping track of Meeting
s between Person
s,
you could go with something like
case class Meeting(a: Person, b: Person, time: Datetime) extends AbstractUnDiEdge(a, b) with MultiEdge with Relation { def extendKeyBy = one(time) }