Core User Guide: Traversing Graphs

Graph traversals are explicitly or implicitly invoked by means of a Traverser that extends scala.collection.Traversable in turn. You determine the type parameter of Traversable by selecting a Traverser factory method such as innerNodeTraverser.

Needless to say, all methods relying on recursion are tail recursive.

Traversing for a Specific Node, a Path or Cycle

Often you are solely interested in the result of a graph traversal such as a specific node, a path, cycle or subgraph rather than tracking the visited elements. Chances are good that you'll find a method that directly your requirement and that a Traverser will be provided transparently for you.

Finding Paths
A mixed graph example.
import scalax.collection.edge.WDiEdge
import scalax.collection.edge.Implicits._

val g = Graph(1 ~ 2 % 4, 2 ~ 3 % 2, 1 ~> 3 % 5, 1 ~ 5  % 3,
              3 ~ 5 % 2, 3 ~ 4 % 1, 4 ~> 4 % 1, 4 ~> 5 % 0)
def n(outer: Int): g.NodeT = g get outer  // look up 'outer' that is known to be contained

n(1) findSuccessor (_.outDegree >  3) // Option[g.NodeT] = None 
n(1) findSuccessor (_.outDegree >= 3) // Option[g.NodeT] = Some(3)
n(4) findSuccessor (_.edges forall (_.undirected)) // Some(2)
n(4) isPredecessorOf n(1)             // true 
n(1) pathTo n(4)                      // Some(Path(1, 1 ~> 3 % 5, 3, 3 ~ 4 % 1, 4))
n(1) pathUntil (_.outDegree >= 3)     // Some(Path(1, 1 ~> 3 % 5, 3))

val spO = n(3) shortestPathTo n(1) // Path(3, 3 ~ 4 % 1, 4, 4 ~> 5 % 0, 5, 1 ~ 5 % 3, 1)
val sp = spO.get                   // here we know spO is defined
sp.nodes                           // List[g.NodeT] = Nodes(3, 4, 5, 1)
sp.weight                          // Double = 4.0

def negWeight(e: g.EdgeT): Float = 5.5f - e.weight
val spNO =
    n(3) shortestPathTo (n(1), negWeight) // Path(3, 2 ~ 3 % 2, 2, 1 ~ 2 % 4, 1)
val spN = spNO.get                        // here we know spNO is defined
spN.weight                                // Double = 6.0

val pO1 = n(4).withSubgraph(nodes = _ < 4) pathTo n(2) // Some(Path(4, 3 ~ 4 % 1, 3, 2 ~ 3 % 2, 2))
pO1.map(_.nodes)                                       // Some(Nodes(4, 3, 2))

val pO2 = n(4).withSubgraph(edges = _.weight != 2) pathTo n(2)
                                   // Some(Path(4, 4 ~> 5 % 0, 5, 1 ~ 5 % 3, 1, 1 ~ 2 % 4, 2))
pO2.map(_.nodes)                   // Some(Nodes(4, 5, 1, 2))

As to

  1. Searches for any (direct or indirect) successor of node 1 having outDegree > 3 and finds None.
  2. Searches for any successor of node 1 having outDegree >= 3 and finds node 3. This also means that there exists a path from node 1 to node 3. To get the full path call findPath.
  3. Searches for any successor of node 4 having only undirected edges and finds node 2.
  4. Successfully tests for node 4 being a predecessor of node 1.
  5. Finds an arbitrary path from node 1 to node 4.
  6. Finds a path from node 1 to an arbitrary node having outDegree >= 3.
  7. Determines the shortest path from node 3 to node 1 and gets the result from Option. Calling get is okay in this example because we know that there must exist a path.
  8. Reduces path p to the List of nodes on the path. The returned path is an instance of the inner class Path facilitating further functionality...
  9. ...so, among others, it provides weight to calculate the total of the edge weights on this path.
  10. Defines a custom weight method to override static edge weights. We are using float to demonstrate that the weight method may be of any numeric type.
  11. Calls shortestPathTo with the custom weight method negWeight. The returned path is the longest one in terms of static edge weights as 5.5f exceeds the maximum static weight.
  12. Note that the weight of the returned path does not reflect the custom weights used to calculate the shortest path but just the static weights of the edges on the path.
  13. Finds a path from node 4 to node 2 restricting the traversal to a subgraph containing only nodes with value less than 4.
  14. Finds a path from node 4 to node 2 restricting the traversal to a subgraph containing only edges with weight of 2.
val g = Graph(1~>2, 1~>3, 2~>3, 3~>4, 4~>2)
val fc1 = g.findCycle         // Some(Cycle(3, 3 ~> 4, 4, 4 ~> 2, 2, 2 ~> 3, 3))
val fc2 = (g get 4).findCycle // Some(Cycle(4, 4 ~> 2, 2, 2 ~> 3, 3, 3 ~> 4, 4))

for (c1 <- fc1; c2 <- fc2) yield c1 == c2     // false
for (c1 <- fc1; c2 <- fc2) yield c1 sameAs c2 // true

g.findCycleContaining(g get 1)                // None

As to

  1. Finds an arbitrary cycle in g, if any.
  2. Finds an arbitrary cycle reachable from node 4, if any.
  3. c1 and c2 are not equal because they start at different nodes.
  4. Irrespective of starting at different nodes, c1 and c2 depict the same cycle in a semantic sense.
  5. There exists no cycle containing node 1.

Cycle detecting works for directed, undirected and mixed graphs, alike. If you have to ensure that your Graph instance is acyclic on any operation consider utilizing the Acyclic constraint in the Constrained module.

Refining Traversers by Supplying Properties

Except for withSubgraph, we did not make use of properties in the above examples. Properties, all prefixed by with, encourage a fluent way of method calls, hence we refer to them as fluent properties. You may refine any Traverser by means of the following properties:

Table: Traversal properties.
Property Factory method Sample Value Described at
kind: Kind withKind DepthFirst members of Parameters
direction: Direction withDirection Successors members of Parameters
maxDepth: Int withMaxDepth 30 members of Parameters
maxWeight: MaxWeight withMaxWeight 125.3 members of FluentProperties and
subclasses of Properties
nodes: (NodeT) => Boolean,
edges: (EdgeT) => Boolean
withSubgraph _.inDegree < 10 members of FluentProperties and
subclasses of Properties
ordering: GraphTraversal.ElemOrdering withOrdering myOrdering members of FluentProperties and
subclasses of ElemOrdering

For instance, to exclude a specific node from the traversal in a lazy manner you'd call withSubgraph like

(g get 1).withSubgraph(nodes = _ != 2) pathTo (g get 4)

For more examples on how to make use of properties search for the above factory methods in TTraversal.scala

Ordering

As a special case, assume you want to control the order of edges to be traversed. You can do so by setting the ordering property. Since ordering has been tricky in general let's look at an example ensuring that edges are traversed in reverse order of their weights:

import edge.WDiEdge

val root = 1
val g = Graph(root ~> 4 % 2, root ~> 2 % 5, root ~> 3 % 4,
                 3 ~> 6 % 4,    3 ~> 5 % 5,    3 ~> 7 % 2)

def edgeOrdering = g.EdgeOrdering(g.Edge.WeightOrdering.reverse.compare)
val traverser = (g get root).outerNodeTraverser.withOrdering(edgeOrdering)
 
traverser.toList  // List(1, 2, 3, 4, 5, 6, 7)

Note that node and edge ordering are limited compared with a priority queue.

Traversing for Graph Elements

Traversers also enable to inspect graph elements (nodes and/or edges) during the traversal. By doing so they ensure that the passed visitor method is called once for each visited element. As a side note, DownUpTraversers are different in that they call the visitor twice.

Depending on what graph elements you're interested in, select one of the following Traverser factory methods:

Table: Traverser factory methods.
Factory Method Type of Visitor
innerNodeTraverser path dependent NodeT
outerNodeTraverser N type parameter of the graph
innerEdgeTraverser path dependent EdgeT
outerEdgeTraverser E[N] based on the type parameters of the graph
innerElemTraverser InnerElem, the common base of NodeT and EdgeT
outerElemTraverser OuterElem[N,E], the common base of N and E[N]
innerNodeDownUpTraverser (Boolean, NodeT)
outerNodeDownUpTraverser (Boolean, N)

While inner elements provide graph functionality at your fingertips, outer elements feel more natural since they're exactly what you passed to the graph.

The above factory methods are available both at node and graph level. At node level root, the node to start the traversal at, is set to the specific node you are invoking the method on; at graph level root must be passed explicitly. In addition you may pass any traverser property to override the default values:

import scalax.collection.edge.WDiEdge
import scalax.collection.edge.Implicits._

val g = Graph(1 ~> 2 % 1, 1 ~> 3 % 2, 2 ~> 3 % 3, 3 ~> 4 % 1)
val n1 = g get 1

n1.outerNodeTraverser.sum                 // 10
g.outerNodeTraverser(n1).sum              // 10
n1.outerNodeTraverser.withMaxDepth(1).sum //  6

n1.innerEdgeTraverser.map(_.weight).sum   //  7.0

n1.innerElemTraverser.filter(_ match {
  case g.InnerNode(n) => n.degree > 1
  case g.InnerEdge(e) => e.weight > 1
})                                        // Traversable(1, 2, 3, 1 ~> 3 % 2, 2 ~> 3 % 3)

As to

  1. Starts a default traversal at n1 to sum up the node values. sum is provided by scala.colllection.Traversable.
  2. The same traversal as before invoked at graph level.
  3. The same traversal as before but restricted up to a depth of 1.
  4. Calculates the sum of weights over all edges. Here we could employ outerEdgeTraverser as well.
  5. Traverses g for all its elements reachable from n1 to return a set of InnerElem containing nodes with a degree greater than 2 and edges with a weight greater than 1.

Down-up Visitors

Assume you'd like to traverse a graph, especially a tree, in a stack-like manner i.e. you need to carry out one action when touching a node the first time and another, related action when touching the same node the second time in course of traversing up in the imaginary tree. For this purpose you may utilize innerNodeDownUpTraverser:

import scala.collection.mutable.ArrayBuffer

val root = "A"
val g = Graph(root~>"B1", root~>"B2")
val innerRoot = g get root
val result = (ArrayBuffer.empty[String] /: innerRoot.innerNodeDownUpTraverser) {
    (buf, param) => param match {
      case (down, node) => 
        if (down) buf += (if (node eq innerRoot) "(" else "[") += node.toString
        else      buf += (if (node eq innerRoot) ")" else "]")
    }
}
("" /: result)(_+_) // "(A[B1][B2])"

Extended Visitors

In case you need to process the internal state of a traversal including the count of visited nodes, the current search depth, the internal stack etc. you may switch to extended visitors. Beware that these are implementation specific:

val g = Graph(1~>2, 1~>3, 2~>3, 3~>4, 4~>2)

import g.ExtendedNodeVisitor
import scalax.collection.GraphTraversal._
import scalax.collection.GraphTraversalImpl._

type ValDepth = (Int,Int)
var info = List.empty[ValDepth]
(g get 1).innerNodeTraverser.withKind(DepthFirst).foreach {
  ExtendedNodeVisitor((node, count, depth, informer) => {
    info :+= (node.value, depth)
  })
}
info.sortWith((a: ValDepth, b: ValDepth) =>
  a._1  < b._1 || a._1 == b._1 && a._2 < b._2)
// info = List((1,0), (2,1), (3,1), (4,2))

Combined Traversals

Further, when traversing for a result, it is also possible to append a visitor for additional side effects like:

val g = Graph(1 ~> 2, 1 ~> 3, 2 ~> 3, 3 ~> 4, 4 ~> 2)
var center: Option[g.NodeT] = None

(g get 4).findCycle( n =>
  center = center match {
    case s @ Some(c) => if (n.degree > c.degree) Some(n) else s
    case None        => Some(n)
  }
)

The above expression evaluates to an Option[Path]. In course of the traversal the appended visitor also calculates the node with the maximum degree among those visited. The resulting center need not be part of the cycle.

Finding Graph Components

You may search for weakly or strongly connected components. In both cases you first need to decide on where the computation should start:

  • Starting at graph level ensures that all components of the graph will be returned.
  • If you choose to start at a specific node you will get all components reachable from that node, that is a subset of graph-level components.

Finding weakly connected components

val componentEdges = {
  def edges(i: Int) = List(i ~> (i + 1), i ~> (i + 2), (i + 1) ~> (i + 2))
  (edges(1), edges(5))
}
val disconnected = Graph.from(edges = componentEdges._1 ++ componentEdges._2)
val sums =
    for (component <- disconnected.componentTraverser())
    yield (0 /: component.nodes)((cum, n) => cum + n.toOuter) // List(6, 18)

As to

  1. Creates a disconnected graph having two components.
  2. The elements of sums correspond to the sums of the integer node values in both components.

Here is how to search for the weak comopnent containing a specific node:

val anyNode = disconnected.nodes.draw(new util.Random)
anyNode.weakComponent.nodes // componentEdges._1.size = 3

Finding strongly connected components

Strongly connected components are computed based on a tail-recursive version of Tarjan's algorithm.

Let us refer to the sample graph on Wikipedia:

val sccExpected = (
    Graph('a' ~> 'b', 'b' ~> 'c', 'c' ~> 'd', 'd' ~> 'a', 'd' ~> 'e', 'c' ~> 'e', 'e' ~> 'c'),
    Graph('f' ~> 'g', 'g' ~> 'f', 'g' ~> 'h', 'h' ~> 'j', 'j' ~> 'i', 'i' ~> 'g', 'i' ~> 'f', 'f' ~> 'i')
)
val connected = (sccExpected._1 union sccExpected._2) + 'e' ~> 'f'
val scc       = connected.strongComponentTraverser().map(_.toGraph)
scc.toSet == Set(sccExpected._1, sccExpected._2) // true

As to

  1. The graph is comopsed from two strong components connected by a simple directed edge.
  2. The graph-level traverser yields the strong components that we lift to graphs.
  3. Assure that the strong components found equal to the expected ones.

To limit the scope of the search to nodes reachable from a give node, start the search at that node like

val startAt = sccExpected._2.nodes.head
startAt.strongComponents.size // 1
startAt.innerNodeTraverser.strongComponents(println)

As to

  1. When starting at a node of the second component, we get just that component.
  2. In case you are interested in the visited elements during the search invoke the search on a traverser. Of course, the resulting component contains the very same nodes visited before.

On the type Component

What about the type returned by a component search? Component is a strict set of the contained nodes and a lazy set of the contained edges. Further you may invoke toGraph on a component.

Topological Sorting

Graph for Scala comes with three flavors of topological sorting along with a remarkably informative result type. For the following examples, assume graph is a value of type Graph and root is a value of type graph.NodeT.

Graph-level sorting

As a matter of course you can call topologicalSort directly on your graph instance:

graph.topologicalSort.fold(
  cycleNode => ???,
  order     => ???
) 

The result is of type CycleNodeOrTopologicalOrder, a type alias for Either[NodeT, TopologicalOrder[NodeT]]. Left will wrap an inner node being on a cycle in case there exists no topological order. On success, Right contains the calculated TopologicalOrder. When called at graph level, this will include nodes irrespective of whether they are connected or they are members of different components.

Combining with fluent properties

Calling componentTraverser first, you may combine topologicalSort with any of the familiar fluent properties like

graph.componentTraverser().withSubgraph(nodes = myNodeFilter).topologicalSort

Node-level sorting

Whenever starting at a concrete inner node, topologicalSort() will only include nodes within the comopnent spanned by the given node. In reverse, if you know your graph is connected you need not call TopologicalOrder at node level. It suffices to call it at graph level.

Note that node-level topologicalSort() also has a boolean argument allowing you to exclude predecessors from the result. Therefore you need to place the paranthesis:

root.topologicalSort()

Component-level sorting

Further, if you are interested in the topological order of the components of your graph, simply call

graph.componentTraverser().topologicalSortByComponent

that yiels a Traversable[CycleNodeOrTopologicalOrder] one for each component.

Exploiting TopologicalOrder

TopologicalOrder is more than just a list of nodes. Most interestingly, you can call toLayered to lift this result type into a layered representation of the topological order:

graph.topologicalSort.fold(
  cycleNode => yourCycleErrorHandler,
  _.toLayered foreach println // prints topo order layer by layer
)

The layers of a topological sort can roughly be defined as follows:

  1. layer 0 contains all nodes having no predecessors
  2. layer n contains those nodes that have only predecessors in ancestor layers with at least one of them contained in layer n - 1.

Among others, layers are usefull to

  1. compute several valid topological orders by altering the order of nodes within layers or
  2. to maintain a stable ordering over JVM instances.

Canceling Traversals

To cancel a traversal prefer calling one of scala.collection.Traversable's methods that finish the traversal as soon as the specified condition holds true such as exists or collectFirst. Otherwise you are free to invoke break() as provided by scala.util.control.Breaks at any point.

Building Walks and Paths

WalkBuilder and PathBuilder allow to build a Walk respectively Path in terms of the underlying graph. This proves usefull whenever you have some results computed outside the library and would like to represent them by Graph for Scala path-dependent types. Since WalkBuilder and PathBuilder enforce proper walk/path semantics they are also well suited for testing user-computed paths.

Let's look at how to create a path in the sample graph g by means of a PathBuilder:

val builder = g.newPathBuilder(n(1))
builder += n(3) += n(4)
builder.result          // Path(1, 1 ~> 3 % 5, 3, 3 ~ 4 % 1, 4)
   
builder.clear
builder += n(4) += n(3)
builder.result          // Path(1, 1~>3 %5, 3)
    
builder.clear
builder add n(4)        // false

As to

  1. Instantiates a PathBuilder with the start node 1.
  2. Adds the nodes 3 and 4 consecutively.
  3. Tries to add the nodes 4 and 3 consecutively...
  4. ...but adding 4 has been refused since it is not a direct successor of 1.
  5. It is recommended to check for whether additions have taken place as expected by calling add instead of +=.

Note that it is also possible to create walk and paths by adding edges and, for multi-graphs, to define an edge selector.