Thorough testing is inevitable to assure high quality when implementing your own
algorithms on top of Graph for Scala.
The package scalax.collection.generator
, part of the module graph-core
,
helps you to achieve this goal by providing facilities for user-friendly random graph generation.
Besides accepting a node generator, Graph for Scala Test allows you to control the metrics of random graphs including their order, density and density distribution. Further, you control which edge types should connect nodes including custom weight and label generators.
The companion objects also provide some predefined generators such as smallConnectedIntDi.
Think of RandomGraph
as the native implementation to generate
any kind of Graph
with random nodes and edges satisfying user-defined
graph metrics.
The simplest way to obtain a random graph is to use a predefined generator like
import scalax.collection.generator._ val predefined = RandomGraph.tinyConnectedIntDi(Graph) val tinyGraph = predefined.draw // Graph[Int,DiEdge]
You are certainly not stick with the metrics applied to predefined generators but you can adjust them individually:
object sparse_1000_Int extends RandomGraph.IntFactory { val order = 1000 val nodeDegrees = NodeDegreeRange(1,10) override def connected = false } val randomSparse = RandomGraph[Int,UnDiEdge,Graph]( Graph, sparse_1000_Int, Set(UnDiEdge)) val sparseGraph = randomSparse.draw // Graph[Int,UnDiEdge]
As to
Int
nodes.
1000
.
1
and 10
.
RandomGraph
generator is instantiated to generate
graphs with the above metrics and the edge type UnDiEdge
.
As you have noticed the previous generator was restricted to the
node type of Int
. But to cover your use case you usually
need to obtain random graphs with a specific node type. Let's define a
class Person(name: String, yearOfBirth: Int)
for this purpose.
Also, let's create PersonData
containing meaningful data
to Person
s:
object PersonData { val firstNames = Set( "Alen", "Alice", "Bob", "Jack", "Jim", "Joe", "Kate", "Leo", "Tim", "Tom").to[Vector] val firstNamesSize = firstNames.size val surnames = Set( "Bell", "Brown", "Clark", "Cox", "King", "Lee", "Moore", "Ross", "Smith", "Wong").to[Vector] val surnamesSize = surnames.size def order = firstNamesSize * surnamesSize / 10 def degrees = new NodeDegreeRange(2, order - 2) val maxYearOfBirth = 2010 }
import scalax.collection.edge.LDiEdge case class Person(name: String, yearOfBirth: Int) object Person { import PersonData._ private val r = new scala.util.Random def drawFirstName: String = firstNames(r.nextInt(firstNamesSize)) def drawSurame: String = surnames(r.nextInt(surnamesSize)) def drawName: String = s"$drawFirstName, $drawSurame" def drawYearOfBirth = maxYearOfBirth - r.nextInt(100) } val randomMixedGraph = RandomGraph[Person,UnDiEdge,Graph]( Graph, new RandomGraph.Metrics[Person] { val order = PersonData.order val nodeDegrees = PersonData.degrees def nodeGen: Person = Person(Person.drawName, Person.drawYearOfBirth) }, Set(UnDiEdge, LDiEdge) ) val mixedGraph = randomMixedGraph.draw
mixedGraph
will now have undirected and labeled directed
edges looking like
Graph( Person(Alice, Smith,1967), Person(Kate, Ross,1921), Person(Leo, Bell,2008), Person(Leo, Smith,1983), ..., Person(Alice, Smith,1967)~>Person(Kate, Ross,1921) 'C, Person(Leo, Bell,2008)~Person(Leo, Smith,1983), ... )
You can even supply your own label and weight generators.
You may wonder why we divided the number of all possible combinations of
first names and surnames by 10. The reason is that RandomGraph
must be able to draw distinct nodes. Ok, they become most probably distinct
due to the presence of yearOfBirth
. But had we just
Person(name: String)
and
def order = firstNamesSize * surnamesSize
, we'd get
more and more duplicate names when drawing them.
Therefore RandomGraph
is implemented to cope with duplicates
up to a certain limit. Above that limit the generator assumes that
the supplied node generator behaves ill and rejects to generate the graph
by throwing an exception. Otherwise it would take the risk of an infinite loop.
In conclusion, always plan your node generator with respect to the graph order carefully. By rule of thumb, allow your node generator to have about ten-fold capacity over the graph order.
Node degrees are another important design aspect. Generating graphs with a high density may fail.
GraphGen
, a thin wrapper around RandomGraph
, is
provided for out-of-the-box ScalaCheck
integration.
Below you'll find the ScalaCheck
equivalent of the
previous examples. We assume some knowledge of ScalaCheck
.
Arbitrary
instances for graphs with predefined metricstype IntDiGraph = Graph[Int,DiEdge] implicit val arbitraryTinyGraph = GraphGen.tinyConnectedIntDi[Graph](Graph) val properTiny = forAll(arbitrary[IntDiGraph]) { g: IntDiGraph => g.order == GraphGen.TinyInt.order } properTiny.check
Arbitrary
instancesobject Sparse_1000_Int extends GraphGen.Metrics[Int] { val order = 1000 val nodeDegrees = NodeDegreeRange(1,10) def nodeGen: Gen[Int] = Gen.choose(0, 10 * order) override def connected = false } type IntUnDiGraph = Graph[Int,UnDiEdge] implicit val arbitrarySparseGraph = Arbitrary { GraphGen[Int,UnDiEdge,Graph](Graph, Sparse_1000_Int, Set(UnDiEdge)).apply } val properSparse = forAll(arbitrary[IntUnDiGraph]) { g: IntUnDiGraph => g.order == Sparse_1000_Int.order } properSparse.check
Arbitrary
instances to generate individual graph typesimport scalax.collection.edge.LDiEdge case class Person(name: String, yearOfBirth: Int) object Person { import PersonData._ def firstNameGen : Gen[String] = Gen.oneOf(firstNames) def surameGen: Gen[String] = Gen.oneOf(surnames) def nameGen: Gen[String] = Gen.resultOf( (firstName: String, surname: String) => s"$firstName, $surname")( Arbitrary(firstNameGen), Arbitrary(surameGen)) def yearOfBirthGen: Gen[Int] = Gen.choose(maxYearOfBirth - 100, maxYearOfBirth) } object MixedMetrics extends GraphGen.Metrics[Person] { val order = PersonData.order val nodeDegrees = PersonData.degrees def nodeGen: Gen[Person] = Gen.resultOf( (name: String, year: Int) => Person(name, year))( Arbitrary(Person.nameGen), Arbitrary(Person.yearOfBirthGen)) } type Mixed = Graph[Person,UnDiEdge] implicit val arbitraryMixedGraph = Arbitrary { GraphGen[Person,UnDiEdge,Graph]( Graph, MixedMetrics, Set(UnDiEdge, LDiEdge)).apply } val properMixedGraph = forAll(arbitrary[Mixed]) { g: Mixed => g.order == MixedMetrics.order } properMixedGraph.check
As you know, by default ScalaCheck
executes 100 tests
before it takes your property proved. You may desire to limit the number
of tests because, for instance, you are dealing with bigger graphs.
Here is an example for how you can achieve this in the context of
ScalaTest
:
import org.scalatest.{Matchers, Spec} import org.scalatest.prop.PropertyChecks class TGraphGenTest extends Spec with Matchers with PropertyChecks { implicit val config = PropertyCheckConfig(minSuccessful = 5, maxDiscarded = 5) object `generated Tiny graph` { implicit val arbitraryTinyGraph = GraphGen.tinyConnectedIntDi[Graph](Graph) def `should conform to tiny metrics` { forAll(arbitrary[IntDiGraph]) { g: IntDiGraph => g.order should equal (GraphGen.TinyInt.order) } } } }