TL;DR: This post shows you how to use a finite domain constraints solver in Scala.
I have been working on a music informatics task which required to model certain harmonic and melodic rules and then to generate and chord progressions obeying these rules. A useful approach for this task is to implement the rules as a constraints satisfaction problem (CSP). A CSP library allows you to write rules as logical declarations of relations between variables, and then you just invoke a solver that takes care of finding values for those variables.
choosing a csp library
Luckily there are multiple open source solver libraries for Java and Scala. I can at least think of Choco, JaCoP, Copris, and OscaR. There are probably more, and there are various extensions to provide nice Scala support for some of these libraries. A had tried Choco some years ago, and it seems a mature project. In the end I settled on JaCoP (at least for now), because I could easily fork the project on GitHub (OscaR is in the Bitbucket ghetto), and one of the authors has already provided a simple Scala layer.
Some of the syntactic choices for the JaCoP Scala layer were probably motivated by the MiniZinc syntax, but I found it less Scala’ish, and so I created my own variant, Poirot. In the following, I will show you how to formalise and solve a Soduko using JaCoP/Poirot. Combined with the power of Scala’s batteries-included collections, this gives some 15 lines of concise and easily readable code.
from zero to scala
If you don’t know Scala, it is a pretty amazing object-functional language that runs on the Java VM. We’ll boot our system using the sbt build tool which makes it very easy to declare some library dependencies and open a Read-Eval-Print-Loop (REPL). All you need to get started is an open shell (terminal) window. The following works on OS X and most likely Linux, perhaps even Windows.
Assuming you are in your home directory or some other suitable directory, first create and go into a new project directory, by executing the following two lines in the shell:
mkdir sudoku
cd sudoku
We are going to install sbt locally in this directory, using a nice shell script from Paul Phillips.
curl -s https://raw.github.com/paulp/sbt-extras/master/sbt > ./sbt && chmod 0755 ./sbt
The minimum setup we need is to declare a dependency on the Poirot library:
echo 'libraryDependencies += "de.sciss" %% "poirot" % "0.1.0"' >build.sbt
And we are ready to start the REPL:
./sbt console
When you run ./sbt
the first time, it will download and install the actual tool in that directory. It will also read built.sbt
and figure out that it should download the Scala library, the Scala compiler, and the Poirot library. So if this is the first time you run sbt, this may actually take a while until everything has been retrieved and prepared, so be patient.
Eventually you should see the Scala REPL welcome message:
Welcome to Scala version 2.10.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_33).
Type in expressions to have them evaluated.
Type :help for more information.
scala>
We can now execute Scala code by typing it on this new prompt and hitting return. Try 1 + 1 <return>
and you should see the result 2
.
fifteen lines
Now let us implement the Sudoku solver. I will use the example Sudoku from the Wikipedia Sudoku article. The easiest way to enter it is to use a multiline string which is openend an closed by triple-quotes """
. Like so:
The REPL automatically places the pipe |
characters on each new lines as we press enter, so you do not actually type these characters. The final .stripMargin
call makes sure that the indentation of each line including the pipe character is removed.
The most simple type of variables in a constraint system is that which has a finite integer domain, so the sought value of the variable is an integer number within a given range. In Poirot this type is IntVar
, and a new variable can be created using IntVar("name", min, max)
or just IntVar(min, max)
where min
and max
are integer values.
The first step in using Poirot is to import its types and functions:
The contents of Implicits
adds a few extra methods that we will use. Before we can do anything, we need to declare a model which is the place where the constraints we specify will be remembered:
Then the variables should be defined. The standard Sudoku is a board of 9 rows by 9 columns, and each cell can take a value between 1 and 9. So let’s define a 9 x 9 variable matrix. Scala’s best choice (probably) of an immutable random access data structure is collection.immutable.IndexedSeq
, and luckily Poirot contains the Vec
alias for this type. The matrix declaration becomes:
If you wanted to give meaningful names to the variables, you could have alternatively used this (do not execute now!):
To access a matrix cell, you can say rows(row)(col)
which gives you an individual IntVar
. To access one particular row, you just say row(col)
which gives a Vec[IntVar]
. To produce a column view, we can transpose the matrix:
In Suduko, the values of each row and of each column must be different. There is a handy method allDifferent
which posts that constraint to our model:
Further more, as you can see in the Wikipedia article, each 3 x 3 sub-square is also highlighted. There is an additional constraint that the 9 values in each sub-square must also be different. To impose that constraint, we first need two helper methods. First let’s define a function which gives the row and column indices of a sub-square:
For example, the first sub-square in the top left corner, squareIndices(0,0)
, has indices (0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2). The last sub-square in the bottom right corner, squareIndices(2,2)
, has indices (6,6), (6,7), (6,8), (7,6), (7,7), (7,8), (8,6), (8,7), (8,8). To map these indices to the actual variables, the following function is defined:
Now the sub-square constraints can be imposed:
solving the actual puzzle
In this moment, all the basic Sudoku rules have been incorporated into the model. If we were to run the solver now, it would produce all the billions of possible Sudokus, probably running out of memory before. What we want to do, is solve one particular game, which had been previously defined as board
.
First, we need a better representation of board
, ideally another matrix of numbers. The following clever Scala code creates a vector with the known numbers of the game, where unknown cells are indicated by negative numbers:
Reading this line from left to right, the board string is first split into an array of line strings, with tail
we drop the initial empty line. We make use of a Scala collection feature which allows us to view a string as a sequence of characters. The flatMap
bit goes through each line, pads it with spaces to make sure it is 9 characters long, and then flattens the lines into one string. The final map
call is a dirty bit, about which Paul Phillips will probably be ranting for an hour and a half. We can “subtract” characters ending up with integers, so the character '1'
becomes integer 1
, character '2'
becomes integer 2
, etc. Space characters become negative numbers.
The known
calculated above is a flat one dimensional vector of 81 elements (had we used map
instead of flatMap
, it would be two dimensional 9 x 9). We can create a corresponding flat view of our variable matrix:
To state that a particular variable x
must be equal to a particular integer value v
, we can say x #= v
. Poirot prefixes constraints producing operators with a hash character #
. For example, x #> y
means that x should be greater than v, x #!= v
means that x must be different from v, etc.
To impose the known cells of the game:
We “zip” the variables and board values together, iterate over them, and when a board value is known (greater than zero), we make the constraint statement.
Now we are ready to run the solver:
The inner function search
defines the search parameters. Its first argument is the list of variables to be resolved, the other two arguments guide the search procedure. Since we are only interested in one solution, we just use these standard types. If you were to look for multiple solutions in a large search space, you might for example use indomainRandom()
to explore a random subset of solutions.
The outer call, satisfy
, invokes the actual search process, until one solution is found. There is a variant, satisfyAll
which can be used to look for several solutions. satisfy
should return with success and print some information on the result. We can now retrieve the value which has been found for a variable using v.value()
. For example, to make a pretty print of the solved board:
You should see the same solution as in the Wikipedia article:
5 3 4 6 7 8 9 1 2
6 7 2 1 9 5 3 4 8
1 9 8 3 4 2 5 6 7
8 5 9 7 6 1 4 2 3
4 2 6 8 5 3 7 9 1
7 1 3 9 2 4 8 5 6
9 6 1 5 3 7 2 8 4
2 8 7 4 1 9 6 3 5
3 4 5 2 8 6 1 7 9
Voilà.