ERNI Technology Post No. 37 - The Enumeration Algorithm: Finding Solutions by Force

1. Introduction

In order to complete a project successfully, several different technologies, skills, processes and guidelines are required along with flexible, open-minded people able to give the right weight to each of these aspects. An algorithmic problem may arise at an advanced stage of the project's implementation schedule rather than at its beginning: it is a problem related neither to the chosen technology nor to the programming language used or established processes. The inability to address it effectively with a well-defined algorithm will, in the best case, lead to the project not being able to fulfil all major requirements.

Should the team and the customer together be able to define a context-specific algorithm to solve the problem within acceptable boundaries, the solution remains to be implemented, tested, evaluated as well as deployed, maintained and improved, according to the chosen process.

When, regardless of the reasons, such conditions are not met, the problem remains. Furthermore, the following two questions remain unanswered:

- What happens if the defined problem presents more than one solution?

- Is there a (systematic) way to retrieve all these solutions?

It is the scope of this article to define a strategy providing a reasonable approach to the solution of the above-mentioned points, discussing their applicability, limits, advantages and disadvantages.

2. The idea

It is, in fact, not possible to define a clear-cut algorithm to solve this specific problem. Hence the idea of finding every solution is difficult to imagine.

A more abstract approach might be to list all possible configurations of the configuration space and discard all possibilities which are not valid. All candidates that manage to pass the check are, by process of elimination, solutions. This is the definition of the enumeration strategy and the following conditions are required to apply it.

Applicability conditions for the enumeration strategy:

- A finite criterion can be defined to enumerate every possible configuration.

- It is possible to define a criterion as to whether a configuration is a solution or not.

These logical assumptions can be transformed into a software construct in a very general way, in the case of an object-oriented programming language, by using an abstract class as follows:

The two methods CanEnumerateNextConfiguration() and IsValid() are implementations of the applicability conditions for the specific problem defined above. The method SaveSolution() is trivial, technology and/or programming language-related and not worth discussing.

A concrete class implementing the abstract methods above represents a solution algorithm for the specific problem. It is to be pointed out that with this approach there is no logical difference between finding a solution to one problem and finding a solution to all of them.

To evaluate this approach, the following section provides an overview of a possible implementation of the class defined above for a well-known, complex problem.

3. Guided example

Problem definition

N queens (N > 3): Given an N x N-sized chessboard and N queens, find all the possible configurations having the following properties:

- All the queens are placed.

- The queens do not threaten each other.

(This is a generalisation of the problem of the 8 queens, in which only one solution is requested.)

It can be easily observed that the problem fulfils the necessary conditions for an approach with an enumeration algorithm:

1) The number of possible configurations is limited ((N^2)! / N!) and enumerable.

2) The definition of the solution is given and unmistakable.

Once it is shown that the enumeration algorithm is applicable to the problem, it is possible to proceed with the design of its implementation.

Solution design
Data structure

A simple data structure can be an N-sized integer array with values and index ranging from 1 to N, where array[y] = x indicates that a queen is located at position (x;y) on the board, as shown in the picture in the case of N = 4. (By convention the upper left corner of the chessboard is set as origin of the axis from now on, and N = 4 as a convenient size to be shown.)

Note that all configurations with more than one queen on one row and all configurations with fewer than N queens are automatically excluded in this way.

CanEnumerateNextConfiguration(): simple design
There are several ways of enumerating all possible combinations of N queens placed on an N x N chessboard so that only one single queen is placed on each row. So as to describe a possible enumeration, the two (invalid) chessboard patterns shown in the following pictures can be arbitrarily defined as initial and final configuration:

The enumeration starts by moving the queen on the last row one square forward at a time until the other end of the board is reached, as is shown in the following pictures:

Once the other end of the board is reached, the last row is reset and the queen on the preceding row is moved one step forward:

Four more iterations are computed as described above, until the queens on the last two rows will have reached the other end of the board:

The enumeration process is continued until the final configuration is reached. Afterwards, the method will return false to inform the outermost loop that the whole configuration space has been spanned.

IsValid(): pseudo-implementation
Using the data structure defined above, a pseudo-implementation of this method can be formulated as follows:

CheckColumn(), CheckPrimaryDiagonal() and CheckSecondaryDiagonal() return the information as to whether or not another queen is on the same column, primary diagonal or secondary diagonal.

In case all the above conditions are met for all queens, the actual configuration is a solution.

Instructions for implementation
Because the applicability conditions are general and minimal and the design is so simple, the implementation should also be simple and straightforward. To ensure this, the following hints can be formulated:

1) Since the two above-mentioned methods will be performed a very large number of times and as fast and efficiently as possible, the whole algorithm's infrastructure (variables, containers, etc.) must be created once in the implementing class. Memory allocation, de-allocation and garbage collection should be avoided at all costs to guarantee the highest possible performance with the exception of solutions which are eventually found by the algorithm. They can be stored inside: variables/object instantiated on the fly.

2) Access to data sources such as files, databases, data services is a performance killer and should not be relied upon. All data should be directly readable (non-encrypted) and in memory (this condition represents a restrictive constraint of a technical nature).

3) It should not be necessary to handle exceptions. Since the infrastructure of the algorithm is a composition of arrays, the enumeration engine can be expected to be error-free.

4) The implementation should be kept as simple as possible. Navigating a composition of arrays is not hard and does not require any programming language or framework-specific feature.

5) The use of a performance profiler is strongly suggested to find weaknesses in the implementation, to be able to react accordingly and to double-check that the previous points are observed.

4. Results

The algorithm can be implemented following the given guidelines and hints, and its performance can be measured in terms of time required to solve a specific instance of the problem.

The following image shows the results of the algorithm implemented using .NET/C# for simplicity reasons (for N = 8 and in case all the solutions are requested).

The following table summarises the results for different values of N.

This last table shows unmistakably the limits of this approach for the specific problem: the performance of the algorithm for N

5. Summary

At this point, it is possible to answer the questions asked at the very beginning of this article regarding the advantages and disadvantages of employing an enumeration algorithm.

Advantages:
- Wide applicability: Only two very general conditions must be fulfilled for the algorithm to be applicable.

- Ease of implementation: Only two methods need to be implemented.

- Technology independence: No constraints are set in terms of technology, framework or programming language.

Disadvantage:
- Rapidly growing complexity: Because of the computational complexity characterised by the complex generation of the configuration space, low performances are to be expected already from a relatively small problem size. The problem-specific implementation must be evaluated to assess whether the size is acceptable or not.

6. Conclusion

In this article, an all-purpose strategy to solve algorithmic problems has been proposed and illustrated with a non-trivial example: the problem of the N queens. This problem has been chosen not to demonstrate how good the algorithm performs (there are much better algorithms for the specific problem), but rather to show how remarkably easy it is and to illustrate its advantages and disadvantages.

An enumerative algorithmic approach can be pursued not only in situations where a better approach cannot be pursued, but also as a first, quick and low-effort solution since it is so easy to implement. Comparing the expected with the obtained results and performances is essential in any case.