Monday, June 25, 2012

Movimentum - The solver algorithm

Searching for solutions has been an active branch of artificial intelligence for more than five decades. I am certainly not trying to re-invent the algorithms from all these years, and I will not use very advanced existing algorithms for equation solving.
Still, it's never wrong to have some background in "search." When I looked on the internet for information about search algorithms, I stumbled over the page and book "Artificial Intelligence - Foundations of Computational Agents." Especially chapters three and four are worth reading for background on constraint solving topics.
Most search algorithms fit the following pattern:
  • It is possible to describe a "partial solution" of the problem at hand; such a partial solution is often called a "node."
  • It is possible to refine a partial solution to a more refined solution; such a step is called "traversing an arc" or "expanding a node."
  • There is a set of nodes that are candidates for expansion; this set is usually called the "open set."
  • The solution can be found by
    • first placing the "non-solution" in the open set;
    • repeatedly replace a node from the open set with its expansion, i.e., with all its successor nodes.
The many proposed and used algorithms differ mainly in the way how they select the node to be expanded in the last step. Moreover, different data structures can be used to store the nodes, the open set, and various auxiliary data structures.

I do not yet want to decide on the actual search algorithm. So I will just create the few structures needed for the algorithms explained above; and use a simple, possibly inefficient search algorithm as a first step.

A node captures at least some partial knowledge in the form of original or modified ("rewritten") constraints. In addition, each node captures "knowledge" about variables—however, it is not yet clear, exactly which knowledge we are going to remember. We will certainly remember solved variables here, i.e., the knowledge that a variable has a single value. However, the inequality constraints AtLeastZeroConstraint and MoreThanZeroConstraint may make it useful to remember range knowledge about variables. Let us call that class VariableRangeRestriction and put off its precise design for a little bit of time.

Here is some code that captures the decisions up to now:

    public class SolverNode {
        private readonly ICollection<AbstractConstraint> _constraints;
        private readonly Dictionary<Variable, VariableRangeRestriction>
                                           _variableInRangeKnowledges;
    }

What does the general algoritm look like? Here is a simple framework: The main solver creates an open set and runs solver steps in a loop, until a solution node is found. As an addition, we pass in known solution values from the last frame—in our specific domain, the solution for a frame should be "near" the solution for the previous frame, so those known values might help us to find a solution more quickly:

    public static IDictionary<Variable, VariableRangeRestriction>
        Solve(SolverNode initialNode,
              int loopLimit,
              IDictionary<Variable, VariableRangeRestriction> previousValues,
              int frameNo) {
        // Create open set
        IEnumerable<SolverNode> open = new[] { initialNode };

        // Solver loop
        SolverNode solutionOrNull;
        do {
            if (!open.Any()) {
                throw new Exception("No solution found for frame " + frameNo);
            }
            open = SolverStep(open, previousValues, out solutionOrNull);
            if (--loopLimit < 0) {
                throw new Exception("Cannot find solution");
            }
        } while (solutionOrNull == null);
        return solutionOrNull.VariableInRangeKnowledges;
    }

A solver step finds a node from all the open nodes with a minimal rank, expands it, and computes the new open set. Moreover, it looks whether one of the new nodes is a solution nodes. In the final version, this should be done by checking that all anchor variables have a unique value. However, for this, we need a variable counting visitor, and ... well ... you didn't want to write visitors last time, did you? So we currently use a rough check that all constraints have vanished from the constraint set, and replace this with a better check later:

    public static IEnumerable<SolverNode> SolverStep(
            IEnumerable<SolverNode> open,
            IDictionary<Variable, VariableRangeRestriction> previousValues,
            out SolverNode solutionOrNull) {
        double minRank = open.Min(cs => cs.Rank);
        SolverNode selected = open.First(cs => cs.Rank <= minRank);
        IEnumerable<SolverNode> expandedSets = selected.Expand(previousValues);

        //IEnumerable<SolverNode> newOpen = open.Except(selected).Concat(expandedSets);
        IEnumerable<SolverNode> newOpen = expandedSets.Concat(open.Except(selected));

        // Maybe not really correct: We should check whether all 
        // anchor variables have a single value.
        // For the moment, in our tests, we live with this check.
        solutionOrNull = expandedSets.FirstOrDefault(cs => cs.IsSolved());

        return newOpen;
    }

Into this generic search algorithm, we must now plug in two functions:
  • Assignment of the rank.
  • Methods for expansion.
Regarding the rank, we run, at the moment, an "uninformed search," i.e., a quite primitve search algorithm. Therefore, we just use a constant as the rank, which leads to a
  • breadth first algorithm if we append the new nodes to the end of the open list;
  • depth first searc if we prepend the new nodes.
Above, I start with depth first search, in the hope that we find a solution quickly by "drilling deep."

How to do node expansion? We are in the domain of equations, and here, "rewriting" is the idea: Modify the constraints in such a way that their solution does not change, yet at the end we have very simple constraints that directly provide the solution. Here is a very simple example: Solve the two equations
0 = y + 2
0 = x + y
From the first equation, we can read off that y is –2 in the solution and drop the first constraint. Moreover, we can replace y with –2 in the second equation, so that we get the new constraint
0 = x + (–2)
From which we can read off that x must be 2.

More abstract, to expand a node, our solver has to look at all the constraints of the node and identify a useful rewriting. This identification is done by matching up all the constraints with patterns. If a pattern matches, a corresponding strategy yields a rewriting. For example, if the pattern is 0 = V + C, where V is a variable and C is a constant, a possible corresponding strategy is
  • record that –C is a solution for V;
  • create the rewriting V → –C.
Expansion itself is then simply the execution of the rewriting on all constraints:

    public IEnumerable<SolverNode> Expand(... previousValues) {
        var rewrite = FindBestRewrite(previousValues);
        IEnumerable<SolverNode> result = ExecuteRewrites(rewrite);
        return result;
    }

Here is an attempt to write hardwired code for the two steps to solve the example from above:

FindBestRewrite checks whether there is a constraint of the form 0 = V + C. If there is, it returns a specific rewrite V → –C:

    private VariableToConstantRewrite FindBestRewrite(...) {
        foreach (var c in Constraints) {
            var ce = c as EqualsZeroConstraint;
            if (ce != null) {
                var bin = ce.Expr as BinaryExpression;
                if (bin != null
                    && bin.Lhs is Variable
                    && bin.Op is Plus
                    && bin.Rhs is Constant) {
                    return new VariableToConstantRewrite((Variable)bin.Lhs,
                                                -((Constant)bin.Rhs).Value);
                }
            }
        }
        throw new NotSupportedException("I found no 0=V+E-Constraint");
    }

ExecuteRewrites runs over all constraints, searches V inside the constraint and ...

    private IEnumerable<SolverNode> ExecuteRewrites(VariableToConstantRewrite rewrite) {
        foreach (var c in Constraints) {
            Variable v = rewrite.Var;
            double value = rewrite.Value;
            // create a new constraint from c, where each v is replaced 
            // with new Constant(value); then simplify the new constraint
            // (e.g., replace 1 + 2 with 3).
        }
        throw new NotImplementedException("Not yet completed");
    }

... well, this hard-wired way does not work. It is much more work than we can afford for a simple test case. Using this code, it is not at all clear that a failing or even a succeeding test is caused by the Solve code or the SolverStep—it could as well be that hard-wired supporting code. Maybe there is another way to write this one-shot code, but I think it is more useful to invest one's time into "production code."

No comments:

Post a Comment