Thursday 23 February 2023

11 - Queries That Don't Finish



A milestone on the journey of learning prolog is writing queries that don't complete. In some cases, this is the correct behaviour, but in others, it is undesirable. 

We'll take a first look at this phenomenon using the reversed/2 property we developed earlier.


% Example 11 - Bidirectional Termination

% recursive definition
step([], L2, L2). 
step([H1|T1], X, L2) :- step(T1, X, [H1|L2]).

% convenience property around step/3
reversed(X, Y) :- step(X, Y, []). 

% updated to terminate when first list is a variable
reversed2(X, Y) :- same_length(X,Y), step(X, Y, []).

The reversed(X,Y) property is true if the list Y is the list X but in reverse order. We previously asked “which Y is the reverse of the list [1,2,3]?”


?-  reversed([1, 2, 3], Y).

Y = [3, 2, 1]

Prolog tells us Y=[3,2,1] is a valid answer. In fact, prolog tells us it is the only answer because it doesn't ask to continue searching for more possible answers. The query terminates after the solution is found.


Termination

On several occasions we've pointedly commented that a prolog definition was written well enough for a question to be asked in both directions. For example, our earlier definition of ancestor/2 allows us to ask ancestor(X, sally), as well as ancestor(sally, Y). That is, “who are the ancestors of Sally?”, as well as “who is Sally an ancestor of?” 


?- ancestor(X, sally).

X = martha
X = deirdre
false

?- ancestor(sally, X).

X = jane
false

Prolog reports false when there are no more answers to be found. In both cases, the query terminates after all the answers have been found. 


Non-Termination

Back to our reversed/2 property. Let's query in the other direction by asking “which X, when reversed, is the list [1,2,3]?”


?-  reversed(X,[1, 2, 3]).

X = [3, 2, 1]

...  non-termination ...
** Execution aborted **

Prolog correctly finds X=[3,2,1] as a valid answer. It then asks to find more answers. Given the go-ahead, prolog ends up searching along a never-ending path, which will crash the entire program if left long enough. We have to manually intervene and cancel the query.

This is our first example of non-termination. Non-termination can be problematic because we can't determine whether or not there are more solutions yet to be found. In this particularly simple example, we know there can't be any other lists whose reverse is [1,2,3], but in more sophisticated scenarios we can't be so certain.


Non-Termination Isn't Always Wrong

In some cases, non-termination is the right behaviour. For example, we might define a property of lists such that every item in a list is cat. That definition is satisfied by an infinite set of lists, [cat], [cat, cat], [cat, cat, cat], and so on. It would be logically incorrect for a query asking “which lists satisfy this definition” to terminate.

In other cases, non-termination only happens when queries are made in one direction, as in our example reversed(X,[1,2,3]).

This kind of non-termination doesn't always have to be fixed. It's not unusual to find, even official, prolog properties intended to be used only in one direction. In such cases, the documentation makes clear which parameters are expected to be grounded and which can be left as variables. Because reversed(X,Y) is symmetric in X and Y, it is quite reasonable, and no real inconvenience, to require the first parameter be grounded. 


Fixing Non-Termination

If we do decide to fix non-termination, it is better to apply an additional logical constraint, than try to fix it with a procedural hack. This way, the property retains its logical meaning. This isn't always easy, especially when the non-termination is a direct result of prolog's procedural nature. 

Let's work out why the query reversed(X,[1,2,3]) fails to terminate. We need to remind ourselves of how reversed/2 works. 

Initially, nothing is known about X, and the accumulator starts as an empty list in step(X,Y,[]). With each application of the continuation rule, the accumulator is extended from its head. This extends the length of possible solution for X. When the length of the accumulator reaches the same length as the provided list [1,2,3], the termination rule lets prolog report X=[3,2,1] as an answer. But prolog doesn't stop there. It backtracks and keeps applying the continuation rule to extend the accumulator, in effect, trying to find solutions that are longer in length than [1,2,3]

We can prevent this undesirable backtracking by adding a logical constraint which asserts that a list and its reverse have the same length. 


% updated to terminate when first list is a variable
reversed2(X, Y) :- same_length(X,Y), step(X, Y, []).

This new reversed2/2 property uses same_length/2 provided by swi-prolog to assert that X and Y have the same length. Note that it needs to be placed before the recursive step/3 to ensure that X is created with a fixed length, albeit with currently unknown values.


?- reversed2(X,[1, 2, 3]).

X = [3, 2, 1]

This time the query returns the single correct answer and terminates immediately.

It may seem obvious that X and Y are of the same length, but it wasn't obvious that it needed asserting. Lots of thinking to make a small change is the norm when working in declarative and logic languages like prolog.


samelength/2

If same_length/2 isn't provided by your prolog, you can easily create your own, as follows.


% samelength/2 if same_length/2 isn't provided
samelength([], []). 
samelength([_H1|T1], [_H2|T2]) :- samelength(T1, T2).

It's a good idea to check you can read and understand this particularly simple recursive definition.


Testing For Termination

Imagine we want to test whether a query, let's call it q, terminates. 

We could run the query and manually prompt prolog to keep searching for more solutions until it tells us there are no more to be found. Ten prompts would be inconvenient. A hundred prompts would be impractical. Some queries might require millions of prompts.

Is there a way to exhaust the search for all possible solutions without manual prompting? Have a look at the following.


?- q, fail.

The fail is a goal that always fails. We could have used something like 1=2, which also always fails, but fail is easier to read and understand. 

Prolog works left to right, so it will first try to find a solution that satisfies the goal q. Having done this, it will encounter fail, which causes the query to fail, as if the solution found for the first part q did not satisfy the second part. This causes prolog to backtrack and try to find another solution for q. That second solution, if there is one, will be seen as unsatisfactory too because prolog will encounter fail again. This will repeat for all possible solutions to q. The test q,fail will terminate once all the possible solutions for q have been found.

So, if the test q,fail terminates, we know the query q also terminates. It is possible that q takes a long time to exhaust, and so the test appears not to terminate. This is why we can't say q doesn't terminate if the test is not observed to terminate.

Let's test if reversed2(X,[1,2,3]) terminates.


?- reversed2(X,[1,2,3]), fail.

false

The test terminates, confirming reversed2(X,[1,2,3]) does terminate. Prolog's reply false is not wrong, it is what we expect from a test that fails every solution to the goal being tested.


Key Points

  • Non-termination happens when prolog follows an unending search path trying to resolve a query.
  • Not all non-termination is wrong. Some queries should, logically, have an infinite number of answers,
  • Some non-termination depends on which query parameters are ungrounded variables. It is not uncommon for prolog properties to require specific parameters to be grounded.
  • Wherever possible, fixing undesirable non-termination should be done by adding logical constraints to a definition.
  • We can test whether a query q does terminate. If q,fail terminates, then we can say q terminates. 
  • If the test q,fail doesn't appear to terminate, we can't conclude definitively that q doesn't terminate.