Game Tree Programming Idea: Selectivity via modified alpha-beta

Main

Date: 13 Feb 2009 12:54:58
From:
Subject: Game Tree Programming Idea: Selectivity via modified alpha-beta

Hello,

In single agent branch and bound search, only a single, global, cutoff
value is used for pruning.

This got me thinking about alpha-beta in Chess (and other games).
What if these were made global (instead of stack based)? The idea
being that you (and your opponent) wouldn't explore a new path that is
worse than the best path found so far. The benefit being that this
might be a way to perform additional ("unsafe") pruning in a game
independent way. In short, a simple way to do a more selective
search.

I realize that making this change has the effect of eliminating the
fact that alpha-beta pruning is equivalent (in terms of result, not
computational expense) to straight minimax, but once again the idea
here is to perform additional pruning in an intelligent way that
allows deeper searches.

I haven't tried it, but does this idea make sense?

Thanks,
-- Greg

Date: 15 Feb 2009 14:28:47
From:
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta

On Feb 15, 1:58=A0pm, Andy Walker <[email protected] > wrote:
> =A0 =A0 =A0 =A0 Unsafe pruning is likely to be a disaster.

Would you say that's also true for "selective deepening" as well?
What I'm unclear on is whether you call
it "unsafe pruning" or "selective deepening" (a more accepted term/
technique?), in both cases you're intentionally "pruning" paths. The
only difference I see is that in the latter, you might decide to
search all positions to a fixed depth before pruning. Still, in both
cases pruning based on some heuristic occurs.

I would also like to add that I'm considering a larger set of games
than standard Chess. I'm also considering Chess variants and other
games with bushy trees and where some form of pruning (or selective
deepening, call it what you want) becomes necessary to in order to
search to "useful" depths.

-- Greg

Date: 15 Feb 2009 23:16:34
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta

[email protected] wrote:
>> Unsafe pruning is likely to be a disaster.
> Would you say that's also true for "selective deepening" as well?

There's a difference [if only psychological] between "those
lines look as though they're not going to be interesting so I'm
going to discard them" and "those lines do look interesting, so I'm
going to analyse them much deeper". ...

> What I'm unclear on is whether you call
> it "unsafe pruning" or "selective deepening" (a more accepted term/
> technique?), in both cases you're intentionally "pruning" paths. The
> only difference I see is that in the latter, you might decide to
> search all positions to a fixed depth before pruning. Still, in both
> cases pruning based on some heuristic occurs.

... It's at least slightly unreasonable to call the "forward"
version "pruning" at all! But in any case, it's the "unsafe" that's
the problem. Any algorithm that discards part of the tree that may
contain the best line is dangerous. For practical games we are
constrained as to the number of nodes we have time to analyse and we
are subject to the vagaries of static evaluations -- after all, if
these were correct, we wouldn't need to analyse past depth 1. But
at least you should [surely?] strive to get a correct evaluation of
whatever part of the tree you have time to analyse -- which means, in
essence, that you must not prune anything that would not be pruned by
alpha-beta. "Shaping" the tree you search is another matter.

> I would also like to add that I'm considering a larger set of games
> than standard Chess. I'm also considering Chess variants and other
> games with bushy trees and where some form of pruning (or selective
> deepening, call it what you want) becomes necessary to in order to
> search to "useful" depths.

Understood. Games with significantly greater "width" than
chess are a real pain for tree searching .... Some such games break
down into disjunctive [or nearly disjunctive] sums [cf the opening
phase in Go]; some [eg Boxes] have huge transposition possibilities;
but these bring their own problems.

--
Andy Walker
Nottingham

Date: 15 Feb 2009 07:21:49
From:
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta

a) Good point about "minimal windowing". Perhaps that's a better
way
to do what
I have in mind. My idea was inspired by branch and bound search, but
now that I
think about it further, B&B is not a heuristic search so it's
completely safe
to prune based on a global cutoff value. Not so for A-B.

b) Right, you lose the ability to reconsider if you've already
adjusted global bounds. Heuristics can go wrong.

d) Well the idea is to do unsafe pruning anyway as a way of dealing
with bushy search
trees, so I'm willing to depart from the "standard model".

Thanks for your substantial reply.

-- Greg

Date: 15 Feb 2009 18:58:30
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta

[email protected] wrote:
> b) Right, you lose the ability to reconsider if you've already
> adjusted global bounds. Heuristics can go wrong.

Yep. But the question is what to do when you find that out!
In chess programming, it usually means starting on a new search with
better bounds, so a heuristic that often goes wrong had better be a
massive improvement when it goes right.

> d) Well the idea is to do unsafe pruning anyway as a way of dealing
> with bushy search
> trees, so I'm willing to depart from the "standard model".

Unsafe pruning is likely to be a disaster. You don't usually
get any measure of *how* unsafe the pruning has been. If you thereby
drop your queen once per game, you will lose every game against a
decent opponent no matter how much deeper you have been able to look.
In general, it seems to be a lot better to stay safe, and instead to
prune to search some things more deeply and some things less.

But, as always, the proof of the pudding is in the eating. If
you have an idea, try it! The alpha-beta part of a typical open-source
engine is only a few lines of code, so it's easy to play with it, and
see whether your program plays better or worse.

--
Andy Walker
Nottingham

Date: 15 Feb 2009 17:52:54
From: Peter Osterlund
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta pruning

[email protected] writes:

> a) Good point about "minimal windowing". Perhaps that's a better
> way
> to do what
> I have in mind. My idea was inspired by branch and bound search, but
> now that I
> think about it further, B&B is not a heuristic search so it's
> completely safe
> to prune based on a global cutoff value. Not so for A-B.

You could say that alpha-beta is what you get if you take
brand-and-bound and generalize so to be able to use if for searching a
mini-max tree. There are similarities, such as:

* Guaranteed to find the same value as if searching the full tree.
* Cutting away a huge amount of work, but still typically leaving an
exponential amount of work to be done.

Anyway, for the branch-and-bound algorithm, it can be modified to
compute approximate solutions, ie you can change the cut-off condition
so that optimal solutions are allowed to be skipped if they are just a
little better than the best solution found so far. This can
significantly reduce the number of tree nodes that have to be
searched, at the cost of losing the guarantee to find an optimal
solution.

It should be possible to generalize the "approximate B&B" algorithm to
the alpha-beta case too, by modifying the alpha/beta windows
appropriately. I can't tell if this would produce a useful algorithm
for chess though. Some things to consider:

* Can the alpha-beta window be modified at all levels in the search
tree without losing too much precision, or can you only do it at the
root level?

* How big is the speed improvement?

* Will it make the chess program stronger? Increased speed will allow
it to search deeper, but at the same time it will only be able to
compute approximate mini-max values.

Suppose you set the approximation threshold to 0.1 pawns. What will
happen? Will the program slowly lose 0.1 pawns per move until the
position is lost, or will the increased speed make it find a decisive
tactical shot that wins the game?

--
Peter Osterlund - [email protected]
http://web.telia.com/~u89404340

Date: 15 Feb 2009 23:34:04
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta

Peter Osterlund wrote:
> Suppose you set the approximation threshold to 0.1 pawns. What will
> happen? Will the program slowly lose 0.1 pawns per move until the
> position is lost, or will the increased speed make it find a decisive
> tactical shot that wins the game?

If you use the "wrong" alpha-beta levels, for whatever reason,
then one of two things will happen: (a) by luck, judgement or otherwise,
the "true" value of the position [as determined by the static evaluations]
is within the chosen bounds, or (b) it is outside those bounds. In case
(a), the search will go faster if the "wrong" bounds are "too narrow"
[which is, after all, the basis for "aspiration search" and for "minimal
windowing"] but slower if they are "too wide". In case (b), the *only*
information you get, in general, is that the position is at least of
value so-and-so [if better than beta] or at most of value so-and-so [if
worse than alpha]. There's no "lose 0.1 pawns per move". The evaluation
may return as 0.1 pawns below alpha, but the true value of the move that
gave that evaluation may be a queen lower, and there may be no move that
is better than 0.3 pawns lower. Even in time trouble, a computer can't
afford to use evaluations below alpha.

More generally, it's the play-off between (a) and (b) that makes
chess programming interesting. With care, you can get (a) most of the
time, and consequently search much faster; but you have to be prepared
for case (b) to occur, in which case there is usually little choice but
to search again with wider bounds. If 90% of your searches go twice as
fast, but 10% go twice as slow, you win ....

--
Andy Walker
Nottingham

Date: 16 Feb 2009 00:04:23
From: Peter Osterlund
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta pruning

Andy Walker <[email protected] > writes:

> Peter Osterlund wrote:
>> Suppose you set the approximation threshold to 0.1 pawns. What will
>> happen? Will the program slowly lose 0.1 pawns per move until the
>> position is lost, or will the increased speed make it find a decisive
>> tactical shot that wins the game?
>
> If you use the "wrong" alpha-beta levels, for whatever reason,
> then one of two things will happen: (a) by luck, judgement or otherwise,
> the "true" value of the position [as determined by the static evaluations]
> is within the chosen bounds, or (b) it is outside those bounds. In case
> (a), the search will go faster if the "wrong" bounds are "too narrow"
> [which is, after all, the basis for "aspiration search" and for "minimal
> windowing"] but slower if they are "too wide". In case (b), the *only*
> information you get, in general, is that the position is at least of
> value so-and-so [if better than beta] or at most of value so-and-so [if
> worse than alpha]. There's no "lose 0.1 pawns per move". The evaluation
> may return as 0.1 pawns below alpha, but the true value of the move that
> gave that evaluation may be a queen lower, and there may be no move that
> is better than 0.3 pawns lower. Even in time trouble, a computer can't
> afford to use evaluations below alpha.

I don't think you understood my idea, probably because I didn't
explain it well enough.

Suppose you search the PV using standard alpha-beta, and suppose it
returns a score of 1.00 up to the root. Using normal alpha-beta, the
remaining moves for the root position would be searched with a window
such that the search terminates as soon as the score is known to be <=
1.00. My idea is to terminate as soon as the score is known to be <=
1.10. You can obviously miss the best move in that case, but not if
the best move is a lot better than the best move you found so far.

After finishing such a search, you can be sure that the computed score
is at most 0.10 lower than the mini-max score.

(It starts to get complicated though, if you do the same thing at
other levels than the root level in the search tree.)

--
Peter Osterlund - [email protected]
http://web.telia.com/~u89404340

Date: 16 Feb 2009 19:20:17
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta

Peter Osterlund wrote:
>> If you use the "wrong" alpha-beta levels, for whatever reason,
>> then one of two things will happen: [...]
> I don't think you understood my idea, probably because I didn't
> explain it well enough.

Ah, OK. I was on the wrong lines, then; however, luckily,
most of what I wrote still seems to apply!

> Suppose you search the PV using standard alpha-beta, and suppose it
> returns a score of 1.00 up to the root. Using normal alpha-beta, the
> remaining moves for the root position would be searched with a window
> such that the search terminates as soon as the score is known to be <=
> 1.00. My idea is to terminate as soon as the score is known to be <=
> 1.10. You can obviously miss the best move in that case, but not if
> the best move is a lot better than the best move you found so far.

OK. In a "straight" alpha-beta search, this does indeed [I
guess, though as always the proof lies in experimentation rather than
intuition] save some nodes at the expense of accuracy. It's less
clear that it saves nodes in some versions of minimal windowing; I'd
need to think a little harder to convince myself!

> After finishing such a search, you can be sure that the computed score
> is at most 0.10 lower than the mini-max score.

OK. OTOH, if [eg in a tactical situation] your PV turns out to
be a lot better than anything else, then you haven't gained anything,
whereas if there are [in a positional situation] lots of moves with
nearly the same value, then you have thrown away whatever positional
acumen your program in fact possesses. On the third hand, I've known
positions where the computer seems to be "agonising" between two lines
that differ by 0.01, and you want to shake it and say "look, it really,
really doesn't matter whether you play A or B, just plump for one", and
in those cases it could be a big win.

> (It starts to get complicated though, if you do the same thing at
> other levels than the root level in the search tree.)

In general, it's *harder* to do things only at the root level
than everywhere -- it means you need special code for that level. If
I now understand you correctly, it just means replacing the code that
assigns the score to alpha by code that assigns (score + 100) to alpha
[if we're measuring in millipawns]. Surely someone has already tried
this? If not, go ahead ...!

--
Andy Walker
Nottingham

Date: 16 Feb 2009 22:21:35
From: Peter Osterlund
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta pruning

Andy Walker <[email protected] > writes:

> Peter Osterlund wrote:
>
>> Suppose you search the PV using standard alpha-beta, and suppose it
>> returns a score of 1.00 up to the root. Using normal alpha-beta, the
>> remaining moves for the root position would be searched with a window
>> such that the search terminates as soon as the score is known to be <=
>> 1.00. My idea is to terminate as soon as the score is known to be <=
>> 1.10. You can obviously miss the best move in that case, but not if
>> the best move is a lot better than the best move you found so far.
>
> OK. In a "straight" alpha-beta search, this does indeed [I
> guess, though as always the proof lies in experimentation rather than
> intuition] save some nodes at the expense of accuracy. It's less
> clear that it saves nodes in some versions of minimal windowing; I'd
> need to think a little harder to convince myself!

I agree. Tests are needed to find out if there will be any noticeable
speedup. I know that the corresponding idea can give a very
significant speedup when solving combinatorial optimization problems
using branch & bound, but that doesn't automatically mean that the
idea will be good for alpha beta too.

>> After finishing such a search, you can be sure that the computed score
>> is at most 0.10 lower than the mini-max score.
>
> OK. OTOH, if [eg in a tactical situation] your PV turns out to
> be a lot better than anything else, then you haven't gained anything,
> whereas if there are [in a positional situation] lots of moves with
> nearly the same value, then you have thrown away whatever positional
> acumen your program in fact possesses. On the third hand, I've known
> positions where the computer seems to be "agonising" between two lines
> that differ by 0.01, and you want to shake it and say "look, it really,
> really doesn't matter whether you play A or B, just plump for one", and
> in those cases it could be a big win.

A case I had in mind was this: Suppose move A, B, C, D have
approximately equal score at a given search depth, but at the next
higher search depth, move D reveals a new tactical possiblity, and
therefore has a significantly higher score. In that case if your
search order is A, B, C, D, you want to quickly dismiss B and C as
"not much better than A", so that you have a bigger chance of finding
the tactical opportunity in move D before the time runs out.

>> (It starts to get complicated though, if you do the same thing at
>> other levels than the root level in the search tree.)
>
> In general, it's *harder* to do things only at the root level
> than everywhere -- it means you need special code for that level.

Sorry, I was unclear again. I meant that analyzing the approximation
error is complicated, not that it is complicated to implement the
idea.

One can quite easily conclude that if you do this approximation (by
amount "delta") at K different levels, the approximation error at the
root is no more than K * delta. However that quickly becomes too large
to be practical if you intend to do it at all levels in the search
tree.

I was thinking that maybe under some circumstances it would be
possible to prove a better bound for the error than "K * delta".
However, proving such a bound appeared to be complicated.

> If I now understand you correctly, it just means replacing the code
> that assigns the score to alpha by code that assigns (score + 100)
> to alpha [if we're measuring in millipawns].

Yes, that's correct.

> Surely someone has already tried this? If not, go ahead ...!

Maybe I will. However, I was initially suggesting this as a way to
make the original posters idea work. His idea was in a broad sense
"take something good from branch and bound and apply it to alpha
beta".

--
Peter Osterlund - [email protected]
http://web.telia.com/~u89404340

Date: 18 Feb 2009 00:10:38
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta

Peter Osterlund wrote:
> One can quite easily conclude that if you do this approximation (by
> amount "delta") at K different levels, the approximation error at the
> root is no more than K * delta. However that quickly becomes too large
> to be practical if you intend to do it at all levels in the search
> tree.

Actually, half the approximations will be for the other player,
so there is a limit of ceiling(K/2)*delta, and even that happens only
if there is a sequence of second or later moves being best by no more
than delta. So I think the practical performance would be a lot better
than the theoretical limit. However, there seem to be problems in the
case where alpha+delta >= beta [eg, minimal windowing] -- it needs a
bit of thought to determine what the best "mend" is, but at least once
the alpha and beta levels have become that close, there will be no
further approximation errors.

>> Surely someone has already tried this? If not, go ahead ...!
> Maybe I will. However, I was initially suggesting this as a way to
> make the original posters idea work. His idea was in a broad sense
> "take something good from branch and bound and apply it to alpha
> beta".

OK. Just possibly worth noting that ordinary b&b relates to
alpha-beta in much the same way that helpmates relate to ordinary
mates, so there might also be some spin-offs for "fairy" chess.

--
Andy Walker
Nottingham

Date: 14 Feb 2009 16:59:26
From:
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta

a) Good point about "minimal windowing". Perhaps that's a better way
to do what
I have in mind. My idea was inspired by branch and bound search, but
now that I
think about it further, B&B is not a heuristic search so it's
completely safe
to prune based on a global cutoff value. Not so for A-B.

b) Right, you lose the ability to reconsider if you've already
adjusted global bounds. Heuristics can go wrong.

d) Well the idea is to do unsafe pruning anyway as a way of dealing
with bushy search
trees, so I'm willing to depart from the "standard model".

Thanks for your substantial reply.

-- Greg

Date: 14 Feb 2009 23:57:08
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta

[email protected] wrote:
[...]
> This got me thinking about alpha-beta in Chess (and other games).
> What if these were made global (instead of stack based)? The idea
> being that you (and your opponent) wouldn't explore a new path that is
> worse than the best path found so far. The benefit being that this
> might be a way to perform additional ("unsafe") pruning in a game
> independent way. [...]

You sort-of can, sort-of can't, and it sort-of probably doesn't
help [at least if I've understood your notion correctly].

(a) It's easy to modify alpha-beta so as to determine efficiently
whether the current position is better or worse than some fixed value.
In such a case, alpha and beta never change [as any proposed change
would correspond to a beta cutoff], so they may as well be global. But
you don't get the current value, nor a best move. This is pretty-much
"minimal windowing", and it needs to be combined with something else to
become part of a proper search -- after which it's a Good Idea.

(b) Otherwise, you need to be able to "back out" any changes ["Oh,
this looks like a good move -- yes, better and better -- oops, I've now
seen the refutation, back to my previous idea"], so you either can't
change alpha/beta or else you really need to stack them. If you never
change them, then you don't get the benefit from improving alpha, so
your search is less efficient than standard alpha-beta for no improvement
elsewhere in the algorithm.

(c) You have a recursive procedure anyway, and the hard work is not
done when going down or back up the tree [which is where alpha/beta get
stacked and changed], but in the zillions of static evaluations at the
leaves. Almost every intuitive idea about chess programming turns out
to be wrong; despite this, my intuition says that even if you could get
your idea to work efficiently, the improvement would not be measurable.

(d) If you get a result different from straight minimax, then it isn't
merely different, it's wrong -- at least within the "standard model" of
chess programming [by which you're trying to find the best move according
to the static evaluations at whichever leaves your search goes down to].
This is probably Bad News.

(e) Nevertheless, if you have an Idea, chess programming is not like
counting angels on pin-heads, it's an absolutely practical problem. Take
one of the open source programs, change the dynamic search to include the
Idea, run a few hundred games "new vs old", and see which wins. If your
Idea loses, quietly forget it. If it wins, quietly win a tournament or
two using it, then shout it from the rooftops and become famous. If it
draws, then write a paper explaining how your Idea needs development but
shows promise, and how a Large Grant would help with this.

(f) If I've misunderstood your proposal, then (a,b,c) may not apply;
sorry. But (d,e) always apply!

--
Andy Walker
Nottingham