|
Main
Date: 13 Feb 2009 12:54:58
From:
Subject: Game Tree Programming Idea: Selectivity via modified alpha-beta
|
Hello, In single agent branch and bound search, only a single, global, cutoff value is used for pruning. This got me thinking about alpha-beta in Chess (and other games). What if these were made global (instead of stack based)? The idea being that you (and your opponent) wouldn't explore a new path that is worse than the best path found so far. The benefit being that this might be a way to perform additional ("unsafe") pruning in a game independent way. In short, a simple way to do a more selective search. I realize that making this change has the effect of eliminating the fact that alpha-beta pruning is equivalent (in terms of result, not computational expense) to straight minimax, but once again the idea here is to perform additional pruning in an intelligent way that allows deeper searches. I haven't tried it, but does this idea make sense? Thanks, -- Greg
|
|
|
Date: 15 Feb 2009 14:28:47
From:
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta
|
On Feb 15, 1:58=A0pm, Andy Walker <[email protected] > wrote: > =A0 =A0 =A0 =A0 Unsafe pruning is likely to be a disaster. Would you say that's also true for "selective deepening" as well? What I'm unclear on is whether you call it "unsafe pruning" or "selective deepening" (a more accepted term/ technique?), in both cases you're intentionally "pruning" paths. The only difference I see is that in the latter, you might decide to search all positions to a fixed depth before pruning. Still, in both cases pruning based on some heuristic occurs. I would also like to add that I'm considering a larger set of games than standard Chess. I'm also considering Chess variants and other games with bushy trees and where some form of pruning (or selective deepening, call it what you want) becomes necessary to in order to search to "useful" depths. -- Greg
|
| |
Date: 15 Feb 2009 23:16:34
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta
|
[email protected] wrote: >> Unsafe pruning is likely to be a disaster. > Would you say that's also true for "selective deepening" as well? There's a difference [if only psychological] between "those lines look as though they're not going to be interesting so I'm going to discard them" and "those lines do look interesting, so I'm going to analyse them much deeper". ... > What I'm unclear on is whether you call > it "unsafe pruning" or "selective deepening" (a more accepted term/ > technique?), in both cases you're intentionally "pruning" paths. The > only difference I see is that in the latter, you might decide to > search all positions to a fixed depth before pruning. Still, in both > cases pruning based on some heuristic occurs. ... It's at least slightly unreasonable to call the "forward" version "pruning" at all! But in any case, it's the "unsafe" that's the problem. Any algorithm that discards part of the tree that may contain the best line is dangerous. For practical games we are constrained as to the number of nodes we have time to analyse and we are subject to the vagaries of static evaluations -- after all, if these were correct, we wouldn't need to analyse past depth 1. But at least you should [surely?] strive to get a correct evaluation of whatever part of the tree you have time to analyse -- which means, in essence, that you must not prune anything that would not be pruned by alpha-beta. "Shaping" the tree you search is another matter. > I would also like to add that I'm considering a larger set of games > than standard Chess. I'm also considering Chess variants and other > games with bushy trees and where some form of pruning (or selective > deepening, call it what you want) becomes necessary to in order to > search to "useful" depths. Understood. Games with significantly greater "width" than chess are a real pain for tree searching .... Some such games break down into disjunctive [or nearly disjunctive] sums [cf the opening phase in Go]; some [eg Boxes] have huge transposition possibilities; but these bring their own problems. -- Andy Walker Nottingham
|
|
Date: 15 Feb 2009 07:21:49
From:
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta
|
a) Good point about "minimal windowing". Perhaps that's a better way to do what I have in mind. My idea was inspired by branch and bound search, but now that I think about it further, B&B is not a heuristic search so it's completely safe to prune based on a global cutoff value. Not so for A-B. b) Right, you lose the ability to reconsider if you've already adjusted global bounds. Heuristics can go wrong. d) Well the idea is to do unsafe pruning anyway as a way of dealing with bushy search trees, so I'm willing to depart from the "standard model". Thanks for your substantial reply. -- Greg
|
| |
Date: 15 Feb 2009 18:58:30
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta
|
[email protected] wrote: > b) Right, you lose the ability to reconsider if you've already > adjusted global bounds. Heuristics can go wrong. Yep. But the question is what to do when you find that out! In chess programming, it usually means starting on a new search with better bounds, so a heuristic that often goes wrong had better be a massive improvement when it goes right. > d) Well the idea is to do unsafe pruning anyway as a way of dealing > with bushy search > trees, so I'm willing to depart from the "standard model". Unsafe pruning is likely to be a disaster. You don't usually get any measure of *how* unsafe the pruning has been. If you thereby drop your queen once per game, you will lose every game against a decent opponent no matter how much deeper you have been able to look. In general, it seems to be a lot better to stay safe, and instead to prune to search some things more deeply and some things less. But, as always, the proof of the pudding is in the eating. If you have an idea, try it! The alpha-beta part of a typical open-source engine is only a few lines of code, so it's easy to play with it, and see whether your program plays better or worse. -- Andy Walker Nottingham
|
| |
Date: 15 Feb 2009 17:52:54
From: Peter Osterlund
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta pruning
|
[email protected] writes: > a) Good point about "minimal windowing". Perhaps that's a better > way > to do what > I have in mind. My idea was inspired by branch and bound search, but > now that I > think about it further, B&B is not a heuristic search so it's > completely safe > to prune based on a global cutoff value. Not so for A-B. You could say that alpha-beta is what you get if you take brand-and-bound and generalize so to be able to use if for searching a mini-max tree. There are similarities, such as: * Guaranteed to find the same value as if searching the full tree. * Cutting away a huge amount of work, but still typically leaving an exponential amount of work to be done. Anyway, for the branch-and-bound algorithm, it can be modified to compute approximate solutions, ie you can change the cut-off condition so that optimal solutions are allowed to be skipped if they are just a little better than the best solution found so far. This can significantly reduce the number of tree nodes that have to be searched, at the cost of losing the guarantee to find an optimal solution. It should be possible to generalize the "approximate B&B" algorithm to the alpha-beta case too, by modifying the alpha/beta windows appropriately. I can't tell if this would produce a useful algorithm for chess though. Some things to consider: * Can the alpha-beta window be modified at all levels in the search tree without losing too much precision, or can you only do it at the root level? * How big is the speed improvement? * Will it make the chess program stronger? Increased speed will allow it to search deeper, but at the same time it will only be able to compute approximate mini-max values. Suppose you set the approximation threshold to 0.1 pawns. What will happen? Will the program slowly lose 0.1 pawns per move until the position is lost, or will the increased speed make it find a decisive tactical shot that wins the game? -- Peter Osterlund - [email protected] http://web.telia.com/~u89404340
|
| | |
Date: 15 Feb 2009 23:34:04
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta
|
Peter Osterlund wrote: > Suppose you set the approximation threshold to 0.1 pawns. What will > happen? Will the program slowly lose 0.1 pawns per move until the > position is lost, or will the increased speed make it find a decisive > tactical shot that wins the game? If you use the "wrong" alpha-beta levels, for whatever reason, then one of two things will happen: (a) by luck, judgement or otherwise, the "true" value of the position [as determined by the static evaluations] is within the chosen bounds, or (b) it is outside those bounds. In case (a), the search will go faster if the "wrong" bounds are "too narrow" [which is, after all, the basis for "aspiration search" and for "minimal windowing"] but slower if they are "too wide". In case (b), the *only* information you get, in general, is that the position is at least of value so-and-so [if better than beta] or at most of value so-and-so [if worse than alpha]. There's no "lose 0.1 pawns per move". The evaluation may return as 0.1 pawns below alpha, but the true value of the move that gave that evaluation may be a queen lower, and there may be no move that is better than 0.3 pawns lower. Even in time trouble, a computer can't afford to use evaluations below alpha. More generally, it's the play-off between (a) and (b) that makes chess programming interesting. With care, you can get (a) most of the time, and consequently search much faster; but you have to be prepared for case (b) to occur, in which case there is usually little choice but to search again with wider bounds. If 90% of your searches go twice as fast, but 10% go twice as slow, you win .... -- Andy Walker Nottingham
|
| | | |
Date: 16 Feb 2009 00:04:23
From: Peter Osterlund
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta pruning
|
Andy Walker <[email protected] > writes: > Peter Osterlund wrote: >> Suppose you set the approximation threshold to 0.1 pawns. What will >> happen? Will the program slowly lose 0.1 pawns per move until the >> position is lost, or will the increased speed make it find a decisive >> tactical shot that wins the game? > > If you use the "wrong" alpha-beta levels, for whatever reason, > then one of two things will happen: (a) by luck, judgement or otherwise, > the "true" value of the position [as determined by the static evaluations] > is within the chosen bounds, or (b) it is outside those bounds. In case > (a), the search will go faster if the "wrong" bounds are "too narrow" > [which is, after all, the basis for "aspiration search" and for "minimal > windowing"] but slower if they are "too wide". In case (b), the *only* > information you get, in general, is that the position is at least of > value so-and-so [if better than beta] or at most of value so-and-so [if > worse than alpha]. There's no "lose 0.1 pawns per move". The evaluation > may return as 0.1 pawns below alpha, but the true value of the move that > gave that evaluation may be a queen lower, and there may be no move that > is better than 0.3 pawns lower. Even in time trouble, a computer can't > afford to use evaluations below alpha. I don't think you understood my idea, probably because I didn't explain it well enough. Suppose you search the PV using standard alpha-beta, and suppose it returns a score of 1.00 up to the root. Using normal alpha-beta, the remaining moves for the root position would be searched with a window such that the search terminates as soon as the score is known to be <= 1.00. My idea is to terminate as soon as the score is known to be <= 1.10. You can obviously miss the best move in that case, but not if the best move is a lot better than the best move you found so far. After finishing such a search, you can be sure that the computed score is at most 0.10 lower than the mini-max score. (It starts to get complicated though, if you do the same thing at other levels than the root level in the search tree.) -- Peter Osterlund - [email protected] http://web.telia.com/~u89404340
|
| | | | |
Date: 16 Feb 2009 19:20:17
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta
|
Peter Osterlund wrote: >> If you use the "wrong" alpha-beta levels, for whatever reason, >> then one of two things will happen: [...] > I don't think you understood my idea, probably because I didn't > explain it well enough. Ah, OK. I was on the wrong lines, then; however, luckily, most of what I wrote still seems to apply! > Suppose you search the PV using standard alpha-beta, and suppose it > returns a score of 1.00 up to the root. Using normal alpha-beta, the > remaining moves for the root position would be searched with a window > such that the search terminates as soon as the score is known to be <= > 1.00. My idea is to terminate as soon as the score is known to be <= > 1.10. You can obviously miss the best move in that case, but not if > the best move is a lot better than the best move you found so far. OK. In a "straight" alpha-beta search, this does indeed [I guess, though as always the proof lies in experimentation rather than intuition] save some nodes at the expense of accuracy. It's less clear that it saves nodes in some versions of minimal windowing; I'd need to think a little harder to convince myself! > After finishing such a search, you can be sure that the computed score > is at most 0.10 lower than the mini-max score. OK. OTOH, if [eg in a tactical situation] your PV turns out to be a lot better than anything else, then you haven't gained anything, whereas if there are [in a positional situation] lots of moves with nearly the same value, then you have thrown away whatever positional acumen your program in fact possesses. On the third hand, I've known positions where the computer seems to be "agonising" between two lines that differ by 0.01, and you want to shake it and say "look, it really, really doesn't matter whether you play A or B, just plump for one", and in those cases it could be a big win. > (It starts to get complicated though, if you do the same thing at > other levels than the root level in the search tree.) In general, it's *harder* to do things only at the root level than everywhere -- it means you need special code for that level. If I now understand you correctly, it just means replacing the code that assigns the score to alpha by code that assigns (score + 100) to alpha [if we're measuring in millipawns]. Surely someone has already tried this? If not, go ahead ...! -- Andy Walker Nottingham
|
| | | | | |
Date: 16 Feb 2009 22:21:35
From: Peter Osterlund
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta pruning
|
Andy Walker <[email protected] > writes: > Peter Osterlund wrote: > >> Suppose you search the PV using standard alpha-beta, and suppose it >> returns a score of 1.00 up to the root. Using normal alpha-beta, the >> remaining moves for the root position would be searched with a window >> such that the search terminates as soon as the score is known to be <= >> 1.00. My idea is to terminate as soon as the score is known to be <= >> 1.10. You can obviously miss the best move in that case, but not if >> the best move is a lot better than the best move you found so far. > > OK. In a "straight" alpha-beta search, this does indeed [I > guess, though as always the proof lies in experimentation rather than > intuition] save some nodes at the expense of accuracy. It's less > clear that it saves nodes in some versions of minimal windowing; I'd > need to think a little harder to convince myself! I agree. Tests are needed to find out if there will be any noticeable speedup. I know that the corresponding idea can give a very significant speedup when solving combinatorial optimization problems using branch & bound, but that doesn't automatically mean that the idea will be good for alpha beta too. >> After finishing such a search, you can be sure that the computed score >> is at most 0.10 lower than the mini-max score. > > OK. OTOH, if [eg in a tactical situation] your PV turns out to > be a lot better than anything else, then you haven't gained anything, > whereas if there are [in a positional situation] lots of moves with > nearly the same value, then you have thrown away whatever positional > acumen your program in fact possesses. On the third hand, I've known > positions where the computer seems to be "agonising" between two lines > that differ by 0.01, and you want to shake it and say "look, it really, > really doesn't matter whether you play A or B, just plump for one", and > in those cases it could be a big win. A case I had in mind was this: Suppose move A, B, C, D have approximately equal score at a given search depth, but at the next higher search depth, move D reveals a new tactical possiblity, and therefore has a significantly higher score. In that case if your search order is A, B, C, D, you want to quickly dismiss B and C as "not much better than A", so that you have a bigger chance of finding the tactical opportunity in move D before the time runs out. >> (It starts to get complicated though, if you do the same thing at >> other levels than the root level in the search tree.) > > In general, it's *harder* to do things only at the root level > than everywhere -- it means you need special code for that level. Sorry, I was unclear again. I meant that analyzing the approximation error is complicated, not that it is complicated to implement the idea. One can quite easily conclude that if you do this approximation (by amount "delta") at K different levels, the approximation error at the root is no more than K * delta. However that quickly becomes too large to be practical if you intend to do it at all levels in the search tree. I was thinking that maybe under some circumstances it would be possible to prove a better bound for the error than "K * delta". However, proving such a bound appeared to be complicated. > If I now understand you correctly, it just means replacing the code > that assigns the score to alpha by code that assigns (score + 100) > to alpha [if we're measuring in millipawns]. Yes, that's correct. > Surely someone has already tried this? If not, go ahead ...! Maybe I will. However, I was initially suggesting this as a way to make the original posters idea work. His idea was in a broad sense "take something good from branch and bound and apply it to alpha beta". -- Peter Osterlund - [email protected] http://web.telia.com/~u89404340
|
| | | | | | |
Date: 18 Feb 2009 00:10:38
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta
|
Peter Osterlund wrote: > One can quite easily conclude that if you do this approximation (by > amount "delta") at K different levels, the approximation error at the > root is no more than K * delta. However that quickly becomes too large > to be practical if you intend to do it at all levels in the search > tree. Actually, half the approximations will be for the other player, so there is a limit of ceiling(K/2)*delta, and even that happens only if there is a sequence of second or later moves being best by no more than delta. So I think the practical performance would be a lot better than the theoretical limit. However, there seem to be problems in the case where alpha+delta >= beta [eg, minimal windowing] -- it needs a bit of thought to determine what the best "mend" is, but at least once the alpha and beta levels have become that close, there will be no further approximation errors. >> Surely someone has already tried this? If not, go ahead ...! > Maybe I will. However, I was initially suggesting this as a way to > make the original posters idea work. His idea was in a broad sense > "take something good from branch and bound and apply it to alpha > beta". OK. Just possibly worth noting that ordinary b&b relates to alpha-beta in much the same way that helpmates relate to ordinary mates, so there might also be some spin-offs for "fairy" chess. -- Andy Walker Nottingham
|
|
Date: 14 Feb 2009 16:59:26
From:
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta
|
a) Good point about "minimal windowing". Perhaps that's a better way to do what I have in mind. My idea was inspired by branch and bound search, but now that I think about it further, B&B is not a heuristic search so it's completely safe to prune based on a global cutoff value. Not so for A-B. b) Right, you lose the ability to reconsider if you've already adjusted global bounds. Heuristics can go wrong. d) Well the idea is to do unsafe pruning anyway as a way of dealing with bushy search trees, so I'm willing to depart from the "standard model". Thanks for your substantial reply. -- Greg
|
|
Date: 14 Feb 2009 23:57:08
From: Andy Walker
Subject: Re: Game Tree Programming Idea: Selectivity via modified alpha-beta
|
[email protected] wrote: [...] > This got me thinking about alpha-beta in Chess (and other games). > What if these were made global (instead of stack based)? The idea > being that you (and your opponent) wouldn't explore a new path that is > worse than the best path found so far. The benefit being that this > might be a way to perform additional ("unsafe") pruning in a game > independent way. [...] You sort-of can, sort-of can't, and it sort-of probably doesn't help [at least if I've understood your notion correctly]. (a) It's easy to modify alpha-beta so as to determine efficiently whether the current position is better or worse than some fixed value. In such a case, alpha and beta never change [as any proposed change would correspond to a beta cutoff], so they may as well be global. But you don't get the current value, nor a best move. This is pretty-much "minimal windowing", and it needs to be combined with something else to become part of a proper search -- after which it's a Good Idea. (b) Otherwise, you need to be able to "back out" any changes ["Oh, this looks like a good move -- yes, better and better -- oops, I've now seen the refutation, back to my previous idea"], so you either can't change alpha/beta or else you really need to stack them. If you never change them, then you don't get the benefit from improving alpha, so your search is less efficient than standard alpha-beta for no improvement elsewhere in the algorithm. (c) You have a recursive procedure anyway, and the hard work is not done when going down or back up the tree [which is where alpha/beta get stacked and changed], but in the zillions of static evaluations at the leaves. Almost every intuitive idea about chess programming turns out to be wrong; despite this, my intuition says that even if you could get your idea to work efficiently, the improvement would not be measurable. (d) If you get a result different from straight minimax, then it isn't merely different, it's wrong -- at least within the "standard model" of chess programming [by which you're trying to find the best move according to the static evaluations at whichever leaves your search goes down to]. This is probably Bad News. (e) Nevertheless, if you have an Idea, chess programming is not like counting angels on pin-heads, it's an absolutely practical problem. Take one of the open source programs, change the dynamic search to include the Idea, run a few hundred games "new vs old", and see which wins. If your Idea loses, quietly forget it. If it wins, quietly win a tournament or two using it, then shout it from the rooftops and become famous. If it draws, then write a paper explaining how your Idea needs development but shows promise, and how a Large Grant would help with this. (f) If I've misunderstood your proposal, then (a,b,c) may not apply; sorry. But (d,e) always apply! -- Andy Walker Nottingham
|
|