THE TYRANNY OF NUMBERS

(PART THREE)

by Steve Lopez

First off this week, a quick recap. I've been using CBTree to help me learn the White side of the Exchange Variation of the Ruy Lopez. At the end of Part Two, after the moves 1.e4 e5 2.Nf3 Nc6 3.Bb5 a6 4.Bxc6 dxc6, we saw that CBTree provided me with the following menu of choices:

MOVEFREQUENCY%RESULTELO
5.0-0128876%0.542290
5.Nc321713%0.562297
5.d41509%0.472274
5.d3211%0.502234
5.h3111%0.362200
5.Nxe56<1%0.332296
5.c32<1%0.501995
5.b31<1%0.502200

Since I'm studying this opening from White's perspective, I have a choice to make here. Which line do I want to concentrate on? I'll need to use the process of elimination to make my decision.

Most of the lines look pretty similar from the standpoint of results, all hovering within a handful of points of 0.50, so they all look to give White and Black about even chances. However, the last three moves on the list appeared in less than ten games each, so we can label them "untested". I might try them against Fritz later, but right now I'll bypass them. 5.d3 and 5.h3 both look pretty passive, so I'll cross them off the list too.

So which of the three remaining moves will I concentrate on? I already know from my outside reading that the Ruy Exchange tends to be pretty drawish, because it usually leads to a mass exchange of material. 5.d4 looks like it would achieve that end much more quickly than the other two lines, so I'll study it (despite the fact that, of the three candidate moves, it seems to be the one that most favors Black). Another plus to choosing 5.d4 is that there are 150 games (out of the original 1729 games) from the database that follow this line, so by choosing this path I've cut the amount of data that I need to wade through by more than 90% of the original total.

Let's digress and consider this "outside reading" business for a moment. Steve's Rule #3 for using a statistical game tree program reads as follows: don't be afraid to consult books and other outside knowledge to help you study the game tree. In many cases, this approach will give you information and insight that goes beyond mere statistics. (For example, I read somewhere that 5.d4 was the standard move prior to the 1960's. Then Fischer popularized 5.0-0 and castling became the norm).

So what's next? After 5.d4 we see:

MOVEFREQUENCY%RESULTELO
5...exd414295%0.452288
5...Bg464%0.922216
5...Nf621%0.502200

Odds are that my class-level opponents will be playing 5...exd4, since the other two moves have been played very infrequently. Even if they play one of the other two moves, I doubt that they're going to have any more information on them than I do. So I'll skip the latter two moves for now and come back to them later.

After 5...exd4, we see:

MOVEFREQUENCY%RESULTELO
6.Qxd414099%0.452279
6.Nxd421%0.252203

Forget the numbers. There is no way that I'm going to play 6.Nxd4 because it puts the Knight on a doofy square (where it can be easily attacked by a Black c-pawn) and sets up a possible Queen swap on d1 that will prevent me from castling. So 6.Qxd4 is the only choice.

MOVEFREQUENCY%RESULTELO
6...Qxd412791%0.452295
6...Bg4118%0.592225
6...Be621%0.252200

Here comes a typical question that might be posed by someone unfamiliar with the science of statistics: "Since 6...Be6 is way better for Black, why isn't it played more often?" In Aristotelian fashion, we might answer the question with a question: "Is 6...Be6 really better, or does it just look better?"

Examining the stats we notice that 6...Be6 was played in just two games; Black won one and the other was a draw. Two games does not constitute a good statistical sampling.

We're all familiar with ads proclaiming that "four out of five dentists recommend Brand Yeeech toothpaste". But what does this really mean? It means absolutely nothing, because we have no idea what type of statistical base is being used to come to that conclusion. If thousands of dentists were interviewed and 80% of them recommended Brand Yeeech, I'd be impressed. But it's more likely that only five dentists were consulted, and four of them practice in small towns where the tiny grocery carries only Brand Yeeech and no other toothpaste brands. So of course these four recommend it; they have no choice.

Fortunately, CBTree gives us an opportunity that Madison Avenue fails to provide: the chance to look behind the statistics and see how they were derived.

If I highlight 6...Be6 and click on "Games" at the top of the screen, I see that Black's sole victory is Game #282 of the database (Mieses-Janowski, Cambridge Springs, 1904). Upon firing up ChessBase or Fritz and playing through the game, I see that Mieses (White) was actually a piece up at one point but later got his Knight trapped by a Black Bishop and a swarm of Black pawns in the endgame. Obviously, Black's victory was not due directly to his having played 6...Be6. So we can conclude that 6...Be6 isn't in and of itself better for Black -- it just worked out that way in one of the two games in the database in which it was played. The other game was a draw, so after averaging 0.00 (the Black win) and 0.50 (the draw), we get 0.25. The raw statistical data suggests that Black is winning after 6...Be6, but our research proves otherwise.

This brings up Steve's Rule #4 for using a statistical tree program: before drawing conclusions from the numerical data, make sure the numbers are based on a ggod statistical sampling. What constitutes a "good statistical sampling?" This will vary depending on the overall starting size of the original database and the number of games in the branch you're currently examining. Since the particular branch we're looking at contains a total of 140 games, I think that a mere two games is way too small a sampling from which to draw conclusions. By the way, this might seem to be in direct violation of my first rule for using statistical trees (don't run a tree program on a large batch of unrelated games) but that's just an illusion; the key word in Rule #1 is "unrelated". What we're searching for here is a happy medium: too much data and your head will explode, too little data and you risk being misled by the numbers.

Personally, I'm skeptical of numbers derived from fewer than 10 games. In such cases I do exactly what I described a few paragraphs ago: I play through the games to see what happens later on. Did the move in question directly contribute to the result or did some other factor later in the game decide the issue? Steve's Rule #5 for using a statistical tree program comes into play here: always play through complete games to see what's "behind" the numbers.

Now back to our tree: I remember a time when 6...Bg4 was a hot topic of debate on Compuserve, so I make a mental note to check later to see what all the shouting was about. Right now, though, I'll stick with 6...Qxd4:

MOVEFREQUENCY%RESULTELO
7.Nxd412598%0.442284
END22%0.502280

Four more players call it a day, leaving us with 125 games. Clicking on 7.Nxd4, we get this:

MOVEFREQUENCY%RESULTELO
7...Bd76250%0.372325
7...c52117%0.482285
7...Bd62016%0.582284
7...Nf6108%0.602200
7...Bc543%0.382218
END32%0.502367
7...g622%0.502200
7...Be611%0.002200
7...f611%0.002200
7...Ne711%1.002200

Yipes! Time for the "process of elimination" again! Basically, I can cut out all but the top four moves, since the bottom five ("END" being excluded, of course) have a statistical base of less than ten games. The upper four moves have a large enough statistical base to allow me to draw some preliminary conclusions. Looking at the raw numbers, I see that 7...Bd7 is played about half the time here and results in an average effectiveness of 0.37. Since an evaluation of "Black has a slight advantage" is scored as 0.40 by CBTree, we'll round off 7...Bd7's evaluation to a slight plus for Black. The remaining three variations round off as either an even position or as slightly better for White. So there's really nothing alarming here.

Between these four variations, we're left with 123 games to study, which is still an impressive chunk of data (though nothing like the original total of 1729 games!).

How do we tackle this data? A good way to do it is to examine the set of games for each one of these four moves, looking for (and playing through) the games that contain commentary. There are several ways to go about doing this.

If our Ruy Lopez Exchange database has an opening classification key, we can just look up and click on each of the four variations and get the game list for each one. If the database has no key, but we have a copy of ChessBase, we can direct ChessBase to automatically generate an opening key. If we don't have ChessBase, we'll have to do it the old-fashioned way: use the "Games" command in CBTree to call up a list of the games for each variation, write down the game numbers, and then check them manually in whichever program we're using to view the games. (A neat shortcut in CBWin or CB6 is to set up the board position after the candidate move and then hit SHIFT-F7 to have a list of the games containing that board position sent to the Clipboard).

Using an opening key, we click on the 7...Bd7 variation. Games with commentary are easily identifiable by the letters V and C on the right-hand side of the games list. In this particular case, there are nine of them. I would concentrate on these nine games first since they contain variations and commentary by stronger players. After studying these games and learning the ideas contained within them, we can look at the unannotated games and look for the same patterns, themes, and similarities. We then repeat this process for the other three candidate moves.

After we do this, we can then return to CBTree and extend our variation search an extra eight or ten positions down the road from each of our four main candidates at Black's seventh move. We do this to search for hidden traps and pitfalls.

Here's an example: let's say that I'm interested in playing a variation that CBTree shows was played in ten games and is evaluated as 0.90. Looking at the games list, we see that nine of the ten games ended as wins for White and just one was a Black victory.

The initial reaction would be to say, "Hot diggety! Sure looks good for White!" So what could be wrong with this picture? Upon closer examination, we see that all of White's wins occurred before 1985. Black's sole win took place in 1986, and our candidate variation never appeared since. Smelling a rat, we play through the next few moves of the game in which Black was victorious. Sure enough, our "dream move" for White gets busted by Black and leads directly to White's downfall. Evidently in 1986 some brainy mug cooked up a theoretical novelty that finally turned the tables on White.

While this scenario may sound overly simplistic, it serves to illustrate the point: number are not always to be trusted. A similar scenario may have a different ending; Black's sole victory might have come because White blew his early advantage and lost a hotly-contested endgame. The only way to know for sure is to use CBTree to check ahead in the game tree, use one of our other programs to play through the complete games in the database, and use our brains to analyze the data we uncover.

By now you have the idea of how to use CBTree (or any other statistical game tree program). The main use for these programs is as a sort of road map to guide us through the labyrinth of variations. The numerical statistics, while certainly useful, are really of secondary importance. The main purpose of a tree program is to help locate and isolate specific games and variations for study and then use the games themselves to discover the positional and tactical themes that arise from these opening positions.

So don't be enslaved by the "tyranny of numbers"! Use your statistical tree program as a trusted guide and refer to the numbers as handy guideposts. Above all, remember to use the various tools at your disposal to discover the truth behind the numbers. Think for yourself!



STEVE'S RULES FOR USING A STATISTICAL TREE PROGRAM:

  1. To avoid massive confusion and aggravation, don't run a tree program on a large batch of unrelated games. Organize your data before generating a tree. Corollary: Confusion is bad; cut corners wherever possible.

  2. Don't rely on numerical data that may be based on an incomplete statistical sampling (for example: don't trust the "average Elo rating" for a move unless 90% to 95% of the games in the database give Elo ratings for BOTH players).

  3. Don't be afraid to use books or other outside sources to help you study and understand the game tree.

  4. Before drawing conclusions from the numerical data, make sure that the numbers are based on a good statistical sampling (the exact figure here varies with the situation; a general rule of thumb is to be leery of results based on fewer than ten games).

  5. Always play through complete games to determine what lies behind the statistical data.