THE TYRANNY OF NUMBERS

(PART TWO)

by Steve Lopez


"It ain't so much the things we don't know that get us in trouble. It's the things we know that ain't so." -- Artemus Ward

In Part One, our intrepid hero was exploring the dangerous wilderness of the Ruy Lopez Exchange, hacking his way through the dense underbrush of the game tree with the help of his trusted guide, CBTree. Will he make it to the middlegame alive or will he succumb to a hidden trap along the way?

Actually, the only trap we can fall into is the trap of believing that numbers don't lie. Wait, let's reword that: numbers don't lie, but one can misinterpret numbers to get all kinds of false results. Finding these pitfalls is what this series of articles is all about.

Last time, you were left hanging after White's first move. Here's a refresher:

MOVEFREQUENCY%RESULTELO
e41729100%0.542287

As you'll doubtless recall, this means 1.e4 was played in 1729 games, comprising 100% of the tree at that point. White did slightly better than Black in the games in which this move appeared (since 0.50 is the median and a number closer to 1.00 means the move was better for White). The average Elo rating of the players who played 1.e4 was 2287.

Some players like to refer to the Elo rating to see what moves the strong players prefer. This is fine, except for one hitch: what if the players in some of the database's games don't have Elo ratings assigned to them? In such cases, CBTree assigns these players an arbitrary rating of 2200 for statistical purposes (on the assumption that most users are only interested in master and grandmaster games and will only have such games in their databases).

Checking my Ruy Lopez Exchange database, I see that less than 50% of the games contain Elo ratings for one or both players. This completely blows the statistical accuracy of our 2287 average Elo for 1.e4. After all, I might have snuck a couple of my own class-level games into the database. Or, more likely (and worse), there are certainly a bunch of GM games that I manually added from books or magazines for which the players' ratings weren't available. Plus we need to take into account that there are a lot of games from before 1945 in the database and nobody had ratings in the days of Alekhine and Capablanca; Arpad Elo hadn't yet come along with his rating system. So the "default" rating of 2200 doesn't quite work here (Richard Reti -- a mere 2200 player? I don't think so!).

Short of limiting one's database to games in which both players have listed ratings, what does one do?

Fortunately, CBTree has covered this problem with a menu selection called "Stats". By clicking on this feature, we can get a statistics box to appear on the screen. In this box we see three sets of pie charts, bar graphs, and numerical information. The upper set gives the results gleaned from all of the games in the database. The second set shows statistics derived from only the games that had an Elo rating listed for the moving player (White, in this case). The lower set delivers stats for the games in which both players had a listed rating.

By using this last set of charts, we can be assured of getting accurate figures based on Elo ratings. We can get performance rating averages for the move, too (i.e. the average of what the players' ratings would have been after the games were played, adjusted for their success or failure). So, after some exploring, we find that the true Elo average for White's move 1.e4 in only the games with ratings for both players is 2416, with a performance rating of 2452. Much better than the initial figure of 2287!

This brings us to Steve's Rule #2 for using a chess tree program: don't rely too heavily on numerical data that may be based on an incomplete or inaccurate statistical sampling. Which leads directly to the exception to Rule #2: ...unless you want to do a little extra work. In this case, the extra work consists of clicking on the "Stats" function and waiting a few seconds for the new numbers to be generated (which really isn't much work at all, come to think of it).

The bottom line here is that unless 90% to 95% of the games in your database have Elo ratings listed for both players, don't get too carried away with looking at the Elo average given in the main CBTree screen. Use the "Stats" function instead.

The first move of a game isn't really the optimum place for referring to Elo averages; you're usually going to check these averages out when you have several candidate moves from which to choose. But I wanted to take this early opportunity to point out an immediate and recurring pitfall: relying too heavily on numerical data without putting any thought into what's behind the numbers. This is the biggest trap of using a statistical tree program as a "crutch". If you just parrot the moves of strong players (basing your choices on statistical data) without understanding why the moves were played, you're just playing with numbers instead of playing chess. And, as we're now seeing, this is extremely dangerous because the raw numbers can be misleading.

Returning to our tree, let's go ahead one move and see what happens:

MOVEFREQUENCY%RESULTELO
1...e51722100% 0.542287
1...Nc67<1%0.792406

Oops, an immediate fork in the road! Taking this at face value, we see that 1...Nc6 seems to be a real turkey for Black. White seems to win over three-fourths of the games in which this move is played (perhaps, we reason, this is why it's only seen in seven games, less than 1% of the total game tree).

However, let's think about this for a moment. We're looking at a game tree constructed from a database of Ruy Lopez Exchange Variation games. The defining position of the Ruy Exchange comes when White's Bishop takes Black's Knight on c6. If there's no minor-piece swap on c6, there's no Ruy Exchange (said swap is how the variation gets its name). My hunch is that these seven games will transpose back into the main line and that White's apparent success in these seven games has nothing whatsoever to do with Black's choice of 1...Nc6. Let's follow the path of the 1722 games in which Black played 1...e5 and see:

MOVEFREQUENCY%RESULTELO
2.Nf31722100% 0.542286

MOVEFREQUENCY%RESULTELO
2...e51722100%0.54 2295

MOVEFREQUENCY%RESULTELO
3.Bb51729100%0.542287

Ah-ha! Notice that when we went from 2...e5 to 3.Bb5 the total number of games jumped back up to our original total of 1729. So the 1...Nc6 line did transpose back to the main line as predicted (said move order appearing this way: 1...Nc6 2.Nf3 e5 3.Bb5. Same board position, different move order. Ain't chess fun?).

So, going back to our stats for Black's first moves, we see that 1...Nc6 isn't inherently bad for Black after all, despite the 0.79 average that it receives. The four embittering losses, three draws, and no wins for Black in this line were just the breaks and had nothing to do with the opening move order.

Let's forge ahead in our tree and see what happens next:

MOVEFREQUENCY%RESULTELO
3...a61729100%0.542296

MOVEFREQUENCY%RESULTELO
4.Bxc61729 100%0.542287

After we click on 4.Bxc6, we see the previously referred-to defining position of the Ruy Exchange on our electronic chessboard. But now what will Black do?

2297
MOVEFREQUENCY%RESULTELO
4...dxc6169698%0.54
4...bxc6322%0.522241
END1<1%0.502200

Some choices at last! Black has to reacpture with a pawn or else lose the Knight for nothing. But what's with this "END" business? What this designates is that there is one game in the database that ended with 4.Bxc6 and never got to Black's fourth move. After clicking on the "Games" command, we see that the guilty parties are J. Brittner and A. Stock, who decided to call it a day after 4.Bxc6 at Mehlingen in 1992. Rather than pausing to ridicule them, let's move onward instead.

Looking at the stats, I see that 4...bxc6 isn't played very often (and I knew this from my outside reading anyway), so I've decided to check out the 4...dxc6 lines. I'll most likely return later and check out the variations after 4...bxc6.

After 4...dxc6, we see a bunch of choices for White:

MOVEFREQUENCY%RESULTELO
5.0-0128876%0.542290
5.Nc321713%0.562297
5.d41509%0.472274
5.d3211%0.502234
5.h3111%0.362200
5.Nxe56<1%0.332296
5.c32<1%0.501995
5.b31<1%0.502200

Whoa! Talk about having some options! How are we going to sort through this stuff?

It's actually not too tough. We'll use the process of elimination to decide what to study, combining CBTree's statistics with our own personal preferences as players. That way, we're using CBTree as a tool and not being a slave to its numbers or relying on it to do all our work for us.

And the process of elimination is what we'll look at next week in T-Notes.