*"It ain't so much the things we don't know that get us in trouble. It's the
things we know that ain't so." -- Artemus Ward*

In Part One, our intrepid hero was exploring the dangerous wilderness of the
Ruy Lopez Exchange, hacking his way through the dense underbrush of the game
tree with the help of his trusted guide, *CBTree*. Will he make it to the
middlegame alive or will he succumb to a hidden trap along the way?

Actually, the only trap we can fall into is the trap of believing that numbers don't lie. Wait, let's reword that: numbers don't lie, but one can misinterpret numbers to get all kinds of false results. Finding these pitfalls is what this series of articles is all about.

Last time, you were left hanging after White's first move. Here's a refresher:

MOVE | FREQUENCY | % | RESULT | ELO |
---|---|---|---|---|

e4 | 1729 | 100% | 0.54 | 2287 |

As you'll doubtless recall, this means 1.e4 was played in 1729 games, comprising 100% of the tree at that point. White did slightly better than Black in the games in which this move appeared (since 0.50 is the median and a number closer to 1.00 means the move was better for White). The average Elo rating of the players who played 1.e4 was 2287.

Some players like to refer to the Elo rating to see what moves the strong
players prefer. This is fine, except for one hitch: what if the players in
some of the database's games don't have Elo ratings assigned to them? In
such cases, *CBTree* assigns these players an arbitrary rating of 2200 for
statistical purposes (on the assumption that most users are only interested
in master and grandmaster games and will only have such games in their
databases).

Checking my Ruy Lopez Exchange database, I see that less than 50% of the games contain Elo ratings for one or both players. This completely blows the statistical accuracy of our 2287 average Elo for 1.e4. After all, I might have snuck a couple of my own class-level games into the database. Or, more likely (and worse), there are certainly a bunch of GM games that I manually added from books or magazines for which the players' ratings weren't available. Plus we need to take into account that there are a lot of games from before 1945 in the database and nobody had ratings in the days of Alekhine and Capablanca; Arpad Elo hadn't yet come along with his rating system. So the "default" rating of 2200 doesn't quite work here (Richard Reti -- a mere 2200 player? I don't think so!).

Short of limiting one's database to games in which both players have listed ratings, what does one do?

Fortunately, *CBTree* has covered this problem with a menu selection called
"Stats". By clicking on this feature, we can get a statistics box to appear
on the screen. In this box we see three sets of pie charts, bar graphs, and
numerical information. The upper set gives the results gleaned from all of
the games in the database. The second set shows statistics derived from only
the games that had an Elo rating listed for the moving player (White, in
this case). The lower set delivers stats for the games in which both players
had a listed rating.

By using this last set of charts, we can be assured of getting accurate
figures based on Elo ratings. We can get performance rating averages for the
move, too (i.e. the average of what the players' ratings would have been
after the games were played, adjusted for their success or failure). So,
after some exploring, we find that the true Elo average for White's move
1.e4 in only the games with ratings for both players is 2416, with a
performance rating of 2452. *Much* better than the initial figure of 2287!

This brings us to Steve's Rule #2 for using a chess tree program: don't rely too heavily on numerical data that may be based on an incomplete or inaccurate statistical sampling. Which leads directly to the exception to Rule #2: ...unless you want to do a little extra work. In this case, the extra work consists of clicking on the "Stats" function and waiting a few seconds for the new numbers to be generated (which really isn't much work at all, come to think of it).

The bottom line here is that unless 90% to 95% of the games in your database
have Elo ratings listed for both players, don't get too carried away with
looking at the Elo average given in the main *CBTree* screen. Use the "Stats"
function instead.

The first move of a game isn't really the optimum place for referring to Elo
averages; you're usually going to check these averages out when you have
several candidate moves from which to choose. But I wanted to take this
early opportunity to point out an immediate and recurring pitfall: relying
too heavily on numerical data without putting any thought into what's behind
the numbers. This is the biggest trap of using a statistical tree program as
a "crutch". If you just parrot the moves of strong players (basing your
choices on statistical data) without understanding *why* the moves were
played, you're just playing with numbers instead of playing chess. And, as
we're now seeing, this is extremely dangerous because the raw numbers can be
misleading.

Returning to our tree, let's go ahead one move and see what happens:

MOVE | FREQUENCY | % | RESULT | ELO |
---|---|---|---|---|

1...e5 | 1722 | 100% | 0.54 | 2287 |

1...Nc6 | 7 | <1% | 0.79 | 2406 |

Oops, an immediate fork in the road! Taking this at face value, we see that 1...Nc6 seems to be a real turkey for Black. White seems to win over three-fourths of the games in which this move is played (perhaps, we reason, this is why it's only seen in seven games, less than 1% of the total game tree).

However, let's think about this for a moment. We're looking at a game tree constructed from a database of Ruy Lopez Exchange Variation games. The defining position of the Ruy Exchange comes when White's Bishop takes Black's Knight on c6. If there's no minor-piece swap on c6, there's no Ruy Exchange (said swap is how the variation gets its name). My hunch is that these seven games will transpose back into the main line and that White's apparent success in these seven games has nothing whatsoever to do with Black's choice of 1...Nc6. Let's follow the path of the 1722 games in which Black played 1...e5 and see:

MOVE | FREQUENCY | % | RESULT | ELO |
---|---|---|---|---|

2.Nf3 | 1722 | 100% | 0.54 | 2286 |

MOVE | FREQUENCY | % | RESULT | ELO |
---|---|---|---|---|

2...e5 | 1722 | 100% | 0.54 | 2295 |

MOVE | FREQUENCY | % | RESULT | ELO |
---|---|---|---|---|

3.Bb5 | 1729 | 100% | 0.54 | 2287 |

Ah-ha! Notice that when we went from 2...e5 to 3.Bb5 the total number of
games jumped back up to our original total of 1729. So the 1...Nc6 line *did *
transpose back to the main line as predicted (said move order appearing this
way: 1...Nc6 2.Nf3 e5 3.Bb5. Same board position, different move order.
Ain't chess fun?).

So, going back to our stats for Black's first moves, we see that 1...Nc6 isn't inherently bad for Black after all, despite the 0.79 average that it receives. The four embittering losses, three draws, and no wins for Black in this line were just the breaks and had nothing to do with the opening move order.

Let's forge ahead in our tree and see what happens next:

MOVE | FREQUENCY | % | RESULT | ELO |
---|---|---|---|---|

3...a6 | 1729 | 100% | 0.54 | 2296 |

MOVE | FREQUENCY | % | RESULT | ELO |
---|---|---|---|---|

4.Bxc6 | 1729 | 100% | 0.54 | 2287 |

After we click on 4.Bxc6, we see the previously referred-to defining position of the Ruy Exchange on our electronic chessboard. But now what will Black do?

2297

MOVE | FREQUENCY | % | RESULT | ELO |
---|---|---|---|---|

4...dxc6 | 1696 | 98% | 0.54 | |

4...bxc6 | 32 | 2% | 0.52 | 2241 |

END | 1 | <1% | 0.50 | 2200 |

Some choices at last! Black has to reacpture with a pawn or else lose the Knight for nothing. But what's with this "END" business? What this designates is that there is one game in the database that ended with 4.Bxc6 and never got to Black's fourth move. After clicking on the "Games" command, we see that the guilty parties are J. Brittner and A. Stock, who decided to call it a day after 4.Bxc6 at Mehlingen in 1992. Rather than pausing to ridicule them, let's move onward instead.

Looking at the stats, I see that 4...bxc6 isn't played very often (and I knew this from my outside reading anyway), so I've decided to check out the 4...dxc6 lines. I'll most likely return later and check out the variations after 4...bxc6.

After 4...dxc6, we see a bunch of choices for White:

MOVE | FREQUENCY | % | RESULT | ELO |
---|---|---|---|---|

5.0-0 | 1288 | 76% | 0.54 | 2290 |

5.Nc3 | 217 | 13% | 0.56 | 2297 |

5.d4 | 150 | 9% | 0.47 | 2274 |

5.d3 | 21 | 1% | 0.50 | 2234 |

5.h3 | 11 | 1% | 0.36 | 2200 |

5.Nxe5 | 6 | <1% | 0.33 | 2296 |

5.c3 | 2 | <1% | 0.50 | 1995 |

5.b3 | 1 | <1% | 0.50 | 2200 |

Whoa! Talk about having some options! How are we going to sort through this stuff?

It's actually not too tough. We'll use the process of elimination to decide
what to study, combining *CBTree's* statistics with our own personal
preferences as players. That way, we're using *CBTree *as a tool and not being
a slave to its numbers or relying on it to do all our work for us.

And the process of elimination is what we'll look at next week in *T-Notes*.