ELECTRONIC T-NOTES


CHESSBASE USA'S WEEKLY ON-LINE NEWSLETTER


FOR THE WEEK OF FEBRUARY 24, 2002


SETTING HASH TABLES

by Steve Lopez

Nearly all present day chess engines use hash tables. Another term for them is transposition tables. Chess programs that use hash tables create a temporary storage file as they play or analyze and store the positions they've evaluated in this file. If a position comes up later in the search (by a transposition of moves) that's the same as a position that's already been evaluated (and stored in the hash tables), they don't need to analyze it a second time -- they just pull the existing evaluation information from the hash tables and then move on to the next position.

Here's a simple example. Let's say it's White's turn to move in a particular position (the exact position doesn't matter for this example). Among White's candidate moves are Bg5 and Nc3. Black could reply to either move with a5 or Qd7. Let's say that a chess engine analyzes this position, sees the move order 1.Bg5 a5 2.Nc3 Qd7, and analyzes that position. Later in its search, it sees the move order 1.Bg5 Qd7 2.Nc3 a5 -- this results in the exact position as in the variation previously mentioned (through a transposition of moves). By referencing the existing analysis in the hash tables, the engine won't need to analyze the position again -- it'll just see that the position's already been evaluated, use that same evaluation, and advance to the next position to be analyzed. Later in its search, it hits the position after 1.Nc3 Qd7 2.Bg5 a5. Once again, this is the same position, so it uses the evaluation stored in the hash tables. And so on.

ChessBase engines allow you to configure the size of the hash tables (which are stored in your computer's RAM) in the engine selection dialogue. Hit the F3 key on your keyboard in ChessBase 8 or any of our playing programs (from Fritz6 [late 1999] onward -- this shortcut isn't present in older versions of our chessplaying programs), and you'll see the dialogue appear on your screen:


To set the hash table size, just type a value (given in megabytes of RAM ) in the box provided. The program will provide a "maximum" hash table size based on your computer's total RAM, the system resources currently being used, and any hash table size limits programmed into the engine.

It's crucial to set the hash tables to the proper value (which is frequently not the suggested "maximum"). Bigger is not always better; in fact, it can be downright harmful to an engine's performance. So how do you know what size to make the hash tables?

There's a mathematical formula you can use to determine hash table size. (I can hear you already: "Awwww, man...math! I hate math!" I know -- I do, too. But bear with me; this is actually pretty simple. You can do this with a third-grade math education and a pocket calculator. Trust me on this -- I'm The Original Doofus when it comes to numbers, but even I can do this). First we'll look at the formula, and then we'll explain it in everyday English:

HT[KB] = 2.0 * PFreq[MHz] * t[s]

If you were to recite this formula in English verbage, you'd read it like this:

Hash table size (in kilobytes) is equal to 2 times your processor speed (in megahurtz) times the average number of seconds the engine will analyze a move

It's really pretty sweet when you cut out all the algebraic nonsense.

Lets' see how this works in practical application. I have an 800 MHz processor. Two times 800 is 1600, so that's the number I'll plug into that part of the equation:

HT[KB] = 1600 * t[s]

Obviously, your processor may or may not run at the same clock speed, so just double your clock speed and slap that number into the formula. For example, if you have a 1.4 GHz processor (1400 MHz), you'd zap 2800 (2 times 1400) into the formula in the proper place.

The next step is to determine the average number of seconds per move. This is really easy if you're playing a casual (non-timed) game. You'd set your program for, say, 30 seconds a move and then pop that number into the formula. But what if you're playing a timed game? It's still easy. Here's how I determine it. Let's say I'm playing with a sixty minute time limit per player. That's 3600 seconds. I figure that the average chess game lasts about 50 moves, and the first ten moves or so will be coming right out of the program's opening book (no engine analysis required -- it's all pregenerated). So that leaves about 40 moves to be analyzed by the engine in the course of a "normal" game. Thus the engine will have to analyze 40 moves in 3600 seconds. 3600 divided by 40 equals 90, so I can expect the program to spend about 90 seconds (on average) analyzing each move. So I slap that number into the formula:

HT[KB] = 1600 * 90

Fischer time controls don't louse up the formula -- you just have to be more clever about your calculations. You can simply use the base time as I described above, multiply the added increment by 50 (remember, you're still getting time back, even when making moves out of the opening book, so you don't subract time for book moves when using a Fischer increment) and then add that number to the total number of seconds before you do the division. If you're using standard tournament time controls (such as 40 moves in 2 hours, followed by 20 moves per hour afterwards), just figure the number of seconds in a three hour game (10800) and divide it by the 60 moves in those two segments of the game (10800 divided by 60 = 180 seconds on average).

So far so good -- that's not too hateful, is it? Just be sure you have batteries for your calculator and you'll be all set. But there's another slight hitch that requires one more mathematical calculation. Let's look again at our formula so far:

HT[KB] = 1600 * 90

Doing the math, we see that my hash table size should be 144000. But note that the hash table size is given in kilobytes in the formula. The engine selection dialogue allows you to set the table size in megabytes. Whoops. Hey, don't panic. Just divide the number you get by 1000, and that's the number of MB you type in the hash table size box. In this case, 144000 divided by 1000 is 144, so I'd type "144" in the box provided. Tah-dah! This is the optimum, proper, correct, and all-around swell size for the hash tables using this time setting (game in 60) on my machine.

Why wouldn't you want to set the tables much larger or smaller? If you set them too small, they fill up with positions too quickly and then the engine has no more room to add new positions. This dumbs the program down a little, but isn't fatal. But setting the tables too large is a different story. It doesn't matter if you bought a RocketBoy 5000 computer with two gigs of RAM -- if you set the hash tables to 2 gigs for a sixty minute game using an 800 MHz processor, you're going to lobotomize your chess engine. It'll play like Jack Nicholson after they fried him in "Cukoo's Nest". The program will only be using a fraction of the available hash table space. Searching through all of that for a position will be like looking for a needle in a haystack; the program will blow a lot of time in fruitless searching, time better spent in analyzing. So if you must err when setting the hash tables, do it by making them too small.

Let's look at an illustrative example. A lot of people love to play blitz chess (which is like saying that a lot of people love ice cream; I have a natural gift for overstating the obvious), so let's use a blitz game on my 800 MHz machine for our next example. Five minutes equals 300 seconds. 300 divided by 40 is 7.5, which we'll round up to 8. So our formula ends up like this:

HT[KB] = 1600 * 8

making the size 12800. Divide this by 1000 to get 12.8, round it down to 12, and we find that 12 MB RAM is the optimum hash table size for a blitz game on my PC. That's a pretty far cry from 1 gig, eh?

You can also use this formula when setting up the hash table size in ChessBase 8. Of course, you're not playing a game in CB8, only analyzing a position. In this case, I'll assume that you have some idea of how long you want an engine to analyze a position; I typically let the engine run for a minute or so, but you may prefer to have it run longer. Just take the approximate number of seconds you typically allow the engine to run, plug it into the formula, and set the hash tables accordingly.

OK, so far so good. Now we get to the esoteric stuff. According to programmers, chess engines prefer to use RAM in doubling increments rather than using in-between values. There's a technical reason for this, but only smart guys named Franz, Bob, Mark, Mathias (with a varying number of t's), etc., truly understand it, so let's not go there. But if you prefer to follow their advice, there are certain RAM settings that work best for hash tables in chess engines:

1 MB
2 MB
4 MB
8 MB
16 MB
32 MB
64 MB
128 MB
256 MB
512 MB

and so on, in doubled increments. If the number you get from the formula falls between two of these increments, round down to the next lower increment from this list. However, you're really not drastically damaging anything by using an "in-between" value; I used 48 MB hash tables on a 64 MB RAM machine for a long time and still got great analysis from my chess programs. In fact, the "hashtable size" box of the engine dialogue pictured above has a pulldown mode (click the "upside-down triangle" immediately to the right of the white box) that gives you a variety of preset sizes to use with your chess engines, including some of the "in-between" increments.

There's one last trick you need to know to set up your hash tables. Windows uses a certain amount of RAM for the various .dlls and other bits required to run it. So does the user interface of your chess program. This isn't a ton of RAM, but it can detract from the amount of RAM available for hash tables. If the amount you've set cuts into the RAM that's being used by Windows, etc., you'll see a large amount of hard drive activity when your chess program starts using the hash tables; this can take a lot of time and severely affect degrade your chess program's performance. This is because your computer is taking some of the information already stored in RAM and setting it up in a temporary swap file on your hard drive (this accounts for the hard drive activity you notice) to make room for the hash tables.

However, there's a way to set up the hash tables in advance, so that the disk swapping doesn't occur during a game. Set your hash table size in the Engine dialogue and then click on "Infinite analysis" in the Game menu. If any disk swapping occurs it'll happen now, before the game begins. Depending on your available RAM and the hash table size you've set, you may see your hard drive light come on and hear the rattling as Windows sets up the temporary file on your hard drive. As soon as this stops, you'll see your chess engine start analyzing the position normally in the Analysis pane. At this point, you just go back to the Game menu and select "Infinite analysis" a second time to turn it off. Your hash tables are set up and you can now start your game.

How long can you expect this swapping to take? That depends on the amount of RAM your computer has and the size you've set for your hash tables. Back when I had a 64 MB RAM machine and I set the tables to 48 MB, the swapping took anywhere from five to seven minutes. However, on a 512 MB RAM machine, I can set the hash tables for 256 MB and the swapping (if any) takes only a second or two. So the time will therefore vary according to the desired hash table size and the available RAM.

Note that you don't have to use "Infinite analysis" to set up the hash tables before every game. You do this once, before your first game, and won't have to do it again unless you exit the program. If any swapping occurred when you set up the hash tables, you'll notice a much shorter amount of swapping occur when you exit the program. Windows is taking the information stored in the temporary file and putting it back into RAM, but this is a much faster process than taking info from RAM and putting it on the hard drive. As long as the program is running, the hash tables are set. But if you exit the program and later start it again, you'll need to set up the hash tables all over again.

This also applies to ChessBase 8. Hit ALT-F2 to start an engine's analysis. The first time you do this, you may see some hard drive activity and slow analysis as the disk-swapping occurs. But once the hard drive stops banging, your hash tables will be set for the remainder of that session in CB8.

Note that hash tables don't actually change a computer's analysis. A chess engine will give the same result after an eleven-ply search whether or not the hash tables are set correctly. The difference is that with a properly configured set of tables, the program will reach that eleven-ply depth more quickly. As a corollary of that, a chess program (while playing a game) will utilize its available time more efficiently with a properly configured set of hash tables. If the hash tables are set far too low, they'll fill up quickly and not allow the program the chance to store and use additional information. Set them far too large and the program will waste valuable time searcing through "empty space" for information that's not there (see my earler "lobotomy" comment).

Hopefully, this article hasn't been overly technical. The hash table formula may seem daunting at first, but it's actually pretty easy to use. Go ahead and give it a try; although the results will probabaly be transparent in most cases to the average user, there will be increased performance from the chess engine.

Until next week, have fun!

You can e-mail me with your comments, suggestions, and analysis for Electronic T-Notes.