Advanced: Information Theory

From player TemplateRex's Stratego.com forum post

Introduction

Most of you know game theory and the concept of Nash equilibrium. Another interesting mathematical approach to analyzing games such as Stratego is  information theory. A central concept there is the Shannon  information entropy.

Example: if a square is occupied by an unknown (and unmoved) piece, with prior probability pi to be of type i (i = marshal ... flag), then the entropy is computed by the formula (see the linked Wikipedia article):

H = -sumi pi log2(pi)

Here the sum is over all piece types, and log2 is the logarithm with base-2. The entropy counts the number of bits of information of the unknown piece. If any of the 12 pieces is equally likely, then pi = 1/12 for all i, and H = log2(1/12) = 3.585... To check this: 2^H = 12 again. In other words, the 12 unknown types can be encoded in 3.585 bits.

In the initial setup, the prior probabilities are constrained by the number of pieces. For completely random setups, p_marshal = 1/40, p_general = 1/40, ... p_bomb = 6/40, p_flag = 1/40. Similarly for the other pieces. You can then compute H = 3.275 per square, or 131 bits for the whole setup (=40 * 3.275).

Gravon database

Now you can use the Gravon database to see how random actual players choose their setups. I used @Dobby125's excellent  blog post  for the setup statistics on 85K games. He has the probabilities for all 40 squares and all 12 piece types. Then you can use these numbers and compute the entropy for each square, as well as for each piece.

Entropy per piece

First per piece: here's a table of the 12 pieces and their information entropy in the initial setup at Gravon. rank	n	p	H	H*total	Gravon	Info	Info/n marshal	1	0.025	0.133	5.32	4.58	0.74	0.74 general	1	0.025	0.133	5.32	4.51	0.82	0.82 colonel	2	0.050	0.216	8.64	7.82	0.83	0.41 major	3	0.075	0.280	11.21	10.89	0.32	0.11 captain	4	0.100	0.332	13.29	12.92	0.37	0.09 lieut	4	0.100	0.332	13.29	12.76	0.53	0.13 sarge	4	0.100	0.332	13.29	12.95	0.34	0.09 miner	5	0.125	0.375	15.00	13.49	1.51	0.30 scout	8	0.200	0.464	18.58	17.92	0.65	0.08 spy	1	0.025	0.133	5.32	4.37	0.95	0.95 bomb	6	0.150	0.411	16.42	15.24	1.18	0.20 flag	1	0.025	0.133	5.32	3.77	1.55	1.55 total	40	1.000	3.275	131.00	121.21	9.79	0.24 The first two columns give the piece rank and quantity "n". The column "p" is simply n / 40. Then the entropy H for random placement, then the observed entropy in the Gravon database, and the revealed information by the non-random placement, finally also the information per piece.

E.g. for a marshal, general, spy or flag, a completely random placement would have an entropy of 5.32 bits (=log2(1/40)). In the Gravon database, the entropy of these pieces was far lower. For the marshal e.g. the entropy is 4.58. This means that the marshal is effectively only placed at 2^4.58 = 24 squares, instead of 40. The flag is placed at less than 14 squares (2^3.77). Funny enough, the spy is more restricted than the general, which is in turn more restricted than the marshal.

Note that for a total setup, on average 10 bits of information is given away by the non-randomness of the setup. This is a factor of almost 1000 in the number of setups. Of course, there are strategic reason (no flag in front row, spy behind a lake etc.).

Entropy per square

You can also do this analysis per square (using Dobby125's layout): A	B	C	D	E	F	G	H	I	J 10	2.83	2.75	2.83	2.86	2.87	2.87	2.86	2.82	2.75	2.83 9	2.97	3.13	3.26	3.24	3.22	3.21	3.23	3.24	3.12	2.97 8	3.16	3.11	3.38	3.37	3.19	3.13	3.38	3.39	3.09	3.10 7	2.63	2.61	3.36	3.37	2.60	2.62	3.37	3.31	2.59	2.60 			lake	lake			lake	lake Each cell has the entropy of the square's piece type. E.g. on A7, we have on average only 2^2.63 = 6.2 piece types. And indeed, in 94.9% of the setups, A7 is occupied by either a scout, lieut, sarge, captain, bomb or major. The content behind the lake, e.g. on D7, is much more hidden: on average 2^3.37 = 10.3 pieces. And indeed, only flag and spy have less than 5% of being there.

Applications

So what can a human player do with this? Not much as most players can't do logarithms in their head   But for an AI it might be useful information to keep track of, and to place a value on it (weighted per piece type, reducing uncertainty about the marshal is more valuable of course). When a piece is revealed, or excluded, the probability goes to 0 or 1. Then the entropy is equal to 0. (0 * log(0) is 0, when taking appropriate limits). Perhaps someone with access to the Gravon game viewer source code can add a piece of code to keep track of this during a game (or maybe @Dobby125 can analyse this with his Python scripts).