Anything wrong with SCRABBLE tile frequency?
Tile Frequency
Have you ever thought there were too many vowels in a SCRABBLEÓ bag? Have your bingo aspirations been frustrated by drawing a V or W at the wrong time? A good case could be made for having five fewer vowels and just one V and W. When Alfred Butts created SCRABBLEÓ in 1938 he meticulously compiled the frequency of letters appearing on the front page of the New York Times. He then used this frequency distribution to decide how many of each letter should be in a game. Well almost. If he had stuck strictly with the New York Times distribution there would have been 9 S-tiles and no J, Q, X, or Z tiles. He deliberately set the number of S-tiles at 4 so as to make them very valuable. And the number of tiles for J, Q, X, and Z was set at 1.
The frequency of letters for words found in the Official Word List (OWL), however, does not match the frequency that was derived from the 1938 New York Times. If tile distribution were based on the OWL there would be 5 fewer vowels and only 1 V and W. There would be 2 more Ls and Cs. And one more R, M, and P. The tables below show the details.
Table 1. Number of occurrences in the OWL (considering only 2-8 letter words).
E: 66,293 D: 23,190 F: 8,660
S: 52,656 U: 20,725 K: 8,152
A: 45,617 C: 20,268 W: 6,749
I: 42,335 G: 16,690 V: 5,384
R: 40,543 P: 16,680 Z: 2,549
O: 34,625 M: 16,170 X: 1,878
N: 32,620 H: 13,787 J: 1,493
T: 32,325 B: 12,970 Q: 1,014
L: 31,456 Y: 10,362
Table 2. Tile distribution based on frequency shown in Table 1. *
A: 8 L: 6 D: 4 B: 2 F: 2 K: 1
E: 12 N: 6 G: 3 C: 4 H: 2
I: 7 R: 7 M: 3 W: 1
O: 6 T: 6 P: 3 V: 1
U: 4 Y: 2
*Distribution assumes that 4 S-tiles and 1 each J, Q, X, and Z are a given.
If the SCRABBLEÓ tile distribution matched table 2, I’d bet most games would have more bingos and that there would be significantly fewer cases of vowelitis. Do I advocate making a change? Do I advocate tilting against windmills? It’s like the qwerty keyboard. Change it and increase typing speed by 20%. But it just ain’t gonna happen.
I’ll close this piece by presenting two more tables, which I find interesting. I asked myself, How would Table 2. change, if the frequency of tiles were based on the 2 – 5 letter words found in the OWL? Tables 3 and 4 provide an answer. I think the tables confirm something most of you probably already know. Es and Is are needed more for bingos than for short words. O, H, W, Y, and K are more readily used in short words than in bingos.
Table 3. Number of occurrences in the OWL (considering only 2-5 letter words).
E: 6,428 U: 2,499 K: 1,460
S: 6,236 D: 2,477 F: 1,228
A: 5,857 C: 2,004 W: 1,137
O: 4,479 P: 2,109 V: 665
R: 3,911 M: 1,998 Z: 370
I: 3,769 Y: 1,933 X: 318
L: 3,428 H: 1,740 J: 317
T: 3,320 G: 1,682 Q: 101
N: 2,876 B: 1,672
Table 4. Tile distribution based on frequencies shown in Table 3.
A: 9 L: 6 D: 4 B: 3 F: 2 K: 2
E: 10 N: 5 G: 3 C: 3 H: 3
I: 6 R: 6 M: 3 W: 2
O: 7 T: 5 P: 3 V: 1
U: 4 Y: 3