Tetris AI Revisited | BASeCamp Programming Blog

One aspect I implemented into BASeTris on a lark was a simple AI. This is able to evaluate the positions a piece can go, decide on the “best” and then use the needed inputs to move that piece to the appropriate location/rotation.

Unfortunately, one of the downsides is that it isn’t super bright. It makes a lot of mistakes and chooses rather unusual places. I’ve attempted to tweak the values slightly but really don’t know whether it improves or makes it worse. Aside from having to sit and watch a game each time, I have no idea if the AI merely got different luck with the pieces or if it actually did better or worse. Basically, I’m not super happy with it’s performance and gameplay skills. It’s interesting to have the Player movements be computer-controlled, but when it constantly does dumb stuff, it’s a bit less impressive.

Recently it occurred to me that if I was able to create something which could “simulate” an entire game and then score the results based on how well the game was played, it could run through the AI and “evolve” the AI by mutating the constants that are used to try to improve it.

Scoring

First, I’ll revisit the scoring. For a given board, a Score is evaluated based on “scores” built from the height, bumpiness, number of holes, and whether any full rows are completed. The AI has constant factors that are used to multiply against each one to “weight” each one and come up with the final score. When attempting to find the best placement for a block, the AI will effectively take each possible position for the block and each possible rotation and come up with a score. One shortcoming of the current “test” algorithm is that it will not be able to ‘see’ under overhangs. for example a particularly tall wall with a T piece on top could be blocking a good move, but the AI would be unable to see it. (Basically, it only knows how to move a block and drop it, it doesn’t know how to manuever to get into particular locations nor even evaluate those locations.)

Basically the AI is given a piece- the next piece, and the current board state. It’s goal is to evaluate the possible places it could put the piece and determine which of those leaves the board in the best state. This is a rather naive implementation (it doesn’t really “prepare” the board for things like tetrises) and is aiming for very short-term goals. Technically because of the next queue, it would be possible to evaluate further, but of course each level of block we evaluate exponentially increases the test space, so I’ve not implemented that.

My idea was to implement a naive evolution algorithm. First step is to have it possible to “play a game” only within code, without any drawing or timing or waiting or any of that. It could just take each piece, find the best state, and move forward directly to that state and then evaluate again, and so on until it decides the game is over.

At that point we would be evaluating each “game”- we’d run for example 100 games for each AI mutation, and use the average “score” evaluated from each game- perhaps based on the number of lines and the number of pieces that were dropped, and such. Then comparing that score to the other scores from AI built in that generation, and taking say the top scorer and using it as a basis for a new generation of 100 mutations. If a generation doesn’t yield a better result than a parent than we discard it- (we are only going to look for a local maximum) and try that generation again. if a parent doesn’t yield a better child after some number of attempts we’ll determine that one as the “winner” mutation.

Of course, all of this is a relatively naive implementation of such AI; it isn’t able to look forward or backward or anything like that. An interesting experiment and learning experience may be to use a neural network to try to create a better AI, inputs being the current game field and the active block information, driving it with proper rewards (based on line clears) and it may be possible to get it to learn the game quite well. It will be necessary to make appropriate tweaks to the game to deal with that, as it would be playing the game itself and using key inputs rather than my naive “find the best piece” algorithm which doesn’t need a game to “run”, just have it play games over and over, instead of gameovering a new game starts.

Of course the AI itself will be nothing more than a curiousity- what the game really needs to utilize it as part of normal gameplay would be some kind of two-player mode. A second player could be a CPU–controlled adversary which hopefully with a Neural network implementation could present a good challenge.

Have something to say about this post? Comment!