patternpythonMinor
Simple 2048 AI in Python
Viewed 0 times
simplepython2048
Problem
A few weeks ago, I wrote a Python implementation of 2048. Since then, I've been working on a simple AI to play the game for me. The code uses expectimax search to evaluate each move, and chooses the move that maximizes the search as the next move to execute.
Similar to what others have suggested, the evaluation function examines monotonicity along a "snake path"; I also penalize boards that don't keep the largest tile at the snake path's terminal.
The AI can get to 2048 (and even 4096 occasionally), but not further. One of the things I've had difficulty with is tuning the evaluation function, and looking at what to try next. I have taken a look at awarding bonuses for zero tiles (for example,
The code for the game and the AI can be found here. You can run
Below is the relevant AI code:
```
from game import *
import math
def aimove(b):
"""
Returns a list of possible moves ("left", "right", "up", "down")
and each corresponding fitness
"""
def fitness(b):
"""
Returns the heuristic value of b
Snake refers to the "snake line pattern" (http://tinyurl.com/l9bstk6)
Here we only evaluate one direction; we award more points if high valued tiles
occur along this path. We penalize the board for not having
the highest valued tile in the lower left corner
"""
if Game.over(b):
return -float("inf")
snake = []
for i, col in enumerate(zip(*b)):
snake.extend(reversed(col) if i % 2 == 0 else col)
m = max
Similar to what others have suggested, the evaluation function examines monotonicity along a "snake path"; I also penalize boards that don't keep the largest tile at the snake path's terminal.
The AI can get to 2048 (and even 4096 occasionally), but not further. One of the things I've had difficulty with is tuning the evaluation function, and looking at what to try next. I have taken a look at awarding bonuses for zero tiles (for example,
z*log(s) where z is the number of zeros and s is the total sum of the tiles on the board; zeros become more valuable deeper into the game) but haven't had much luck improving things. Rather, I feel like I'm guessing at what to do next to improve the heuristic (or I'll look at a board that defied my human strategy, and try and write something to mirror this), and am curious what someone else might suggest. The code for the game and the AI can be found here. You can run
aiplay with a Game instance.Below is the relevant AI code:
```
from game import *
import math
def aimove(b):
"""
Returns a list of possible moves ("left", "right", "up", "down")
and each corresponding fitness
"""
def fitness(b):
"""
Returns the heuristic value of b
Snake refers to the "snake line pattern" (http://tinyurl.com/l9bstk6)
Here we only evaluate one direction; we award more points if high valued tiles
occur along this path. We penalize the board for not having
the highest valued tile in the lower left corner
"""
if Game.over(b):
return -float("inf")
snake = []
for i, col in enumerate(zip(*b)):
snake.extend(reversed(col) if i % 2 == 0 else col)
m = max
Solution
Don't use wildcard imports
PEP8 advises against this:
Rewrite like this:
Use
These
But they will all be evaluated, even if action is "left",
it will be checked against "right", "up", "down", pointlessly.
Avoid shadowing variables
In this code, the parameter
This can be confusing and can lead to errors.
Dealing with long lines
This is just a small tip.
I don't like using
I prefer to put the expression within parentheses,
which is another way of breaking a long line, like this:
Easier way to make deep copies
An easier way:
Other simplifications
This expressions can be written simpler:
When
the expression becomes:
When
the expression becomes:
Since the multiplication with
you can simply write:
The posted code doesn't really use a
It just takes the initial board out of the
but that's all the data a
and it's never used again.
All the calls involving
This is not a normal practice, and confusing.
As such, it would be better to move all the functions of
do
and call the methods like
You point out in a comment that you like to use a class for logical grouping purposes. But classes are not the only thing that can do that, packages can do that too, in fact it's a very good way of using packages.
However, the option is probably to keep the
but make its methods non-static,
and work with instances, instead of calling static methods.
Instead of manipulating a board directly,
let the
and hide within the details of all the operations on the board.
It will be good encapsulation.
For example,
the static call
PEP8 advises against this:
from game import *Rewrite like this:
import math
import itertools
from game import GameUse
elif for mutually exclusive ifsThese
if statements cannot all be true:if action == "left":
b = Game.left(b)
if action == "right":
b = Game.right(b)
if action == "up":
b = Game.up(b)
if action == "down":
b = Game.down(b)But they will all be evaluated, even if action is "left",
it will be checked against "right", "up", "down", pointlessly.
if action == "left":
b = Game.left(b)
elif action == "right":
b = Game.right(b)
elif action == "up":
b = Game.up(b)
elif action == "down":
b = Game.down(b)Avoid shadowing variables
In this code, the parameter
b of aimove is going to be shadowed by the parameters in fitness, and the many other methods where you reuse this name:def aimove(b):
# ...
def fitness(b):
# ...
def search(b, d, move=False):
# ...This can be confusing and can lead to errors.
Dealing with long lines
This is just a small tip.
I don't like using
\ for breaking new lines like this:alpha += .9 * search(c1, d - 1, True) / len(zeros) + \
.1 * search(c2, d - 1, True) / len(zeros)I prefer to put the expression within parentheses,
which is another way of breaking a long line, like this:
alpha += (.9 * search(c1, d - 1, True) / len(zeros) +
.1 * search(c2, d - 1, True) / len(zeros))Easier way to make deep copies
c1 = [[x for x in row] for row in b]
c2 = [[x for x in row] for row in b]An easier way:
from copy import deepcopy
c1 = deepcopy(b)
c2 = deepcopy(b)Other simplifications
This expressions can be written simpler:
(b2[3][0] != m) * abs(b2[3][0] - m)When
b2[3][0] is not equal to m,the expression becomes:
1 * abs(b2[3][0] - m)
abs(b2[3][0] - m)When
b2[3][0] is equal to m,the expression becomes:
0 * abs(b2[3][0] - m)
0 * abs(m - m)
0 * abs(0)
0 * 0Since the multiplication with
(b2[3][0] != m) doesn't change the outcome,you can simply write:
abs(b2[3][0] - m)Game shouldn't be a class, in its present formThe posted code doesn't really use a
Game instance.It just takes the initial board out of the
Game instance it receives,but that's all the data a
Game instance contains,and it's never used again.
All the calls involving
Game are static Game.some_function calls.This is not a normal practice, and confusing.
As such, it would be better to move all the functions of
Game to package scope,do
import game,and call the methods like
game.actions, game.over, and so on.You point out in a comment that you like to use a class for logical grouping purposes. But classes are not the only thing that can do that, packages can do that too, in fact it's a very good way of using packages.
However, the option is probably to keep the
Game class,but make its methods non-static,
and work with instances, instead of calling static methods.
Instead of manipulating a board directly,
let the
Game class contain the board,and hide within the details of all the operations on the board.
It will be good encapsulation.
For example,
the static call
Game.over(b) should be replaced with the instance method call game.over(), with no more board parameter.Code Snippets
from game import *import math
import itertools
from game import Gameif action == "left":
b = Game.left(b)
if action == "right":
b = Game.right(b)
if action == "up":
b = Game.up(b)
if action == "down":
b = Game.down(b)if action == "left":
b = Game.left(b)
elif action == "right":
b = Game.right(b)
elif action == "up":
b = Game.up(b)
elif action == "down":
b = Game.down(b)def aimove(b):
# ...
def fitness(b):
# ...
def search(b, d, move=False):
# ...Context
StackExchange Code Review Q#92488, answer score: 9
Revisions (0)
No revisions yet.