April 29, 2011

Writing a Tic Tac Toe program using AI (Minimax)

tic-tac-toe

Most of us know about [tic tac toe][tick_tac_toe] game. If not, you might know this game in another name. I belonged to the second category. I had played this game many times in my childhood but with another localised name. This game said to be a simplest example of programming with a game tree. Tic-Tac-Toe also seems to be a common interview coding question for Software Engineer - Developer positions.

Let me give a brief idea of what is this game about. It consits of a board containing 9 cells with 3x3 (rowxcolumn). It is a two player game with each player assigned with a marker symbol (X or O). During the first turn the player mark the symbol X (Marker symbol corresponding to the player) to a cell among the available cells, and the second player will mark O (Second player’s symbol) to a cell among the available cells. The game continues until it reaches either one of the conditions: When one column, row or diagonal has X, The player assigned with X wins else if this state arrived for O, the player O wins. If the board contains no free cell left and none of the above conditions arrived, the Game ends with Draw.

Now, Our goal is to write a program to play this Tic-Tac-Toe. First we will write a program conists of Human-Human player game such that it can played between two friends. After writing Human-Human game, we can improve it by replacing one Human player with Artificial Intelligent Player. Writing an Artificial Intelligent Player is the most interesting part of the code.

Artifical Intelligence was one of my subjects in the Semester-8. Since I did not study MiniMax theorem in AI for examination because of the fact that the last minute study didn’t work. While writing the AI player program, I went through the MiniMax theorem and found it very interesting unlike classroom. I will describe more in the later part of this post. The AI player is a simplest application of MiniMax theorem. If teachers could show Tic-Tac-Toe program in the AI classroom, it would have simplified the task of teaching without leaving any complexity.

The complete code for the Tic-Tac-Toe in python is given below. Explanation for the components in the program are described as below.

GAME class

It consists of a class GAME. It is the major class which contains the states and methods required for the game. Initially it sets all the cells in the board (self.board) with ‘-‘ (blank). An empty list, lastmoves is initialized. self.lastmoves is a stack which consists of recently marked positions. It consists of print_board() function which is used to print the cells in the current board. get_free_positions() return a list of unmarked positions on the board. mark() function marks a position with a marker symbol. When it marks a position, it pushes that position to the lastmoves stack.

The revert_last_move() function reset the recent move (Recent marking). It makes use of lastmoves stack for resetting the last move by unmarking it.

The is_gameover() function returns True or False based on whether game is completed or not. A game completes when either of the players win or when the game reaches Draw. win_pos list consists of eight different three-cell combinations that corresponds to winning state. ie (first row, second row, third row, first col, second col, third col and two diagonals). This method sets the self.winner when game is over.

The play() method executes the game play. It takes two player objects as arguments. Each player has a method move() which performs that player’s move. Since there are nine cells in the game board, it executes the loop for a maximum of nine times. It calls the move() method for each of the players alternatively. On each move, it checks whether the game is completed or not.

We have two player classes Human and AI used in this program. Actually we can create a Base class Player and inherit child player classes, AI and Human and implement move() method individually. (Good OOP design). But I haven’t used any base class and inheritance to leave the code simple.

Human class - The Human player

Human class is a player attribute used in the play() function of GAME class. It has a constructor (init() ) and a method move(). When the object is initialized, it sets the marker for the player as self.marker = “X” and sets type of player (Human(H) or Computer(C)). Human class is meant for Human player. The move() function takes an instance of GAME class (current game). The method reads the next move position from the user and marks the position with player marker (X or O) if it is available.

See the lines which is at the end of the lines in the program: player1 = Human(“X”) player2 = AI(“O”)

Now replace player2 as Human (player2 = Human(“O”)) and execute the script. Now you have a two player Human-Human game. Try a game with your friend.

Next target is to code an Artificial Intelligent Player.

AI class - The computer player

We use our brain logic to play Tic Tac Toe game. Everybody plays to win the game. How do we code up an intelligent Player which plays to win. Winning can be achieved by foreseeing the moves of the opponent. If we can foresee all the possible moves of opponent until the end of the game, we can always choose the best move. That should be the basic idea of the intelligent program. Foresee or emulate the next states of game. The technique used for foreseeing and choosing the best move is called MiniMax theorem.

The AI class is similar to Human class. It consists of similar initialization part. But AI class has an additional attribute called opponentmarker which stores the marker for the opponent, which will be used for emulate marking of opponent. The AI class consists of the following methods: move() - Chooses the next move maximized_move() - returns the next best move (To reach winning game end) minimized_move() - returns the next best move for opponent (Means worst move for current player) get_score() - returns the status of game on end. Win, lose or draw.

The move() method calls maximized_move() which returns the best move position and marks that cell as next move. The get_score() function checks whether game has ended. If ended, it returns 1 if the AI player won, -1 if other player won or 0 if draw. These values are scores.

The major players of minimax theorem are Max functioning and Min functioning. Max function maximizes the possibility of win and Min function minimizes. Let us see what is written in the maximized_move(). It gets a list of all available moves and apply the moves one by one. Check if game has ended, if it is ended collect the score and check whether it is better than all moves applied, if it is the best one use that as bestmove. If game has not ended, call minimized_move() so that next opponent’s move is emulated.

The minimize_move() method does exactly same as maximized_move(), except it selects the worst score as best move. (That means, -1 score means opponent win, which is the best case for opponent. In minimize, we are emulating the best move of opponent). The minimized_move() calls maximized_move() for emulating the next state (The AI player’s state).

Hence once maximized is called, it will execute a stack of Max-Min-Max-Min-Max.. until the game over. In such way it find outs the best next move for the intelligent player. Hence we cannot beat the intelligent player. We can only defend until we reach a draw. At each state, it applies all the available moves and emulate the game with all possibilities to which it wins, and selects the best move. When the Max-Min-Max.. recursive call returns at each level, revert_last_move() method is used to reset the last move (Since it is an emulation and has to return to previous state after emulation outcome is obtained).

Now, try running the original program and play the game with Computer. After that change both players to AI and see what happens.

The given program applies Max-Min sequence until end of game for every move. It is very CPU expensive. If we need to reduce the intelligence to make it less CPU expensive, restrict the sequence of Max-Min calls to a restricted number of levels.

Hope you enjoy this post.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197 #!/usr/bin/env python
#Filename: tic-tac-toe.py

#Description: Tic-Tac-Toe two player game


class GAME:

    def __init__(self):
        '''Initialize parameters - the game board, moves stack and winner'''

        self.board = [ '-' for i in range(0,9) ]
        self.lastmoves = []
        self.winner = None

    def print_board(self):
        '''Print the current game board'''
        
        print "\nCurrent board:"

        for j in range(0,9,3):
            for i in range(3):
                if self.board[j+i] == '-':
                    print "%d |" %(j+i),
                else:
                    print "%s |" %self.board[j+i],
    
            print "\n",


    def get_free_positions(self):
        '''Get the list of available positions'''

        moves = []
        for i,v in enumerate(self.board):
            if v=='-':
                moves.append(i)
        return moves

    def mark(self,marker,pos):
        '''Mark a position with marker X or O'''
        self.board[pos] = marker
        self.lastmoves.append(pos)

    def revert_last_move(self):
        '''Reset the last move'''

        self.board[self.lastmoves.pop()] = '-'
        self.winner = None

    def is_gameover(self):
        '''Test whether game has ended'''

        win_positions = [(0,1,2), (3,4,5), (6,7,8), (0,3,6),(1,4,7),(2,5,8), (0,4,8), (2,4,6)]

        for i,j,k in win_positions:
            if self.board[i] == self.board[j] == self.board[k] and self.board[i] != '-':
                self.winner = self.board[i]
                return True

        if '-' not in self.board:
            self.winner = '-'
            return True

        return False

    def play(self,player1,player2):
        '''Execute the game play with players'''

        self.p1 = player1
        self.p2 = player2
    
        for i in range(9):

            self.print_board()
            
            if i%2==0:
                if self.p1.type == 'H':
                    print "\t\t[Human's Move]"
                else:
                    print "\t\t[Computer's Move]"

                self.p1.move(self)
            else:
                if self.p2.type == 'H':
                    print "\t\t[Human's Move]"
                else:
                    print "\t\t[Computer's Move]"

                self.p2.move(self)

            if self.is_gameover():
                self.print_board()
                if self.winner == '-':
                    print "\nGame over with Draw"
                else:
                    print "\nWinner : %s" %self.winner
                return

class Human:
    '''Class for Human player'''

    def __init__(self,marker):
        self.marker = marker
        self.type = 'H'
    
    def move(self, gameinstance):

        while True:
        
            m = raw_input("Input position:")

            try:
                m = int(m)
            except:
                m = -1
        
            if m not in gameinstance.get_free_positions():
                print "Invalid move. Retry"
            else:
                break
    
        gameinstance.mark(self.marker,m)
         
class AI:
    '''Class for Computer Player'''

    def __init__(self, marker):
        self.marker = marker
        self.type = 'C'

        if self.marker == 'X':
            self.opponentmarker = 'O'
        else:
            self.opponentmarker = 'X'

    def move(self,gameinstance):
        move_position,score = self.maximized_move(gameinstance)
        gameinstance.mark(self.marker,move_position)



    def maximized_move(self,gameinstance):
        ''' Find maximized move'''    
        bestscore = None
        bestmove = None

        for m in gameinstance.get_free_positions():
            gameinstance.mark(self.marker,m)
        
            if gameinstance.is_gameover():
                score = self.get_score(gameinstance)
            else:
                move_position,score = self.minimized_move(gameinstance)
        
            gameinstance.revert_last_move()
            
            if bestscore == None or score > bestscore:
                bestscore = score
                bestmove = m

        return bestmove, bestscore

    def minimized_move(self,gameinstance):
        ''' Find the minimized move'''

        bestscore = None
        bestmove = None

        for m in gameinstance.get_free_positions():
            gameinstance.mark(self.opponentmarker,m)
        
            if gameinstance.is_gameover():
                score = self.get_score(gameinstance)
            else:
                move_position,score = self.maximized_move(gameinstance)
        
            gameinstance.revert_last_move()
            
            if bestscore == None or score < bestscore:
                bestscore = score
                bestmove = m

        return bestmove, bestscore

    def get_score(self,gameinstance):
        if gameinstance.is_gameover():
            if gameinstance.winner  == self.marker:
                return 1 # Won

            elif gameinstance.winner == self.opponentmarker:
                return -1 # Opponent won

        return 0 # Draw

        

if __name__ == '__main__':
    game=GAME()     
    player1 = Human("X")
    player2 = AI("O")
    game.play( player1, player2)

Thanks to the Tic-Tac-Toe image which is licensed under CC (source: http://www.flickr.com/photos/public-domain-photo/4250653771/sizes/l/in/photostream/).