Source: analyticsindiamag.com
At the present scenario, video games portray a crucial role when it comes to AI and ML model development and evaluation. This methodology has been around the corner for a few decades now. The custom-built Nimrod digital computer by Ferranti introduced in 1951 is the first known example of AI in gaming that used the game nim and was used to demonstrate its mathematical capabilities.
Currently, the gaming environments have been actively utilised for benchmarking AI agents due to their efficiency in the results. In one of our articles, we discussed how Japanese researchers used Mega Man 2 game to assess AI agents. Besides this, there are several popular instances where researchers used games to benchmark AI such as DeepMind’s AlphaGo to beat professional Go players, Libratus to beat pro players of Texas Hold’em Poker, among others.
In this article, let’s take a look at another simple video game called Snake and how machine learning algorithms can be implied to play this simple game.
Snake game is one of the classical video games that we all have played at least once in our childhood. In this game, the player controls the snake to maximise the score by eating apples that are spawned at random places. The snake will keep growing one grid every time the snake eats an apple. The only rule is that the snake has to avoid the collision in order to survive.
Researchers around the globe have been implementing various machine learning algorithms in this cult game. Below, we have mentioned a few implementations of neural network algorithms in the classic Snake game.
Snake Game Using Neural Networks & Genetic Algorithm
In a paper, researchers from the University of Technology, Poland used a neural network structure that decides what action to take from any given input. The Neural Network is called DNA by the researchers. The DNA class is the most important part of the snake as it is the “brain” that makes every decision.
The class has matrixes with weights and separate ones with bias, which represent each layer of the neural network. The next step is followed by creating a function that allows calculating its performance, where the performance includes the number of moves the snake executed without dying and scores.
The Implementation
The researchers used neural networks with 1 hidden layer with 6 neurons and the genetic algorithm to find out which method and parameters are the best. At first, they randomly generated the population of snakes with an optimum size of 2000. Then they let the snakes play in order to understand how many steps were executed and how many apples the snake ate.
This led to the calculation of fitness of each snake that helps to see which one performed the best and which one should have a higher probability of being chosen for breeding. For the Selection part, the researchers chose a pair of snakes (parents) that will give DNA to the new snake (child) where the probability of being chosen is based on fitness. After choosing the parents, the researchers crossover their DNA by taking some of the weights from the father and some from the mother and applying it to their child.
After selection, the next step is a mutation which follows when every new snake inherits the neural network from parents. Then the playing and mutation processes are repeated in order to get the best results.
Snake Game Using Deep Reinforcement Learning
In this research, the researchers develop a refined Deep Reinforcement Learning model to enable the autonomous agent to play the classical SnakeGame, whose constraint gets stricter as the game progresses. The researchers employed a convolutional neural network (CNN) trained with a variant of Q-learning.
Further, they proposed a designed reward mechanism to properly train the network, adopt a training gap strategy to temporarily bypass training after the location of the target changes, and introduce a dual experience replay method to categorise different experiences for better training efficacy. According to the researchers, the experimental results showed that the agent outperformed the baseline Deep Q-Learning Network (DQN) model and surpassed human-level performance in terms of both game scores and survival time in the Snake game.