A neural network in a genetic algorithm develops survival behavior

I have been developing, in my leisure, an original simulation of a biology-inspired neural network (brain) evolving in a natural selection environment. Survival behavior is observed to emerge rapidly. This is not an effort to create an efficient learning system but an exploration of how learning mechanisms like backpropagation and reinforcement learning may emerge in biological brains. Below, I point out the salient features that set this neural network apart from traditional artificial neural networks. Thereafter, I discuss the behaviors I observe and various avenues that I think this research could take.

I conjecture that evolution sets policy gradients for (genetic) survival over multiple generations. These policy gradients are burned into the network topology. The network topology, in turn, directs (something like) reinforcement learning implemented on synaptic weights such that the individual brain learns to optimize reward during the course of one lifetime.

Code: github.com/souvik1982/Brain
Document: Evolving document


Bots with 30 neurons each search for and race after "food".
Shown here is their behavior between generations 1000 and 1100.

Punctuated equilibria leading to best possible 30 neuron brain for chasing food.

Each bot (blue circle) is controlled by a neural network (brain). The brain has 9 sensory neurons that take input from its field of vision, represented here as a grid in front of each bot. The brain also has 3 motor neurons that tell the bot to take a step forward, turn a degree to the left, and turn a degree to the right, respectively. The rest of the neurons in the brain (30 or so) mediate signals from the sensory to motor neurons. The connection topology of the brain is initially random. The neural network operates in time steps. Each neuron has a time-stepped model of signal integration, activation, leakage and refraction. Each synapse has a time-stepped model of Hebbian reinforcement and leakage. The initial behaviors of the bots are purely random and show no correlation between what they see and what they do. Every time a bot bumps into food (green), a daughter bot is generated with the same brain as its parent but with small random mutations in its connection topology. At the same time, the oldest bot in the group has to die. This simple evolutionary rule causes the brain topology, over a few thousand generations, to generate survival-like behavior as the bots search for and dart towards food ever more effectively.

The average time it takes for a bot to get to food decreases through punctuated equilibria over several generations.

Description of the Neural Network

Interesting results and observations

Future directions