This short video is about the problem of training the agent to program an execution unit to move a piece on an infinite board under changing board rules.
The learning agent communicates with the environment through an interface
There is an infinite board of square fields with a piece on it in the environment
Board rules allow shifting the piece only by a certain number of squares in lateral and vertical directions
The board rules change regularly
There is an execution unit in the environment
The execution unit may be programmed to move the piece
The problem is to train the agent to program and launch the execution unit each time the board rules change
The video below describes the communication cycle and the alphabet used to train the reinforcement learning agent.
Combinatorial Approach to Research and Learning
Combinatorial AI is an approach to building machines able to participate in research and development along with humans. A combinatorial agent proposes new models and tries to prove them on data when learning models under supervision or when doing something new and challenging. It starts from simple elementary models and combines them to more complex, learning which models to combine, what for, and how.
Agent’s learning has two sides, the first is about reading the structured problem descriptions, and the second is about building solutions. If trained steadily step by step, the agent might learn properties of individual items as well as systems build from them, follow dependencies, and use formal languages to share knowledge with others. All these activities are manifestations of understanding, which is a feature of human thinking and a key for developing strong AI.
Combinatorial AI is similar to a child, as it needs to learn how to build complex systems from the very basics. It may need to learn math, physics, material properties, chemistry, and even common sense before it builds something really useful. This training scheme is not less important than AI itself.
Proof of Concept: Artificial Agent Learning to Program
At the end of 2018, a proof-of-concept combinatorial agent was trained on a demo problem.
It learns to write and launch a simple but useful program in a formal language
The probability of success approximates to 100% with the time
The proof-of-concept runs on a laptop PC using one core of a 1.6 GHz CPU
The whole training takes about 15 minutes and 10M of RAM