If artificial intelligence (AI) rhymes with “learning” today, just a few years ago things looked very different. Although it was possible to identify print characters, play chess or make medical diagnoses using logical inferences from experts, the AI at the time was, however, laborious and limited since it required manual programming.
The beginnings of autonomy
At the beginning of the 2010s, technical and algorithmic developments have allowed improvements in the AI performances, in particular in machine learning; a process by which a computer can improve itself based on the results it gets while performing a task.
The most widely used machine learning technique, supervised learning, consists of providing the computer with a learning database built on labeled classification models and examples (e.g., the image of a tree is associated with the “tree” label). The computer can thus end up identifying elements by referring to the characteristics of thousands or even millions of components that make up its database.
Form recognition has developed lately as well, a classification system that allows the computer to identify different types of computerized “patterns,” not just visual ones – objects or images – but also sounds (speech recognition) and others (medical information, satellite scans, etc.). The problem with pattern recognition is that it is difficult to develop a good feature extractor, and each new application has to go through a thorough review process.
Deep learning: a revolution
In the early 2000s, researchers Geoffrey Hinton, Yann LeCun and Yoshua Bengio decided to re-examine the potential of digital artificial neural networks, a technology abandoned by research from the late 1990s to the beginning of the 2010s. The trio of researchers “invents” deep learning, which is now the most promising branch of AI, reviving the interest in this field of technology.
Inspired by the functioning of the human brain, these networks of artificial neurons, optimized by learning algorithms (set of rules), perform calculations and operate according to a system of layers; the results of each layer serving successive layers, hence the qualifier “deep.” While the first layers extract simple features, the subsequent layers combine them to form concepts that become more complex.
The principle of this technology is to let the computer find by itself the best way to solve a problem from a considerable amount of data and indications concerning the expected result. Deep learning can use supervised learning as well as unsupervised learning.
The great revolution brought about by deep learning is that the tasks asked of the computer are now substantially based on its principles or algorithms. Whereas before AI knowledge was subdivided into several types of applications, studied in silos, efforts are now more concerted to understand the learning mechanisms.
The turning point 2011-2012
Five milestones for deep learning
- Graphical Processing Units (GPUs) capable of processing more than a thousand billion operations per second become available for less than $2000 per card. These very powerful specialized processors, initially designed for video game rendering, have proven to be highly efficient for neural network calculations.
- Experiments conducted by Microsoft, Google and IBM, with the collaboration of Geoffrey Hinton’s lab at the University of Toronto, demonstrate that deep networks can halve the error rates of speech recognition systems.
- As part of Google Brain, a deep learning research project led by Google, AI manages to learn to “recognize” a cat image among 10 million digital images from YouTube.
- Google uses artificial neural networks to improve its speech recognition tools.
- Convolutional neural networks – inspired by the visual cortex of mammals – pulverize records in image recognition by drastically reducing the error rate. Geoffrey Hinton’s Toronto victory at the prestigious ImageNet Image Recognition Competition confirms the potential for deep Most researchers in speech and vision recognition then turn to convolutional networks and other neural networks.
Massive investments from the private sector have followed in the subsequent years.
What can a computer learn to recognize through deep learning?
- Visual elements, such as shapes and objects in an image. It can also identify the people in the image and specify the type of scene in question. In medical imaging, this can allow, for example, to detect cancer cells.
- Sounds produced by speech that can be converted into words. This feature is already included in smartphones and digital personal assistance devices.
- The most common languages – to translate them.
- Elements of a game to take part in … and even win against a human opponent.