Homo sapiens strive to create artificial intelligence for at least 70 years. Someday we succeed and it will be our greatest invention. However, we still do not have a well-accepted definition of what intelligence is. Someone says it is related to the memory, how much a person knows. Or it has something to do with problem-solving. There is no clear answer. The same in Wikipedia, it describes intelligence as a collection of traits with too many words…
Scientists can not agree as well, but perhaps the most supported is the following definition:
Intelligence is the ability to achieve goals in various environments.
Short and simple, but each word is chosen very carefully. Let`s scrutinize each one.
First, intelligence is the ability. All living creatures are able to achieve goals so by definition they are intelligent, but each has an individual level. This ability is quantitative, how well you behave, but very hard to measure.
The next word is the hardest — environment. No one truly understands it. For example, the environment for chess is the board, 8 by 8 cells, 16 figures, and the “laws” of movement. There are about 10⁵⁰ possible boards (valid positions of figures) and many more ways to play the game (around 10¹²⁰). It is impossible to write a program that wins just by searching through all possible combinations. It is physically impossible, our universe doesn’t have enough matter to build such a computer (no matter how quantum it is).
That is why everyone was so excited when the program for the first time defeated the best human player (Deep Blue vs Garry Kasparov in 1997). The same happened with the harder game of Go recently (AlphaGo won Lee Sedol in 2016, another environment with ~10¹⁷⁰ possible boards). In the last years, we saw many examples where the program within its environment shows impressive results. One — can recognize images, another makes fake images, another generates a text. In every case, there is a great performance within a narrow environment (that is why it is sometimes called narrow AI). If you change the chess environment, for example, “pawn moves diagonally”, then the best program fails. It does not adapt as we humans do. People know about this limitation and try to make a universal program (possibly MuZero is the best we have for now). But still, it is far from what we want.
Because of this, there is a word “various” in the definition. The same program should win chess, drive a car, translate a text, and boil the eggs. The more various environments it can perform the better it is. But truly there is only one environment — the objective reality we call the Universe. What is important is the information the program receives from the sensors and the movements it can send. That is how our universe reduces to a chess environment.
The amount of information that even a simple creature perceives at land or sea is insanely larger than in chess or Go games. Let send a robot in a field and give it a megapixel (1000 by 1000) camera, gray values only (8 bit per pixel that is 2⁸=256 shades of gray). Pixels can take any possible values among 256 power 1000х1000 combinations, which is around 10²⁴⁰⁰⁰⁰⁰. It is much more than 10¹⁷⁰ as in the Go.
We would like the robot to have a memory like every creature has. The most obvious way to save information is just to write every pixel from each frame. Storing 10 frames would cost 10 megabytes. By the end of the day, the robot records a terabyte of data. Enormous!
We need to record smarter. How? We are lucky to live in a mysterious Universe with hidden gems. 70 years ago Claude Shannon discovered one of them and it was so huge, that it started the information era and made possible computers. The gem is an information theory and it forms the foundation for numerous codecs to compress images, music, movies, text…
Interestingly, nature through evolution found many more gems that are manifested in the brain. And now neuroscience tries to find these covert laws of nature.
Returning to the robot, let’s try to save the sequence (as if it were pixel values):
It appears that the sequence is random and we need to store every bit as it is.
Now let’s try again with another.
It is more interesting. It has patterns, like 0101010, 01110, or as they are also called — regularities. Information theory says “regularity means compressibility”. The simplest way to save space is just to select patterns and rename them with shorter sequences. Like in the picture above, after renaming we have a shorter sequence without loss of information.
Our environment is full of such regularities, recurring data with dependencies and casual relationships. Each video frame is similar to the previous. Even on a single frame, neighboring pixels are similar, they create edges or textures of objects. The same is true for chess, there are patterns of moves and positions that lead to winning. A human player through years of experience develops an ability to discover these hidden patterns and doesn’t need to probe every move.
Our brain constantly tries to find patterns even if the data is random. It seeks to bring order into chaos. The brain adapts to the environment, strive to be in balance with it. For example, we have a visual illusion that vertical lines are longer than horizontal ones (even they are the same length). The illusion arises because our environment has more vertical lines, like edges of objects, and the brain adapts to it. At first, it was a hypothesis, but people took the photos, calculated, and indeed — there are more vertical lines!
Where do these patterns come from? From the laws of nature itself. Objects are local because atoms create electrical bounds. Live creatures have symmetry because it reduces computational complexity in DNA. Things fall down because of gravity. Children at one year are surprised if the rolling ball does not fall from the edge. They do not speak but already have a basic understanding of the physical laws.
Everyone builds a model of the environment inside their head. We know the names of objects, where things are located, how to go from work to home. We learn the causes and effects, how the world changes by itself or because of us. As if the brain constantly fills the buckets with questions “what, where, why, when, how…”. Everything in our mental model is connected, you hear a word and dozens of other associations pop up. Our brain has a rough model of the world and we can manipulate it as we want.
This model does not write everything, but just enough to survive and reproduce. Building the model is expensive, it requires a lot of time and computations. That is why the brain likes to simplify things, to make heuristics. Brain size and speed have limits, just 1.5 kilo and around 10¹⁵ operations per second. Luckily, we can create artificial systems much larger and faster. That is why people want to understand how to teach a machine to learn a model of the environment. With more resources, it will know more, comprehend more, and predict the future to the extent no human ever will.
The purpose of unsupervised learning (part of a machine learning field) to understand how to discover patterns in data. Unsupervised learning is extremely hard, we do not even know the right questions to ask. People explore various directions and make many attempts but still it resists. Ultimately, the goal is to make a machine to learn a compact and efficient casual model of the world. But no one knows how to do that. One of the central problems is knowledge representation: how information should be stored in a model. This question has not been solved for many decades. The usual ways how we store information in computers do not work. People tried and got the IBM Watson or robot SOFIA, a walking encyclopedia without a clue about what is going around.
For some time I thought that the neural networks could form the model by learning the probability distribution of the data. It contains all information and can be compressed. Many people still believe this is the case. But there is something deeper… Previously I lied to you, the first sequence of random zeros and ones we tried to store was not random. Formally it is, all randomness tests will confirm it. But actually, it is a binary representation(first 128 bits) of a number pi (3.14159…) And we know many formulas that can generate this number. So, it appears that instead of saving the data, we can save the program that generates this data. And it will be much shorter.
This idea originally appeared in the works of Solomonoff and later Kolmogorov. The complexity of any data can be measured as the length of the shortest program that can generate the data (it is called Kolmogorov complexity). Maybe, the brain does not just preserve information from the world, but searches and saves the programs that generate information? If yes, it could save a lot of space and energy. What if nature through evolution has discovered how to save and run programs in the living matter?
By the way, physics also has the same goal, to find the programs (usually called laws) that describe the world most shortly and accurately. Consider the movement of planets in the night sky. The first program (named after Ptolemaeus) explained the data as a complex movement on the epicycles with the Earth in a center of everything. It was very complicated and not accurate. The next one, named after Newton, introduced a universal gravity that moves the planets around the sun on elliptical orbits. The next program, named after Einstein, fixed the error with the Mercury and gives the most accurate prediction of the positions of planets. The last is important, a program does not simply describe the data but generates correct predictions of the future. The same for the brain, it constantly makes predictions.
The environment was the hardest word in the definition and it took more space. The next is easier, what does it mean to achieve? As soon as we have a good model, the achievement is not that hard. Iterate through possible actions, get predictions from the model, and just pick the best one. The more actions you have and the faster you can iterate, the best action you will select. Though recall the chess, actions are not random. A good model detects patterns in actions too and provides the best one first. For some animals, such actions are genetically predetermined, others refine actions through trial and error, and some, like humans, learn by imitation (the most advanced method so far).
Consciousness is closely linked to action selection. The more actions you are conscious of, the likely you pick the optimal one. This connects with the freedom of will — if you are conscious only of one action you do not have anything to choose from.
The model is at the core of decision-making. Preferably it should have a large memory and work fast. But the model development and decision-making are not separate. Humans build it through interaction with the world, which is called the action-perception loop. Take the kids who touch everything and break stuff just for fun.
The last and the most important word in the definition — the goal. Where is it comes from? For a machine it is easy, humans define goals. For an animal, goals arise from the needs: from simpler, like food and safety, to more complex like socialization and self-realization in humans. Our needs define our behavior and goals, and the goals define the meaning of life. But what needs does a machine have? To get charged? Is it possible that someday such needs will arise spontaneously, like destroying the Earth? Can a machine find the meaning of its existence?
Not just a robot, but even a human struggles with defining its true goals. For simple, like, “go to eat”, “go to sleep”, it is easy, but more complicated can be distorted, imposed, or be simply unconscious.
Unfortunately, a lot of people are overly obsessed with intelligence. They think that you need to be super smart to be successful, to go to a top university to get a good job, to earn money to fulfill all your needs. They put rational thinking in the first place. I do not diminish its value, science and technology depend on it. But for humans, intelligence is not the most important. Read the definition. Intelligence is the ability to achieve goals in various environments. For a human, defining goals is the most important and only then achieving them. Because what is the point to have strong intelligence if you achieve not yours but others or simply wrong goals.
But if your goals are truly yours, then intelligence would be very handy:) And if the goals are super hard, like to colonize the galaxy, then you need a superintelligence. And that is why we need intelligent machines — to help humanity achieve its goals.