Once I visited a great library with many many books. Momentarily, a bitter thought came up in my mind: “ so much information and all is disconnected. Words on pages are just blobs of ink, and only inside the human mind they integrate together”. After this, I became convinced — we store information in the wrong way!
The internal model of the world lies in the core of intelligence (see what is intelligence?). It somehow is encoded and stored in neural activation patterns that we call memories. But how exactly is it stored? And how can we store information in a similar way?
Knowledge representation is one of the key problems in the artificial intelligence field.
After this article, you will know the basics of how computers encode information. How does it differ from brain encoding? And why do we need to change the way we preserve information to create AI?
Basics of computer codes
80 years ago when the first computers were created, people chose the most obvious and simple way to encode something: just assign a unique number to every object. Computers use binary numbers so for N objects we need at least log2(N) bits. For example, the letters you read right now are enumerated and each has its own code. Medium uses a common UTF-8 encoding, that has 1112064 symbols. However, each symbol requires not log2(1112064))=21 bit. Some symbols that are more frequently used are encoded shorter, and some longer. For example, the letter “A” is encoded as 01000001 (1 byte, 8bits), and Ukrainian “ї” (like in a word naїve) as 1101000110010111 (2 bytes). The letter “A” has a shorter code because it is used far more often worldwide. Other symbols, like ख़, ௵ or smiles 😆😉, are encoded even longer (3, 4 bytes). Since each letter has a variable-length code we can write the whole text with fewer bits.
The same with color, each has its own number. But again, to save space, images are stored not as a collection of pixels. Otherwise, a relatively small image with 1000x1000 resolution would “weigh” 3 megabytes (10⁶ pixels x 3 colors x 8 bit per color = 24 megabits). Thanks to information theory people created many codecs to easily compress and decompress data, like .jpg, .png, .mp4, .mp3, .pdf .zip …. The more redundant (typical) information (like the text) the better it can be compressed.
For many decades computer scientists have created many efficient data structures. Each kind of data (strings, numbers, images…) has a special and clever way how to store and manipulate the information. Maybe the best known is a binary search tree, that allows performing a fast search for numbers. My favorite are the algorithms for strings, like pattern matching, prefix search, etc. (see a marvelous course on algorithms on the Coursera). However, the brain, as it seems so far, has a common way to store any data. As if the brain is a universal data structure :)
Another key difference is that computers work sequentially. The processor at the frequency of several gigahertz receives instruction on how to read, change, and store the data back to memory. We must admit, this architecture (processor — memory — input-output devices) is very successful. For many years processors are becoming faster and faster, memory larger and more efficient, but the base architecture remains the same.
However, the brain challenges this choice. It shows that parallel processing can be superior for many tasks, like object recognition, associative memory, planning… So, do we need a new architecture for AI? Not simply because “let’s do like the brain”, but the architecture that would better fit for human tasks?
What will this new architecture look like? Nobody knows yet. Still, neuroscience gathered a lot of info about the brain. Let’s try to select the most general brain encoding principles that might be used for new brain-like data structures.
Principles of information encoding in the brain
1.Brain integrates information
It is hard to explain what information is, but we could try to understand what integration is. Imagine you threw a die (singular of dice, I guess), it rolls on one among six sides. Let’s assume that each time it rolls “two”, the light in the room turns on. For some unreasonable causal link two pieces of matter are connected “die 2” — “lights on”. It appears that information in the brain is written in such a way to save this connection. Neurons that encode dice state “two” are connected with neurons that encode “lights on”. Even if you get outside the room, you will almost certain that you hold in your hand a light switcher. The brain constantly integrates cause and effects, this helps to make predictions about the future, thus helps to survive. But it is not always good, sometimes a person might perceive a mere correlation as a cause and effect and will live with wrong predictions (or start the superstition, like “black cat has crossed your path, you are in huge trouble”).
Our perception is sequential, so we link information in time. The states of dice ( 1–2–3–4–5–6) are joined in our minds because we saw them one after another. We know there are six sides and how each looks, how the a die sounds when it hits the table, and that it will more likely roll to six if we blow on it :) Our memory is associative, every perception is encoded in the context of a previous experience. That is how information becomes knowledge when it integrates with other information.
Long ago I couldn’t grasp the meaning of the phrase: “The memory and the processor in a computer are separated but in the brain, they are the same”. Now I know, it just means that neurons both compute and store information. Billions of neurons work in parallel and it is the main advantage over the sequential processor. Once the neurons respond to new information, other neurons connect with it and link with the previous experience. Whereas a computer just stores new data in a separate memory cell, that is not connected with other memory.
However, there are caveats, reading data from neurons is very different from the computer. To know the color of a lemon you need to activate the neurons that encode its name, or shape, or sour taste. Only then they will reactivate the neuron that encodes its color. In a computer, you just need to know the memory cells address with the data and reading it does not influence the neighboring cells at all. Information in the computer is disintegrated similarity like in the books, that requires dictionaries (addresses) to read.
2.Brain encodes information distributedly
Let’s encode each state of dice by exactly one neuron. The neuron that encodes the side with two dots will be connected with the neuron encoding “lights on”. We activate the neuron “two dots” and it reactivates the neuron “lights on”. That is how we get a prediction of what “two dots” mean. It is a simple way to explain how information is integrated, but it is wrong. In reality, each side of a die is encoded by dozens-thousands of neurons. One might think it is super wasteful to use many neurons when one can do the job. But, look at this, when only one neuron encodes something, then 1000 neurons can encode 1000 different things. It is called local coding. But if we take a group of neurons, say 20, then the same 1000 neurons can encode С(20, 1000) ≈ 10⁴² different things (a symbol for combinations, more common in Europe). This way of encoding is called distributed coding. It is much more interesting, such a huge capacity! It has a nice bonus — if some neurons die or fail to become active, no worries, other neurons will still encode the object.
However, mathematically the network of 1000 binary neurons can encode as many as 2¹⁰⁰⁰ ≈ 10³⁰¹ different states. It is much bigger than using only 20 neurons! Then why do we have to limit ourselves to a small group of neurons? When the percentage of active neurons is small, this is called sparse distributed coding. It appears that sparse coding makes it easier to decipher the code by other neurons. A neuron is less likely to confuse two sparse codes compared to two dense (when there are active neurons around 50%).
How does it connect to the brain? Sparse distributed codes were found in many-many areas. Most likely that our thoughts, plans, imagination, and many other things are encoded with sparsely active neural networks. If it is so fundamental, then maybe the computer should encode similarly?
Current computer architecture is very different from the brain. Computers use dense codes. A memory cell with a size of 8 bits can store 2⁸=256 different codes. This gives a huge capacity but makes memory unstable. If even one bit is flipped we will receive completely different data. For example, 01000001 -> 01000000 changes A->@. I was surprised to know that such mistakes actually might occur (in RAM memory). The cosmic rays might interfere with electronics and change the memory state. The further from the ground, the more severe the effect. Therefore, space missions require special attention. For many years people have developed many methods for detecting and fixing errors. But still, it seems nature uses a very different strategy by storing information among thousands and millions of neurons distributedly.
3.Brain encodes hierarchically
In a mysterious way, nature organizes the matter in more and more complex forms: elementary particles shape the atom, atoms form molecules and crystals, and these form all other things like a teacup or a ball. Simple things create complex, short sequences compose the long. As if there are basic building blocks (dot, line, figure, sound, letters …) that form countless numbers of objects.
Experimental results strongly suggest that primary sensory areas of the brain encode these basic building blocks. The upper neuronal levels integrate information from the primary and encode more and more complex things. Consider vision. Neurons in a primary visual area are sensitive to oriented lines, simple curves, and neurons in higher areas are picky for more complex objects, like faces or teacups. The same is true for the auditory cortex: short and simple sequences are encoded in a primary area, and the higher levels encode words and songs.
However, all these are far from being totally clear. It seems that encoding is not just a bottom-up process. There are many feedback connections to primary areas as if higher areas constantly clarify what those lower areas are sending. It is still unknown the true role of this feedback mechanism. There are many theories, but I half-jokingly decided for myself that the feedback is what creates my imagination :)
So, do not forget, the structure of information itself is hierarchical and compositional (parts form the whole). And the brain adjusted itself to reflect and leverage this. Interestingly, an algorithmic approach of sparse coding tries to encode data as a combination of simple building blocks as well. For example, it can encode images no worse than jpeg2000. I am pretty sure that in the future, a brain-like way of storage will be dominating. Instead of sending the movie online, it is better to send the instruction on how to compose the picture from the basic building blocks stored locally on every computer.
4.Brain adapts to a different kind of information
Right in the eye retina, there are cells that encode the movement, especially on the periphery (useful if someone wants to sneak and attack you). As a result, at a higher level of the hierarchy, there are neurons that encode the moving objects. In the hippocampus and entorhinal cortex, there are cells that encode your body position in the environment. Similarly, there are cells that encode head and eye direction. Different kinds of information can be encoded in a binary way (computer is the proof), but hierarchical relationships everywhere are different. Therefore, different neural networks specialize in different kinds of information. Genetics helps with it as if some neurons are born to encode certain things. Still, neurons are very adaptable. At an early age when the plasticity window is not closed, it seems the neurons can adapt to everything.
Adaptation is happening not only as a response to perception, neurons always try to structure and rewrite information more compactly. The brain is more than just an input-output device, it is spontaneously active by itself. That is why our mind is always occupied with many thoughts. The brain replays experience, tests different hypotheses, tries to predict the future. It is always hungry for work. Such a process helps to improve memory, make it more compact, more structured, and better accessible. It optimizes and extends our model of the world even without perception! Take a theoretical physicist who uncovers the laws of nature with a pen and paper.
The best optimization happens when we comprehend something. It integrates together many pieces of a puzzle and throws away everything that just does not fit. It is a cool mechanism to effectively allocate resources and reduce energy consumption. And often produces the “Eureka” effect :)
Overall, the brain tries to uncover the hidden structure of information and save resources. I like the most how the brain adapts to visual perception and memory. Images in the brain are encoded very differently from the computer or even artificial neural networks ( like a convolutional NN). Visual features of objects (their building block) are encoded together with their locations. And it seems the brain uses many coordinate systems either relative to other objects or to a human body. Currently, it is very poorly understood, but I speculate that this should help with object recognition, efficient memory allocation, and our creativity. Just mentally change the location of some feature (like put the chair to your fridge), and you have created an entirely new object!
5. Brain filters what to store
Several hundreds of millions of sensory receptors are bombarded with gigabits of information every second. However, only a small amount leaves a trace in memory. The memory capacity and brain computational power are very limited thus we need to use them wisely. You enter a kitchen and instead of shelves, plates, cutlery see a fridge door that you are about to open. Human perception is defined by goals. Goals are defined by demands and values that are different in every person. Two people will read this article and each will leave with a unique impression. Attention governs where to look, what to hear, what to store. How exactly it works nobody knows but its role is huge!
As we grow and our model of the world becomes more or less mature, the brain filters information even better. We do not pay attention to things that we expect, that is why we do not see other stuff in the kitchen. But a surprise quickly draws attention and updates the model. Often the case when something is so different from the learned model that it is cheaper to ignore or refuse it. Otherwise, you would need to heavily rebuild your world model. Maybe that is why speaking about religion or politics is so hard :)
You should not leave the article with the idea that our computers are bad, and to make brain-like computation we need to create a new architecture. A common computer can perform any computation (they are Turing complete) and all these five principles one can implement in a home PC. Still, once we build a theory of neural architecture and brain-like data structure, we will use dedicated hardware. It would be much faster, smaller, and more energy-efficient. Someday, the brain-like computer with the size of a tennis ball will be more intelligent than whole humanity (pure speculation:).
Certainly, these were not all principles, but some of the most important. In the next article, we will look closer at sparse distributed codes and whether they could be used for a new neural data structure.