AI Territory. New horizons
Do you have this feeling that we live in a fascinating time? That our future brings us crazy and marvelous discoveries? That we will create artificial intelligence, uncover mysteries of the brain, overcome the biological form, maybe even in our lifetime? I have this feeling too… Or… I had. Some time ago I was infected with the disturbing and annoying idea that we are moving too slow and in the wrong direction. Not that we need to abandon science and return to mother nature, no, our fate is to peel through the deepest riddles of the universe. But the idea that the mainstream science in AI asks the wrong questions, have false values, pursue misleading goals… And it makes me fearful that I will live my life the same as two previous generations of AI researchers, believing that AI is around the corner but never truly seeing it. Because of this idea, I decided to start this blog. With this idea I will try to infect you, to ruin your bright view of the future. I do not want to make you pessimistic, rather, I want you to face the problems and start fighting.
My name is Viacheslav Osaulenko, I am a Ukrainian scientist. Last year I defended a PhD in artificial intelligence. The usual path is to apply for a postdoc, make a contract with some laboratory or university, and build up your profile. Get published, get cited, get recognized, get funded, get tenure. But my inner feelings telling me not to do this. The science culture of “publish or perish”, deadlines that make papers superficial, wild hunt for citations… I am not sure that I want to be part of it.
So I am exploring a different path, and this blog a part of it. With simple words and some math, I will show many promising ideas, exciting approaches, and interesting routes through AI territory. The goal is to explore faster and challenge exciting approaches.
Why? What is wrong with the current mainstream deep learning (DL)? One of the main strengths of DL is the end-to-end learning that replaces the need for hand-crafted features. You just provide the right dataset, set up architecture, loss function, training algorithm (plus many engineering tweaks) and the backpropagation does the rest. This is actually the source of weakness.
This leads to catastrophic forgetting where new data overwrites previous experience. Why this is happening? The problem is very deep and buried in knowledge representation. The way how information is distributed across neurons should be changed. I do not argue in favor of symbolic AI, connectionism still has a huge potential. But it has to evolve towards end-to-model-to-end learning. The central part must be the Model of the data. Compact model with invariances, that preserves information across time and datasets. In other words, we need to add memory. And the way how this memory should be stored, manipulated, and retrieved is The Question.
This problem is well known in the community. People whose “fault” of making deep learning mainstream are telling about this. Y. LeCun says that all these AI technologies are good and useful but unsupervised learning should be our target, calling it the “Dark matter” of AI. J. Hinton also is deeply suspicious about backpropagation and thinks maybe we need “to throw it all and start again”. Y. Bengio favors the building of casual models of the world, and in one of his interviews tells: “what excites me the most now is sort of direction of research where we are not trying to build systems that are going to do something useful, we are just going back to principles about how can a computer observe the world, interact with the world and discover how that world works”. Do not get it wrong, three of them recognize the importance of deep learning, otherwise, they would not do it. But they call to explore different ways, not to be stuck in current approaches.
Neuroscientists also point out current AI limitations. They do not even consider backpropagation as a viable candidate for learning in the brain. Bruno Olshausen argues that perception is not simple feedforward computation, but is goal-oriented and crucially depends on the internal model. Carl Friston formulated the free energy principle, where the goal is to minimize the difference between model expectation and reality. Cristos Papadimitriou summarizes many ideas from neuroscience into computation by assemblies of neurons, trying to build networks on different principles. Jeff Hawkins, a walking source of great ideas and inspiration, calls for a reverse brain engineering approach. And many, many more exploring different approaches but all have the same rhythm. Around this rhythm, I will build my future texts. The rhythm of the distributed brain-like computations with memory.
I write it again, we need to understand how memory is stored, manipulated, and retrieved! Is the memory just a probability distribution? Or is the model a collection of programs that generate the data and predictions? Does it need to discover invariances and hidden structure to compress the information the most? Will it help with catastrophic forgetting? Or maybe it could decrease the need for huge datasets and megawatts of power? What about humans, how do we store the data? When information becomes knowledge?
These are only some questions that form the new horizons for AI. The next frontier that should bridge the gap between biological and artificial systems. That leads to the abstract laws of computation that lies in the center of AI territory. I believe this will speed up the progress towards our bright future.
This and many more topics I will explore in the following articles. I will try to do my best to make the story consistent so that each post is a continuation of the previous. And you can help me write better through your advice and critique in the comments.
Remember, our future has not to be bright by itself, but we must fight for it.