Since I started understanding science, I always had two questions that always bugged me.
1. How a new born baby learns.
2. Who am I and does everybody like me, be feeling like me.
These two questions have made me make myself a promise…an impossible dream… ‘Coding a system that is conscious in some degree’. Not the fake conscious. I mean real conscious !!
I’ve been observing people’s behaviors, I’ve read, listened as much material online as possible, I’ve been looking at myself in the mirror, trying to understand what it means by being ‘me’, I always try to see if I can find some inconsistencies in the world in a way it is … may be some back-door that would allow me to enter into the understanding of consciousness.
At times, I just pass by and look back to make sure that things I saw around are still there.
With all that I’ve understood till now about consciousness, right from what could it be to trying to understand how it makes us us, I believe that the key to cracking consciousness is to first solve the problem of making system self aware, making system to understand what I’m talking about, when I talk to it.
Lemme elaborate. Here’s the way I see a solution to problem. I genuinely feel that no matter how well we derive algorithms to detect this and that and that no matter how well our speech recognition and generation systems achieve perfection, I strongly believe that all those are no less than a talking parrot (with respect, I understand it’s not easy).
Our systems today, do not understand anything besides which word to generate next given some context window. Our systems do not understand what it means when they write “I sat in the car”. And this is so because our text processing and visual algorithms are not designed to work together. Think of an algorithm that can look at the car and write text “car” in response .. mind that I’m not talking about predicting it’s a car. I’m talking about having that unique representation/feeling/embeddeding whatever you call it, for the word ‘car’.
Key problem to crack before solving a question of consciousness, according to me, is to first make a system that can get the image segmentation right. It won’t need it to get trained to identify objects (because I still don’t believe we’re trained to identify objects) but to be able to isolate objects to process them individually. I believe our stereo vision isn’t the reason, we can isolate objects precisely but it’s something else. Because I don’t think the pixel intensities themselves carry any special meaning, the history does.
When it comes to vision, I believe our eyes have very little to do with what we see. e.g. my brain should have no way to differentiate between something large and far away to something small up close. Brain has no direct access to the physical world. Neither the information comes with any instructions manual. It doesn’t know what to do. It has no way of knowing the source of the information…so information is meaningless because it literally could mean anything.
Anyway, let me get back. I know I’m drifting away from the topic. So If I can make a system that can isolate things precisely, my system (rather than learning to identify those objects) can just keep asking me what’s this and what’s that and take a note of it. So that later when it is given some piece of text to read, it should be able to know what it means by word “car”.
If you don’t believe me, think of the example above. If my brain has no direct access to the physical world, how in the world can by brain still differentiate between something large from distance to something small up close to me !!!
How does the brain make meanings !!!
Memory !! right? .. it has it’s history, for that matter. So when you open up your eyes and see, the interpretation of the world around is simply a meaning that once was useful.
Anyway, once my system can take a note of things it has seen visually and can hook their meanings with the text it has read, it can understand what it means by a sentence “I sat in the car” !!! May be …
I don’t really know how to code this for sure but I absolutely absolutely absolutely love thinking about how we processes what we see and possibly make a meaning out of it .. Wooow! even thinking about it makes me jump up !!!
Lemme please explain… Bare with me !!! that would be the end of this article I promise.
let’s assume that I’m sitting in my room right now, ‘the room’ is a container. I’ve a glass of water here. it’s a container. There are lots of containers around the world. Windows is a container, your body is a container and so on. These containers are called Image schemas and they structure the space around you. They allow you to locate things in space relative to your body. That’s what allows you to talk about things being above, below, Infront or behind you.
Similar to image schemas, there are process schemas, made of neurons in your body that tell you what the process is. Things like, What is the precondition to the process, what are you starting with, when did the process start, are you iterating it, is the process long or short, have you reached the end .. etc.
These schemas are computed by neural circuitry in your brain. Guess what !! Every single language in the world uses similar schemas to express. Their computation/representation could be relative to subjects, objects, adverbs, etc but same imagery schemas show up in the languages world wide.
These containers, so called image schemas, they also allow reasoning. Like if you know that you have a closet inside your house, and you put your keys in the drawer of your closet, it means that your keys are inside your house. That has to do with the logic of container schema.
These schemas structure some bigger schemas called frames. Imagine your understanding of a hospital. It has doctors, patients, receptionists and nurses, there are operation theaters, there are instruments and it has things you know about hospital..
Every frame is a structure like that. And if I say the word surgery, immediately everything about it will come up. Think of the algorithm that can pair visuals with text in this format. Suddenly you will know what to expect in your visual field… you will know that there is a surgeon, there is a patient, a MRI machine etc. Every word is relative to some frame.
Words activate frames and frame are the ways in which you structure the world. We can not think without frames.
Another thing frames allows us is a generalization of concepts. It allows us to understand things like metaphors. So if one says “we’re spinning our wheels in this relationship”, what happens? You’ve an image of a car, the car isn’t moving, wheels are turning, you’re putting energy into getting it moving and it’s not working out.
You take this frame of a car and you let it superimpose it on the frame of relationships, and that makes you understand the meaning of the sentence “we’re spinning our wheels in this relationship”…How beautiful is that !!!! Wooooooooooooooooooooo .. just jams up my brain. Suppppppperb !!
If I can, someday build a system this way, piece by piece.. I’m sure that there will a time when my system would just keep talking to me, start asking me questions like a little boy and start understanding the whole world visually, associating it with concepts and texts.. and for every of it’s mistake, I can just talk to it to tell that the interpretation was wrong and this is what the actual interpretation is .. and my system would adjust.
Should be way efficient than training a model for every other thing and would be robust against any misdirection with false information ( which by the way is the biggest problem with our current ML models. Give it one curated image and it adjust weights for the whole model to trick it to guess it as something which it isn’t).
There’s a lot to talk about and I do understand, it’s a difficult problem to grasp and solve but that doesn’t make me leave thinking about it at all. I want to try for it anyway … I won’t be able to build it or I would build it .. what else can happen :).
Hoping to build one some-day.
BTW, there’s one more thing that I’ve learnt to do in pandemic. It’s called ‘Lucid dreaming’ 👻. Will definitely talk about it..but later. You have a great day/night ahead.