Our different senses differ in detail. The features that allow effective representation of audio signals are different than those for representing vision. More generally we’d expect that any particular sensor type we’d provide to a new Creature we’re designing would need some new and specific way to effectively represent the types of data it receives from the world, and these representations would be different from sensor to sensor.
The different actuators in our Creature should also get different representations. For example, the motor signals that drive the bulk movement of the Creature (say its wheels) probably have a very different character than those that drive fine motor skills (such as the movement of fingers).
In the DBM framework, there is a natural way to handle this. The approach is called a Locally Connected Deep Boltzmann Machine, or LC-DBM. We briefly encountered this concept in an earlier post, where it made an appearance in this figure (the picture on the right).
Let’s see if we can build an interesting LC-DBM that we can run in hardware.
Embodying Cid in a robot
Imagine we have a robot that has two motors. One of these controls the movement of the back left wheel, and one controls the movement of the back right wheel. The robot will have a castor in front that’s free to rotate, so we get back wheel drive. We’ll assume for this first experiment that these motors only have two settings — off (0) and forward (1). This is not all that restrictive. While there are some things this type of Creature can’t do, he will be able to get to most places using just these movements.
We’ll give each of these motors visible units corresponding to two successive times. Now that we’re starting to think about embodiments, what these times are in actual physical units becomes important. In this case, we’ll set the interval between times to be much longer than the response time of the motors — say 450 ms.
The DBM we’ll start with to represent the two motors at two times will look the same as the one we used in experiment #3. Here it is.
This Creature is also going to be equipped with a camera, so we can also have vision neurons. A typical camera that you’d mount on a robot provides a huge amount of information, but what we’re going to do is to start off by only using a tiny fraction of it, and in a particularly dumb way. What we’ll do is take the images coming in from the camera, and separate them into two regions — the left and right halves of the full image. We’ll take all of the pixels in each side, average them, and threshold them such that if the average intensity of the pixels is 128 or higher, that means 1 (i.e. bright = 1) otherwise 0 (dark = 0). This mimics the thresholded photodetector ommatidia idea we discussed a couple of posts back, although now we have two of them — one for the left side of the creature’s vision, and one for the right side.
Again we’ll have two successive times. Typical cameras provide around 30 frames per second, which is a lot faster than the time we set for the motor response. So what we’ll do is average the camera results over 15 frames, so that we can keep the difference in time the same as the difference we chose for the motors. Again this is not the smartest thing we could do but we can improve this later! With these choices, here’s the DBM we will use for the vision system.
Now let’s equip our Creature with a speaker / microphone. As with the vision system, an audio system we can mount on a robot can provide us with very rich data. But we’ll ignore most of it for the time being. Analogously to the simple system we put in place for vision, let’s again choose two audio neurons, but this time instead of thresholding the intensity of the visual input on the left/right halves of the incoming images, we’ll threshold the intensity of two different frequencies, one low and one high, corresponding to 100 Hz and 1000 Hz. An input in each will be 0 if the fourier component of the signal over a total of 450ms was less than a threshold, and 1 if it’s greater. The idea is that if these frequencies are present, the corresponding audio neuron will be on, otherwise it will be off.
Here’s the DBM for the audio system.
Finally, let’s add a weapons system. We’ll mount a missile launcher on the robot. Because firing a missile is serious business, we’ll make it so that both weapons neurons have to be on simultaneously for a firing event, so 00, 01 and 10 mean ‘don’t fire’, and 11 means ‘fire’. Again we’ll have two times, separated by 450 ms. Here’s the weapons system DBM.
Connecting sensors and actuators at higher levels of the LC-DBM
OK so we have built four different DBMs for audio, vision, motor and weapons. But at the moment they are completely separate. Let’s fix that!
Here is an architecture that brings all four together, by combining the different modalities higher up in the LC-DBM.
This network can be embedded in hardware. I created this embedding by staring at it for a few minutes. There are probably much better ways to do it. But this one should work. Here it is!
So that’s cool! Alright that’s enough for now, next time we’ll think about different experiments we can subject this New and Enhanced Cid to.