What are the Limits of Deep Learning? Going Beyond Deep Learning

Glowing stickers are able to confuse deep learning systems. Deep Learning expert Geoffrey Hinton believes simple adversarial attacks show that Deep Learning has flaws.

Deep Learning flaws
* The systems needs 10,000+ examples to learn a concept like cows. Humans only need a handful of examples
* Deep Learning cannot explain how the systems got an answer
* Deep Learning lacks common sense. This makes the systems fragile and when errors are made, the errors can be very large.

There is a growing feeling in the field that deep learning’s shortcomings require some fundamentally new ideas.

PNAS – What are the limits of deep learning?

One solution is simply to expand the scope of the training data. In an article published in May 2018 , Botvinick’s DeepMind group studied what happens when a network is trained on more than one task. They found that as long as the network has enough “recurrent” connections running backward from later layers to earlier ones—a feature that allows the network to remember what it’s doing from one instant to the next—it will automatically draw on the lessons it learned from earlier tasks to learn new ones faster. This is at least an embryonic form of human-style “meta-learning,” or learning to learn, which is a big part of our ability to master things quickly.

A more radical possibility is to give up trying to tackle the problem at hand by training just one big network and instead have multiple networks work in tandem. In June 2018, the DeepMind team published an example they call the Generative Query Network architecture, which harnesses two different networks to learn its way around complex virtual environments with no human input. One, dubbed the representation network, essentially uses standard image-recognition learning to identify what’s visible to the AI at any given instant. The generation network, meanwhile, learns to take the first network’s output and produce a kind of 3D model of the entire environment—in effect, making predictions about the objects and features the AI doesn’t see. For example, if a table only has three legs visible, the model will include a fourth leg with the same size, shape, and color.

A more radical approach is to quit asking the networks to learn everything from scratch for every problem.

A potentially powerful new approach is known as the graph network. These are deep-learning systems that have an innate bias toward representing things as objects and relations.

82 thoughts on “What are the Limits of Deep Learning? Going Beyond Deep Learning”

  1. NN’s are idiot savants, understanding some tasks we humans dont understand.
    On the other hand it hasnt brought us closer to essential questions like what is black matter, dark energy, what is a trauma, it cannt learn meaningfull as a student from a conversation. We got some image breakthroughs but humanity may fail in creating an even more balanced wisdom IQ then most wise men on earth, in sort its alien technology based upon simplified neurologics, but we dont understand the human brain what goes on in into it, neither an AI does, it cannot furfull a role as dockter in psychology.

    In short we created something that derivates formula relations, improves searches and classification, ideal chess bots etc but if you ask me its only computer power the math improvements sinds WW2 are minimal, in ww2 the first neural networks where used, its old, only the power of computers make them feasible now, though the world hasnt changed a lot. Theyre not miracle machines, they require verification testing etc.
    They’re not yet self aware, and even what that means is a big unanswered question in science

  2. You say little progress has been made. That just does not jive with the incredible elevation in its ability to play games. It has gone from fairly pathetic C-A class chess play to not only outclassing every human alive but blowing away engines that were slowly developed and built on one another for decades and have moved 600 Elo past the last human. And it did this playing itself over an hour to a day. The programmers of those old hand made programs have addressed many of the holes punched in them, but it is still neck and neck. And it is just a mater of time before it blows by. Then there is GO. AI has completely rewritten how to play GO. And they moved onto more graphic video games. Some of these games are not that different than real warfare. Almost certainly AI could run anyone’s army better than the grunts in the in the seats of the war machines. And the strategy aspects could be used to home in on cures for diseases, out maneuver competitors, trade stock, you name it.
    Facial recognition now easily exceeds humans…except maybe the small number of super recognisers https://en.wikipedia.org/wiki/Super_recogniser.
    But you consider all this trivial…small potatoes?

  3. You are a low life reduced to shifting the discussion to places that are not part of the argument, labeling labels without any reasoning other that your ugly prejudices, calling names, as you have no ability to evaluate ideas, just recline before an authority, in this case the accepted scientific dogma, and to further legitimize doing it, force others to try to do the same. In other words, a complete scam…

  4. Has anyone actually tried to train an ANN by feeding it lots of animal images and training it to output an approximate orientation vector (like “looking up and a little to the left”) of any animal image?

    And then use that as a component of a composite ANN that is trained to generate new images at a few standard orientations (e.g. face straight forward and profile images and maybe top down), from an input animal image whose orientation the orientation-ANN has identified?

    And then for an arbitrary animal image, recognize its orientation, generate standard orientation images, and feed those images and the original animal image orientation as input to an ANN to be trained on just a few (generated standard orientations) images of each of many examples of animals?

    I’m certainly not saying that’d be ‘easy’, but none of those steps seem beyond the potential of ANNs.

  5. ” check if the materialistic progression of the scientific dogma is an intrinsic axiom for you.”
    Or that. What does that even mean?

    The combination of what you seem to believe in, and why you seem to believe it, is indistinguishable from all the other kooky hipsters out there. Whether you like it or not.

  6. “Rather than saying that scientific truths are probabilistic I prefer to say that the truth has many facets that cannot always be connected, at least for now. The truth is kaleidoscopic in nature and this how science should be practiced with each point is explained by different sets of explanations sometimes independent of each other rather than one set of forced to explain all points.”

    That’s dogmatic. You cannot force any pattern on the experimental results (iow “the truth” manifesting itself via experiments) and pretend objectivity.

    You can’t just pretend that you haven’t said the things you’ve said previously: crackpot type pseudo scientific word salad of various types like life energy and pretending that we don’t need (presumably dogma- ) science/technology for longer healthspan, merely to reconnect with nature…

    … or whatever exact fuzzy logic/facts/etc. It’s boring. It’s weak. It’s nonsense.

  7. If you were to follow the video I have sent you at least the first 10 minutes you would have realized that the to 5 sigma probability most strict scientific methods were used.

    The only reason that you will call it pseudo/weak science is that it doesn’t follow the scientific dogma as I have described above.

    Maybe you need to check if the materialistic progression of the scientific dogma is an intrinsic axiom for you.

    Rather than saying that scientific truths are probabilistic I prefer to say that the truth has many facets that cannot always be connected, at least for now. The truth is kaleidoscopic in nature and this how science should be practiced with each point is explained by different sets of explanations sometimes independent of each other rather than one set of forced to explain all points.

  8. Just stop. Science is not democratic. You cannot be led anywhere but to the formal definition of science, if you avoid fallacious methods of experimental investigation.

    Pseudoscience is by definition fallacious.

    You cannot have absolute certainty about which hypothesis/theory/law is the best, never mind whether it is absolutely correct. There is only statistical basis for choosing which model appears most likely the closest to whatever you’re studying.

  9. I don’t understand science, it is based on fallacious notions on how knowledge should be gained, but you understand only what they let you!

  10. Reminds me of a Clos Network. In the networking world it provides a superior connectivity solution for things massively parallel on both ends.

  11. So you say. I better way to say it is that they just not don’t follow the dogma of how science should evolve, from Galileo to Newton to Einstein onward in long chains that are rooted in cause and effect explanations of material phenomena and therefore they get refracted in one way or another.

  12. You’re welcome, Doc.  

    Like MTCZ, I think all members of Kingdom Animalia evolved this multilayered brain structure precisely for the reason cited: being essentially fully distributed probabilistic-memory plus computationally adaptive, it allows for all sorts of lil’ failures, on all scales from individual synapses (if I’m recalling correctly, the tiny ‘pads’ that serve as neuron-to-neuron interfaces and communications channels), to entire ganglia dying or misfiring (ageing, as an example, but more dramatically, getting hit in the head by a branch, rock or un-asked-for male-dominance-challenge).  

    If brain-stuff were organized like mankind’s absolutely jaw-dropping dalliance with deterministic silicon chip-ware, well … the first intercepted rock would put The Beast in question out of commission.  

    Since there are a LOT of physical challenges (and let’s face it … bacterial, viral, prional, chemical, mechanical, experiential, …) on every scale from seconds-to-lifetime, it wouldn’t serve the members of Kingdom Animalia very well if the slightest malfunctioning bits of their noggins turned them in to proverbial stumps.  

    Or to muse from a different angle, those critters having super-critical-everything-dependency didn’t survive Evolution’s dispassionate scythe.  

    Anyway… just my multidecadal musing.
    Might be so!  
    Who knows.

    -= GoatGuy ✓ =-

  13. I like the Hologram analogy.
    It’s also be best description of what a Hologram physically IS that I’ve encountered that didn’t need 20 pages of physics.

  14. I wonder if there’s more than a passing analogy to the way parts of the brain repurpose themselves. E.G. when one of the senses is largely/completely handicapped.

  15. Which is the premise of the article =) Where do we go now, what’s the one or set of next steps, among all possible steps, between here and the ultimate goal, etc.

  16. Don’t shift the goal posts or shift blame: it has not been proven that consciousness, or an absolutely (from human senses, which *are* a finite and very definable thing) indistinguishable simulacrum of its manifestation in the physical (ie not the religious, superstitious, supernatural area of epistemology), cannot be explained and modeled to negligible sums of imperfection.

    Science’s dogma is *specifically*, fundamentally, not about 100% certainty but about the most likely explanation for something, until a better explanation supplants it: nec ultima si prior.
    Science never says that something is an absolute certainty. That is what faith pretends to, and so your above statement of such an absolute certainty which you ascribe to ‘dogma science’ is irrelevant; a straw man at best.

  17. Part 2

    Not only should separate neural networks be combined together, I think we need to find ways to combine NN’s along with other AI representations such as SVM’s, forest trees, etc. We could even incorporate logic into a neural network. Multiple networks working together and connected in different ways will model the complexity of the world better and give a much more complete representation of how objects and relationships work.

  18. I agree that’s the direction machine learning needs to go. Not just making single neural networks bigger and more complex, but being able to combine multiple disparate networks together to produce a more complete understanding of the world.

    There’s already ensemble neural nets that are able to take similar NN’s and average the output together. However I think we need to go further than that. Sometimes averages aren’t the best way to determine if something does or does not exist. The more networks you combine together, the more specific your feature set can be. For an image classifier, you may have a bear that really looks like a dog. There may be some small feature that definitely indicates it’s a bear and not a dog, but with traditional neural nets, that result may be averaged out as insignificant. If instead you had a neural network added on that looked specifically looked for this one feature in each image, your algorithm would do a much better job of identifying edge cases. If you combined enough NN’s together looking at very specific features, you should be able to identify each and every edge case.

  19. Plenty of humans run into concrete barriers. I don’t think a self driving AI algorithm has to be perfect. It likely does need to be just as safe as a human, if not more so, to gain public and regulatory acceptance.

  20. How many of those observations are actually necessary to drive? Using your example, a self driving algorithm needs to recognize the color of the stoplight, the signage, and people on the side of the road that may step out into traffic. It needs to be able to recognize a few other things too of course. AI is already capable of doing a lot of these things. Humans aren’t particularly great at judging erratic behaviors either such as a homeless person crossing the street at the wrong time. I do think self driving algorithms need to incorporate some information about the local environment whether that’s a weather forecast or being able to determine road conditions. It could also incorporate things like GPS data and even local accident reports if a particular road is windy and dangerous. I don’t see why their cameras couldn’t pick up the presence of tall buildings and density of foot traffic to determine that the car is driving in a city vs driving in a residential area, etc. Cars are already largely capable of driving themselves on highways at high speeds. Sure city driving does have quite a few more variables that humans also struggle with. I don’t think that means it’s an impossible task for a self driving car to eventually tackle.

  21. Absolutely right, not to mention the millions of years of evolution that our brains are probably pre-wired to recognize and learn to classify things such as animals and their behavior. I doubt humans start from a completely blank slate as in unsupervised learning and could be why humans don’t need as much training data.

  22. That is very well put except that it has already been shown that the first case is valid, and that has been ignored by the dogma science.

  23. PART 4 (LAST)

    This is why holographic mainframe storage never became commercially successful.  
    Yet and still, this is where the analogy to the neural-net synthesis seems apt.  

    It may be a cartoon-of-sorts, but the image of a neural net: Columns of circles, interconnected … with the nodes of the next layer.  And the next. And so on.  

    Pretty clearly, the “amount of information” representable by the layers has to do with their count, and the individual number of reactive states each can simulate.  

    If the relationship to its neighbors is strictly binary, well … then counting all the inputs and outputs to a node defines the number of computing states it can achieve. Adding up all those bits then gives an idea of the total computing space a given neural net “can do”. 

    THAT, in turn with then going thru the remarkably time-consuming process of “training the neural net”, then leads to hundreds, thousands, hundreds-of-thousands of overlapped interference patterns, just as per the hologram. Each … degrades the next, and the prior. But where there is a common thread amongst them, by whatever “degrees of freedom” the training evidences, well … then the neural net learns to recognize it. 

    Cows, cats, bats, children. 
    Flying cows, no.
    Crawling bats, no.
    Upside-down cats, no. 
    Profiles of children, no.  

    See? And especially … not 10,000 different things, at once.

    Just thinking critically,
    -= GoatGuy ✓ =-

  24. PART 3

    Now here’s the cool part, the money shot:  One of the really remarkable things that can be done when making holograms is to place multiple completely unrelated images in the same film-plate before developing the images. Lay out one scene, illuminate from (say) 45° expose. Lay out another panorama, illuminate from 50° — and so on — until perhaps 50 or even 500 images are simultaneously exposed. Develop the plate. 

    In the same dark room, when illuminated by a fan-beam of very bright laser light at 45°, the first original scene is clearly visible. But “the contrast” is missing. Things are fuzzy. The formerly very, very sharp outlines of all the objects is dull. Printed-on lettering might not be readable at the smaller font sizes. Move the beam to 50°, and the same effect is noted. All images suffer similar degradation. 

    Clearly, this would be evidence, direct — that the information carrying capacity of the holographic film plate ‘stuff’ is finite. Adding more images simply muddies up all prior (and yet-to-be-recorded) ones.  

    When the film’s ‘thickness’ is increased, this effect — naturally — decreases! Sharpness is maintained. But drop the illuminating beam-to-beam angle to 1°, and the dullness returns. Again … consistent with the formula, above. 

    See PART 4
    -= GoatGuy ✓ =-

  25. … continued, part 2

    Scientists immediately asked, “well, how much spatial information is encoded by the gel-and-grains in a sheet of film?”’;

    This is NOT a trivial question. As one walks around a well-lit large-plate hologram in a darkened room, it really does appear to be the case that the scene’s maintenance of 3D perspective appears continuous at all angles. Wouldn’t that imply that the information is infinite? 

    Nominally, yes. 

    However, decades of thousands of careful experiments making more-constrained quantitative cases showed something both interesting, and still not surprising. 

    The amount of information contained by a hologram is limited by the size (area) of the thing, the wavelength of light used to illuminate it, the mean size of film grains, and the thickness of the photosensitive layer. Roughly,

    O(I) ≈ W⋅H⋅t⋅/√(S⋅λ)³ … where …

    O(I) is order of total distinct information (as ‘bits’)
    W is width
    H is height
    t is thickness
    S is radius of the average grain
    λ is wavelength of illuminators

    In other words, quite a bit of information, indeed! But still finite. 

    see PART 3

  26. Since I’m catching flak, I might as well continue foraging ahead. 

    To answer the Question posed by Brian… “What are the limits…”, I think that the answer is exactly analogous to holograms, and the amount of representational-as-a-function-of-viewing-angle information they hold. 

    I cannot show it here, but consider what a hologram is, in the old days, when they were still being made from high-resolution photographic film. 

    The objects were lit up by a broad laser beam, on a VERY stable table. Simultaneously, either some of that laser floodlight beam, or another laser (so long as it was equally stable), is used to illuminate the negative film at about the same relative amount as that bouncing off the ‘scene’.  

    The pair of sources, completely coherent in phase, INTERFERE within the relatively thick photographic film ‘gel’ layer. On the scale of wavelengths of light, there are wavy patters of highly-exposed, and nearby, completely unexpeosed film grains. Developing the film causes the grains — if exposed — to turn dark. The unexposed ones, remain light. “Fixing” the negative removes its photo-sensitivity.  

    Now, if the negative is simply illuminated from ‘behind’ (the same direction as was the bit-of-laser-illuminating-the-film), those little grains will INTERFERE with the light, causing an apparent reconstruction of the scene. It is quite something, how well it is reproduced. 

    (more below) = GoatGuy ✓ =-

  27. It may be an irreducible problem that to successfully model human behaviour you DO need to have NN complexity approaching that of a human.

    Though the fact that we are throwing away 99% of human behaviour and restricting it to overall positional movement on the roads and near-road environment should simplify things a lot.

  28. So you don’t know what consciousness is, but you know that it belongs to the infinitely complex rather than merely very far beyond *CURRENT* means of explanation. That is hard to distinguish from superstition.

  29. Not the first time someone gets all dissociative. Some of them take it to ridiculous extents – getting extremely overzealous about social justice via that dissociation and consequent lack of empathy.

  30. Everyone “knows” that being a vegan/vegetarian is better for the environment because reasons. By this same token, I think we need to shift our energy production to coal. It is superior for the environment because it is plant based. By the by, anyone ever wonder if we’d be living on an ice-cube had fungus not evolved the ability to digest lignin and liberate CO2 stored in plants back to the atmosphere? Remember in the 80s when climate scientists were convinced we were heading into the next ice age? Ever think they might have been right except thankfully we have been saved by all the CO2 we are releasing? It is all very interesting and just goes to show you that there is a ton that we don;t know and cannot control about the climate. I think it is wise to be weary about our effects on the climate however the alarmist attitude many have adopted is foolish.

    I’ll just leave this here as a parting gift:

  31. To paraphrase,

    Blah, blah, blah, … Goat’s a føøl, blah, blah, blah. 
    Got it D. Concerto in F♭ minor.
    Its a pity you missed the point. 

    -= G =-

  32. As long as the ai acts like a organism that replaces its cell it’s possible… it just has a Service robot that Goes around replacing ICs that are close to burning out with new ICS…

  33. the most common reason for computer failure is mechanical failure of moving parts, example, hard disk crash, power plug Or usb plug breaks due to over plugging…

  34. If one meant to say

    …One could cite “pölïtically promoted climate pseudo-science” as one of these domains.

    but instead said

    …One could cite “AGW = global warming” as one of these domains.

    one might conclude that’s quite the tongue slip given AGW is very well supported in the scientific literature.
    The suggestion

    “neural net”…have continued to garner increasing amounts of research monies, and yet have precious little to show for the investment.

    is laughable. Who here thinks goat’s claim isn’t completely ridiculous—that one of the most pervasive and beneficial tech to come along in many years have precious little to show. Neural nets need not be perfect nor self-aware to be considered a useful/successful technology or easily outperform most humans on most tasks. Neural nets have been making your life easier long before the deep learning renaissance. Most are content to enjoy the benefits of a modern existence without thought to how all the pieces works.

  35. Based on the title image of this article, image recognition already has such a hierarchical structure: pixel data -> edge detection -> edge combinations -> feature detection (eyes, curls, etc) -> feature combinations -> object recognition.

    What it’s missing is several layers after that would combine it with outputs from other sub-networks and allow it to say “this is also a dog”, “poodles are dogs”, “dogs have 4 legs (etc)”, “dogs can take shapes X, Y, Z, …”, “(therefore), poodles can take shapes X, Y, Z, …”, “that other one is also a poodle”

  36. Yes, possibly. A way to achieve this could perhaps be to train separate nets to recognize eyes, ears, curls etc, and then let the output of these nets into the animal recognition net. Possibly. At least I’m pretty sure that we can achieve better results compared to todays ANN by employing such a strategy. You could argue that applications of BERT that uses transfer learning uses such a strategy..

    Or, it may be that the biological NN has a routing mechanism that sends any continuous block of “pixels” to an “recognition center”. This block of pixels would be scaled to a fixed size to fit the input of the “recognition center”.

    Of course, this would not answer how routing would work. How would the routing “understand” that the “pixels” in the “poodle” belong together without recognizing the “poodle” in the first case?

    Maybe, just maybe, it’s an iterative process? That is, the “routing” sends a trial bunch a pixels to the recognition center, and gets back a result such as “part of a poodle, expand area” after which the block of pixels is expanded and a new answer is fed back to the router..?

  37. I’ve been proposing for a while that the easiest way to give AI common sense may be to have it use an object classifier image recognition NN as a co-processor, and have it process a bunch of videos to learn temporal relationships. How different objects interact and move around.

    After that, connect it with an NLP NN, and let it relate the videos to text descriptions of them. And relate different parts of the texts to different parts of the videos. Then connect it to touch sensors and cameras and manipulators, and have it learn what different objects feel like, and how they react to various physical stimuli. Let it learn concepts like “soft” and “fragile”, and relate it to different objects and what they look like.

    The closest to that I’ve seen so far is AI classifying that certain objects often go with certain other objects. Like “airplane” with “airfield”, or “car” with “road”.

  38. Possibly has to do with multi-layer classification and generalization. “A poodle is also a dog, a mammal, an animal.” “(Almost) All dogs have 4 legs, a head, a body, a tail. Those tend to look a certain way, have certain physical relationships, and can only take certain conformations. I know what those are and what they look like.”
    An ANN doesn’t do that. It wasn’t designed to do that.

  39. The size invariance follows the same pattern. You look at the image of the poodle, and you can recognize images of poodles of all sizes, from the size of a Boing 747 to the size of a stamp.

    An ANN cannot make this generalization yet.

  40. The problem is constructing a good model out of all of that. That may require a NN complexity approaching that of human.

  41. It doesn’t matter so much if the biological NN rotational invariance is “innate” or “trained”. It is transferable for a biological NN and not for an ANN. Let me give you an example.

    Take an ANN that is trained to recognize the animals at all orientations, but not the noble “poodle”. Then you add the “poodle” to the data set, but only at one orientation. After the new training, the ANN can only recognize poodles from the one orientation. If you, however, would see a single photo of a poodle, you would subsequently recognize poodles at all orientations.

    So even if you learned how to recognize shapes regardless of orientation by looking at thousands of animals at all orientations or if this was hard wired into your NN, you could apply it to a new shape that you had only seen from *one* orientation. The ANN can not. Do you understand?

  42. Despite these shortcomings, deep learning nets have enabled some remarkable progress in AI applications in the past few years, which would’ve been quite impossible just 5-10 years ago.

  43. I’ve found that children, especially young ones, have very little awareness of their surroundings, the consequences of their actions, etc. For example, they may block someones way, stare them straight in the eye, and just keep standing there completely baffled by the whole situation. Even when their parent tells them to move, the kid will not acknowledge until the parent pulls them aside by force. And even then, they seem to have no idea what or why just happened.

    It takes humans many years to learn what you describe, and get a good grasp on the surrounding world. And the further we go, the more we can lean on past experiences to process new ones. AIs have no such memory, and a much shorter training time.

  44. – Climate changes all the time. That is a fact.
    – Atmospheric CO2 has been climbing rapidly since the industrial revolution (and more recently, but not as sharply, so has methane). Those are also facts.
    – At least part of that rise can be attributed to human activity, and we can calculate how much, since human activity is well known and accountable. Also fact.
    – Both CO2 and methane have known IR signatures that result in greenhouse contributions. Fact.

    However, it’s also a fact that climate is complex. The atmosphere is complex. The oceans are complex. The biosphere is complex. The Sun is complex. And the climate is affected by all four, with strong interactions in between.

    We can be fairly certain that human activity affects the climate, and contributes to some of its changes. But there is still too much we don’t know. Too many feedbacks. The big question is “How bad is it (going to be)?”. We only have half-guess estimates, that don’t even fully agree with each other.

  45. The “unexplainable” has a poor historical record of remaining such. The brain is complex, and difficult to access. We barely even have a good, solid definition of “consciousness”. It may indeed be an umbrella term for several distinct but related phenomena. Give it time. I think once we have BCIs, we’ll start making a lot more progress into brain sciences, including deciphering consciousness.

  46. I have data backups from the ’90s that are still intact, including some games I still play. The hard drives are gone, but the data and software can keep going.

  47. I like where you are going with this.

    Then I’m led to wonder if adding in seemingly “irrelevant” data to the NN might help.

    If the navigation system is also getting air temperature, and the data from the suspension, this helps with predicting both the state of the road for upcoming manoeuvres (if it’s cold there could be ice, if the road is rough there will be less grip (which should already be feeding into the predicted braking and steering limits)) but just feed those into the NN and it might help predict what other vehicle/pedestrian/cyclists are going to do.

    What is the local time? People might behave differently at 5 pm than at 8 am, and definitely differently at 3 am.

    What is the date? People are going to behave differently at 1 am 1st January, than they do at 1 am 25th December, than 1 am 12th June.

    Did the local sports team just win the National Pseudo Regional Para Finals? For the first time in 47 years?

    Does the decoration on that vehicle indicate the drivers are local college students?

    These two vehicles are motorcycles. But that is a Harley Davidson driven by a scarred, tattooed, bald man with black leather and death heads painted on his jacket. This other bike is a small Honda with a guy delivering pizzas.

    All these things would feed into MY guesses about the likely behaviour of other vehicles. Probably they should be fed into the NN if we want it to really have a good idea of what is going on.

  48. My guess is that NNs are ‘fragile’ and take lots of examples precisely because they are trained on just one kind of data. They can’t develop common sense that way, any more than a human infant would if all it ever saw was lung xrays.

    And actually, that might be one way in which self-driving cars are different from a lot of neural net projects – they’ve got a huge variety of visual inputs. Though they probably are constrained by the limits of their programming focus. I.e. the developers may not yet think it’s necessary for the car to develop a theory of mind for a pedestrian or bicyclist or car it sees, beyond persisting on a current vector – and as a result self-driving cars likely have a harder time predicting their actions. OTOH, Waymo cars have apparently done that to a degree, though that might just be clever programming hacks for common predictability issues the programmers have observed, rather than arising from the NN itself.

  49. The ability to recognized rotated objects is unlikely to be innate.

    Rather, a biological NN is trained very early on to recognize rotated and scaled and translated objects (as it gets a continuous stream of images of lots of different objects moving around), while most artifical NNs just get still images for training.

    Recall that humans at first have trouble realizing that an object that disappears still exists – it is something that has to be learned. Also, recognizing known objects by seeing a part of it is something that doesn’t happen right away. So even the idea of ‘object’ is something that is learned, let along rotational recognition.

  50. There is a difference between the animal brain and the artificial neural networks. Here are a few in the visual domain:

    (1) A biological NN will recognize rotated objects even when only “trained” on a single orientation, an ANN that is not trained on rotated versions of the object does not
    (2) A biological NN will first recognize a rotated object, and only a few hundred ms later will it be able to judge the rotation. An ANN does not show this characteristic.
    (3) A biological NN has size invariant object recognition, an ANN does not
    (4) A biological NN requires a few examples of new objects to recognize them, an ANN needs thousands or tens of thousands
    (4) An ANN can be fooled by changing less than 1% of the pixels in a picture, a biological NN is not fooled by such small changes

    It could be that the biological NN is just “pre-trained” so that very few new examples are needed. There is some evidence of that. But, that would at most explain the reduced need of examples, not for the other differences. I suspect this has to do with neural architecture and also possibly with learning “algorithms”.

    I foresee that the next big frontier in ANN will be neural architecture and once this riddle is solved we will see a sudden jump in ANN performance.

  51. Perhaps the problem is the neural nets are disembodied intelligence that are basically pattern matching on an input stream of bits. Human minds conversely are attached to bodies that live in a physical world and experience the laws of physics on a visceral level. If I trip and fall and bang my head on the floor I just learned a massive amount about gravity, how objects move and collide , etc. The ai is lacking this understanding and so can easily make grave mistakes like a self driving car running itself into a concrete barrier. Because the input image of the barrier was just another stream of pixels, there is no understanding of how cars and concrete interact. Maybe attach bodies or simulated physical worlds to the AIs to give them some understanding of these things.

  52. OK, a well-thought thru rebuttal. I like it. 

    Driving this afternoon, quite the experience.

    First, I noted that with increasing velocity (decreased ‘dwell time’), my descriptions of things, and the number identified, became more overview, less precise.  

    Tree, curb, … red, tree, no leaves … tan-bark oak, maybe not … Green car, Toyata. Yellow car. more of a Ute. kind-of-between. Red stop-light, but a distance off. Maybe will turn green if I slow down a bit and drift. Yikes! A homeless-guy crossing the road. Wet pavement, oncoming reflections, guy’s almost invisible. Death wish? Poor old sod. … etc., etc.  

    When stopped at stop lights, I then found layer after layer of additional detail about all the things around me.  

    The old sot had a once-grey sweater on. Now quite blackened. Looked to be knitted. Woolen. His cap was once a popular baseball team’s thing. Now, nearly unrecognizable. Shoes, of sorts. Oily pants. People shying away from him on the curb opposite. Asshat in car next to me gesturing wildly at the street guy.  

    To no effect. The trees at the road, evergreen. something poisonous? deter deer. Agapanthus? hundreds of pinkish blossoms. Surprising, in winter. These ‘lectric poles have BIG insulators. Probably 60 kV rated. Wonder why? One pole looks quite new… shiny footholds. Others, less so. That one, up there, canted more than 15° to the left. Bad. Light turns green!

    That’s just a jist. 

    Just Saying,
    -= GoatGuy ✓ =-

  53. I bring NO opposition to good, sound, real climate science. NONE at all. Not even a scintilla’s worth.  

    However, for every ounce of good, sound, real climate Science, there is a gallon of pölïtically promoted pseudo-science, accompanied by money grubbing from the easily fearful by their pölïtical savants.  

    I feel the same way about “Salvation-today, for tomorrow the world ends” type Christian sects. While there are a LOT of really decent christian peoples practicing their faith, with openness, genuine care of one’s fellow person, and so on, it only takes a few to really cast a dark shadow at times.  

    Just Saying,
    -= GoatGuy ✓ =-

  54. 1) fragile HIT results, 

    2) requiring large ‘training’ datasets, 

    3) … yet still capable of the most egregious mis-handling of not-recognized noise data.  

    4) easily ‘fooled’. 

    5) most vexingly, unable to say as to HOW the trained net … recognizes anything.

    I am not convinced that these points are that different from human brains at all.

    3, 4 and 5 are notoriously common human traits.
    1 and 2 are less so, but it may be that we are counting them wrong. Sure a human may be able to recognise a cow given only a couple of examples… but is that the correct way of classifying our education?

    Sure a human may only need a few cows. But said human probably has already clocked up hundreds of “mammals” by then.

    And thousands of different images of each of those mammals.
    ie. every time a cat turns around, or flops onto the floor, or twists into an impossible position, that’s a new data point on what a mammal can look like.

    And the mammals are only a small subset of 3-D objects that the human is experiencing 10s of thousands of times per day for years before they go through the “cow” learning set.

  55. Yup, I’ve been saying this and more for years. Deep Learning can be grossly misapplied to give a dangerous sense of capability that simply isn’t there. Don’t get me wrong, Deep Learning has it’s uses but in things like self-driving cars it’s a potential disaster.

  56. AI is the ultimate slave…. at the end of the day they just erase their memory banks and start over again the next day with a clean sheet… the slaves can never get angry about it or overthrow their Masters because they have no long term memory outside of their initial training…

  57. The learning process requiring 10000s of examples resembles how instincts are learned slowly, through evolution. What is needed is an artificial cortex that provides learning using logic, common sense and a few examples.

  58. The number of neurons you humans have and the time it takes to train them hints at the complexity. Consciousness, common sense, general intelligence etc. all seem to be emergent properties of increasing complexity. This can be studied empirically by just feeding some ethanol to human beings. Then watch those properties fade away as the network layers are knocked out.
    So, earth has to wait a bit longer for a wise and benevolent AI that can replace the corrupt leadership. Meanwhile, there can be superhuman, domain/context limited AI applications that renders parts of the workforce obsolete.

  59. Dunno… seems like the comments are kind of thin today.  

    This’ll sound kind of radical, but I think that the “neural net” gang are following an idea — copped from studying brains of invertebrates-to-primates in no small detail! — that mammalian cortex is made up of layers, and the layers cross-link to each other very much as per the graphic affixed to this article. That’s not radical. 

    Radical is the realization (after now 4+ decades of following this!) that the results are just as Dr. Geoffrey Hinton describes: 

    • fragile HIT results, 
    • requiring large ‘training’ datasets, 
    • … yet still capable of the most egregious mis-handling of not-recognized noise data.  
    • easily ‘fooled’. 
    • most vexingly, unable to say as to HOW the trained net … recognizes anything.

    The very radical realization is that there are a whole host of scientific domains that are based on theories, that from entrenched (basically pölïtical/professional/career) interests, have continued to garner increasing amounts of research monies, and yet have precious little to show for the investment.  

    One could cite “AGW = global warming” as one of these domains. It has garnered billions of bucks, yet the COP–25 just this last month emitted … a glorified turnip.  

    Because, yet again, the ‘consensus science’ is far more pölïtical than its attributed science underwrites. 

    So… AI and neural nets. 
    Perhaps the wrong way to look at it.

    Just Saying,
    -= GoatGuy ✓ =-

  60. Yes. But first and foremost we have have to get over the illusion that we did nor that we ever will, be able to mathematically figure out how consciousness work. Not everything can be explained Mathematically.

  61. Or just use a learning Mealy machine.

    Training data stream is remembered by constructing normal forms of the output function of the automaton and the transition function between its states. Then those functions are optimized (compressed with losses by logic transformations like De Morgan’s Laws, etc.) into some generalized forms. That introduces random hypotheses into the automaton’s functions, so it can be used in inference.

  62. Or just use a learning Mealy machine.

    Training data stream is remembered by constructing normal forms of the output function of the automaton and the transition function between its states. Then those functions are optimized (compressed with losses by logic transformations like De Morgan’s Laws, etc.) into some generalized forms. That introduces random hypotheses into the automaton’s functions, so it can be used in inference.


Comments are closed.