There’s been years of study placed in the problem of how to make artificial intelligence “robust” to attack and less prone to failure. Yet the field is still coming to grips with what failure in AI actually means, as pointed out by a blog post this week from the DeepMind unit of Google.
The missing element may seem obvious to some: it would really help if there was more human involvement in setting the boundary conditions for how neural networks are supposed to function.
Researchers Pushmeet Kohli, Sven Gowal, Krishnamurthy, Dvijotham, and Jonathan Uesato have been studying the problem, and they identify much work that remains to be done, which they sum up under the title “Towards Robust and Verified AI: Specification Testing, Robust Training, and Formal Verification.”
There’s a rich history of verification testing for computer programs, but those approaches are “not not suited for modern deep learning systems.”
Why? In large part because scientists are still learning about what it means for a neural network to follow the “specification” that was laid out for it. It’s not always clear what the specification even is.
“Specifications that capture ‘correct’ behavior in AI systems are often difficult to precisely state,” the authors write.
The notion of a “specification” comes out of the software world, the DeepMind researchers observe. It is the intended functionality of a computer system.
As the authors wrote in a post in December, in AI, there may not be just one spec, there may be at least three. There is the “ideal” specification, what the system’s creators imagine it could do. Then there is the “design” specification, the “objective function” explicitly optimized for a neural network. And, lastly, there is the “revealed” specification, the way that the thing actually performs. They call these three specs, which all can vary quite a bit from one another, the wish, the design, and the behavior.
Designing artificial neural networks can be seen as how to close the gap between wish, design and behavior. As they wrote in the December essay, “A specification problem arises when there is a mismatch between the ideal specification and the revealed specification, that is, when the AI system doesn’t do what we’d like it to do. ”
They propose various routes to test and train neural networks that are more robust to errors, and presumably more faithful to specs.
One approach is to use AI itself to figure out what befuddles AI. That means using a reinforcement learning system, like Google’s AlphaGo, to find the worst possible ways that another reinforcement learning system can fail?
The authors did just that, in a paper published in December. “We learn an adversarial value function which predicts from experience which situations are most likely to cause failures for the agent.” The agent in this case refers to a reinforcement learning agent.
“We then use this learned function for optimisation to focus the evaluation on the most problematic inputs.” They claim that the method leads to “large improvements over random testing” of reinforcement learning systems.
Another approach is to train a neural network to avoid a whole range of outputs, to keep it from going entirely off the rails and making really bad predictions. The authors claim that a “simple bounding technique,” something called “interval bound propagation,” is capable of training a “verifiably robust” neural network. That work won them a “best paper” award at the NeurIPS conference last year.
They’re now moving beyond just testing and training a neural network to avoid disaster, they’re also starting to find a theoretical basis for a guarantee of robustness. They approached it as an “optimisation problem that tries to find the largest violation of the property being verified.”
Despite those achievements, at the end of the day, “much work is needed,” the authors write “to build automated tools for ensuring that AI systems in the real world will do the ‘right thing’.”
Some of that work is to design algorithms that can test and train neural networks more intensely. But some of it probably involves a human element. It’s about setting the goals — the objective function — for AI that matches what humans want.
“Building systems that can use partial human specifications and learn further specifications from evaluative feedback would be required,” they write, “as we build increasingly intelligent agents capable of exhibiting complex behaviors and acting in unstructured environments.”
Previous and related coverage:
An executive guide to artificial intelligence, from machine learning and general AI to neural networks.
The lowdown on deep learning: from how it relates to the wider field of machine learning through to how to get started with it.
This guide explains what machine learning is, how it is related to artificial intelligence, how it works and why it matters.
An introduction to cloud computing right from the basics up to IaaS and PaaS, hybrid, public, and private cloud.