A pioneer in artificial intelligence's obsessions

Writer : Angle Marque EG

A pioneer in artificial intelligences obsessions, Human values must be preserved as artificial intelligence develops, according to computer scientist Stuart Russell.

British-American computer scientist Stuart Russell authored and signed an open letter urging researchers to look beyond the goal of simply increasing the power of AI. To make increasingly capable AI systems reliable and beneficial, the letter recommends more research. In other words, "our AI systems must do what we want them to." Over the years, the letter has been signed by tens of thousands of people, including many of the world's foremost artificial intelligence experts from Google, Facebook, Microsoft, and other industry centers. More than 300 research groups had applied by the end of March to receive funds from the 37th signatory of the letter, inventor-entrepreneur Elon Musk, to conduct new research on "keeping artificial intelligence beneficial."

Russell, 53, a computer science professor at the University of California, Berkeley, has long contemplated the power and perils of thinking machines, which he founded the Center for Intelligent Systems. Artificial Intelligence: A Modern Approach is one of more than 200 papers he has authored (with Peter Norvig, head of research at Google). Artificial intelligence's rapid progress, on the other hand, has heightened the urgency of Russell's long-standing concerns.

He asserts that recent advances in artificial intelligence are due in part to neurally inspired learning algorithms. Facial recognition software, smartphone personal assistants, and self-driving cars all rely on this technology. Simulated artificial neural networks were recently reported in Nature as being able to play Atari video games better than humans within a few hours using only data on the screen and a goal to increase a player's score. However, they had no prior knowledge of aliens, exploding balls or bullets. This is something you'd think of as an indication of mental illness in a newborn baby, Russell said.

During the American Physical Society's 2015 March Meeting in Texas, Quanta Magazine caught up with Russell over breakfast, where he gave a standing-room-only lecture on the future of AI. On the nature of intelligence itself and the enormous difficulties in safely approximating it in machines, Russell discusses in this edited and condensed interview

Scientific American: In your opinion, the ultimate goal of AI research in your field should be to create systems that are "provably aligned" with human values. How do you explain it?

"Provably" and "human values" are two seemingly incompatible concepts, so the statement is meant to be provocative. It's possible that human values will never be fully understood. It is possible, however, that the machine will be able to discern most of our values through our actions. There may still be a few pieces of the puzzle that the machine doesn't understand or that we disagree on. To be on the safe side, you should be able to demonstrate that your machine has a low risk of malfunction.

Do you know how to go about it?

I'm currently pondering this question: An approximation of what humans want from a machine comes from where exactly? Inverse reinforcement learning may be a possible solution. Normal reinforcement learning is a method in which you are rewarded or punished based on how you behave, with the goal of determining which behavior gets you the most benefits. Because it's been given the game's score, [the DQN system, which plays on an Atari] is working toward increasing that number. The opposite of reinforcement learning is inverse reinforcement learning. As soon as you notice a pattern, you set out to discover what the behavior is meant to achieve in terms of points. A good example of this is when your domestic robot sees you get out of bed in the morning, grind up some brown round things in a noisy machine, and then do some complicated thing with steam, hot water, milk, and so on. It should learn that having a cup of coffee in the morning is an important part of the human value function.

There's a ton of information about human behavior and attitudes toward behavior in books, movies, and on the internet. So, machines can learn about human values, such as who wins medals and who goes to prison, thanks to this resource.

We want to know more about how you got into AI.

By and large, AI was not considered an academic discipline when I was in school. While attending boarding school in London, I was able to avoid compulsory rugby by taking a computer science A-level course at a nearby college, which allowed me to avoid having to participate. Programming a computer to play naughts and crosses (tic-tac-toe) was one of the projects I worked on for my A-levels. I was ostracized from the college because I spent hours at a time on the school's computer. Following my chess program the following year, I was granted access to Imperial College's massive mainframe computer by one of its professors. Trying to figure out how to get it to play chess was fascinating. I was able to pick up some of the concepts I would later use in my book.

But this was just a pastime for me; my academic focus was on physics at the time. At Oxford, I studied physics. Then, when it came time to apply to graduate school, I applied to Oxford and Cambridge to study theoretical physics, and I applied to MIT, Carnegie Mellon, and Stanford to study computer science, not realizing that I had missed all of the application deadlines for the United States. Fortunately, Stanford was kind enough to waive the deadline, so I was able to attend.

Since then, you've been living on the West Coast?

You've devoted a significant portion of your professional life to figuring out what intelligence is and how machines might be able to achieve it. What have you gleaned?

Researching rational decision-making for my thesis in the 1980s, I began to wonder if it was possible to make rational decisions. To be rational, you would ask yourself, "Here's my current state, here are the actions I can take right now, and after that I can do those actions, and after that I can do these other things; which path is guaranteed to get me there?" Considering the long-term effects of your decisions on our galaxy and its inhabitants is a necessary part of rational decision-making. It's a computational impossibility.

In order to better understand how we actually make decisions in artificial intelligence, I looked into how we define what we're trying to do as "impossible."

So, how do we do it?

Think about a short period of time and then predict what the future will look like from there. As an example, chess programs would only ever make moves that result in a checkmate if they were rational. For example, they look a dozen moves ahead and guess how useful those states are, then choose a move that they hope will lead to one of those states.

Could you demonstrate that your systems, no matter how intelligent they are, will never overwrite the human-set goals?

Another important aspect of "hierarchical decision making" is to think about the decision problem at various levels of abstraction. Over the course of a person's lifetime, they perform approximately 20 trillion physical actions. Giving a speech at this conference costs, I believe, around $1.3 billion. To be rational, you'd be attempting the absurd task of anticipating 1.3 billion steps in the future. In order to get around this, humans have developed a vast library of abstract, high-level actions. After moving your left or right foot, you don't think, "Then I can either move my left or right arm." You decide to book a flight on Expedia.com. When I get off the plane, I'm taking a cab." There you go. Until I see a "taxi" sign when I get off the plane at the airport and look for it, I don't give it much thought. Basically, this is the way we live our lives.. Despite the fact that the future is spread out, there is a lot of detail that is close to us in time, but these large chunks where we have made commitments to very abstract actions, such as "get a Ph.D." or "have children."

Is hierarchical decision making currently possible with computers?

That's one of the pieces that's currently missing: What is the source of all these high-level decisions? To our knowledge, DQN networks do not learn abstract representations of actions. If a game requires a player to think far ahead in primitive representations of actions — such as, "Oh, what I need to do right now is unlock the the door," and unlocking the door requires a key, etc. — then DQN will have trouble. That task will never be completed because there is no way for the machine to represent "unlock the door" in any way.

We could see a huge leap in the capabilities of machines if that problem could be solved—and it's certainly not impossible. In my opinion, there are two or three problems like that that, if all of them were solved, there wouldn't be any major barrier between there and human-level artificial intelligence.

When it comes to artificial intelligence, what worries you?

"What if we succeed?" is the title of a section in the first edition of my book, which was published in 1994. Because it seemed to me that no one in AI was really concerned about it. It's possible that the distance was too great. Success, on the other hand, would be a huge accomplishment. Perhaps "the most significant event in human history" is an appropriate label. Then, if that's the case, we'll have to think a lot more about the specifics of this event than we currently are.

Intelligence explosion is a term used to describe the process by which machines will be able to work on AI just like we do and improve their own abilities—redesigning their own hardware and so on—and their intelligence will soar. There have been improvements in the community's arguments over the last few years. Value alignment is the most persuasive argument. Because of this, you end up with a system that is extremely effective at optimizing a certain type of utility function. [Oxford philosopher] Nick Bostrom uses paperclips as an example in his book [Superintelligence]. "Make some paperclips," you say. That and the entire planet is turned into a massive paperclip dump. What purpose do you give a super-optimizer if you build one? As a result of the fact that it is going to work.

What about the differences in the values of humans?

That's a fundamental issue. This means that machines should err on the side of doing nothing when conflicting values exist. That may be a challenge. I believe that these value functions will have to be built in. A domestic robot must share at least some of our basic human values, or it will do things like cook the cat for dinner because there is no food in the fridge and our children are hungry. These compromises are a fact of life. Machines aren't going to be welcomed into your home when they make these trade-offs in such a way as to reveal that they aren't grasping the concept at all.

I don't think there's a way around the fact that there will be a values industry in some form. If you get it wrong, you're going to lose a lot of money. People will lose faith in domestic robots after hearing one or two stories about them cooking the cat in the oven.

Once you've gotten some intelligent systems to behave themselves, do you have to get better and better value functions that tidy up all the loose ends, or do they continue to behave themselves as you add more and more intelligent systems? In the meantime, I don't know the answer to this question.

The behavior of artificial intelligence should be mathematically verified in all possible circumstances, as you've argued. If that were to happen, how?

Some argue that a system can produce a new version of itself with different goals at any time. That's a common scenario in science fiction, where the machine magically achieves its goal of eradicating humanity. In other words, can you demonstrate that your systems will never deviate from the human-set objectives, no matter how clever they are?

Because the DQN system has been written in such a way that its goal is to maximize that score, it would be relatively easy to prove that the system cannot change its goal. Now, there is a hack known as "wire-heading" that allows you to physically alter the Atari game console to alter the screen's display of the game's score. For the time being, DQN does not have a robot arm because its scope of action is limited only to within the game. However, if the machine is able to act in the real world, this is a serious issue. How can you show that your system is built in such a way that it can never change the mechanism by which the score is presented to you even though it is within your scope of action? - That's a more difficult proof to come up with.

Do you see any promising developments in this area?

"Cyber-physical systems" refers to systems that connect computers to the physical world. There are bits representing an air traffic control program and real airplanes in the cyber-physical system, and the most important thing is to make sure that no planes collide. You're attempting to prove a theorem about the relationship between bits and the real world. . So what you'd have to do is to write a very simple, very conservative mathematical description of the physical world, and your theorems would still hold in the real world as long as it is somewhere within that envelope.

However, you've raised the possibility that formal verification of AI systems may not be mathematically possible.

Many questions about computer programs suffer from the "undecidability" problem. Turing demonstrated that no computer program can determine whether or not any other possible program will end and output an answer or become stuck in an infinite loop. This creates a problem because you can't prove that all possible other programs would satisfy a property if all you started with was one program. How important is undecidability for AI systems that rewrite themselves to worry about? Using both the existing program and the knowledge and experience they've gained from life, they will rewrite their own code. What is the potential impact of real-world interaction on the design of the next program? What we don't yet know is in that area.

Read more:

Artificial Intelligence