Review: Human Compatible: Artificial Intelligence and the Problem of Control by Stuart Russell

Professor Russell’s book starts out with an entertaining journey through the history of AI and automation, as well as cautionary thinking about them. This discussion is well informed – he is a renown AI academic and co-author of a comprehensive and widely used AI textbook.

Having provided historical background, the remainder of the book argues two main points: (1) the current approach to AI development is having dangerous side-effects, and it could get much worse; and (2) what we need to do is build AIs that can learn to satisfy human preferences.

Concerning the dangers of AI, the author first addresses current perils: misuse of surveillance, persuasion, and control; lethal autonomous weapons; eliminating work as we know it; and usurping other human roles. I found this part of the book an informative and well-reasoned analysis.

Beyond AI’s current perils, the author next addresses the possibility of AIs acquiring superhuman intelligence and eventually ruling and perhaps exterminating humankind. The author believes this is a definite possibility, placing him in basic agreement with works such as Bostrom’s Superintelligence and Tegmark’s Life 3.0. AI’s existential threat is the subject of continuing debate in the AI community, and Russell attempts to refute the arguments made against his position.

Russell bases his case for AI’s existential threat on two basic premises. The first is that in spite of all the scientific breakthroughs required to initiate superintelligence (well documented by Russell), you cannot rule out humans achieving these breakthroughs. While I appreciate this respect for science and engineering, clearly some human achievements are more within reach than others. Humans understanding human intelligence, let alone creating human-level machine intelligence, seems to me too distant to speculate about except in science fiction.

Russell’s second premise is that unless we change course, superintelligence will be achieved using what he calls the standard model, which creates AIs by optimizing them to meet explicit objectives. This would pose a threat to humanity, because a powerful intellect pursuing explicitly defined objectives can easily spell trouble, for example if an AI decides to fix global warming by killing all the people.

I don’t follow this reasoning. I find it contradictory that an AI would somehow be both super intelligent and bound by fixed concrete objectives. In fact in the last part of the book, Russell goes to great pains to illustrate how human behavior, and presumably human-level intelligence, is far more complicated than sequences of explicit objectives.

In the last part of the book Russell advocates developing provably beneficial AI, a new approach that would build AIs that learn to satisfy human preferences instead of optimizing explicit objectives. While I can see how this would be an improvement over homicidal overlords, I don’t think Russell makes the case that this approach would be even remotely feasible.

To point out how we might grapple with provably beneficial AI he spends a good deal of time reviewing mathematical frameworks that address human behavior, such as utility theory and game theory, giving very elementary examples of their application. I believe these examples are intended to make this math accessible to a general audience, which I applaud. However what they mainly illustrate is how much more complicated real life is, compared to these trivial examples. Perhaps this is another illustration of Russell’s faith that human ingenuity can reach almost any goal, as long as it knows where to start. Like scaling up a two-person game to billions of interacting people.

I was very pleased to read Russell’s perspective on the future of AI. He is immersed in the game, and he is definitely worth listening to. However, I have real difficulty following his extrapolations from where we are today to either superintelligence or provably beneficial AI.

Learning Common Sense from Video

Common sense makes humans very efficient learners, so machine learning researchers have been working on ways to imbue machines with at least some ‘common sense’. In a previous blog we discussed using pictures to train natural language processing systems, in a sense giving the systems partial ‘knowledge’ of what words represent in the physical world. ML systems can get even closer to common sense with a little help from video ML models and human teachers.

In my latest iMerit blog I discuss an innovative deep learning architecture that applies the concept of attention, commonly used in sequence models for language processing, to analyze motion patterns in video using only 30 percent of the computations used in previous approaches.

Next I discuss training such a video analysis system to learn the basic language of movement. For this training the human teacher goes beyond typical training data annotation, drawing on knowledge of the physical world to improvise representative examples of the basic concepts of movement. It is hoped that this will give the ML system a bit of ‘common sense’, allowing it to more easily learn new video analysis tasks.

Learning Words with Pictures

Natural language processing (NLP) machines have made great progress by learning to recognize complex statistical patterns in sentences and paragraphs. Work with modern deep learning models such as the transformer has shown that sufficiently large networks (hundreds of millions parameters) can do a good job processing language (e.g., translation), without having any information about what the words mean.

We humans make good use of meaning when we process language. We understand how the things, actions, and ideas described by language relate to each other. This gives us a big advantage over NLP machines – we don’t need the billions of examples these machines need to learn language.

NLP researchers have asked the question, “Is there some way to teach machines something about the meaning of words, and will that improve their performance?” This has led to the development of NLP systems that learn not just from samples of text, but also from digital images associated with the text, such as the one above from the COCO dataset. In my latest iMerit blog I describe such a system – the Vokenizer!

Navigating the Cost Terrain with Minibatches

Training a Machine Learning system requires a journey through the cost terrain, where each location in the terrain represents particular values for all ML system parameters, and the height of the terrain is the cost, a mathematical value that reflects how well the ML system is performing for that parameter set (smaller cost means better performance). For a very simple ML system with only two parameters, we can visualize the cost terrain as a mountainous territory with peaks and valleys, plateaus and saddlebacks. (Deep learning cost terrains are a lot like this, only instead of three dimensions they can have millions!)

Training mathematically explores the cost terrain, taking steps in promising directions, hoping not to fall off a cliff or get lost on a plateau. Our guide in this journey is gradient descent, which calculates the best next step in the search for the best ML system parameters, which is in the lowest valley of the cost terrain.

Gradient descent can be very cautious and look at all the training samples before taking a step. This makes sure the step is a very good one, but progress is slow because it takes a long time to look at all the training samples. Or, it can make a guess and take a step after every training sample it looks at. These snap decisions make rapid steps in the cost terrain, but there is a lot of motion but little progress because each step is all about one training sample; we want the ML system to give good average performance across all the training samples.

The best way to efficiently navigate the cost terrain is a compromise between slow deliberation and snap judgement, called minibatching. This approach takes a step using a small subset of the the training set – enough to get a pretty good idea of where to go, but a small enough sample size so that the calculations can be done quickly using modern vector processors.

Read my latest iMerit blog to get a better idea of how minibatching works:

Learning Without a Teacher

Machine learning applications generally rely on supervised learning, learning from training samples that have been labeled by a human ‘teacher’. Unsupervised learning learns what it can from unlabeled training samples. What can be learned this way are basic structural characteristics of the training data, and this information can be a useful aid to supervised learning.

In my latest iMerit blog I describe how the long-used technique of clustering has been incorporated into deep learning systems, to provide a useful starting point for supervised learning and to extrapolate what is learned from labeled training data.

Inside an AI

Artificial intelligence gets its name from the fact that AIs perform tasks associated with human intelligence, such as recognizing faces or understanding language or playing chess. For these tasks, we can measure AI performance and compare it to human performance, using a single ‘yardstick’, such as accuracy or word error rate or games won.

But can artificial intelligence and human intelligence be compared in a general way, using a single yardstick? Is there a general intelligence scale upon which, for example, humans might average 500, today’s best AIs 275, and future superintelligent AIs 1000?

Of course, it is difficult to measure even human intelligence on a single scale. It is generally acknowledged that measures like IQ tests, while useful as predictors of particular capabilities, do not capture the breadth of human intelligence.

However, putting aside the fundamental difficulty of quantifying intelligence, human or otherwise, can we compare human and artificial intelligence, beyond performance on specific tasks? Should we talk about humans being smarter than AIs, or vice versa? I would say ‘No’. Today human and artificial intelligence are so different that it doesn’t make sense to try to compare them along a single scale.

One striking difference between AIs and humans shows up in the way deep neural networks work. These networks, at the heart of today’s most advanced AIs, learn patterns from huge masses of data, and use these patterns to ‘understand’ things like visual images or language. However the way these networks ‘perceive’, ‘learn’, and ‘understand’ the world is decidedly non-human.

Let’s consider machine translation as an example. First, a little history. In 1954 an IBM 701 computer was programmed with a dictionary and rules that allowed translation of Russian sentences into English. The results were so encouraging that it was predicted that the problem of automatic machine translation would be completely solved in 3 to 5 years.

However in the next 10 years little progress was made. Research in machine translation came to be considered such a long shot that funding was drastically curtailed. Critics at the time pointed out that human translation requires complex cognitive processing that would be extremely difficult or impossible to program into computers. When humans interpret language, we don’t just hear it as sounds or or see it as symbols, we understand it as objects, actions, ideas, and relationships, which is key to our understanding language.

In the next decades, researchers in machine translation tried to get closer to human understanding by developing complex models of linguistic structure and meaning. While this enabled machine translators to gradually improve, human translators performed much better.

In more recent years deep neural networks began to be applied to machine translation, greatly improving performance. By 2016, Google announced the GNMT system, a deep neural network that reduced translation errors by 60% compared to previous methods.

How did GNMT achieve this quantum jump in performance? Did Google engineers finally figure out how to program the kind of understanding into their computers that humans need to make good translations?

The answer to this last question is: “No, quite the opposite!” The designers of GNMT did away with any attempt to incorporate human-like knowledge. GNMT relies on none of the complex models of language structure and meaning used by previous methods.

Instead, GNMT uses a type of neural network called Long Short-Term Memory (LSTM). Basically, LSTM [Note 1] allows sequences of output numbers (translated sentences) to be calculated from sequences of input numbers (sentences to be translated). The calculations in GNMT are controlled by hundreds of millions of parameters. Millions of examples are used to set these parameters, through a trial-and-error adjustment procedure.

As an illustration of how different a deep neural network translator is from a human translator, consider how such a system typically represents a word to be translated. A translator with a 10,000 word vocabulary, for example, might represent each word by string of 10,000 ones and zeroes, with a one corresponding to the word’s position in the vocabulary list, and all the rest of the string zeroes. This way of representing words is called ‘one-hot’ encoding [Note 2]. Experimentation has shown it works well with deep neural networks doing language processing [Note 3].

For example, if the word ‘elephant’ is the 2897th word in the translator’s vocabulary, what the machine translator ‘sees’ when presented with the word is a string with 9,999 zeroes, with a single one in position 2897 of the string. All it ‘knows’ about ‘elephant’ at this point, is that it is word number 2897 in its vocabulary.

Contrast this with a human translator, who probably remembers many things about elephants, as soon as the word is encountered.

The power of a deep neural network comes from its ability to find patterns in word occurrence by analyzing millions of translated documents. Machine translation has always used patterns of word occurrence, for example, which words are more likely to follow other words. However deep neural networks take this to a whole new level, recognizing extremely complex patterns that link and relate many words to many other words.

That these huge deep neural network translators can be built and perform so well is a tribute to years of creative engineering and systematic experimentation in AI. Decades ago, no one really knew that it would be possible to train a network with hundreds of parameters, let alone hundreds of millions. And it was equally unexpected that good machine translation could be done by using only patterns of word occurrence, without any reference to word meaning.

GNMT is an engineering marvel, to be sure. However, its mechanistic translation incorporates nothing about what words refer to in the real world. It ‘knows’ the Japanese sentence ‘Watashi no kuruma wa doko desu ka? translates to ‘Where is my car?’, but it has no idea what a car is (other than the words ‘car’ associates with) or that the question refers to a location on the planet earth that the questioner is likely to walk to.

Today, we compare machine and human translation, and the machines are looking very good. But what does this tell us about how artificial and human intelligence compare? Is this an example of AI catching up to human intelligence? No, it is only machine translation catching up to human translation.

Note 1: Four years is a long time in AI, and further progress has been made since GNMT. Transformer architectures have replaced LSTM as the architecture of choice for many applications in language processing. The evolution from LSTM to Transformer is an example of a fascinating aspect of AI deep learning progress: simpler architectures often perform better, when more compute power becomes available. GNMT’s LSTM is an example of a ‘bi-directional recurrent neural network with memory states and attention’, which is as complicated as it sounds – sentences are sequentially processed through a neural network that updates states that represent how words depend on other words that come before and after, and how far ahead or backwards words make a difference. Transformers do away with a lot of that, and just take in whole sentences at once.

Note 2: Generally, encoding just means transforming one representation to another, according to a set of rules, like converting ‘elephant’ to a string of ones and zeroes. Encoding is used in a couple of other senses in GNMT. The diagram at the start of this blog shows that an LSTM-type neural network can be divided into a front-end encoder and a back-end decoder. The encoder and decoder in this case describe mapping from the input language to the neural network’s internal representation (encoding), and mapping from the internal representation to the output language (decoding). These mappings are what the neural network learns by training on millions of examples. Also note that language translation itself is a form of encoding – a transformation of the input language to the output language.

Note 3: Although one-hot encoding is frequently used in LSTM neural networks, GNMT actually uses a more sophisticated technique that encodes word segments instead of complete words. The network learns to break up words in ways that maximize its ability to make good guesses for word translations outside its vocabulary.

AI’s Superpower

Just putting ‘artificial’ and ‘intelligence’ together in the same term is enough to get people pretty excited.

For some, ‘Artificial Intelligence’ can only be a misnomer. True intelligence is uniquely human, biologically evolved, embodied, necessarily shaped by environment and social relationships, non-algorithmic, and unknowable by mere human consciousness. Anything that becomes possible for human-built computers is by definition not really artificial intelligence.

For others, natural intelligence is simply computation performed on a relatively slow biological computer, that took hundreds of thousands of years to evolve. It is only a matter of time before the exponential improvement in computing technology will allow AI to surpass the power of human brains.

The loaded nature of the term AI has also led to a variety of definitions, and identification of subcategories such as Narrow AI and Artificial General Intelligence. Sometimes AI is differentiated from terms such as ‘machine learning’ or ‘automation’.

I prefer a simple and pragmatic definition for AI – technology that can perform tasks previously requiring human intelligence. This definition does not address the limits or scope of AI, it simply acknowledges that we have developed and will continue to develop systems that perform tasks previously requiring human intelligence.

This definition will be too broad for some people’s taste. After all, electronic calculators fit the definition, and nobody considers them AI. However, I think of AI as a pursuit rather than a destination, with a leading edge that continues to advance. In practice, when we talk about AI, we are usually talking about technology near the leading edge.

In a previous post, I addressed why I think AI is neither comparable to human intelligence, nor a threat to humans. But I also think the leading edge of AI, deep neural networks, is very impressive.

Deep neural networks map patterns in data to outputs that represent some useful interpretation of the data, such as the identity of a face or the translation of a spoken sentence. In a sense, this capability is pretty simple; these AIs can be dismissed as mere ‘curve fitters‘. What makes deep neural networks so useful?

Here are three things that give these AIs ‘superpowers’:

  • Patterns are everywhere
  • Data is abundant
  • AI learning extends human programming.

Patterns are everywhere

Recognizing patterns is central to the way we humans live, work, and play. For example:

  • Patterns in our environment tell us what we can eat, where we can find food, when we need to take shelter, and how to turn the wheel of our car
  • Social patterns bond children to mothers, attract mates, expose cheaters
  • Humans impose patterns on their environment – constellations in the stars, orbital mechanics – to enrich understanding and guide exploration
  • Patterns of language – spoken, written, schematic – communicate ideas and directions, and preserve the growing body of human knowledge
  • Patterns are used by detectives to fight crime, and by financial analysts to make money
  • We amuse and enrich ourselves through patterns in music and art, and in puzzles and games.

Data is abundant

Much of our reality these days is represented digitally, on the web or in databases. This gives unprecedented access to information about the patterns central to our lives. If only we had enough eyes and brains to examine and digest this huge volume of data! But this task is a perfect fit for deep neural networks: feed them enough data and they can discover extremely complex patterns.

For example, automatic speech-to-text recognition has been revolutionized by deep neural networks. One such network with 5 billion connections is possible only because it could be trained with lots of data: 3 million audio samples, together with 220 million text samples from a 495,00-word vocabulary. 

AI learning extends human programming

Obviously, it takes humans to program deep neural networks. But these networks are programmed to ‘learn’, in the sense that they adjust their own parameters during the training process.

The fact that very large deep neural networks can be trained and give good results is a relatively recent discovery in AI.  Why these networks work so well is not well understood theoretically, but extensive experimentation has led to innovative designs and good results. This work has been carried out by a thriving, innovative community of AI researchers and engineers, who are building and extending a shared body of open-source software, datasets, and results.

One of the things observed in these experiments is that as deeper neural networks have become feasible, human engineers have needed to do less preprocessing of the inputs to the networks, to identify important features in the data. By letting the networks ‘learn’ what features are important, better results are obtained with less human programming.

An example is automatic speech-to-text recognition, mentioned above. For decades engineers developed these systems using approaches that drew on linguistic analysis of human vocalization and language: speech as composed of elemental sounds, phonemes, which are then built up into words and sentences, all governed by language syntax and semantics. Up through the early 2000’s, systems mirrored this analysis: sounds were mapped to phonemes and possible words, sometimes using neural networks, then symbolic or statistical models of language were used to predict, correct, and make sense of words and sentences.

As effective deep neural networks became available, engineers put more and more of the linguistic analysis burden on the networks. Eventually, networks were trained to directly map sound (digital time-frequency plots) to words, resulting in a dramatic improvement in accuracy.

Artificial Intelligence?

Whether or not AI is really ‘intelligent’, AI research and development continues to move the limit of machine capability.

How Will AI Impact Organizations?

We can think of AI as software that helps organizations more effectively reach their goals, e.g., by reducing costs and increasing revenues.

Gaining benefits from AI, or any other innovative technology, requires organizational change. New strategies. New job descriptions. New workflows. New org charts. Training.

What makes AI different? Are the challenges faced by organizations adopting AI different from those encountered in adopting other software innovations?

After all, using computers to revolutionize organizations is nothing new. IBM developed the SABRE reservation database for American Airlines back in 1964, replacing manual file cards with a system that could handle 83,000 reservation requests. Pretty disruptive!

So how is AI changing organizations? Let’s take an example – the financial industry. AI’s ability to find patterns in mountains of data can help financial organizations:

  • Make more accurate and objective credit decisions by accounting for more complex relationships among a wider variety of factors
  • Improve risk management using forecasts based on learning patterns in high volumes of historical and real-time data
  • Quickly identify fraud by matching continuously monitored data to learned behavioral patterns
  • Improve investment performance by rapidly digesting current events, monitoring and learning market patterns, and making fast investment decisions
  • Personalize banking with smart chatbots and customized financial advice
  • Automate processes to read documents, verify data, and generate reports.

To make these improvements using AI, a financial organization needs to undertake the sort of activities needed to introduce any new software into their operations and products, such as:

  • Establish strategic priorities and budgets
  • Clarify and communicate objectives and plans with stakeholders
  • Work with software developers/vendors/users to establish and carry out software/system development projects
  • Create/modify procedures and organizations to take advantage of the new software
  • Hire, train, retrain the workforce as needed
  • Monitor results and adapt as required.

What are the special challenges AI brings to these activities?

The first challenge is AI’s high profile. Managers feel compelled to catch the wave of the future, and workers fear they will lose their jobs. As a consequence:

  • Managers may undertake AI projects with unrealistic expectations. AI can be extremely effective, however only when there is access to large volumes of data relevant to an operational role that truly benefits the organization
  • Employees essential to successful adoption of the new systems may stand in the way or quit if they see AI as a threat.

Clearly due diligence is required in the first case, and effective employee engagement in the second.

A second challenge is an “all or nothing” aspect of AI. To reap the benefits of AI, the core AI technology must be fully integrated with an organization’s IT infrastructure and business operations. Notice in the financial organization example above, how many aspects of the organization could be affected by AI. To successfully integrate AI, an organization must be “all in”. To do this requires particularly high levels of communication, investment, and cross-organizational participation.

A third challenge is that with successful adoption of AI, the requirement for personal growth and change is pervasive, up and down the organization. Leaders, engineers, and operators all need to learn and embrace the changes brought about by AI. For many, this can be an exciting opportunity for career growth and more fulfilling jobs. Others will mourn the lost relevance of hard-won experience. The organization must be prepared to invest in training, re-training, and professional development. The more AI takes over routine data gathering and analysis, the more important ‘soft’ skills will be to every worker.

Finally, a fourth challenge is that even a very capable AI can produce unintended results. For example, although AI-based analysis can lend objectivity to credit decisions, training AIs using historical data can promulgate past biases. Also, when highly-trained AIs encounter situations they have never seen before, the results can be unpredictable. This means AIs need human supervisors, and these supervisors are dealing with a whole new kind of employee!

Next: What kinds of jobs will AI impact?