The head of Google artificial intelligence analyzes the future development direction of deep learning “opening” for ten years

  Since the advent of computers, human beings have been looking forward to further creating “thinking machines”. In 1956, American computer scientist and cognitive scientist John McCarthy first proposed the concept of “artificial intelligence”, trying to solve various problems and continuously improve itself by allowing machines to use language to form concepts and logic.
  In the 50 years since, scientists have developed various artificial intelligence systems “based on logic, rules and neural networks.” However, these efforts to encode human knowledge into machine-readable content have not yet allowed machines to make significant progress in autonomous learning, even if they consume a lot of human and material resources.
  Until about 2011, artificial intelligence and machine learning began to achieve various achievements in many fields, not only solving a series of world problems, but also opening up avenues for new computing experiences and interactions.
  Recently, Jeff Dean, head of artificial intelligence at Google, made an in-depth analysis of the progress of deep learning in the past decade, including the computing hardware and open source software frameworks that promote machine learning, the many applications of machine learning, and the future development direction. The article is published in the Special Issue of Artificial Intelligence and Society in Dae alus, the journal of the American Academy of Arts and Sciences.
  It is understood that Dean has a strong interest in neural networks at the undergraduate level, and has conducted in-depth research on the topic of parallel neural network training. At the time he concluded that it would take a million times more computing power than the computers of 1990 before neural networks could begin to break through on challenging problems.
  At the beginning of the 21st century, researchers began to use graphics processing units (GPUs) as hardware devices in deep learning algorithms. GPUs have a relatively higher floating-point calculation rate than CPUs. Since then, deep learning has advanced rapidly in areas such as image recognition, speech recognition, and language understanding, prompting researchers to design specialized hardware devices that better match the needs of deep learning algorithms than GPUs.
  Regarding the construction of dedicated hardware, the deep learning algorithm has two key characteristics, one is that it has a high tolerance for low precision, and the other is that its calculation is basically composed of various linear algebraic operation sequences of matrices and vectors. Therefore, chips and systems dedicated to low-precision linear algebra computing can help greatly improve the performance of deep learning algorithms.
  Early chips of this type featured Google’s first Tensor Processing Unit (TPUv1), which targeted 8-bit integer computation for deep learning inference, providing an order of magnitude improvement in speed and performance per watt over contemporary CPUs and GPUs, Giving Google huge improvements in the accuracy of speech recognition, language translation and image classification systems.
  Along with the development of its dedicated hardware, open source software frameworks have emerged that promote deep learning research, making it easier to express deep learning models and computations, while helping deep learning be applied in a wider range of fields.
  In 2015, Google developed and open-sourced TensorFlow, which can express machine learning computing, by integrating earlier frameworks such as Theano and DistBelief, which emphasized scale and performance, so that individuals and organizations around the world have more development opportunities in machine learning.
  Due to the continuous improvement of hardware performance and the popularity of open source software tools, the research output in the field of machine learning and its applications have proliferated. Now, researchers in the field of machine learning are actively collaborating with scientists in neuroscience, climate science, molecular biology, and health care to help solve a series of important problems that not only bring social benefits, but also advance human development.
  For example, machine learning can help researchers learn more about genetic makeup and ultimately more effectively address gene-based diseases; machine learning can help researchers better understand weather and the environment, especially predicting daily weather and climate hazards such as floods; Machine learning also provides new ways to help detect and diagnose diseases. When applied to medical images, computer vision can help doctors analyze and judge some serious diseases more quickly and accurately than doctors themselves.

  It is worth mentioning that today’s machine learning technology also gives scientists a more accurate understanding of how diseases spread, giving the world a better chance of disease prevention.
  Finally, Dean pointed to the future of machine learning. “There are some interesting research directions emerging in the machine learning field, which can be even more interesting if you combine them,” he says. The
  first is sparse activation models, such as sparse-gated expert models, which show how to build very large capacities models, of which only a subset of models are “active” for any given example. The model’s researchers showed that this approach is 9 times more efficient to train, 2.5 times more efficient to infer, and more accurate.
  The second is automated machine learning, techniques such as neural architecture search or evolutionary architecture search, capable of automatically learning various aspects of a machine learning model and its components to optimize the accuracy of a specific task, often involving running many automated experiments, where each experiment A lot of computation may be involved.
  The third is multi-task machine learning, multi-task training on a modest scale of a few to dozens of related tasks, or transfer learning from a large amount of data trained for related tasks, and then on a small amount of data trained for a new task Fine-tuning, has proven to be very effective on a variety of problems.
  If these three technical directions are combined to build systems that run on large-scale machine learning accelerator hardware, a single model trained in this way may be composed of many components with different structures, can handle millions of tasks, and can automatically learn Complete the new mission successfully.
  Building a system that can handle new tasks independently across all application domains of machine learning requires expertise and advances in many areas, spanning machine learning algorithms, responsible artificial intelligence, distributed systems, and computer architecture. This is a really big challenge, but one that will give AI a huge boost.
  It is important to note that while AI has the ability to help us in many aspects of our lives, all researchers and practitioners should ensure that these methods are developed responsibly, such as scrutinizing bias, fairness, privacy, and others that may affect others and social behavior.