Hardware is gating the rate of progress in Artificial Intelligence. Bill Dally is the Chief Scientist and Senior Vice President of Research, Nvidia.
Moore’s law is not providing performance gains but improvement is being made with specialized chips.
Powerful new AI chips are developed to improve the speed and efficiency of today’s systems.
The Nvidia Turing Chip was introduced last year. It achieves its performance through specialization.
Nvidia Turing has a Tensor Core at its heart. It does a 4X4 multiplication matrix. This performs 128 operations in one command. Deep Learning acceleration is mainly this matrix multiplication.
Nvidia has improved inference improvement by 110X in 6 years and Single chip inference was improved by 66X in 6 years.
Nvidia chips went from 28 nanometers to 16 nanometers. This gained 28% of the improvement.
Turing added integer tensor cores.
NVIDIA Chief Scientist Bill Dally gave a talk 6 months ago at the GPU Technology Conference Israel 2018 in Tel Aviv, where he discusses accelerated platforms and the future of computing.
Hotchips gave a 2018 tutorial on deep learning accelerators. They provide a very brief introduction to Deep Neural Nets and their applications in computer vision, speech recognition, and other areas. We review the two key computational elements of Deep Neural Nets: inference and training in regards to their compute and memory requirements. Finally, we review popular target architectures for supporting these applications, including CPUs, GPUs, and custom DNN accelerators, including a discussion around common micro-architectures for acceleration of typical computational patterns and computational considerations around batch sizes, quantization and pruning.In the second portion of this tutorial we turn our focus to the problem of accelerating inference in edge devices.
Nvidia is projecting forward and most of the energy of future chips is in data movement and memory. Most of the work is not for processing and calculation.
They need to get 70 femtojoule or less per MAC. They are considering analog.
AI hardware needs to put most of its work in simulation and training.
Nvidia eats its own dogfood and uses its AI chips for improving their own hardware and applications.
Nvidia has an evolution of models.
Bill feels that further specialization for AI chips would be to target verticals. Nvidia can target and execute on 4 verticals, but there is