Artificial intelligence is having a real impact on many industries. It now exceeds humans at some image recognition and speech recognition tasks, it is approaching human levels for language translation, and it is beating experts at all sorts of games. It is being used in medicine, media and entertainment, and security. And autonomous vehicles promise to drastically reduce the 1.3 million road traffic deaths each year--largely through human error.

"Unless you've been sleeping under a rock, you've noticed that there is an AI revolution going on," Bill Dally, Nvidia's Chief Scientist and head of research, said at the recent VLSI Symposia[1]. "Every aspect of human life and commerce is going to be deeply impacted by AI."

Despite these advances, deep learning remains "completely gated by hardware" because the jobs are getting bigger. ImageNet is now considered a small dataset and some cloud data centers train on more than one billion images and use upwards of 1,000 GPUs, Dally said. Microsoft's ResNet-50 neural network requires 7.72 billion operations to process one low-resolution (225x225) image. In his talk, Dally discussed some of the way that circuit design can increase efficiency of training and inference to meet these growing requirements.

nvidia-vlsi1.jpg

The arithmetic in deep neural networks largely consists of convolutions and matrix multiplication. Training requires at least half-precision (FP16) and the "state-of-the-art," Dally said, is the Tesla V100 with its Tensor Cores that deliver 120 trillion operations per second with very high efficiency. CPUs and FPGAs are orders of magnitude off, he said, and even custom chips would deliver at best 30 percent better performance per watt.

The V100 is also at the heart of what is now the world's fastest supercomputer[2]. Summit has 4,608 nodes each with two IBM Power9 CPUs and six Tesla

Read more from our friends at ZDNet