New Microsoft AI Supercomputer Will Reveal How Far Deep Learning Can Take Us?

Microsoft has a new Artificial Intelligence Supercomputer that has 38 petaflops. This makes it the fifth most powerful computer in the world.

* This is about being able to do a hundred exciting things in natural language processing at once and a hundred exciting things in computer vision, and when you start to see combinations of these perceptual domains, you’re going to have new applications that are hard to even imagine right now.
* There are 17 billion parameters for the latest Open AI models when previous systems maxed out at 1 billion parameters. This new Microsoft system could be used to scale to trillions of parameters

A new class of models developed by the AI research community has proven that some of those tasks can be performed better by a single massive model — one that learns from examining billions of pages of publicly available text, for example. This type of model can so deeply absorb the nuances of language, grammar, knowledge, concepts and context that it can excel at multiple tasks: summarizing a lengthy speech, moderating content in live gaming chats, finding relevant passages across thousands of legal files or even generating code from scouring GitHub.

Microsoft has developed its own family of large AI models, the Microsoft Turing models, which it has used to improve many different language understanding tasks across Bing, Office, Dynamics and other productivity products.

The supercomputer developed for OpenAI is a single system with more than 285,000 CPU cores, 10,000 GPUs and 400 gigabits per second of network connectivity for each GPU server.

It also unveiled a new version of DeepSpeed, an open source deep learning library for PyTorch that reduces the amount of computing power needed for large distributed model training. The update is significantly more efficient than the version released just three months ago and now allows people to train models more than 15 times larger and 10 times faster than they could without DeepSpeed on the same infrastructure.

SOURCES- Microsoft
Written By Brian Wang,

4 thoughts on “New Microsoft AI Supercomputer Will Reveal How Far Deep Learning Can Take Us?”

  1. Computing power has always been as much of a limit as architecture. Generally hardware comes first, then architectures are developed afterwards to optimize hardware resources.

  2. For training, many of the ongoing projects trying to harness the summed distributed power of crypto-currency mining rigs installed everywhere probably have a much better potential than supercomputers such as this. 10000 GPUs is not much in the crypto-mining world. The hardware is essentially free because it already is out there. All needed is better optimized software for doing really large training sets over distributed nodes.
    Inference is another thing but that domain is owned by specialty hardware like FPGAs.

Comments are closed.