Google AI Architecture Pathways Can Learn Millions of Things

The Google Research next-generation AI architecture is called Pathways. It will enable a single model to learn millions of things instead of just one thing. This will be a massive phase change and up leveling of neural network and AI capabilities.

Pathways is part of the improvements Google is making to their AI. Google new AI models are helping to create information experiences that are truly conversational, multimodal and personal. This improves Google Assistant.

The multitask unified model, or MUM for short, has improved searches for vaccine information. They will introduce new ways to search with images and words simultaneously.

In October, Google introduced a new AI architecture called Pathways. AI models are typically trained to do only one thing.

With Pathways, a single model can be trained to do thousands, even millions of things. From MUM to Pathways to BERT and more, these deep AI investments are helping us lead in search quality. They’re also powering innovations beyond Search. For example, DeepMind’s protein folding system AlphaFold, was recently recognized by Nature & Science Magazine as a defining breakthrough.

To illustrate the scale of the team’s achievement, it took scientists more than 50 years to figure out the structure of 150,000 proteins. The DeepMind team has now expanded that number to 1 million, and they think they will get to more than 100 million this year. Philipp will talk in great detail about our advertising business. which also benefits from our investments in AI.

Pathways is a new way of thinking about AI that addresses many of the weaknesses of existing systems and synthesizes their strengths. To show you what I mean, let’s walk through some of AI’s current shortcomings and how Pathways can improve upon them.

Today’s AI models are typically trained to do only one thing. Pathways will enable us to train a single model to do thousands or millions of things.

Today’s AI systems are often trained from scratch for each new problem – the mathematical model’s parameters are initiated literally with random numbers. Imagine if, every time you learned a new skill (jumping rope, for example), you forgot everything you’d learned – how to balance, how to leap, how to coordinate the movement of your hands – and started learning each new skill from nothing.

The world currently train most machine learning models today to do one thing. Rather than extending existing models to learn new tasks, we train each new model from nothing to do one thing and one thing only (or we sometimes specialize a general model to a specific task). The result is that we end up developing thousands of models for thousands of individual tasks. Not only does learning each new task take longer this way, but it also requires much more data to learn each new task, since we’re trying to learn everything about the world and the specifics of that task from nothing (completely unlike how people approach new tasks).

Pathways could handle more abstract forms of data, helping find useful patterns that have eluded human scientists in complex systems such as climate dynamics.

Today’s models are dense and inefficient. Pathways will make them sparse and efficient.

A third problem is that most of today’s models are “dense,” which means the whole neural network activates to accomplish a task, regardless of whether it’s very simple or really complicated.

This, too, is very unlike the way people approach problems. We have many different parts of our brain that are specialized for different tasks, yet we only call upon the relevant pieces for a given situation. There are close to a hundred billion neurons in your brain, but you rely on a small fraction of them to interpret this sentence.

AI can work the same way. We can build a single model that is “sparsely” activated, which means only small pathways through the network are called into action as needed. In fact, the model dynamically learns which parts of the network are good at which tasks — it learns how to route tasks through the most relevant parts of the model. A big benefit to this kind of architecture is that it not only has a larger capacity to learn a variety of tasks, but it’s also faster and much more energy efficient, because we don’t activate the entire network for every task.

For example, GShard and Switch Transformer are two of the largest machine learning models we’ve ever created, but because both use sparse activation, they consume less than 1/10th the energy that you’d expect of similarly sized dense models — while being as accurate as dense models.

Pathways will enable a single AI system to generalize across thousands or millions of tasks, to understand different types of data, and to do so with remarkable efficiency – advancing us from the era of single-purpose models that merely recognize patterns to one in which more general-purpose intelligent systems reflect a deeper understanding of our world and can adapt to new needs.

AI and Google’s Ad Revenue

AI impacts Google’s ads product was described in the recent earnings call.

Performance Max is a news campaign that went global in November and has been quickly embraced by advertisers. It brings the best of Google Ads, AI and automation together to let brands promote their businesses across all Google services from a single campaign, helping them drive more online sales, leads and/or foot traffic. They are radically simplifying their products and making them easier for customers to use. French children’s wear retailer Petit Bateau tested PMAX over a three-week period return on ad spend jumped 35%. Click-through rates increased 40%, and valuable insights were gleaned into what messaging resonated most.

They developed insights tools. Four new features were launched in Q4. This included demand forecast, which uses ML to help businesses predict forward-looking trends and better understand what goods to stock and what services to offer when.

In Search, Advertisers are leaning more into automation, using responsive search ads to create and select the best-performing creatives, matching with more relevant search queries using broad match keywords, setting optimized bids with auction time signals.

They have smart bidding. They are using more AI to help advertisers measure their results and bid intelligently with data-driven attribution, for example, which uses very advanced ML to more accurately understand how each marketing touchpoint actually contributed to a conversion, obviously, while respecting user privacies. Broad match keywords are a big part of this.

They have responsive ads on display and discovery. They use text image and video assets from advertisers and predict the best combination of assets to show in any size or format on Google properties or the display network. Google his elping advertisers lean into automation and identify new opportunities as a central part really of their recovery and growth strategies.

Written By Brian Wang,