What AI and power plants have in common

We're excited to bring Transform 2022 back in person on July 19 and virtually from July 20-28. Join leaders in AI and data for in-depth discussions and exciting networking opportunities. Sign up today!

The history of artificial intelligence (AI) development over the past five years has been dominated by scale. Tremendous progress has been made in natural language processing (NLP), image understanding, speech recognition and more by adopting strategies developed in the mid-2010s and putting more computing power and more data behind them. This has resulted in some interesting power dynamics in the use and distribution of AI systems; the one that makes AI look like the power grid.
For NLP, bigger is better
The current state of the art of NLP is powered by neural networks with billions of parameters trained on terabytes of text. Simply keeping these networks in memory requires multiple state-of-the-art GPUs, and training these networks requires clusters of supercomputers far beyond the reach of all but the largest organizations.

One could, using the same techniques, train a much smaller neural network on much less text, but the performance would be much worse. So much worse, in fact, that it becomes a difference in kind instead of a mere difference in degree; there are tasks such as text classification, synthesis, and entity extraction where large language models excel and small language models do no better than chance.

As someone who has been working with neural networks for about a decade, I'm really surprised by this development. It is not technically obvious that increasing the number of parameters in a neural network would lead to such a drastic improvement in capabilities. However, here we are in 2022, training neural networks nearly identical to architectures first published in 2017, but with orders of magnitude more computation and achieving better results.
Event
Transform 2022

Join us at the leading Applied AI event for enterprise business and technology decision makers on July 19 and virtually July 20-28.
register here
This indicates a new and interesting dynamic in the field. State-of-the-art models are too computationally expensive for almost any company – let alone an individual – to create or even deploy. For a company to use such models, it must use one created and hosted by someone else - the same way electricity is created and distributed today.
Share AI as if it were a metered utility
Every office building needs electricity, but no office building can house the infrastructure to generate its own electricity. Instead, they are hooked up to a centralized power grid and pay for the electricity they use.

Similarly, a multitude of companies can benefit from integrating NLP into their operations, even if few have the resources to create their own AI models. This is exactly why companies have created great AI models and made them available through an easy-to-use API. By providing businesses with a way to “plug in” to the proverbial NLP power grid, the cost of training these state-of-the-art models at scale is amortized across various customers, enabling...

Business Jul 16, 2022 0 125 Add to Reading List

We're excited to bring Transform 2022 back in person on July 19 and virtually from July 20-28. Join leaders in AI and data for in-depth discussions and exciting networking opportunities. Sign up today!

The history of artificial intelligence (AI) development over the past five years has been dominated by scale. Tremendous progress has been made in natural language processing (NLP), image understanding, speech recognition and more by adopting strategies developed in the mid-2010s and putting more computing power and more data behind them. This has resulted in some interesting power dynamics in the use and distribution of AI systems; the one that makes AI look like the power grid.

For NLP, bigger is better

The current state of the art of NLP is powered by neural networks with billions of parameters trained on terabytes of text. Simply keeping these networks in memory requires multiple state-of-the-art GPUs, and training these networks requires clusters of supercomputers far beyond the reach of all but the largest organizations.

One could, using the same techniques, train a much smaller neural network on much less text, but the performance would be much worse. So much worse, in fact, that it becomes a difference in kind instead of a mere difference in degree; there are tasks such as text classification, synthesis, and entity extraction where large language models excel and small language models do no better than chance.

As someone who has been working with neural networks for about a decade, I'm really surprised by this development. It is not technically obvious that increasing the number of parameters in a neural network would lead to such a drastic improvement in capabilities. However, here we are in 2022, training neural networks nearly identical to architectures first published in 2017, but with orders of magnitude more computation and achieving better results.

Event

Transform 2022

Join us at the leading Applied AI event for enterprise business and technology decision makers on July 19 and virtually July 20-28.

This indicates a new and interesting dynamic in the field. State-of-the-art models are too computationally expensive for almost any company – let alone an individual – to create or even deploy. For a company to use such models, it must use one created and hosted by someone else - the same way electricity is created and distributed today.

Share AI as if it were a metered utility

Every office building needs electricity, but no office building can house the infrastructure to generate its own electricity. Instead, they are hooked up to a centralized power grid and pay for the electricity they use.

Similarly, a multitude of companies can benefit from integrating NLP into their operations, even if few have the resources to create their own AI models. This is exactly why companies have created great AI models and made them available through an easy-to-use API. By providing businesses with a way to “plug in” to the proverbial NLP power grid, the cost of training these state-of-the-art models at scale is amortized across various customers, enabling...