Are Large Language Models (LLMs) the Future of AI or Just a Stepping Stone? It's clear, we are far from achieving human-level intelligence. What might come next?
The fastest supercomputer can complete a quintillion calculations per second. It consists of 74 cabinets, each weighing 8,000 lbs, and consumes about 21MW.
The human brain can perform around 10 quadrillion synaptic operations per second, weighs approximately 3 lbs and its power consumption is roughly equivalent to a light bulb.
The brain operates through networks of neurons, synapses and chemical processes. It learns through experience and interaction with the world and builds complex, dynamic models of reality over time.
Large Language Models (LLMs)
LLMs are mathematical models based on statistical patterns in data. They don't have sensors to interact and learn from the real world.
As the name implies Large language Models only work on text. But, by combining them with other models like convolutional neural networks (CNNs) or Vision Transformers (ViT), they can be multimodal (interpret images and video).
The underlying mechanics of LLMs and the human brain are vastly different. You don't need the deep dive technical detail to understand that fact - it's obvious. It's estimated the power required to run ChatGPT for a single day is 1GWh.
Despite the hype, there is some way to go before current AI is anywhere near as efficient or capable as the human brain. So are LLMs (with added multimodal capabilities) the best way forward?
To answer that question is difficult. Scientists don't understand how the brain works. Engineers don't understand the detail of what is going on under the hood of an LLM.
LLMs: The Way Forward
AI research is evolving, and it's possible that future LLMs or hybrid models could overcome some current limitations.
Incorporating memory systems could give models the ability to store and retrieve information over longer periods. This approach could enhance context retention, establish meaningful relationships and improve reasoning.
Current models link text and images but can't fully understand how different modalities interact. Advances could allow AI to learn more integrated representations of the world. Integrating multimodal models with real-world sensors, perhaps via robotics (Embodied AI), would give them the ability to learn like humans via real-world feedback. Self-supervised learning techniques that learn in similar ways to the human brain might help.
However, all of the above does not get around the fundamental issue: 21MW vs the light bulb, 300 tonnes vs 3 pounds. The amount of power (and water) to train, and then support LLMs is vast. The costs of infrastructure run into billions of pounds. Given there is currently no killer use case, in the longer term, who is going to pay?
The Hallucination Issue
Here's a question: can you trust a human? Well, you may say that depends, and you would be correct. It depends on a range of factors, many of which are entirely human, their demeanour, the look in their eyes.
Now can you trust an LLM? No, I suggest you can't, and unless their output is non-critical, that's a big problem. Why? Because for better or worse, until you can, you need to put a human in the loop.
Of course, significant research is underway to resolve the issue. Will the next generation of AI be any better? That's an open question.
A Different Approach?
First, a disclaimer. Science still lacks a definitive understanding of consciousness. Since intelligence comprises multiple elements, we don't have a well-defined target. What does Artificial General Intelligence (AGI) actually mean?
The next technological step will probably involve advances across multiple dimensions of AI research. Here are some potential directions:
Neuromorphic Computing: Tries to mimic the structure and functionality of biological neurons and synapses. The aim is to deliver efficiency and power savings. It's possible these models could learn from fewer examples.
Spiking Neural Networks (SNNs) Are a type of neural network that aims to mimic the way biological neurons communicate. They use spikes of electrical activity to process and transmit information.
As SNNs only transmit information when a threshold is reached they can deliver a lower computational overhead and improved energy efficiency.
The next step could be Neuro-Symbolic AI Combining the strengths of symbolic reasoning with neural-network-based machine learning.
Quantum AI : Quantum computing (QC) could transform AI development. It exploits quantum mechanics to handle specific AI computations.
The technology could dramatically speed up computations, simulations and specific algorithms. Unfortunately, although QC technology has been around for many years and breakthroughs to date have been limited.
Embodied AI Systems: AI integrated within physical or virtual environments, experiencing and interacting with multiple modalities (vision, sound, tactile inputs).
This approach takes us closer (as mentioned above) to how humans learn. It could deliver Improvements in reasoning through direct interactions.
Brain-Computer Interfaces (BCIs): Sorry, but I am not going there. It’s possible in theory, but the ethical and risk implications will probably stop mainstream development dead in its tracks.
All of the above might be augmented by Optical and Photonic Computing, using light (photons) rather than electrons to perform networking and computation. New semiconductors (or technologies from the past that never found a use case) could be a better option than current digital semiconductors.
So What's Next?
It depends if/when the hype dies down. There is way too much focus on personalities.
To sustain AI development needs massive investment. If the LLM hype cycle drives us into another AI winter, what then? Who will fund the development of next generation technology?
Assuming LLMs are not (as discussed above) the end game, then focus/investment needs to switch, at least in part, to other potential ways forward. There are also geopolitical issues at play.
As a thought experiment, what would you do if, as a superpower, you were behind - years behind? Then perhaps rather than try to compete head-on, you could try a different approach. That might make sense.
Perhaps some of the technologies mentioned above will be combined. Maybe scaled-down technologies, such as Small Language Models (SLMs) or Edge applications, will be where the true value lies.
What’s The Use Case?
Taking a simplistic view you might conclude that LLMs were introduced to the World in late 2022 as a technology, not a product. Users were the guinea pigs that led the way to potential applications.
There is no doubt LLMs are useful tools. Unfortunately, this fact is often obscured by the hype surrounding Agent/AGI. Nearly three years on, there is not a killer use case that will provide sufficient revenue to justify the investment in those companies to date. Before technology companies develop whatever comes after LLMs they might want to consider the use case first.