A Lesson from History. The first commercial passenger airship took flight in 1910, with the first international passenger airship service (Graf Zeppelin) launching in 1928. Meanwhile, a bicycle shop owner flew the first powered aircraft in late 1903, with the first passenger aircraft flight (Ilya Muromets) in 1913.
Airships and aircraft developed in parallel, but in the early days of aviation, it was the airship that captured the public imagination. Aircraft were bit-part players, perceived as dangerous and uncomfortable, serving no more than a niche market.
Looking back through the lens of history, airships were huge monstrosities, and their limitations were blindingly obvious. Any fool could see they were dangerous. But history is hindsight with 20/20 vision—those alive at the time did not see it that way.
What’s my point? Well, let’s jump back to today. We have many Large Language Models (LLMs). They are now well-known, well-funded, and well-hyped. We also have Small Language Models (SLMs) running along, largely unnoticed in the background.
Is the hype surrounding LLMs obscuring the potentially transformative impact of SLMs? When or if the AI bubble bursts, will SLMs be the way forward?
LLMs are large (obviously—the clue is in the name) with high billions of parameters. They have limitations (hallucinations), and their supporting infrastructure is vast (and expensive). So far, LLMs lack a killer commercial application. Some claim LLMs are potentially dangerous. Meanwhile…
A Clarification
Before going any further, it’s important to address the hype. Many talk about AI, but that is a blanket term. Under the AI banner is machine learning, and under that lies deep learning. LLMs (and SLMs) exist at the deep learning level.
Machine learning has many applications, some of which have been installed and active for decades. Visual recognition systems are a machine learning application. Some large-scale insurance claim processing and credit scoring applications also use machine learning.
What Is an SLM?
An SLM has the same basic core architecture as an LLM. There are some structural variations, but the main difference is scale (number of parameters, layers, etc.). Parameters are variables that the model learns from training data.
LLMs have high billions of parameters, but an SLM typically has millions to low billions. Hence, these models are significantly smaller, requiring less memory and computational power than LLMs.
The lower number of parameters means SLMs perform well on specific tasks but have limitations in generalisation and contextual understanding. They struggle with complex reasoning.
Like all potential solutions, SLMs have downsides. While they deliver fundamentally different ways of interacting with and processing data, they need training, they hallucinate, and as customised solutions, they require specialist skills to build, train, and deploy.
There are many potential applications for SLMs. Before ChatGPT (and the associated hype cycle) arrived on the scene, they addressed market demand for advanced NLP solutions in areas such as customer support chatbots and virtual assistants, advanced search and recommendation engines, and language translation.
Early SLMs were used to improve predictive text and autocorrect features on smartphones, then voice assistants on smart speakers and earbuds, and, as time progressed, specialised document retrieval.
SLM vs LLM Applications
LLMs (and SLMs) understand text, but they don’t directly understand numbers and infer their meaning based on training data. If precise calculations, statistical modelling, or error-free numerical reasoning are required, LLMs/SLMs are not the best choice.
Let’s start with LLMs. It’s well known that their applications include content generation, language translation, information retrieval, summarisation, code generation, chatbots, educational content generation, personalised learning content, and educational tools.
The major issue with LLMs is scale. The supporting infrastructure is significant (and expensive). Does the current range of applications (although undoubtedly useful) justify the investment?
There is lots of promise (hype) about where the current LLM technology will lead, but to date, that promise has not been delivered. At present, there is no killer use case. Of course, that could soon change—unless, just like the Hindenburg, the bubble bursts before we get there.
Generally, more parameters in a model allow for a better understanding of context, handling complex requests, and adapting to new situations. For applications like a chatbot designed to handle a wide range of topics, LLMs are often the best choice.
LLMs are also generally more capable in areas like broad code generation. However, for more specific code generation tasks or chatbots with a narrow focus, SLMs can be effective and may be preferred due to their efficiency.
The choice between an SLM and an LLM depends heavily on the application.
LLMs (Strengths):
Greater versatility and breadth: Better for tasks requiring broad knowledge, complex contexts, and generalisation to new situations.
More nuanced and comprehensive responses.
Stronger in complex tasks: Generally more capable in areas including, broad code generation, complex text summarisation, and handling open-ended queries.
SLMs (Strengths):
Efficiency and speed: Faster response times and lower computational costs, making them suitable for applications where latency and resource usage are critical.
Specialisation and focus: Can be highly effective for specific, well-defined tasks and domains, potentially outperforming LLMs in specific use cases.
Lower complexity: Simpler (and cheaper) to deploy and potentially easier to fine-tune for specific use cases.
Data privacy and security: SLMs can handle private or sensitive data on-premises or within tightly controlled environments.
On-device or edge use: SLMs can run directly on user devices (e.g. IoT hardware).
SLMs are not searching for a killer use case; their applications exist here and now. They don’t need vast supporting infrastructure and, compared to LLMs, are relatively easy to implement.
The question isn’t necessarily which technology will "win," but how each will find its space. It may be SLMs are the workhorses solving specific business problems, while LLMs continue to push the envelope.
In the end, unlike the aircraft analogy, the future might hold room for both technologies.
What Does Business Need?
It depends. Assuming privacy and security issues can be addressed, then LLMs cope much better in coding applications and in situations where contextual understanding and, to a point, reasoning are required.
SLMs are best for specific, customised business applications—on-site or at the edge. Their chat-style, flexible user interface (rather than point-and-click) adds natural language querying, quick data lookups, or guided customer service workflows. They can complement existing systems.
SLM-based applications can remove much user frustration with existing software systems, where users have to comply with software demands rather than the most efficient way to complete a task.
History shows it is best to be cautious about technological predictions. The "obvious" winner often isn’t obvious until after the fact. I simply suggest that the focus should remain on solving real problems rather than chasing the most hyped solution.