Trends and key technical developments in Large Language Models (LLMs)

It's hard to imagine another area where resources have been poured as intensively as the development of large language models. Just a couple of years ago, the field was dominated by a few major developers and a handful of smaller ones. Now, the landscape has changed dramatically.

Today's LLM scene is bursting with abundance, and the most challenging decision is choosing the suitable model, as all the major tech companies have their own language models, or in some cases, several. Smaller developers have also grown significantly. Companies like Anthropic, Stability AI, Cohere, and Mistral are great examples of this rapid growth. This development is extraordinary, and it extends beyond standalone chatbots like ChatGPT. For instance, Google is incorporating the LLM-based platform into virtually every one of its offerings, including Android, Search, and Gmail.

Below, we will discuss the trends and key technical developments in the large language models sphere.

The rise of open-source models

A couple of years ago, a few small open-source language models were available for those who didn't want to pay for API access. Access to models like OpenAI's GPT, including their source code, functionality, and training materials, was restricted and hidden behind an API. Today, the market has transformed significantly. A substantial portion of major current language models has been released as open-source, with some having very open licenses.

Currently, open-source models are reaching parity with closed ones. This shift is highlighted in a leaked memo from Google suggesting that they should make everything public because the business of closed models isn't sustainable long-term.

Reducing inference costs

The cost of running inference on language models has decreased significantly in recent years. Whether these models are used within applications like GPT or run independently using open-source solutions, this cost reduction represents one of the most significant changes in the industry.

The primary driver of this cost reduction is the decrease in the price of computing power per unit of data processed. Advances in hardware efficiency, optimized algorithms, and the scalability of cloud computing have all contributed to this trend. Consequently, achieving better results with less computational expense is now possible. Lower inference costs mean that the sophisticated models can be adopted more broadly across various industrial applications.

Integration into the applications

Another big trend is embedding language models into other applications. Tools such as Microsoft's Copilot, which can assist in coding, and HubSpot's content creation and marketing automation are pinnacles of this trend.

Companies can thus provide more intelligent, context-aware functionalities that improve user experience. For instance, Copilot leverages LLMs to assist developers by suggesting code snippets and detecting errors. Similarly, HubSpot's content assistants use language models to generate marketing content, streamline email campaigns, and provide marketing insights, making marketing efforts more efficient and personalized.

Technical trends in LLMs

Emerging multimodality

One of the most fascinating technical trends currently is the emergence of multimodal models. These models go beyond traditional language processing ‐ they can ingest and understand various modalities such as images, videos, audio, or other sensor data (for instance, robots' sensor or movement data). After processing this data, the output can be also multimodal: images, videos, audio, or even robots' movement data. Visual Language Action (VLA) models are at the forefront of this trend, revolutionizing robotics by allowing robots to partially guide themselves based on the world understanding provided by LLMs.

The rise of small LLMs

Small language models are becoming more common. Today, the competition between small models is about who has the most parameters running in their models. While large models can handle more complex tasks, running them locally or on a developer's machine may be challenging. At Smartbi, we are witnessing more companies having the burden of running heavy LLMs in their own server halls. On the other hand, small language models are becoming more prevalent; they are lighter, more agile, and less processor-intensive. They specialize in performing specific tasks, like tailoring customer inquiries, specific technical translations, or assisting in specific predictive maintenance tasks.

LLM as a thinking aid

There is a heated debate regarding whether language models qualify as general AI or if they merely replicate and repeat patterns from their training data. Some critics argue that these models, like GPT-4 or GPT-3, are sophisticated parrots, mimicking the vast amounts of text they have been exposed to during training. However, we argue that modern LLMs are evolving beyond simple task execution to serve as powerful thinking aids. These models can actively support and improve human cognitive processes. They assist in brainstorming, provide out-of-the-box solutions, and help in problem-solving by offering new ideas.

Getting started with LLMs

When choosing LLMs, the most crucial step is understanding the need and the purpose for which the language model will be used. Is it an automation tool, a thinking aid, or something else? Here are a few steps to get started:

Identify the need and use case
Choose a suitable language model, either open-source or commercial
Train the model to suit your needs
Implement and measure results
Iterate and continuously improve

By following these steps, you can effectively harness the power of LLMs to meet your specific industrial needs, whether in operations or management. The world of language models is rapidly evolving, offering exciting opportunities for businesses that are willing to explore it.

Get in touch

Could AI be applied to your business case?

Subscribe to the newsletter and learn how AI can solve business challenges.