Find Your Best LLM: Unify Helps Detect the Right LLM Quickly and Cost-Efficiently

Jade-Worrall · ‎08-09-2024

With the help of the Unify LLM router, companies can determine easily and cost-effectively which LLM provides the best outcome for a given prompt based on the output quality, cost, and speed of existing models.
Intel® Liftoff member Unify is a central gateway to all Large Language Model (LLM) endpoints, with a single API to query all endpoints using one key. Beyond accessing all models, the Unify router sends each prompt to the best LLM, providing the optimal combination of quality, speed, and cost.
Unify’s live benchmarks provide an unbiased view on endpoint performance, bringing clarity to the LLM space to make informed decisions.

The Challenge

Different datasets and training schemes train each model, making some more adept at handling specific tasks than others. As a result, finding the right LLM for a task often involves manual and costly testing to understand how each model handles different kinds of prompts.

With the abundance of specialized LLMs, rigorous testing is becoming increasingly unsustainable, particularly on a per-prompt basis. For each model, you need to sign up with a given provider, experiment and test, make sense of the results, and repeat the same process with other models in order to compare across various set-ups, tasks, and prompt variations.

While a simple solution could be to use a general, all-purpose model such as GPT, the issue with this approach is twofold. First, using large, proprietary models may not be an option when the processed data is sensitive. Second, smaller expert models may deliver better results much faster and cheaper, especially when the tasks aren’t complex.

Unify proposes a hassle-free alternative, allowing users to automatically pick the right model for a task while ensuring optimal outputs on each prompt.

unify graphic .jpg

Selecting the right LLM by using the Unify platform is always the right mixture of quality, throughput and cost

The Solution

The Unify router determines which LLM provides the best outcome for a given prompt, based on the output quality, cost and speed of existing models.

At a higher level, the router is trained to provide a quality score to the various LLMs based on a task, based on a representative benchmark dataset. The router then uses Unify's live, runtime benchmarks of LLM endpoints to gather data on the speed and costs of a model across multiple endpoint providers. The quality score, endpoint cost and speed can then be used as a basis for determining a routing scheme, steering the router’s selection towards the preferred metric(s) depending on the use-case.

You can use the default router configurations as ready-to-use routing schemes, but you can also customize and train your own router:

Creating a profile by signing up to your Unify account
Uploading a dataset of prompts that represent the task you want to use LLMs for.
Use the trained router through the online interface or via the API to start routing optimally for your use case.

To customize further, users will soon be able to provide their own endpoints to route across when training a router.

Unify's LLM router determines the best possible output for the LLMs of choice

“The Unify API can be used across hardware devices, including Intel® hardware. In order to optimize our platform constantly, we have received a lot of support working with Intel® during both the Intel® Ignite program and also Intel® Liftoff. We will soon add Intel® Gaudi® AI accelerator-based endpoints to our platform, which will give us another boost in terms of training our AI platform as efficiently as possible.” - Daniel Lenton, CEO at Unify

Joining the Intel® Liftoff program has provided Unify with unique expertise, resources, and collaboration opportunities with other batchmates. The mentorship provided by the program has helped to shape the startup's value proposition, both with Intel members and with other community founders.

Unify was also part of Intel® Ignite, where Kevin Crain (CTO at Intel Ignite) put the company in touch with the Intel® Liftoff team. Early versions of the Intel® Gaudi AI accelerator architecture were exclusively available to Unify, which greatly aided their LLM platform development.
For any company or individual deploying LLMs, Unify is a straightforward answer to the ongoing issue of model selection and an efficient way to optimize the performance of LLM pipelines for any use case, balancing quality, speed, and cost.

Are you at the forefront of innovation, harnessing the power of Large Language Models (LLMs) to drive your projects forward? If this sounds like you, we would love to help you accelerate towards your goals. Apply to the Intel® Liftoff Program for Startups and let’s get started.