mlcon 3: How LLMs Are Defining the Newest Human/Machine Interface

Jade-Worrall · ‎08-26-2024

Ml con.png

The GenAI hype cycle may be plateauing, but people are talking to - and about - Large Language Models (LLMs) more and more. And unlike almost every other technology that’s come before, the LLMs are listening and learning, reshaping everything from business reporting to content production.

That was the theme of a conversation Intel® Liftoff’s Rahul Nair hosted with Daniel Han (Dan H.) from Unsloth.ai and Daniel Whitenack (let’s call him Dan W.) from Prediction Guard. The conversation took place at this year’s mlcon 3, a key touchpoint in the AI & ML developer conference circuit.

Their talk was titled "Transforming Human Interaction: How Language Models Impact Communication" - a topic that they’re all uniquely well-placed to tackle. Here are some highlights and surprises from their chat.

How Humans Are Using LLMs: Key Use Cases

Dan W. began by talking about the many and varied use cases that people are finding for LLM technology.

Chatbots are probably the most well known of these, but by no means the only use case. Users are also relying on LLMs to assist them with content generation and publishing - and even reports. Dan W. ventured the interesting notion that LLMs can actually replace dashboards by providing instantaneous natural language answers to business-critical questions.

Less glamorous than both of these is what Dan W. calls “information extraction” - internal bureaucratic communication like doctor’s transcripts that need to be accurately and confidentially recorded.

Secure, Scalable GenAI with PredictionGuard

All of these use cases (especially the last one) depend on alignment between the LLM’s capabilities and the organization’s requirements and rules. PredictionGuard ensures this alignment, by building safeguarded, private and performant AI models for organizations seeking to make the most of LLM technology, without the attendant risks. Describing PredictionGuard’s participation in the Intel® Liftoff Program, Dan W. notes: "We’ve been really fortunate to interact with Intel® Liftoff and the Intel® Developer Cloud to optimize both for CPU and Intel® GAUDI processors, for GAUDI 2 processors for inference."

Tuned Up: Unsloth.ai Makes Finetuning 2X Faster, with 70& Less Memory

Alignment isn’t the end of the story: LLMs also need to be finetuned for optimal functionality at scale. This is where Unsloth.ai changes excels, making finetuning simpler, easier and less memory intensive. With Unsloth, organizations can “train their own ChatGPT in 24 hrs instead of 30 days.”

Crucial to their method, as Dan H. explains, is accuracy: Unsloth doesn’t use accuracy gradations. Instead, they do everything exactly, rewriting performance architecture and optimizing hardware to make LLMs leaner, faster and more reliable. As he puts it, “we’re hardware agnostic and perform hardware optimizations across different platforms," and Intel® GPUs and Triton kernels have played a crucial role in optimizing their solutions.

He’s been surprised by some of the emerging applications for LLMs, especially the use of LLMs for mathematical predictions. People show a preference for converting numbers into text, effectively turning an LLM into a linear progression or decision tree system. “It’s weird, but it seems to work.”

Scaling & Trust: Critical Challenges in LLM Deployment

The hardware and software underlying LLMs is scarce and expensive, which means that even if a model “works”, it can be difficult to scale it to a level that’s useful for business functions.

As Daniel points out, only a small portion of AI budgets get spent on usage. The bulk of these budgets are earmarked for infrastructure, engineering and technical demands. He also noted that trust is a major obstacle to the adoption of Generative AI. He drew on important lessons he’s learned from a completely different domain to define the right way to think about trust and AI.

Understanding: does the system know what you want?
Motive: “who’s in control of this thing, and where does my data go?”
Ability: does it actually work? And how do we evaluate it?
Character: does it work the way you want it to?
Track record: can we count on this thing to be consistent?

If you get the impression that these all feel a little…human…you’re not alone. Daniel cautions against anthropomorphizing too much - even though LLM models can behave in a way that’s human-like.

Democratizing Access to GenAI & LLMs: a Bright Future for Intel® Liftoff Startups

Rahul ended the session on an aspirational note, raising the possibility of making LLMs more readily available across the globe.

The panel discussed the potential of developments like edge computing and AI PCs to address challenges like latency as the use of GenAI accelerates and compute demands continue to soar.

Both startups shared their visions for a multi-modal future, pushing the boundaries of what’s possible in their respective domains. At Intel® Liftoff for Startups, we can’t wait to see what they achieve, and we look forward to many more productive conversations like this one.