The 3 Essential Rules for Applying GPT-4 in an Enterprise Context
Chief Commercial Officer
The capabilities of large language models (LLMs) are breathtaking. But while the potential benefits are exhilarating, implementing them effectively requires careful planning.
At Faculty, we’ve worked at the cutting edge of AI for nearly a decade. Recent advances such as GPT-4 from our partners OpenAI have outpaced any other developments in the field.
From building hundreds of AI solutions for private and public sector organisations, we’ve seen most of what can go right. And what can go wrong.
If your organisation is ready to get on board with LLMs, here are three essential rules to ensure your solution is a success.
1. You need to find the right problem
LLMs are experiencing a Cambrian Explosion moment.
For pretty much every business problem in the world, someone, somewhere will be experimenting with an LLM-powered solution.
Capabilities like Computer Vision, that took off during the acceleration of Deep Learning a decade ago, also went through similar moments of explosive creativity. Albeit not to the same level.
But the brutal truth is that most of those experiments failed to deliver solutions that scaled. And the same will be true with LLMs.
It’s through this kind of experimentation, and the failures that go with it, that society finds out where the value of an innovation lies. It’s an essential, even beautiful process.
But of course, it’s preferable to not invest in one of the experiments that fail.
You can massively increase your chances of success by picking the right problem to target.
The things that LLMs are good at are now well known. They can generate highly convincing text from simple prompts. They can summarise and classify text, retrieve information from it, generate images from text and transcribe speech to text. And, of course, they can provide engaging chat interfaces.
It’s these capabilities that are fuelling the rapidly expanding list of use cases. Anything from financial forecasting, to legal and medical advice, to automating call summaries in contact centres.
At face value, all of these seem like plausible applications. But in the long run, some will prove more so than others.
For each case, their viability will correspond to how well they fit with important details of how LLMs work. The importance of natural language versus structured data, for instance, is vital.
So too is whether the problem is relatively insensitive to small errors, or if individual errors can have a significant impact. If you consider applications according to these dimensions, some will emerge as more credible use cases.
Ultimately, to choose the right use case, you need to be mindful of how well the details of your problem will map with how LLMs work. But that’s just the start.
Use case examples grouped by the importance of natural language (y) and sensitivity to individual errors (x).
The top right quadrant represents the most viable use cases for LLMs.
2. You need to integrate LLMs into an end-to-end system
The ‘chasm of death’ between a PoC and production technology in AI is well documented.
Even if a use case is well suited to an LLM, many applications will never make it beyond a fun but ultimately pointless widget. Treated as standalones, they fail to take root and inevitably wither and die.
To succeed, your AI solution must be plugged into a business process in a thoughtful way. And then implemented in a well designed, well engineered, and properly maintained software system.
In our experience, some of the most successful implementations are Intelligent Decision Systems. They use data and AI to infer causal models of the processes that drive business outcomes.
They help users to make confident, informed decisions on when and how to intervene. And they log the impact of their actions in a way that creates a closed feedback loop.
If embedded thoughtfully, LLMs have the potential to supercharge these Intelligent Decision Systems.
For example, they can streamline the process by which decision-makers build an understanding of important cause and effect relationships by synthesising and summarising unstructured data. And by enabling them to query it in natural language.
They can also provide a mechanism for capturing decisions made in natural language, and actioning them by triggering a process in connected software.
But if you are not clear about exactly where and how an LLM enhances your decision flow, you risk implementing technology that will soon be discarded.
Having your LLM-powered solution embedded into the right part of the decision flow is essential. But so too is building the correct technical infrastructure around it.
This includes some common AI components, like data processing pipelines and model validation. As well as others that are more novel to LLMs, such as prompt pipelines and model fine tuning.
3. You need to build in AI safety from the outset
For every tweet or article showcasing LLMs doing something incredible there is another showing it behaving unpredictably. In some cases, disastrously so.
Snapchat found this out the hard way when its ‘My AI’ chatbot offered advice to a user posing as a 13-year-old girl on how to hide the smell of alcohol and marijuana, and how to ‘set the mood’ for a sexual encounter.
LLMs’ current limitations are well documented. Hallucinations, bias, unpredictability, and opacity around training data – they all present commercial, legal and reputational risks.
“The right way to think of the models that we create is a reasoning engine, not a fact database.”Sam Altman, CEO, OpenAI
As this class of technology improves, many of these problems will be quickly ironed out. GPT-4 is already less likely to hallucinate or cover restricted topics than GPT-3 after only a few months of iteration.
But it’s still essential to assess the risks around any application of LLMs. And to build AI safety into the way you deploy the technology from the ground floor up.
Part of the solution here is to make sure that LLMs are used in ways that are safety appropriate. For example, treating them as a source of first reference rather than final reference on matters of fact. As Sam Altman, CEO of OpenAI said, “The right way to think of the models that we create is a reasoning engine, not a fact database.”
AI safety can also be achieved through a set of technical capabilities. For example, bias detection can be used to ensure that model outputs don’t inadvertently draw on protected characteristics.
Automatic alerts can tell administrators if a system starts behaving in ways that were not intended. Building human feedback into the loop can prevent your model drifting in undesirable directions. And auditing and logging of system responses will make sure that if problems begin to arise you can trace them to their root.
These measures, and others, need to be implemented with care. But if done correctly, they don’t just reduce the risk of undesirable outcomes. They make that risk more transparent and controllable.
Ready to get started?
So yes, there is a lot to bear in mind to get LLMs done right. But if you make sure they’re applied to the right problems, the potential to be realised is astronomic.
It’s by implementing them thoughtfully into an end-to-end system, and building in AI safety from day one, that you can ensure lasting and reliable value.
The good news is that all of the above is easily navigated with the right expertise in your corner.
Our mission at Faculty is to help our customers implement intelligent decision systems to take on their most important challenges. LLMs like GPT-4 have the potential to accelerate that like nothing we’ve seen before.
How could your organisation benefit from better decision making powered by LLMs? Let’s find out.