While the concept of artificial intelligence (AI) is not novel, tools such as ChatGPT, which leverage generative AI (GenAI), have thrust it into the limelight. As a result, the term “AI” quadrupled in usage in 2023, with the Cambridge Dictionary even declaring “hallucinate” (when used in the context of AI) to be its “word of the year.”
These terms have become popular for good reason, with a plethora of impressive public-facing models captivating imaginations. Tools such as Midjourney have allowed people to create photorealistic art from simple descriptions, fueling conversation about what can be achieved with GenAI. These conversations have raced far beyond curiosities as institutions like the UK’s Department of Education and industry-leading businesses aim to take advantage of the technology.
The challenge is not just figuring out what GenAI can do but also getting to grips with the new terminology and acronyms that come with it.
In this post, we set out the key terms you need to know in the rapidly evolving field of GenAI. From Large Language Models (aka LLMs) to prompt engineering, we’ve got you covered. Let’s dive right in.
Generative AI – your essential glossary
Bias is the tendency for AI to replicate the prejudices and disparities inherent in the data it was trained on. For example, a search for “office worker” in a stock photo library will return a huge number of mostly male, mostly white people in suits. As a result, an AI trained on images from around the web will produce images of mostly male and white office workers.
However, It is possible to avoid this by training the AI with a larger and more diverse variety of images, such as photos of women and people of colour.
ChatGPT is arguably the most popular generative AI system out there at the moment. It takes OpenAI’s foundation model, GPT, and tunes it for use in a chat interface. Like Google Bard and Llama 2, it is one of many LLMs available to users (more on LLMs below).
Embeddings are numerical representations of the meaning of things like words, images, or audio. Despite their huge complexity, the neural networks (see more on this term below) that underpin generative AI are just mathematical models that manipulate numbers. As a result, processing something like a word requires converting it into a numerical representation that is meaningful.
Real embeddings are lists of thousands of numbers, but if we imagine a four number embedding, it would look something like this:
Cat and dog are similar concepts – they are small, mammalian household pets. As a result, they have similar embeddings. Since apples are not like cats or dogs, most of their embedding values are different from the first two but would be similar to a pear, for example.
Fine tuning is the act of further training a foundation AI model in order to make it more suited to a specific task. This involves taking pairs of example inputs and desired outputs and using them to adjust the foundation model by forcing it to adapt to match the desired output.
For example, if someone using an LLM to write marketing emails found that it produced dry, stiff-sounding emails that did not match the company style, they could collate a dataset of all subject headings and email body text from past marketing emails. Using the subjects as inputs and the email body as target outputs, they could tune the model to closely match their preferred brand style.
Foundation models are the incredibly large, general purpose models that have driven the generative AI revolution. The vast amount of computing power and data required to train state-of-the art AI means that we increasingly rely on providers like OpenAI (the creators of GPT) to create models that we can adapt to our specific needs rather than starting from scratch.
Guardrails are mechanisms used to try to ensure that the output of an LLM stays within defined boundaries. As an example, we could create a simple model that refuses to answer if a prompt asks for instructions on illegal activities. Guardrails can be used to defend against hallucinations, bias, and offensive outputs.
Generative AI is a subset of AI that can produce new data. Traditional AI (sometimes called predictive or discriminative AI) is limited to tasks like identifying people in images or discriminating between positive and negative sentiment in tweets. A generative AI model can create entirely new images of people or even write tweets of its own.
Hallucination is the tendency for LLMs to produce coherent but factually incorrect text. LLMs are essentially highly sophisticated predictive text machines that generate sequences of words based on patterns they see in the data used to train them. The huge corpus of text used to train state-of-the art LLMs means they often produce factual statements, but this is not always the case.
For example, words like ‘carnivore’ and ‘antelope’ will appear often enough alongside the word ‘lion’ in human-created text that, when asked what a lion eats, an LLM will produce a correct answer by just stringing together words it knows go together. When asked how a lion’s digestive system differs from a human’s, and the LLM may find itself in uncharted territory. As a result, it may start making up plausible text sequences with no relation to reality.
A Large Language Model (LLM) is a type of generative AI that uses a special type of neural network called a transformer to predict how a text sequence should continue. The key difference between an LLM and your phone’s predictive text is that these transformers are capable of holding a great deal of context in memory, meaning they can create long, coherent, and relevant text responses to any input.
Neural networks are machine learning models conceptually based on the way biological brains work. Simply put, they are very flexible mathematical functions that take in a list of numbers, transform them in some way, and output another list. The transformations are not programmed but are instead learned in a trial and error fashion by providing the network with thousands to millions of examples of an output you would expect from an particular input.
An example of this would be recognising cars in images. One could gather thousands of images, label them as containing cars or not, and then set up a neural network to receive an image as a list of pixel brightnesses and output a number from 0 to 1 based on how sure it is that there is a car. By showing it thousands of example images, the neural network will learn a strategy to reliably transform the input pixel values into an accurate guess.
A Prompt is the text a user provides to a generative AI to begin its output. This may take the form of a description for an image-generating AI or a simple question for an LLM trained to work through a chat interface.
Example of a prompt:
Please write an AI-themed blog post called “Understanding the terminology.”
Prompt engineering is the practice of experimenting with different prompts to get better outputs. Often, approaches such as providing general instructions on how to behave (system prompting) or giving examples before asking the LMM to complete a task (few-shot prompting) can produce much better results than the most natural prompt a person might think to write.
Here is an example of how you could edit a very simple prompt to give more instruction and context to the LLM and obtain better outputs.
You are a knowledgeable blogger with a casual style writing blog posts for work. You have been asked to write a glossary of AI terms. They should be in alphabetical order and be complete paragraphs that start with the term to be defined.
Retrieval Augmented Generation (RAG) is a way to have an LLM access a knowledge base when writing its responses. Since embeddings represent meanings, a database of the embeddings of a series of documents can be used to search for documents with similar meanings to a prompt. Then the LLM is sent both the relevant documents and the prompt and is able to craft a response with added contextual knowledge.
A textbook with detailed definitions of AI terms would be a useful resource for a RAG approach to generating this blog. For each term, we could prompt something like “please define retrieval augmentation generation.” The system would then search the textbook (crucially by meaning, not keyword) for passages related to RAG and then use them to produce the desired blog-style paragraph.
Reinforcement learning from human feedback (RLHF) is a method of tuning foundation models by having users of a system rate the responses of the model. The model will adapt itself in response to that feedback, constantly chasing positive ratings. RLHF is one of the methods used to create ChatGPT from the underlying GPT model.
A Token is the smallest unit of text that an LLM understands. Texts used to train an LLM would have been broken down into words or even single letters, and the model’s job would be to take a sequence of these tokens and learn to predict the next token.
For example, a word-based tokenizer would convert
Write a blog about generative AI terms
Into the following tokens:
Write, a, blog, about, generative, AI, terms
The LLM would then predict a logical continuation of that sequence of tokens, producing something like
Sure, !, here, is, a, blog, post, about , generative, AI, terms…
Feeling more up-to-date on all the GenAI terminology?
Hopefully, now that you have our handy GenAI glossary by your side, you will find it easier to navigate through the constantly changing sea of GenAI terms and acronyms.
If there are any words we missed that are still proving to be a mystery to you, or if you would like to discuss anything covered in this post in more detail, please reach out to our team for a chat.