Building intelligent customer service through data science

Customer services can be frustrating. From negotiating complex, multi-option menus over the phone to reading long lists of FAQs online, we can often end up bored and irritated. Fortunately, forward-thinking businesses that understand the power of technology are starting to provide better alternatives.

2017-07-13

Data Science

Octopus Energy is one such company. It has a single e-mail address, hello@octopus.energy, to which customers can send any query they like — and with the guarantee that they will receive a response from a real person. Simplifying the experience for customers has huge value, and you only need to look at Octopus Energy’s ratings on TrustPilot to see that it works.

After welcoming its first customers around a year ago, it now provides electricity and gas to roughly 100,000 homes across the UK. As it continues to grow, it is important that its customer services model can scale efficiently.

Because Octopus does not ask customers to specify the nature of their problem, members of the customer services team read new messages (and decide who should deal with them) based on the content. This classification step, which is currently a bottleneck in the process, is ripe for replacement by an AI system. The goal of my ASI Fellowship project was to build such a system.

I was given access to all Octopus Energy’s customer messages in order to solve this problem. Using SherlockML (ASI Data Science’s data science platform), I processed and cleaned the data into a format that was ready for some machine learning analysis. I then built a pipeline to classify messages, and trained it on a data set that I constructed.

The first stage of the pipeline converts the text from an email into a long list of numbers that can be understood by machine learning algorithms. The main component of this step is a ‘tf-idf’ transformation. Tf-idf assigns a number to each word in an email that is proportional to how often it appears in the email, and inversely proportional to the number of other emails it appears in.

Intuitively, this process produces a large score when a word is very common in an email, but doesn’t appear very often in other emails. This allows the computer to identify which words might be important in identifying different types of message.

The next step in the pipeline is to classify the message. For this, I trained a Support Vector Machine (SVM) classifier on past messages from customers. Once the emails have been converted into numbers, they can be imagined as points in a high-dimensional space. SVMs attempt to find surfaces in this space that best separate the different types of message. For example, messages from customers submitting their meter readings will occupy a different region in this abstract space from those who want to change their direct debit details.

Using SherlockML’s ability to spin up large servers in the cloud, I trained a large number of different pipelines, adjusted the many different tuneable parameters, and chose the one that performed best on left-out data.

The pipeline worked well, and I worked with Octopus Energy’s tech team to connect it to their existing infrastructure. Furthermore, the classifier can continue to improve as time goes by using a process called ‘online learning’, where it learns from new messages and from past mistakes.

With this new system in place, Octopus Energy can continue to provide its amazing customer services — but at scale.

Angus Williams took part in the Faculty Fellowship May 2017. Prior to the Fellowship he completed a PhD in astronomy at the University of Cambridge.

View All