A leading process engineering firm, with an annual revenue of £2+ billion pounds, operating in 140 countries around the world.
Unplanned outages of machines cause major disruption and carry significant costs for every manufacturing process. In this case, the client’s machines are processing up to 6 tons of raw food per hour, so any unplanned interruption would cause millions of pounds of lost productivity as well as potential food spoilage – creating compounding financial impact. Looking at ways to predict when and how a machine might fail would allow the client to proactively schedule work, instead of scrambling to minimise the impacts when an outage occurs.
The dataset available to Faculty was a series of unstructured log files from around 2,000 machines dating from 2001. The first step was to parse, ingest and clean the data, which was accomplished with the use of Faculty Platform. When this was complete, we were able to enrich the dataset with external information such as geographical data and details of the machines, such as submodel or year of manufacture.
In the course of preparing to tackle the main aim of the project – predicting machine errors before they occur – Faculty was able to confirm a hypothesis of the client. While all machines in the dataset are built with similar components, they are dispatched into different environments to process different commodities. As a result, they log different kinds of error report. For instance, in machines sorting rice in India the ejector system is more likely to fail, while for machines sorting wheat in the USA, the feeding system is more exposed to faults. Faculty investigated these effects using unsupervised-learning clustering techniques, and found strong associations between environmental information and error profiles. These insights raise the possibility of tailored maintenance packages for customers to ensure optimal performance.
By looking at the pattern of fatal and non-fatal errors we were able to devise an alerting system that would make a prediction for the status of the machine in 30 minutes’ time. Possible outputs were ‘fully operational’, ‘non-critical error’ and ‘critical error’. This means that operators could be warned of errors 30 minutes in advance, giving them the chance to intervene.
When false alarms are minimised, the algorithm can still catch 20% of these errors 30 minutes in advance, before they impact the process.