Back

Flagging extremist content online

February 16, 2018

Introduction

It is well known that online propaganda from terrorist organisations plays a key part in radicalisation in Europe and the UK. In collaboration with the UK Home Office, we recently developed and tested an AI algorithm designed to detect such propaganda on the web.

Following media coverage of the classifier (see, for example, BBC News and The Guardian), there has been much talk of how it works and its performance. In this post, we’ll do our best to outline the general approach we took to design the classifier without exposing the inner workings of the algorithm,. We also specify precisely the metrics used and the performance achieved by the classifier.

Binary classification

Let’s briefly review the basic problem – binary classification. Given a media file (audio or video), we wish to determine whether or not it belongs to one of two classes. In this case, the classes are “extremist” and “non-extremist”.

We attempt to do this following a two-step process. First, we train a model on a large number of labelled media files (i.e., the class of the media file is known); then, we test the model on media files it has never seen to validate its performance.

The success of the classifier depends on the choice of model. In order to be effective, the model has to be designed in such a way that it has the capacity to learn the features of the input that have significant discriminatory power when it comes to deciding whether or not it represents extremist content.

Our model was designed to be able to learn multiple underlying signals in extremist propaganda. It then learns to combine all these signals in such a way as to achieve the high degree of performance reported. But how do we quantify the performance?

ROC curves, sensitivity and specificity

Let’s take a step back. The output of the algorithm is not a simple “yes” or “no”. The raw output is a continuous probability of belonging to the extremist class. This gives us flexibility to choose where to draw the line when it comes to attaching a label to a given input.

For example, we may choose to say a video is extremist if the output probability is > 0.5, and non-extremist otherwise. The probability threshold (0.5 in this example) will give simultaneous values for two important metrics: true positive rate and true negative rate.

 

 

The true positive rate (or TPR, and also known as sensitivity or recall) is the fraction of extremist videos that are correctly identified as extremist. Similarly, the TNR or true negative rate (also known as specificity) is the fraction of non-extremist videos that are correctly identified as non-extremist. As we vary the probability threshold, we move along what is called an ROC curve (see figure above).

If we set the probability threshold high, then the model must be more confident that a given media file is extremist content for it to get classified as such. This means that the model is less likely to classify both extremist and non-extremist media as extremist. Hence we expect the true negative rate to go up (fewer innocent videos will get incorrectly labelled as extremist), but the trade-off is that we also expect the true positive rate to go down (because fewer extremist videos will get correctly labelled as extremist).

On the other hand, reducing the threshold means more media files from both classes will get marked as extremist, and hence we expect the true negative rate to go down and the true positive rate to go up. We choose the threshold with this trade-off in mind, based on the needs of the user (e.g. very large volumes of videos might require an extremely high true negative rate).

Consider a random classifier, that is, one which assigns a probability at random. If we give a random classifier a threshold of 0.7 that means about 70% of all videos get classified as non-extremist, resulting in a true positive rate of 30% and a true negative rate of 70%. Consequently the ROC curve is simply a straight line interpolating the points (100, 0) and (0, 100), as shown in the figure above.

A good classifier is one that can simultaneously achieve good true positive and true negative rates, and so the further to the top right the ROC curve lies, the better. We illustrate a model that is better than random (i.e. has more discriminatory power) with the grey solid line labelled ‘useful classifier’. Finally, the perfect classifier has an ROC curve that looks like a right angle with its edge pointing north-east.

The blue curve in the figure is the ROC curve for our classifier after generating predictions for about 100,000 media files (none of which were used for training). We were delighted with the overall performance of our model. We optimised for a high true negative rate for a number of reasons, one of which is that given the enormous volume of non-extremist content uploaded to internet platforms, even a small percentage of false alarms would quickly overwhelm and discredit the system. From our ROC curve, we chose the point at which TNR = 99.995% and TPR = 94%, as shown by the red dot in the figure.

Conclusion

With a TNR of 99.995% and a TPR of 94%, the average number of innocuous videos flagged as extremist is only 250 per day for a site with 5 million daily uploads (e.g. for the largest media hosting platforms in the world).

Such a low number of false positives could be manually screened by a single analyst. We consider this reduction a significant step in the fight against online extremism, and we will help technology platforms to leverage our tool in order to remove the vast majority of Daesh content online with just a minimal impact to their business.

 

To find out more about what Faculty can do for you and your organisation, get in touch.

Close

Faculty newsletter

Sign up to our newsletter to receive information about our latest developments, news and events.

Faculty Science Ltd (“Faculty”, “we”, “us” or “our”) respect the privacy of its users (“User”, “you” or “your”) and is committed to protect the information that you share with us, whether it’s directly, through using our Services such as our Data Science Platform Faculty Platform (“Faculty Platform”), or through a third party (“Third Party” or “Third Parties”). We want to be transparent about our practices regarding the data we may collect when you use our Sites and our Services.

Our Sites

This Privacy Policy covers the information practices of faculty.ai, https://cloud.my.faculty.ai, and subdomains of both. Collectively these are referred to as our “Sites”.

Our Services

This Privacy Policy also covers other ways you might interact with us – such as by attending one of our events, signing up to our mailing list or the use of Faculty Platform – collectively these are referred to as Faculty’s “Services”.

What this policy does not cover

This Policy covers all Services and Sites of Faculty unless another Privacy Policy is displayed. In any such circumstance you will be made fully aware of the existence of another Policy. An example of this is when you sign a contract under which we supply you with our bespoke data science services.

End Users

Our Services are primarily used by Companies and Organisations. Where we are providing Services to you under a Company or Organisation contract (for example where a company holds a licence enabling you to use Faculty Platform), any data held about you personally is controlled by your Company or Organisation. If this applies to you, you can find further information below in the section entitled “Notice to End Users”.

The information we collect

Faculty collects information from individuals who visit our Sites and individuals who register to use the Services, either directly on our Sites or on third party Sites.

Types of Data

We may collect two types of data from our Users:

(1) Non-identifiable and anonymous information (referred to in this Policy as “Non-Personal Data”) where we are not aware of the identity of the User from which we have collected the Non-Personal Data;

(2) Individually identifiable information (referred to as “Personal Data”) where we may be able to identify an individual or the information may be of a private and/or sensitive nature.

Faculty will not request any “Sensitive Personal Data” (that is, information concerning an individual’s racial or ethnic origin, political opinions, religious or similar beliefs, trade union membership (or non-membership), physical or mental health condition, criminal offences or related proceedings, or any other data considered as sensitive under applicable law) unless it is in connection with your employment by Faculty or an application for employment or is related to our bespoke services which are covered by separate Privacy Policies.

As a User you may choose to ask us to process Sensitive Personal Data where you do so we will only use that data as you have requested as explained below (see Data Added or Collected by you).

Data we collect from you

Registration and Contact Information:

When you register to use our Services, or amend your previous registration details, we collect your username, first name, last name, company name, email address and in some circumstances where it is necessary to contact you about the Services, a postal address and phone number (“Registration Information”).

Billing Information

When purchasing Services which require payment, we collect billing information such as billing name, address, credit/debit card information. Sometimes we require some additional information to calculate and verify your bill, such as the number of people in your Company that require licences, your VAT registration number, and your Company registration number (“Billing Information”).

Information you provide through our Support Service

When you request help from us to use our Sites or Services through the Contact Form or Chatbot, you may choose to submit information about your usage of our Services. We will require an email address and name to provide you with assistance, and may ask you to provide further information in order to be able to solve your query (“Support Information”).

Optional Information

Whilst using our Sites and Services, you may provide us with additional information that is not required (“Optional Information”). Such Optional Information might include your job title, survey answers, feedback, or additional information in your support requests. We may ask you for feedback on our Support Service, but such information is optional and you do not have to give it to us. If we ask for this information from you and it is not required for use of our Services, such information will be clearly marked as optional. All such Optional Information shall be treated as Personal Data for the purposes of this policy.

Navigational and Usage Information

We automatically collect information as you use our Sites and Services about how you interact with us. Such information includes your IP address, the browser you are using, the type of device you are using to connect to us, the links that you click on, and the date and time you interact with us (“Navigational Information”). We use cookies to help us collect Navigational Information. You can find further information about our use of cookies in the section at the end of this document entitled Our Cookie Policy.

Data Added or Collected by you

As a User of our Services, in particular Faculty Platform, you may choose to add / invite other Users to our Services. Where you do so, we will only use that data as you have requested, to invite the User to our Services. Such data will be retained in our system until you remove it and will not be used other than for the purposes specified by you. You may also upload or ask us to collect (via APIs – application program interfaces – or other means) various types of information or data for processing and hosting (“Customer Material”). We will only process such Customer Material for the purposes set out in the Terms of Services.

Third Party Collectors

In some situations we may use a third party (that is, a separate organisation) to register your information so that you can use our Services, for example invitees to our events are asked to register via Eventbrite. You can find out more information about these “Third Parties” and their activities  in the section entitled “Third Party Processing”.

Other Information

If you provide us with any information not covered in the above, we will still use such information in accordance with this policy, or as permitted by you.

How we use the information we collect

We use your Registration Information, Billing Information and Optional Information in order to:

Operate the Service:

We require your Registration Information and Billing Information in order to provide you with secure Login credentials (username and password) and to receive payments for Services provided.

To provide customer support

We will require Registration Information and Optional Information in order to provide technical assistance, answer your queries, send you updates on account (for example if your payment is overdue), and to provide other support where it is requested from you.

To improve our Services

We may use Support Information, Optional Information, and Navigational Information to improve delivery of our Services to you. For example to identify common issues and fix them, or to identify bugs. Where we collect such data, such as bugs, your Personal Information will be removed, so we only have statistical information. Where we ask for Optional Information such as User feedback or surveys, such data helps us improve our Services in the future, and is anonymised when stored.

To provide to third party contractors who provide services to Faculty  

In some cases we use third party contractors to assist us in providing our Services, for example, we use Stripe to process your payments, and Zendesk to process your Support requests. A list of the third parties we work with is provided in the Third Party Processing section below.

To enforce our policies, or identify criminal behaviour

We may use your Registration Information, Billing Information and Navigational Information to ensure that your use falls within our Acceptable Use Policy and Terms and Conditions, or to identify any cases of fraudulent or criminal activity.

To update you on our Services

We may use your Registration Information to contact you about important updates to the Services for which you are Registered, such as product updates or changes to our Terms and Conditions, Acceptable Use Policy or Privacy Policy. We may from time to time contact you about updates to our Service which we feel you may be relevant to you, where it satisfies a legitimate interest (which is not overridden by your data protection interests) such as user surveys, or similar Services. You can request that we do not send you similar updates at any time.

To send you information you have consented to

Where you have given us your specific consent, we will send you information about our Services in general, such as our newsletter. You may withdraw your consent at anytime by clicking the link in any of the correspondence, or by clicking here.

Legal bases for processing

The legal bases for collecting and using your data vary depending on the way in which you are interacting with our Services. We collect and use your data only where:

Sharing with Third Parties

We do not sell, share or transfer your data to Third Parties, except in the following specific situations:

Requested by you, the User

For Collaboration

You may request for us to share your Customer Material with a Third Party for the purposes of collaborating on our Services. An example of this is when you invite a User to collaborate on a Faculty Platform project, they will be sent an invitation by us which includes your user name and the name of your organisation (if appropriate), and if accepted, they will get access to any of your Customer Material that you choose to share with them.

Managed Services

You may request us to share information with Third Parties where you are interacting with our Services as an organisation and wish us to share Customer Material with other people in your organisation. An example might be where you ask us to share training information via our Sites to your employees, or where you ask us to issue licences for Faculty Platform to your employees.

To interact with other Third Party Services

You may request that we link other Third Party Services to your Services with us. An example of this is when you create an API (Application Program Interface) on Faculty Platform. You may be required to include your Registration credentials for such Third Parties in order to operate the API.

Necessary for the Sites or Services

For third party processing

We may share your data with Third Parties where it is necessary for the operation, integration, hosting, or support of our Services.  We ensure that each Third Party has the same stringent confidentiality and security measures as Faculty.

We use the following Third Party processors for the following reasons and copies of their respective Privacy Policies are available if you follow the links provided:

With your account holders

Where you are accessing our Services under a licence in the name of your Organisation, we may provide your Customer Material and your Registration Information to your Company where they request us to do so.

For legal or vital interest reasons

We may be required to share your Personal Data with a Third Party for a legal reason, for example

Where you have consented

Where you consent for us to share your Data, as for marketing purposes. For example, you may consent to us using a testimonial from you in our marketing material, or to our listing you as one of our customers.  

Change in control

We may provide your Personal Data to a Third Party in the event that Faculty enters into discussions that might lead to a change in control, such as a merger, acquisition or purchase, unless this results in any change to this Privacy Policy or would affect confidentiality.

Analysis and to improve our services

We may share aggregate Non-Personal Data publicly or with Third Parties, for example through displaying marketing trends on our Sites, or for a Third Party to analyse usage statistics.

Modification or deletion of your Information

Your  choices and controls

If for any reason you would like to Modify or Delete the Personal Data we hold for you, you can do one of the following:

Please note that if you delete or request deletion of your Personal Data, we may still retain Non-Personal Data for the purposes of operating the Service, for example to provide historical user levels. We will also retain a single copy of your Registration Information to ensure that you are not re-added to our systems.

Data Retention

Faculty will hold your Personal Information as long as it is required for you to enjoy the use of our Services. Upon termination of any of our Services for any reason, we will retain the data mentioned below for the following time periods:

In all cases, you may ask us to remove or modify your data in accordance with the section “Deletion or Modification of Information”, although in some cases this may compromise our ability to deliver our Services.

Where your data is provided to us through a Third Party (e.g. Eventbrite), the same deletion periods will apply as above, but the Third Party may have different policies, and you should use the links provided in “Sharing with Third Parties” and contact those Third Parties directly to ensure deletion of your Data. Where we transfer your data to a Third Party, we will be responsible for the deletion of your data with such Third Parties, as outlined above.

Security and Storage of Information

Faculty takes great care in implementing, enforcing and maintaining security policies to help ensure the security of our Services, Sites and our User’s Personal Data. You can find out more information about our Security procedures here.

Access to your data by Faculty staff and contractors

Faculty takes steps to ensure as far as possible that it’s staff are honest, reliable and take all due care in the processing, care and handling of all Data.

Faculty limits access to any Personal Data we hold to staff who:

Customer Material in Faculty Platform (with the exception of Customer Material in the form of Registration Information) is hosted on AWS in Ireland which provides advanced security features and is compliant with ISO 27001. All Customer Material is stored with logical separation from information of other customers. Faculty limits access to Customer Material to the following Faculty staff and contractors:

Notification of breaches

Faculty shall notify the User without undue delay, in the event that any Personal Data held by Faculty on the User or on behalf of the User is lost, stolen, or where there has been any unauthorised access to the Personal Data which is likely to result in a high risk to the User’s rights or freedoms. Furthermore Faculty undertakes to cooperate with the User in investigating and remedying any such security breach. In any security breach involving Personal Data, Faculty shall immediately take remedial measures, including without limitation, reasonable measures to restore the security of the Personal Data and limit unauthorised or illegal dissemination of the Personal Data or any part thereof. Faculty maintains documentation regarding compliance with the requirements of the law, including but not limited to documentation of any known breaches and holds reasonable insurance policies in connection with data security.

Transfer of Data outside of the EEA

Personal Data submitted may be transferred by us to Third Parties (as set out under the heading “Sharing with Third Parties”), including service providers that may be situated outside the European Economic Area (EEA) and may be processed by staff operating outside the EEA. Where this is the case we will take reasonable steps to ensure that your privacy rights continue to be protected. In countries where they do not have similar data protection laws to the UK, we will take reasonable steps to ensure that the Third Parties have policies, terms and conditions that provide similar protection to that offered within the EEA as a minimum. By using the Site you agree to this storing, processing and/or transfer.

Customer Data is hosted on AWS in Ireland, and is not transferred outside of the EEA without specific and independent permission.

Faculty does not transfer any personal data outside of any jurisdiction in a manner incompatible with the requirements of applicable law.

Portability of your data

Upon termination of any of our Services for any reason, you may request a copy of your Personal Data, which Faculty will provide in a reasonably acceptable format.

Other Information

Notice to End Users

Many of the Services we provide are primarily used by Companies and Organisations. Where we are providing Services to you under a Company or Organisation contract (for example where a company holds a licence for Faculty Platform), any Personal Data held is controlled by your Company or Organisation. Where this is the case, your Personal Data will be subject to the Privacy Policy of your organisation, and questions about your information should be directed to your organisation.

Organisation account holders are able to:

Where the Services are not provided under the control of an Organisation, if you register for our Services with an email address owned by an Organisation, that Organisation may assert control over your Registration Information and Customer Material at a later date. You will be notified if this happens.

If you do not want your Organisation to have control over your access to our Services, please register with a personal email address and do not add a Company name to your Registration Information.

For all other queries, please contact the person within your Organisation who implements and enforces your Organisation’s Privacy Policy.

Our Cookie Policy

We use cookies and other tracking products to customise our Services, to allow you to login without re-entering your Registration Information, and to understand how our customers use our Services in order to continuously improve them.

We use them in the following circumstances:

Most browsers allow you to opt out of accepting cookies through their settings and will also allow you to delete cookies already stored on your computer, however, blocking or deleting all cookies may have a negative impact on your use of our Services, and might prevent them from working altogether.

You can opt-out of Google Analytics on all websites by following this link.

Children Under 16

Our Services are not directed towards children under the age of 16, and therefore (other than in Customer Material controlled by you) we do not hold any Personal Data relating to Children under 16. If you have reason to believe that we may have been provided with Personal Data on a child under 16, please contact us immediately via our contact form.

Right to Object

You have the right to object to the processing of your Personal data by Faculty:

If you would like to object to the above, you can contact us via our contact page.

Report a concern

If you have a concern about our use of your Personal Data or our information rights practices please let us know. You also have the right to lodge a complaint with the Information Commissioner’s Office (“ICO”), the UK data protection authority, via this link or by calling 0303 123 1113.

Changes to the Privacy Policy

Faculty keeps its Privacy Policy under regular review. If we change our Privacy Policy we will let you know by:

The changes will take effect seven (7) days after notice has been provided.

Unless otherwise stated, all changes to this privacy policy are effective as of the stated Last Revised date, and your continued use of the Site and/or Services after the Last Revised date will constitute acceptance of, and agreement to be bound by, those changes.

Contact Information

For any queries or comments on the Policy or its content, or for any other purposes you can contact us by using our contact page or by:

Sending an email to: info@faculty.ai

Writing to: Operations Department

Faculty Science Ltd

54 Welbeck Street

London

W1G 9XS

 

By telephone on:  +44 (0)203 637 9415

By entering your email address, you are agreeing to receive information about our latest developments, news and events.

Cancel
Close

search faculty.ai

close