Harmful Content Solutions

Powered by a high-quality dataset of harmful speech samples, crafted in line with evolving E.U. standards in AI and harm, CaliberAI's tools augment human capability to detect language with a high risk of being harmful.

Harmful content

We define harmful content as any language which attacks, abuses or discriminates against a person or a group based on an identity attribute (e.g. Sexism, LGBTQI+ discrimination, Racism, Religious discrimination, Ableism), in line with evolving best international standards here (i.e. U.N., E.U.). We also track threats of violence and non-identity based abuse (e.g. non-identity-based forms of online bullying).

Test potentially harmful content

This is an early release of CaliberAI and you may notice false positives/negatives as we refine our predictive models.

We're working on your classification...

Example of potentially harmful content


What makes us different

Unique data

Unique, carefully crafted datasets, training multiple machine learning models for production deployment.

Expert led

Expert annotation, overseen by a diverse, publisher led team with deep expertise in news, law, linguistics and computer science.

Explainable outputs

Pre-processing and post-processing with eXplainable AI outputs.

Contact sales

Get a closer look at how our solutions work and learn more about how CaliberAI's technology can integrate with your technology stack and editorial workflow.

Get in touch with sales@caliberai.net