Our Quest for Better Machine Learning Model Quality  5 min read

In my previous blog post, I went over how quality and trust are major challenges in today’s machine learning ecosystem. If you haven’t had a chance to read it, I highly recommend that you do before jumping into this post. It sets the stage wonderfully to understand why we decided to build Snitch AI.

About two years ago, my colleague Olivier Blais was building machine learning models for various companies at our partnered service company, Moov AI.

A problem that would frequently occur with customers is that once they were presented a complete model, they were hesitant to go ahead and make use of the model in production. Often, they did not understand the technology involved and were unwilling to trust it. It didn’t really help that there weren’t many ways to “prove” that it worked as it should.

Sure, Olivier could show them all the training data with the accuracy and loss values and prove that the model had indeed properly learned from the training set. But those metrics only went so far.

After all, if the model had overfitted, how would we know for sure?

Even worse, what if there was an issue with the initial data itself? If that were the case, the model would have learned under wrong assumptions. This could lead to disastrous outcomes. Olivier was not satisfied with this and wanted to do better.

Olivier’s mission: find how to prove machine learning model quality

He started digging around, looking for ways to measure and prove model quality. He found various techniques, frameworks and scientific papers that would offer a lot of different approaches.

Each approach would validate a piece of the picture, bit by bit building the trust he needed to convince stakeholders to embrace AI. But these techniques were disjointed, requiring a data scientist to stitch together the various tests to build a complete picture.

But once complete, it did wonders to ensure that the model was really doing what it was supposed to do. This became somewhat of a passion project for Olivier and he presented a talk at ODSC East in April 2020 explaining different techniques to ensure model robustness.

His interest in this topic did not stop there. He began working with the Standards Council of Canada to establish an international norm that could be used to ensure ML model quality and robustness. This would allow for the creation of more explainable, higher-quality AI.

The ISO SC42 committee now has a standard in development, under Olivier’s leadership, to address this very concern. While standards are a great way to ensure minimal quality for ML models, they are only useful if the tools that one can use to develop ML models can assist in assessing and ensuring that the model meets or exceeds the standards.

A solution for the community to build more robust and high-quality AI/ML

This is where Snitch AI comes in. In May 2020, we began development on a platform to assist data science team managers to assess the quality of the models that their team is developing.

Our platform also gives the data scientists the details needed to properly understand this assessment and how to remedy the situation to ensure model robustness.

We’re quite aware that there isn’t a single way to develop ML models and as such, we try to be as framework agnostic as possible, supporting SciKitLearn,  TensorFlow and XGBoost with PyTorch support on the horizon.

Our approach to AI model validation

We also opted for an end-of-line approach, where data scientists only need to provide the model and the accompanying datasets to get an accurate assessment. No need to use our framework or our libraries when training your model. This means less friction for data science teams who can continue to work however they please. It also means that submitting a model to Snitch AI for assessment only takes a few moments.

Going forward, we’re working hard on adding new features to our quality and drift analyses to give teams even more insight into their models. We’re also working on making integration with existing MLOps pipelines much simpler by creating an API to allow automation as well as providing plug-ins for many popular CI tools.

Scientific validation of our scientific validation approach

2021 also marks the start of a research project in collaboration with the University of Montreal to test, validate and improve our analysis techniques while closely following the developments within the ISO SC42 committee and integrating these new standards as they become available.

Our goal is to become the de-facto tool in every data science team’s toolbox, helping them ensure that they are producing robust, high quality models that will bring the expected value to stakeholders.

There’s still a lot of work to be done to ensure that AI is both trustworthy and robust, providing the expected outcomes. AI as a whole is only in its infancy and we’ve only seen the tip of the iceberg as to its potential.

One thing that we can assure you is that Snitch AI will stay at the forefront of any and all developments in model quality assurance to make sure your AI is the best it can be without your entire team needing to become robustness experts.

After all, you don’t want to have to go through all the whitepapers Olivier went through to have a positive and durable impact on your models. 🙂

So, what are you waiting for? Get started now with a free trial and join the movement for building more robust and high-quality AI.

Christian Mérat

Christian Mérat

Christian is Snitch AI's Chief Operating Officer. He's really good at turning a product’s vision into reality. He can overcome any technical obstacle using his innate ability to leverage the right technology to meet business needs. In the last decade, he strongly contributed to the commercial success of Sharegate and Officevibe thanks to his strategic vision and cutting-edge technological expertise.

Read more

Book a demo!

Get Started with Snitch AI

Learn everything about your model blackboxes.

Submit this form to request access.