How Does Mislabeled Training Data Affect ML System Performance?

iMerit is a remarkable company of over 4000 people that specializes in annotating the data needed to train machine learning systems.

I am writing a series of blogs for them on various aspects of machine learning. In my latest blog I explain how inaccuracies in training data labels (‘label noise’) affect ML system performance. It turns out that it’s not so much how many errors that matters, but how those errors are structured.

Author: Tom Robertson

Tom Robertson, Ph.D., is an organizational and engineering consultant specializing in harmonizing human and artificial intelligence. He has been an AI researcher, an aerospace executive, and a consultant in Organizational Development. An international speaker and teacher, he has presented in a dozen countries and has served as visiting faculty at Écoles des Mines d’Ales in France and Portland State University.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: