A Data Scientist’s Top 3 Tips About Training Data

May 17, 2023

Posted by Mighty AI on May 17, 2023 in Computer Vision, Machine Learning & AI, Natural Language Processing

People frequently tell us they learned a lot listening to Mighty AI Principal Data Scientist Angie Hugeback’s interview with Sam Charrington on his “This Week in Machine Learning & AI” podcast.

We’re not surprised—the AI industry continues to shift, but the conversation based in data science and statistics principles remains relevant and grounded.

So we’re opening the vault to resurface Angie’s secrets about training data for computer vision and natural language models. Here’s a sneak peek:

Generating high-quality, accurate training datasets is job number one in scaling AI models. And anyone who has experience in the field knows that your model is only as good as your training data (and the humans training it).
Your training data only represents a subset of the space in which you expect your model to function. You need good coverage—and you want to make sure to represent your input as best as possible if you want to get the most accurate results.
Quality in computer vision and natural learning models is essential—but believe it or not, you can get away with lower-quality training data in certain situations. Wanna know what they are? Then, skip on over to the full episode. (While you’re at it, treat your ears to several informative interviews about AI and machine learning. We guarantee you’ll learn a ton!)

Hungry for more? Check back next week to get a list of our favorite podcasts in this space.

Note: Before January 10, 2017, Mighty AI was known as Spare5. We haven’t edited this podcast episode to reflect our new name.

image credit: Corey Blaz via Unsplash

Tags:computer vision, data science, machine learning, natural language, training data

A Data Scientist’s Top 3 Tips About Training Data

About Mighty AI

Our Offering

Our Company

Contact Info

Keep Up with Us

A Data Scientist’s Top 3 Tips About Training Data

About Mighty AI

Related Articles

New Open Training Dataset for Autonomous Driving Now Available

The AI News We’re Reading: June 2017 Edition

Mighty AI Opens New Location in Detroit

The AI News We’re Reading: May 2017 Edition

Our Offering

Our Company

Contact Info

Keep Up with Us