In today’s episode, we welcome Manu Sharma and Brian Rieger from Labelbox, a private company which we believe is leading training data solution for machine learning. We have had many conversations on this show about artificial intelligence from a hardware and algorithm perspective, but data is just as important. All production AI systems are based on supervised learning, which requires large quantities of data to be labeled so that the algorithms can understand and compartmentalize it. In other words, data without labels can’t be used by most AI algorithms.
While large internet companies like Google and Facebook have built custom tools in-house to help label and sort through their large troves of data, most enterprises have very few options. Labelbox aims to fill this gap by providing a scalable and easy-to-use tool to help companies convert their raw data into labeled data fit for machine learning algorithms. Today on the show, Manu and Brian get into the history of Labelbox, as well as the services it provides to its clients and the machine learning community. We talk about the tiers and iterations of Databox, its pricing structures, the various industries it supports, and what makes it stand out against its competition. We also cover some fascinating ground around human-in-the-loop systems, how a machine learning startup would train its AI and the difference between software 1.0 and 2.0. In our conversation, we also speak about Labelbox in relation to computer vision, drone technology, and labor ethics. Join us to get a taste of the many ways data and AI will continue to penetrate life and industry well into the foreseeable future.
Key Points From This Episode:
How Manu and Brian became friends through building an optimization system for airfoils.
Manu’s experience working with data insights where he realized the need for data labeling.
The connection between the rise of different machine learning algorithms and data labeling.
Three labeling problems Labelbox solves in areas of tools, distribution, and management.
Labelbox’s two formats: on-premises and cloud-based.
Pricing structures for Labelbox which are tiered and correspond to the decisions it makes.
Different industries that utilize computer vision which benefit from Labelbox.
How Labelbox helped KeepTruckin build a dashcam data capturing system.
How Labelbox believes it can take accuracy from 90% to 100%.
Key differentiators of Labelbox regarding software, human support, and data management.
The story of Expensify and how an AI-powered app trains its AI using humans and data.
Data processing as the key differentiator between software 1.0 and 2.0.
Whether the rise of transfer learning is detrimental to Labelbox as a business.
The relevance of Labelbox and machine learning to modern drone technology and uses.
Ethical considerations around human data labeling work conditions.
The plus side of human data labeling: skills development and accessibility.
Some of the third world English speaking regions where human data labeling is burgeoning.
Situations where human data labeling falls under a mixture of experts and outsourced labor.
“We’re seeing a massive adoption of deep learning technology across every industry where cameras or the human eye are involved in making decisions.” — @Riegerb
“Labelbox is the only software platform that a customer would ever need in order to build, create, and manage the training of datasets for operating a machine learning pipeline.” @manuaero
“What’s really fascinating is that in software 2.0, the way the machines learn is through a form of labeled data, and these labels are essentially decisions.” @manuaero