Chris Swart has 6 years experience delivering Natural Language Processing (NLP) services across the email, complaint, pharma, and sales industries. He has an interest in cost effective dataset creation with distant supervision and building semi-supervised datasets to get the best bang for buck for models. He cofounded Comtura on a mission to help sales teams weaponsie their customers voice to sell more. At Comtura he leads the machine learning team.
Weak supervision helps reduce your annotation costs by programmatically labelling unlabelled data. I will talk about Comtura’s experiences with 2 open source weak supervision libraries Snorkel and Weasel. At Comtura, I work on modelling the sales process to predict relevant information from sales conversations and understand deals better.
I will walk you through our journey of going from manual annotation to weak supervision and how it has helped improve our annotation efficiency.
Manual annotation usually only annotates a single document while with weak supervision labelling function to generate a weak label can label 2-3% of your samples potentially widening the reach of your efforts. Your coverage and confidence in your labelling functions can increase at a much faster pace than with traditional manual annotation. Labelling functions can also allow you to encapsulate domain experts.