Research project and paper that uses a novel semi supervised learning method to identify incorrect tags in large NLP datasets. Presented at the Conference on Computational Natural Language Learning (2020).