Mutlilingual Hatespeech Detection @ WISoC Laboratory

Worked under Prof. Yashvardhan Sharma and developed Machine Learning Models using transformers, stacked embeddings and work vectors for detection of hate speech in Tweets. We had also used back-translation as a dat augmentation technique to obtain more training data. Considered Models:

  • XLM-RoBERTa
  • FAIR Stacked Embeddings
  • ULMFit
  • FastText Word Embeddings

Results

Achieved F1-weighted score of 0.90 for coarse-grained hostility detection and 0.54 F1-weighted score for fine-grained hostility identification

Paper published at CONSTRAINT 2021 and presented at AAAI 2021.

Click here for more information on the project and a link to the paper.