Toxic Comments Classification, and 'Non-toxic' Chat Application
1/ Project description:
I've recently participated in a Kaggle competition about Toxic comments classification, sponsored by the Conversation AI team, a research initiative founded by Jigsaw and Google (both a part of Alphabet) who is working on tools to help improve the online conversation. Inspired by the idea of keeping the online environment productive, respectful, and free of profane, vulgar, or offensive languages, I'd like to introduce a chat tool free of toxic comments as mentioned.
In addition, I'd also like to present other models in Python and the result that I and my team have achieved with the Kaggle competition - Toxic comments classification. (At this point, our team is achieving a prediction accuracy score of 0.9869, staying at the top 5% - the 171st among 4231 participants).
2/ Non-toxic Chat application Introduction:
- Non-toxic Chat application link: https://huytquoc.shinyapps.io/NonToxicChat/
- User Guides:
i/ Modify Your User ID or you will use the ID generated by the system as displayed in the "Your User ID" text box in the right-hand side panel.
ii/ Type in your chat text into the Text box under the Chat log, and click the "Send" button when you're ready. However, the "Send" button may be blocked if your chat text is detected with a high risk of containing toxic contents or inappropriate languages.
i/ Toxic Analysis Chart: At the same time user is typing, the pre-built machine model is analyzing the text, and predicting the probability of toxic contents risk in 6 different categories, including "toxic", "severe toxic", "obscene", "threat", "insult", and "identity hate" as suggested by the original Kaggle classification challenge.
In this application, the system will consider all those kinds of toxic comments.
ii/ Toxic text blockage: If a high risk is detected, the "Send" button will be disabled, until the chat is modified and containing low risk or none toxic contents.
3/ Keras in R - Neural Network classification model:
i/ Data Description:
The train data includes "comment_text" and 6 labels, including "toxic", "severe_toxic", "obscene", "threat", "insult", and "identity_hate", that the comments are classified into.
However, limited to the scope of this report, I'd like to focus on the label "toxic" and the model predicting a text comment whether it's containing toxic contents or not.
Prediction label "toxic":
As shown above, the "toxic" label has 2 prediction classes - "0": negative/non-toxic, "1": positive/toxic.
It indicates that training data is highly imbalanced on the prediction classes. This is important for training data construction and model selection later in the training process.
ii/ Model Selection:
In this project, I'd like to use the fastText model for text classification (more details about the model can be found at https://arxiv.org/abs/1607.01759). It's reported as a simple and efficient model for text classification, and more importantly, it has much-advanced performance in terms of accuracy and training time compared to other popular models, such as BoW, ngrams, ngrams TFIDF, char-CNN, char-CRNN, etc.
In addition, the reported benchmark shows that the fastText model can train on more than one billion words in less than ten minutes using a standard multicore CPU, and classify half a million sentences among 312K classes in less than a minute.
iii/ Building Train and Test datasets:
Step 1: Cleaning up the texts:
Step 2: Building Tokenizer:
From the "text_for_tokenizing" that has been cleaned up during the previous Text Cleaning process, it is used to build a tokenizer. This tokenizer will be used as the baseline for a future Text data process, including the ngram creation process, and processing new texts for classification.
In this exercise, I use a maximum of 20,000 words for the text process.
Step 3: Building ngram matrix:
From the tokenizer that has been trained during the previous process, it is used to generate word sequence vectors for the input comment texts.
Before word sequence vector transformation:
In this project, I use a 1-ngram diagram, so that the next step is to convert those word sequence vectors into a matrix, in which each row represents a sentence that contains a word sequence (in the dictionary) of the words in the sentence. The below shows the average length of the sentences is about 30 words; however, I would choose 400 as the maximum length of sentence to build the matrix.
And, the word sequence matrix is created as follows:
Step 4: Creating Train and Test datasets:
In this exercise, I use a ratio of 80:20 to split the dataset into Train and Test datasets.
iv/ Model Construction:
v/ Training - Validation chart:
The training and validation loss is converging after 5 epochs.
Use the trained model to predict the "never-seen during training process" test dataset, and evaluate the Model.
Classification result on the Toxic class:
- The model has performed very well with an Accuracy score of 0.9971.
- The balanced Accuracy score of 0.9890, the model performs very well for both "negative" and "positive" classes, even though the dataset is highly imbalanced with the majority of the Negative class.
Annex A - Github repository:
- Toxic classification model: https://github.com/huytquoc/tx_classification_by_fastText
- Friendly Chat application: https://github.com/huytquoc/ShinyChat
Annex B - Models in Python:
As mentioned above, I've built classification models in Python - Jupiter notebook - with Logistic Regression, light GBM models. In which, I used Forest Trees for Features Selection, and Under resampling techniques on imbalanced training Dataset before the Training process.
The overall score on the submission dataset for competition evaluation is 0.9860 on all 6 toxic categories.
Github repository: https://github.com/huytquoc/Toxic_Comments_Classification
Annex C - Conclusion and Next Plan:
Regarding the next steps, I'd like to continue improving the model and will report on the result in the followings:
- Improve the training dataset, including more features with higher n-gram models
- Improve the prediction scores across the 6 categories, including 'toxic', 'severe toxic, 'obscene', 'threat', 'insult', and 'identity hate'.
- Improve the prediction ability, in which the model can recognize the word context (sentiment analysis) so that it can evaluate the toxic content probability of sentences, not only by specific words.
Thank you for reading. Any comments or questions, please send them to [email protected].