Toxic Comments Classification, and 'Non-toxic' Chat Application

Huy Tran
Posted on Mar 14, 2018

1/ Project description:

I've recently participated in a Kaggle's competition about Toxic comments classification, sponsored by the Conversation AI team, a research initiative founded by Jigsaw and Google (both a part of Alphabet) who is working on tools to help improve online conversation. Inspired by the idea of keeping the online environment productive, respectful, and free of profane, vulgar, or offensive languages, I'd like to introduce a chat tool free of toxic comments as mentioned.

In this project, regarding the toxic comments analysis, I'd like to introduce a Neural Network classification model in R using the Keras for R package and its text data processing functions.

In addition, I'd also like to present other models in Python and the result that I and my team have achieved with the Kaggle competition - Toxic comments classification. (At this point, our team is achieving prediction accuracy score 0.9869, staying at top 5% -  the 171st among 4231 participants).

2/ Non-toxic Chat application Introduction:

i/ Modify Your User ID or you will use the ID generated by the system as displayed in the "Your User ID" text box in the right-hand side panel.

ii/ Type in your chat text into the Text box under the Chat log, and click "Send" button when you're ready. However, the "Send" button may be blocked if your chat text is detected with a high risk of containing toxic contents or inappropriate languages.

  • Highlights:

i/ Toxic Analysis Chart: At the same time user is typing, the pre-built machine model is analyzing the text, and predicting the probability of toxic contents risk in 6 different categories, including "toxic", "severe toxic", "obscene", "threat", "insult", and "identity hate" as suggested by the original Kaggle classification challenge.

In this application, the system will consider all those kinds of toxic comments.

ii/ Toxic text blockage: If a high risk is detected, the "Send" button will be disabled, until the chat is modified and containing low risk or none toxic contents.

3/ Keras in R - Neural Network classification model:

i/ Data Description:


The train data includes "comment_text" and 6 labels, including "toxic", "severe_toxic", "obscene", "threat", "insult", and "identity_hate", that the comments are classified into.

However, limited to the scope of this report, I'd like to focus on the label "toxic" and the model predicting a text comment whether it's containing toxic contents or not.

Prediction label "toxic":

As shown above, the "toxic" label has 2 prediction classes - "0": negative/non-toxic, "1": positive/toxic.

It indicates that training data is highly imbalanced on the prediction classes. This is important for training data construction and model selection later in the training process.

ii/ Model Selection:

In this project, I'd like to use fastText model for text classification (more details about the model can be found at It's reported as a simple and efficient model for text classification, and more importantly, it has much-advanced performance in term of accuracy and training time compared to other popular models, such as: BoW, ngrams, ngrams TFIDF, char-CNN, char-CRNN, etc.

In addition, the reported benchmark shows that the fastText model can train on more than one billion words in less than ten minutes using a standard multicore CPU, and classify half a million sentences among 312K classes in less than a minute.

iii/ Building Train and Test datasets:

Step 1: Cleaning up the texts:


Before cleaning:

After cleaning:

Step 2: Building Tokenizer:

From the "text_for_tokenizing" that has been cleaned up during the previous Text Cleaning process, it is used to build a tokenizer. This tokenizer will be used as the baseline for future Text data process, including the ngram creation process, and processing new texts for classification.

In this exercise, I use maximum 20,000 words for the text process.

Step 3: Building ngram matrix:

From the tokenizer that has been trained during the previous process, it is used to generate word sequence vectors for the input comment texts.


Before word sequence vector transformation:


In this project, I use 1-ngram diagram, so that next step is to convert those word sequence vectors into a matrix, in which each row represents a sentence that contains word sequence (in the dictionary) of the words in the sentence. The below shows average length of the sentences is about 30 words; however, I would choose 400 as the maximum length of sentence to build the matrix.

And, the word sequence matrix is created as follows:

Step 4: Creating Train and Test datasets:

In this exercise, I use ratio 80:20 to split the dataset into Train and Test datasets.

iv/ Model Construction:

v/ Training - Validation chart:

The training and validation loss are converging after 5 epochs.

vi/ Evaluation:

Use the trained model to predict on the "never-seen during training process" test dataset, and evaluate the Model.

Classification result on the Toxic class:


-  The model has performed very well with the Accuracy score 0.9971.

-  The balanced Accuracy score 0.9890, the model performs very well for both "negative" and "positive" classes, even though the dataset is highly imbalanced with the majority of the Negative class.

Annex A - Github repository:

  1. Toxic classification model:
  2. Friendly Chat application:

Annex B - Models in Python:

As mentioned above, I've built classification models in Python - Jupiter notebook - with Logistic Regression, light GBM models. In which, I used Forest Trees for Features Selection, and Under resampling techniques on imbalanced training Dataset before the Training process.

The overall score on the submission dataset for competition evaluation is 0.9860 on all 6 toxic categories.

Github repository:

Annex C - Conclusion and Next Plan:

Regarding the next steps, I'd like to continue improving the model and will report on the result on the followings:

  • Improve the training dataset, including more features with higher n-gram models
  • Improve the prediction scores across the 6 categories, including 'toxic', 'severe toxic, 'obscene', 'threat', 'insult', and 'identity hate'.
  • Improve the prediction ability, in which the model can recognize the word context (sentiment analysis) so that it can evaluate the toxic content probability of sentences, not only by specific words.

Thank you for reading. Any comments or questions, please send to [email protected].

About Author

Related Articles

Leave a Comment

Your email address will not be published. Required fields are marked *

Chanel May 31, 2018
Its not my first time tto pay a visit this site, i am visiting this weeb site dailly and obtain nice data from here everyday.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags