RANDOM FOREST-BASED CLASSIFIER FOR AUTOMATIC SARCASM CLASSIFICATION ON TWITTER DATA USING MULTIPLE FEATURES

Authors

  • Christopher Eke Department of Computer Science, Federal University of Lafia, Nasarawa State.
  • Azah Anir Norman
  • Liyana Shuib
  • Fatokun Faith B.
  • Zalizah Awang Long

DOI:

https://doi.org/10.31436/jisdt.v4i2.345

Keywords:

Natural language processing, Sarcasm detection, Classification algorithm, Random Forest, GloVe embedding

Abstract

Sarcasm is one of the nonliteral languages usually employed in social networks and microblogging websites to convey implicit information in an individual communication message. This could lead to the misclassification of tweets. This paper focuses on sarcasm detection on tweets, which has been experimented with the use of textual features. The textual features comprise the Neural language fusion and Natural language features, which include sentiment-related features, semantic and synthetic features, punctuation-related features, and GloVe embedding features. The features mentioned above were extracted separately from the target tweet and fused to form fused features for the target tweet. The proposed predictive model attained an accuracy of 86.9% with a random forest classifier, which outperformed other models employed in the experiment, such as DT (83.9), SVM (80.5), KNN (83.1), and LR (82.9).

Downloads

Download data is not yet available.

Downloads

Published

2022-12-01

How to Cite

Eke, C., Norman, A. A., Shuib, L., Faith B., F., & Long, Z. A. (2022). RANDOM FOREST-BASED CLASSIFIER FOR AUTOMATIC SARCASM CLASSIFICATION ON TWITTER DATA USING MULTIPLE FEATURES. Journal of Information Systems and Digital Technologies, 4(2). https://doi.org/10.31436/jisdt.v4i2.345