A TECHNIQUE FOR PERCEIVING ABUSIVE BANGLA COMMENTS

##plugins.themes.bootstrap3.article.main##

Md Gulzar Hussain
Tamim Al Mahmud

Abstract

Most of the research on abusive comments or text detection is conducted in English, some of which are intended to detect humiliating or insulting text. But a few works are found in the Bangla language. Detecting abusive text for Bangla language will be helpful to prevent cyber crimes such as online blackmailing, harassment and cyber bullying which are nowadays becoming the main concern in Bangladesh. Our goal is to detect abusive Bangla comments that are gathered from different social sites where people share their views, feelings, opinions, etc. in this paper. In order to classify a bangla comments is abusive or not, we proposed a root level algorithm and also proposed uni-gram string features to achieve a better result. We have collected several comments from renowned social media Facebook for our work.

##plugins.themes.bootstrap3.article.details##

Area :
Articles

References

R. Madhavan. (2018) Natural language processing current applications and future possibilities. [Online]. Available: https://www.techemergence.com/ nlp-current-applications-and-future-possibilities/

Wikipedia. (2010) List of languages by number of native speakers. [Online]. Available: https://en.wikipedia.org/wiki/ L istn ofn languagesn byn numbern ofn nativen speakers

wikipedia. (2017) Bengali language. [Online]. Available: https://en.wikipedia.org/wiki/Bengalin language

W. A. Social and Hootsuite. (2018) 2018 digital yearbook. [Online]. Available: https://digitalreport.wearesocial.com/

P. KALLAS. (2018) Top 15 most popular social networking sites and apps [august 2018]. [Online]. Available: https://www.dreamgrow.com/ top-15-most-popular-social-networking-sites/

C. Nobata, J. Tetreault, A. Thomas, Y. Mehdad, and Y. Chang, “Abusive language detection in online user content,” in

Proceedings of the 25th International Conference on World Wide Web, ser. WWW ’16. Republic and Canton of Geneva, Switzerland: International World Wide Web Conferences Steering Committee, 2016, pp. 145–153. [Online]. Available: https://doi.org/10.1145/2872427.2883062

T. Davidson, D. Warmsley, M. Macy, and I. Weber, “Automated hate speech detection and the problem of offensive language,” arXiv preprint arXiv:1703.04009, 2017.

Y. Chen, Y. Zhou, S. Zhu, and H. Xu, “Detecting offensive language in social media to protect adolescent online safety,” 09 2012, pp. 71–80.

G. Xiang, B. Fan, L. Wang, J. Hong, and C. Rose, “Detecting offensive tweets via topical feature discovery over a large scale twitter corpus,” in Proceedings of the 21st ACM international conference on Information and knowledge management. ACM, 2012, pp. 1980–1984.

S. V. Wawre and S. N. Deshmukh, “Sentiment classification using machine learning techniques,” International Journal of Science and Research (IJSR), vol. 5, no. 4, pp. 819–821, 2016.

A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant supervision,” CS224N Project Report, Stanford, vol. 1, no. 12, 2009.

A. Pak and P. Paroubek, “Twitter as a corpus for sentiment analysis and opinion mining.” in LREc, vol. 10, no. 2010, 2010, pp. 1320–1326.

D. Davidov, O. Tsur, and A. Rappoport, “Enhanced sentiment learning using twitter hashtags and smileys,” in Proceedings of the 23rd International Conference on Computational Linguistics: Posters, ser. COLING ’10. Stroudsburg, PA, USA: Association for Computational Linguistics, 2010, pp. 241–249. [Online]. Available: http: //dl.acm.org/citation.cfm?id=1944566.1944594

L. Barbosa and J. Feng, “Robust sentiment detection on twitter from biased and noisy data,” in Proceedings of the 23rd International Conference on Computational Linguistics: Posters, ser. COLING ’10. Stroudsburg, PA, USA: Association for Computational Linguistics, 2010, pp. 36–44. [Online]. Available: http://dl.acm.org/citation.cfm?id= 1944566.1944571

P. D. Turney, “Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews,” in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ser. ACL ’02. Stroudsburg, PA, USA: Association for Computational Linguistics, 2002, pp. 417–424. [Online]. Available: https://doi.org/10.3115/ 1073083.1073153

D. Maynard and A. Funk, “Automatic detection of political opinions in tweets,” in Extended Semantic Web Conference. Springer, 2011, pp. 88–99.

M. M. Nabi, “Detecting sentiment from bangla text using machine learning technique and feature analysis,” 2016.

S. Chowdhury and W. Chowdhury, “Performing sentiment analysis in bangla microblog posts,” in 2014 International Conference on Informatics, Electronics & Vision (ICIEV). IEEE, 2014, pp. 1–6.

M. Al-Amin, M. S. Islam, and S. Das Uzzal, “Sentiment analysis of bengali comments with word2vec and sentiment information of words,” 04 2017.

M. G. Hussain, T. A. Mahmud, and W. Akthar, “An approach to detect abusive bangla text,” in 2018 International Conference on Innovation in Engineering and Technology (ICIET), Dec 2018, pp. 1–5.

S. C. Eshan and M. S. Hasan, “An application of machine learning to detect abusive bengali text,” in 2017 20th International Conference of Computer and Information Technology (ICCIT), Dec 2017, pp. 1–6.

Prothom alo - facebook home. [Online]. Available: https: //www.facebook.com/DailyProthomAlo/

Mashrafe bin mortaza - facebook home. [Online]. Available: https://www.facebook.com/Official.Mashrafe/

Shakib al hasan - facebook home. [Online]. Available: https://www.facebook.com/Shakib.Al.Hasan/

Salmon thebrownfish. [Online]. Available: https://www. youtube.com/user/salmanmuqtadir

Naila nayem - facebook home. [Online]. Available: https: //www.facebook.com/artist.nailanayem/

Prothom alo - online news portal. [Online]. Available: https://www.prothomalo.com/

wikipedia. (2018) Bag-of-words model. [Online]. Available: https://en.wikipedia.org/wiki/Bag-of-words model

Wikipedia. (2018) n-gram. [Online]. Available: https://en. wikipedia.org/wiki/N-gram