Arabic text classification using deep feature and bidirectional long-short-term memory

Azal Minshed Abid *

Department of Computer Science, College of Education, Mustansiriyah University, Iraq.
 
Research Article
Global Journal of Engineering and Technology Advances, 2022, 13(01), 098–107.
Article DOI: 10.30574/gjeta.2022.13.1.0179
Publication history: 
Received on 16 September 2022; revised on 22 October 2022; accepted on 25 October 2022
 
Abstract: 
Due to the increased demand for automatic document organization, text classification is essential in both academic and commercial platforms. The aim of text classification is to automatically group text documents into one or more predefined categories,that helps to solve a variety of challenges. Many of these concerns are related to data management. In this paper, we propose a new model for Arabic text classification. The model consists of two main phases. The first phase is concerned with extracting three sets of features: statistical feature, Latent Semantic Analysis (LSA) feature, and a combination of both. While the second phase is concerned with introducing these features separately to the Bidirectional Long-short Term Memory (BI-LSTM) for classification purposes. The performance of the proposed model evaluated using CNN Arabic corpus. The experimental results showed solid performance of the proposed model, especially for a combination feature when the averages of precision, recall, and F-measurement reached 94, 91, and 91.94 respectively.
 
Keywords: 
BI-LSTM; Text classification; CNN Arabic corpus; LSA; SVD
 
Full text article in PDF: