Diagnostic of pathology on the vertebral column machine learning - Cluster K-nearest Neighbor (CKNN) part (I)

Aissa Boudjella 1, 2, 3, *, Sarah Arab 3, Manal Y. Boudjella 4, Sarah Khiter 3 and Bachir Bellebna 3

1 Bircham international University, Avda Sierra-2 (Urb. Guadamonte), Villanueva de la Cañada - Madrid 28691,  Spain.
2 Bircham international University .1221 Brickel Av., Suite 900 - Miami, Florida 33131 – USA.
3 Etablissement Hospitalier Universitaire d'Oran, 1 Novembre 1954, Oran, Algeria.
4 University of Sciences and Technology of Oran Mohamed Boudiaf ,USTO-MB, BP 1505 El M’naouar, 31000 Oran, Algeria.
 
Research Article
Global Journal of Engineering and Technology Advances, 2020, 05(03), 020-028.
Article DOI: 10.30574/gjeta.2020.5.3.0107
Publication history: 
Received on 25 November 2020; revised on 04 December 2020; accepted on 06 December 2020
 
Abstract: 
In this investigation, we have developed a graphical user interface application to perform the diagnostic of pathology on the column vertebral based on the Cluster K-Nearest Neighbor (CKNN) classifier. The system is implemented and simulated in Anaconda, and its performance is tested on real dataset that contains 6 features and two (02) classes.  Each class, abnormal and normal class consists of 210 instances, and 100 instances, respectively. A comparison of the performance of the test measurement under various test sizes (10%~50%) is carried out to predict the class label when the nearest neighbor k changes from 1 to 19. The results show that the accuracy depends on both independent parameters, the test size and k-neighbors, which gives better training accuracy than the test accuracy, in the range of [82.5% ~ 100%] and [70%~84%], respectively. When k varies from 1 to 4, a higher training accuracy, larger than 90% is observed. While the test set shows a low accuracy in the range of [74% ~ 82.5%]. Increasing the test size or/and k, does not affect significantly the accuracy.  When k is larger 1, the training accuracy is approximately equal to 0.925±0.05, the test accuracy (except for k=6 and 17) is about 0.79±0.05. The prediction of the class status maybe optimized by combining the dataset set size with the k-neighbors parameters. The GUI can be useful to help the medical doctors to diagnostic the patient effectively to take a rapid decision and predict results in a reduced time lapse. 
 
Keywords: 
Vertebral column; Accuracy; Test Size; Machine learning; Cluster K-Nearest Neighbor classifier.
 
Full text article in PDF: