CLASSIFICATION OF ARRHYTHMIA DISEASES BY THE CONVOLUTIONAL NEURAL NETWORK METHOD BASED ON ECG IMAGES

ABSTRACT


INTRODUCTION
Arrhythmia is a heart disorder that refers to an abnormal heart rate or rhythm, such as a heart rate that is too fast (tachycardia), too slow (bradycardia), and irregular [1]. There are various arrhythmias; one type is the most common, namely atrial fibrillation [2]. The condition of this type of arrhythmia is that the heart beats irregularly and quickly, increasing sufferers' risk of stroke and heart failure. Rohmantri and Surantha, in their journal, suggested that arrhythmia is a life-threatening heart disorder [3]. One way to identify and diagnose this disorder as early as possible is to use an electrocardiogram (ECG) [4].
An electrocardiogram checks the heart's condition and assesses the effectiveness of cardiac treatment by measuring and recording the heart's electrical activity [5]. The process of recording and detecting the heart's electrical impulses uses a device called an electrocardiograph. On the other hand, through Rohmantri and Surantha's journal, Klein said that an ECG is a diagram produced by an electrocardiograph sensor that records electrical impulses in the heart [6]. The results of electrocardiograph sensors are in the form of graphs or signals that can be used to diagnose arrhythmias in the heart. In addition to ECG signals, a phonocardiogram (PCG) can be used to diagnose heart disease [7], [8].
An ECG signal consists of a single heartbeat wave containing P waves, QRS complexes, and T waves with peak/primary waves P, Q, R, S, and T, respectively [9], [10]. However, this ECG process takes a long time, and the lack of experts to handle these cases makes it difficult to do the classification process manually. Therefore, to overcome this problem, a classification method was used.
ECG signal classification methods such as Support Vector Machine (SVM) [11], [12], Neural Network (NN) [13], and Deep Learning Convolutional Neural Network (CNN) methods are widely proposed in arrhythmia classification research [14]. Each method has advantages and disadvantages in its application. Wibawa et al., through their journal, stated that the SVM method could solve the over-fitting problem even though the optimal parameters are difficult to choose [15]. In the NN method, various noise data are highly tolerated, but the processing is very long and complex to interpret. Finally, the CNN method automatically learns filters to extract specific and relevant features from the input data, although the model training process can take a long time. It follows Bajaj and Kumar's statement that CNN has the advantage of combining image extraction and classification [16]. The feature extraction part has the function of automatically extracting images from ECG signals. In contrast, the classification part has the process of accurately classifying the signals using image extraction and is expected to produce a reasonably good accuracy [17].
In previous research [18], Fansyuri implemented the 1-Dimensional CNN method to classify heart disease classes based on ECG signal data into five scenarios. The study used the primary dataset, namely the PTB database, and datasets for adding courses, namely BIDMC Congestive Heart Failures, BIT-MIH Normal Sinus Arrhythmia, and The China Physiological Signal Challenge 2018. From the fifth trial scenario, the best results were obtained with accuracy, sensitivity, specificity, precision, F1-score, and error of 100.00%, 99.98%, 100.00%, 100.00%, and 00.00%, respectively. In other research with the CNN method, the classification of data based on images was carried out using the CNN-2D method [19], [20].
Based on the description of the problem above, this study aimed to determine the performance of the Convolutional Neural Network method in classifying arrhythmia disease based on ECG signal images. The selection of the CNN method is due to the input variable in the form of ECG signal images. Based on previous research, CNN can combine image extraction and classification with reasonably good accuracy results. This research was developed using Python for model building and GUI applications. The difference between this research and previous research lies in the type of CNN model used, namely 2D CNN, and the types of arrhythmia diseases classified, namely 17 classes (including normal sinus rhythm and pacemaker rhythm) based on ECG images from the MIT-BIH Arrhythmia Database.

RESEARCH METHODS
This research will discuss the performance of the CNN application as a method for arrhythmia classification. The classification data is an ECG signal dataset from the MIT-BIH Arrhythmia database obtained through the Mendeley Data website [21]. The processed data was in the form of ECG images, a total of 1000 image fragments, and consisted of 17 classes, including Normal Sinus Rhythm (NSR), Pacemaker Rhythm (PR), and 15 types of arrhythmic heart disorders. The steps are taken to classify arrhythmic diseases, namely: a. Performing data pre-processing, images (.JPG) converted from numerical data (.MAT) using Matlab were allocated to the training data by 75% and testing data by 25% and then uploaded to Google Drive so that it can be connected to Google Colab. The images placed into the testing data were randomly selected by looking at the number of ECG signal images in each patient within the same class. Both datasets will go through several other pre-processing stages: gray-scale, resize, cropping, augmentation, and image data labeling.
b. Establishing a Convolutional Neural Network architecture to classify disease classes based on images. Each CNN input layer has a 3-dimensional neuron arrangement of width, height, and depth. The amount of output obtained is bound by the results of the titration of the previous layer and the number of filters used. In general, CNN layers are divided into two types, namely:

b.1. Feature Extraction Layer
The layer receives input from the image directly and processes it with a convolutional layer, max pooling layer, and ReLu (Rectifier Linear Unit) activation function. The equation for max-pooling is defined as f(x) = max(0, x).

b.2. Classification Layer
It is a type of layer composed of neurons that are fully connected to other layers. The transformation output is the class accuracy for classification with the Sigmoid or Softmax activation function.
c. Compiling the model with a learning rate of 0.001. d. Conducting model training to determine the accuracy of the CNN architecture that has been created. The number of epochs (iterations) used in training is 80, and the batch size value specified is 20 images. During the training process, data will be randomly taken from as many as 20 images from all dataset samples for each epoch until all epochs meet the sample limit.
e. Testing the model by classifying arrhythmia disease images on testing data using the CNN method.
f. Measuring the performance of a classification model by referring to several parameters, namely the accuracy rate, precision value, recall value, and f1-score value. To calculate these measurement parameters, a contingency table called Confusion Matrix is required. The contingency table is represented as a 2 x 2 matrix, as shown in Table 1. Based on Table 1, the four possible condition values are as follows.
• True Positive (TP), where the model predicts data in the Positive class and the actual data is in a Positive class.
• False Positive (FP), where the model predicts data in the Positive class, but the actual data is in the Negative class.
• True Negative (TN), where the model predicts data in the Negative class and the actual data is in the Negative class.
• False Negative (FN), where the model predicts data in the Negative class, but the actual data is in the Positive class.
The accuracy, precision, recall, and f1-score levels can be calculated using the following confusion table formulas.
The following is a diagram of the research steps in classifying arrhythmia diseases in the heart.

Results
Image data processing was done with the Python programming language and the CNN method. The first stage after data collection is the process of converting an image to make it easier to understand so that it can increase the accuracy of the analysis results. This process is called image data pre-processing. One of the results of converting numerical data (.MAT) into image data (.JPG) is shown in Figure 2.

Figure 2. Visualization of ECG signal data
Because the output from Matlab has a size that is too large, the image resizes process was carried out to facilitate the processing of the image dataset. All ECG image data (Figure 2), initially 1058 x 530 pixels in size, were reduced to 1058 x 265 pixels. Furthermore, the color image (RGB) from the resize process was converted into a gray-scale to simplify the image model. Gray-scale images can be filtered to realize a more optimal image, as shown in Figure 3. Before the data labeling process, cropping of the gray-scale image from 1058 x 265 pixels to 265 x 265 pixels was performed. It aimed to identify the characteristics of each ECG image according to each class. Image cropping was done without equalizing the wavelength of each ECG signal image fragment, and the center point is the midpoint of the image (Figure 3). Data augmentation was a stage in image data processing where the data was altered or modified so that the model would detect that the altered image was different. Although the model considers it an additional image, humans can tell that the modified image is the same as in Figure 5. The augmentation process aimed to improve the accuracy of the trained CNN model. It is because the model will get additional data to make the model better at concluding.

Figure 5. Illustration of image data augmentation
The last data pre-processing stage is labeling the training and testing data images. Labeling used decimal numbers 0 (zero) to 16 (sixteen) according to the number of classes in the dataset and in the form of an array. The order of labeling given to each class is shown in Table 2. The formation of CNN network architecture can affect the accuracy of the model. The architecture was used in the model training process to form the CNN model, as shown in Table 3. The ECG signal image processed by the model is 265 x 265 pixels (Figure 4) gray-scale with JPG format, where each ECG signal fragment from one patient can be in two or more arrhythmia classes. The results obtained after the model training process are the accuracy level on training data of 1.0000 and the loss value of 0.0014. Then, for accuracy on test data of 0.8120, the loss value is 1.9445. Figure 6 shows a graph of accuracy and loss of model training results on training and testing data in each iteration.

Figure 6. Accuracy and loss graph of model training results
Next, the model was tested by classifying the ECG signal image in the testing data. The test was carried out by inputting image information into the model. Then the model provides prediction results that best match the image information obtained. The Python script used for the testing process is as follows.

img = load_img(picture, target_size= (265,265)) img = img_to_array(img) img = np.expand_dims(img, axis=0) p = model.predict(img) result = p[0]
The results of testing the CNN model against the testing data are presented visually through a confusion table (confusion matrix) to make it easy to read and evaluate. The confusion matrix is shown in Figure 7.

Figure 7. Confusion matrix of testing data
Measuring the performance of a classification model can be done through several methods, one of which is by finding the accuracy, precision, recall, and f1-score values. Based on Figure 7, the accuracy, precision, memory, and f1-score values on the testing data can be found using the confusion table formulas. However, the search for these measurement values can be done systematically using a Python program with the following script. Based on the classification report output on the testing data, it can be seen that the processed dataset is more than 2 classes (multi-class) and unbalanced. The testing dataset is unstable because there is one class with data that is much different from the other classes, which is 70 images. The class is NSR (Normal Sinus Rhythm). Therefore, the model performance was measured based on the macro-F1 value because the average macro treats all classes equally, where all classes are equally important regardless of the number of images in each class.
Thus, the average value of the macro f1-score is 73%, with precision and recall values of 80% and 71%, respectively. The calculation of the average macro value through the classification report output of the testing data is as follows. The macro precision value of 80% means that the CNN model's ability to provide identification results from each testing data image as the same class is 80% correct. Meanwhile, the macro recall value of 71% means that each ECG image of arrhythmia disease classified with the CNN model ( Table 3) has a relevance level of 71% with the classification results. Finally, the macro f1-score value of 73% means that the CNN model performs well on the testing data because the accuracy and relevance between the identification and classification results on the image on the ECG signal image of arrhythmia disease is 73%.
From the actual testing data, an accuracy value of 81% is obtained, which means that 81% of the images from the arrhythmia disease testing data can be classified according to their class accurately using the CNN model that has been created. The model's accuracy value can be found using the confusion table formula point (a), based on Figure 7. The confusion table shows that the data that was successfully detected correctly amounted to 203 and detected incorrectly amounted to 47, so the model's accuracy value is as follows. Looking at the results above, the accuracy obtained on the testing data is 81.2%.

Discussion
Based on the research that has been conducted, it is found that the CNN model used to classify arrhythmia diseases into 17 classes based on ECG images has good performance. The dataset obtained through Mendeley Data [21] was converted from numerical data (.MAT) to image (.JPG) using Matlab. Next, the image was put into the Pre-processing stage to alter the color image into a gray-scale image. In addition, image processing that was also carried out at this stage was resizing, cropping, augmentation, and labeling. The CNN model architecture used in ECG image detection and classification involved 7 Convolution Layer, 7 Pooling Layer, 2 Dropout Layer, 2 Dense Layer, and 1 Flatten Layer as ReLu and Softmax activation functions.
Performance measurement was based on the f1-score value of 73% with a precision value of 80%, a recall value of 71%, and an accuracy rate of 100% training and 81% testing. This result follows the results of Rohmantri and Surantha's research. Rohmantri and Surantha [3], who used CNN-2D to classify heart disorders into 7 classes, obtained an f1-score value of 98% with a precision value of 98%, a recall value of 98%, and an accuracy rate of 98%. In that study, the CNN model architecture used for classification involved some 2 Convolution Layer, 1 Pooling Layer, 2 Dropout Layers, 2 Dense Layer, and 1 Flatten Layer, as well as ReLu and Softmax activation functions.
Based on the results of model testing and model evaluation, prediction errors in ECG image testing data are likely due to determining the value of model training parameters that were still inappropriate. The parameters include the effect of image input size, training and testing data division scenarios, the impact of the number of epochs, kernel size, batch size, and learning rate. An arrhythmia disease classification application was created using the Python programming language and based on the Graphic User Interface (GUI) to implement the classification model quickly, as shown in Figure 8.

CONCLUSIONS
Based on the results of arrhythmia disease classification through ECG images (17 classes), the CNN method classification model was divided into several stages: Data Pre-processing, CNN Model, Model Compile, Training and Testing, and ends with an evaluation. The CNN model architecture used in the detection and classification of ECG images involves 7 Convolution Layer, 7 Pooling Layer, 2 Dropout Layer, 2 Dense Layer, and 1 Flatten Layer, as well as ReLu and Softmax activation functions. The detection and identification results based on ECG images obtained a precision value of 80%, a recall value of 71%, an f1score discount of 73%, and an accuracy rate of 100% training and testing of 81%. With the f1-score value as a measurement reference, the CNN model performs well in classifying ECG images. The model in this study used an input image of 265 x 265 pixels, a learning rate of 0.001, a filter kernel of size 4 x 4, several epochs of 80, and a data division scenario for training data of 750 images and testing data of 250 prints.