CLASSIFICATION OF STUNTING IN CHILDREN UNDER FIVE YEARS IN PADANG CITY USING SUPPORT VECTOR MACHINE

. Stunting is a nutritional problem in children characterized by a child’s height that is less than twice the standard deviation of the median standard of children growth that has been determined by the WHO. Stunting is influenced by many factors. If the conditional of these factors were known, it could be expected to know whether a child is stunted or not. In this study, the prediction of stunting was carried out using the Support Vector Machine (SVM) classification method. SVM is a method to find the best hyperplane that can be used to separate two or more classes. In this study, the parameters of the SVM model that must be determined are the cost value and gamma. Based on the research results with parameters cost = 10 and gamma = 5, a classification estimation result with 100% accuracy can be obtained.


INTRODUCTION
Children under five years of age are the most vulnerable group for having malnutrition problems [1]. One form of malnutrition is stunting. Stunting is a condition where toddlers have a length or height that is less than twice the standard deviation of the median growth standard defined by WHO. According to the result of monitoring nutritional status by the Ministry of Health office of the Republic of Indonesia, there has been a stunting problem in Padang, West Sumatera, Indonesia. In 2015, 21.1% of children under the age of five were stunted, and this figure increased to 22.6% in 2016 [2], [3]. However, this figure is still greater than WHO's standard, where a region is categorized to have no nutritional problems if the rate of stunting is less than 20% [4].
Stunting is a nutritional problem that is influenced by several factors. These factors include mothers and mothers-to-be, the condition of infants and toddlers, and the socioeconomic conditions of the family and environment [5], [6]. If the conditions of the factors that influence the incidence of stunting are known, it can be predicted whether a child is stunted or not.
Using a data mining classification approach can be used to predict whether a child is stunted or not. One of the methods that can be used in data mining classification is the Support Vector Machine (SVM). The SVM method is a classification approach developed in computer science since the 1990s and is getting more popular today [7], [8]. The principle of this method is to determine the best separator function that can separate two or more classes. In SVM, the function separator is known as a hyperplane. Classifiers on SVM can be carried out on data that can be separated linearly or on data that cannot be separated linearly. Classification is assisted by using kernel functions on data that cannot be separated linearly. Several kernel functions are commonly used, such as linear, polynomial, Radial Basis Function (RBF), and sigmoid [9]. It is complicated to illustrate how to determine the hyperplane using kernel functions.
The SVM method is a classification method with a complex process because the hyperplane to be determined depends on many predictor variables. The more predictors are used, the more the hyperplane will be in a higher dimensional space, so that the function is complicated to present explicitly. In addition to predictor variables, the amount of data used is also very influential in the classification process by SVM. The more observations that are used, the more parameters can be determined.
In this study, to clarify how classification works by the SVM method, the researchers will illustrate the classification process by taking simple examples of linear kernel functions. Afterwards, the researchers determined the best parameters to classify the incidence of stunting in toddlers in the city of Padang.

Data Source
This study used primary data obtained directly from data sources, i.e., 440 toddlers. The population of interest in the study is children under five years of age in Padang, West Sumatra. The survey was conducted at several Posyandu (maternal and children's health services), daycares and housing, which were chosen using a purposive sampling technique. This primary data is related to ten factors that are predicted to affect nutritional status. Primary data collection was done by interview technique, using the questionnaire as a guide. The survey was conducted from August to September 2017.

Research Variables
The variables used in this study consisted of response variables and predictor variables. The response variable used is the incidence of stunting in toddlers ( ) that is categorized into 2: the incidence of stunting with ( = 1) and the incidence of not stunting with ( = 2). The predictor variables used are the variables that affect the stunting incidence. In this study, there are several continuous variables and several categorical variables. The continuous variables used are the child's age ( 2 ), the length of the baby at birth ( 3 ), and the number of dependents ( 5 ). The categorical variables used are gender ( 1 ), average income ( 4 ), employment status ( 6 ), breastfeeding status ( 7 ), mother's behavior status ( 8 ), mother's last education ( 9 ), and mother's knowledge status ( 10 ).

Definition of Stunting
Stunting was measured by the number of standard deviations that a child's length or height is below or above the median of the 2006 WHO growth reference population data on every child's length or height, age, and sex [10]. A number of standard deviations (z-score) less than-2 was considered stunting [11].

Classification
Classification is a way of categorizing objects based on the characteristics possessed by the classification object [10]. In the classification, there are dependent variables and independent variables (predictors). The dependent variable is divided into several classes or specified categories. The independent variables are the predicted variables that affect the result of classification.

Hyperplane
In a -dimensional space, a hyperplane is a subspace of a certain dimension − 1. For example, in a two-dimensional space, a hyperplane is a subspace of one dimension. In other words, the hyperplane is a line. In three-dimensional space, the hyperplane is a subspace of two dimensions. In other words, the hyperplane is a field. It won't be easy to visualize the shape of the hyperplane in more than three-dimensional space [12]. In two-dimensional space, the hyperplane is a line given by the Eq. (1): where X1 dan X2 is the variable, 0 , 1 , 2 re the parameter [13].
In three-dimensional space, a hyperplane is a plane that can be expressed by (2).
Furthermore, Eq. (2) can be extended to find the hyperplane that is in -dimensional space that can be expressed by using Eq. (3).

Support Vector Machine (SVM)
Support Vector Machine (SVM) is a machine learning classification method that finds the best hyperplane that serves as a data class separator. In the SVM method of classification, the hyperplane maximizes the distance between the hyperplanes with each of the closest data points for each class.

Hard Margin
Hard-margin is a term in SVM used when a hyperplane can completely separate both data classes. Suppose there is data available on -dimensional space denoted by ∈ and label each class, which is denoted by = −1, +1 for = 1,2,3, … , where is the amount of the data. Assume the two classes can be completely separated so that the hyperplane can be defined in the form of Eq. (4) [12].
The sample , which belongs to the class = −1 can be formulated as a function that satisfies the inequality (5).
Then the inequality (7) can be written as Getting the largest margin can be done by optimizing the distance between the hyperplane and its nearest data point. This can be formulated with the Quadratic Programming (QP) problem by finding the vector that minimizes the equation: taking into account the inequality constraint Problems (9) and (10) can be restated using Lagrange Multiplier is by adding a constraint function which is multiplied by the Lagrange multiplier ( ≥ 0) to the objective function (11), so that obtained The double formula usually called the Wolfe dual obtained in the problem is written in Eq. (12).

Soft Margin
Using SVM soft margin, the result of classifying most of the data is true, but it allows misclassification to occur for some points around the dividing limit [13]. On the soft margin, the classification problem can be interpreted as a way to find the best hyperplane that can separate the two classes by allowing classification errors around the hyperplane. The double formula for this case is:

SVM on Nonlinearly Separable Data
If data cannot be separated linearly, the SVM soft margin cannot find the best hyperplane to minimize the number of misclassified data points. By using the SVM method, the problem was modified by incorporating kernel functions into the nonlinear SVM. The kernel is used to convert data into more dimensional space to separate the data linearly [13].
A kernel is a function that is used to measure the similarity of two observations [12]. The basis of the kernel is mapping the data into the new higher dimensional space using a function. The kernel mapping function can be formulated as follows [13], where ( )dan ( ) is the Hilbert Space. Eq. (14) can be transformed into (17).
with 0 ≤ ≤ and ∑ =1 = 0 (18) To predict the class of an object , a discriminant function will be used whose basic operation is the sum of the weights of the similarity of object z and a collection of objects that have been previously selected (support vector). Assume S is the support vector, the function is formulated as follows [14]: If ( ) is positive, then object belongs to class +1, and if ( ) is negative, object belongs to class -1. In general, to predict the class of an object can use the function:

Application of SVM to Predict the Classification of Stunting Events
In classifying the data using the SVM method, it is necessary to determine the parameter value cost ( ) and the kernel to be used. According to Vapnik in [9], a popular kernel used in the classifier is the Radial Basis Function (RBF) kernel, so that in this research, we will use the kernel function formula as follows: Thus, to determine the best hyperplane for data classification by using the SVM method, it is necessary to determine the values of the two parameters, namely cost ( ) and gamma ( ), which result in good accuracy. Several values will be used to choose the best parameter. The trial and error method can be used to find the value of the parameters and [15]. Based on [16], the range of values that can be used in SVM is between 10 −2 to 10 2 . Some of the values that will be used in this study are 0.01, 0.1, 1, 10, 100, and the are 1, 2, 3, 4 and 5.
In determining the best parameter values from the 440 data points used, the data is divided into two parts, namely training data and testing data. The distribution of the amount of training data and testing data can be seen in Table 2. The distribution of training data and testing data is carried out randomly to obtain 330 training data and 110 testing data. Furthermore, several repetitions will be done to determine the best parameters to be used in the classification of stunting. The following will display the average accuracy of the ten possible parameter values that can be used in the classification process. Ten possible parameters were obtained after 100 repetitions. Based on Table 3, it can be assumed that the best parameters that can be used to form a hyperplane in this study are the value of cost = 10 and gamma = 5. Furthermore, these values are used to predict the classification of the incidence of stunting in toddlers in the city of Padang. The following table presents the RBF kernel's prediction results with cost = 10 and gamma = 5. Based on Table 4, it can be seen that 140 data can be truly predicted the incidence of stunting, and 300 classified as not stunting can be predicted correctly. Overall, the accuracy obtained in this study are: The accuracy value obtained is 100%. This explains that the prediction of classification of the incidence of stunting produced by using the SVM method in this study has a very high level of accuracy equal to 100%.

CONCLUSIONS
From the results of the research, it can be concluded that: 1. Based on the classification illustration described using the linear kernel function, the parameters that must be specified are and 0 . The number of parameters that must be determined depends on the amount of data used. The value of that is obtained by maximizing the margin that separates the two classification classes by using the double formula: Furthermore, the value of 0 is obtained using the constraint function, namely by choosing a certain , which is a support vector. 2. Based on the research results that have been obtained, it is known that the best parameters used in predicting the classification of stunting in children under five in the city of Padang are cost = 10 and gamma = 5, with the kernel function used is the Radial Basis Kernel Functions (RBF). The accuracy obtained in this classification is about 100%. This case explains that all observations can be correctly predicted. 3. In this study, researchers used the Support Vector Machine method to predict the classification of two classes. In future research, it is suggested that researchers can use the Support Vector Machine (SVM) method to predict the classification of many classes by taking different parameter values.