# kNN k nearest neighbor kNN is

2022-09-07

3.2. kNN k-nearest neighbor (kNN) is a classification algorithm, among the supervised learning methods, which can be successfully ap-plied in large databases. In the kNN algorithm, classification is per-formed by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (Dudani, 1976). Euclidean, Manhattan, Minkowski and Chebyshev parameters are used to calculate distances (Lee, Wan, Rajkumar, & Isa, 2012). In the study, the classification process was performed using the distance parameter and k = 5 value. 3.3. Support vector machine The purpose of the Support Vector Machine (SVM), which is a classifier based on statistical learning theory, is to estimate the most appropriate decision support function that can distinguish two or more CT99021 (Cortes & Vapnik, 1995). An example support vector operation is shown in Fig. 1. As shown in Fig. 1, by finding the decision function in SVM, the most suitable hyperplane is determined which can separate the dataset. The equations for the support vectors in SVM are given in Eq. (1) to be used in a binary classification problem that can be differentiated linearly.
Table 2
The statistical significance of each target variable.
Omnibus tests of model coe cients (Citology) Omnibus tests of model coe cients (Hinselmann)
Chi-square df Sig.
Chi-square df Sig.
Omnibus tests of model coe cients (Schiller) Omnibus tests of model coe cients (Biopsy)
Chi-square df Sig.
Table 3
Effect of adding Gaussian noise to the data set on clas-sification performance.
Noise rate Accuracy (%)
Fig. 1. An example optimal plane and support vectors (Chapelle & Vapnik, 2000).
is obtained. The solution of Eq. (3) with Lagrange optimization is calculated. The decision function of the support vector machine for a two-class problem is given in Eq. (4) (Chapelle & Vapnik, 2000; Vapnik, 1998).
The coe cients ai are the solution of the problem. Detailed discus-sions on SVM refer to (Chapelle & Vapnik, 2000).
3.4. AutoEncoders
An autoencoder, which based on unsupervised learning running on artificial neural networks, is used to perform a kind of encoding
(1) and decoding without performing classification function (Bastürk,¸ Yüksei, Badem, & Çalıskan,¸ 2017; Kaynar, Yüksek, Görmez, & Isık,¸ 2017). Fig. 2 shows a general autoencoder model.
As shown in Fig. 2, nephron model consists of three layers, as in ar-tificial neural networks. These layers include the input, hidden and output layers (Le et al., 2011). Each layer consists of a specified number of neurons. Autoencoder operations are performed in the model we have determined as the hidden layer. Here, the number
(2) of hidden layers is one of the important factors affecting the per-formance of the created model. There are two important features of the autoencoder model: the input layer and the output layer
(3) being chosen equal, and the number of input layers often greater
In Eq. (5) and Eq. (6), s refers to gauss, sigmoid and tanh activation functions used (Bengio, 2009). w represents the weights between the input and the hidden layer, and b represents the bias value (Bastürk¸ et al., 2017). The value z indicates the reconstructed state of the input values using y values of the hidden layer, obtained from the input value x (Chen et al., 2018). The overfitting problem in artificial neural networks is also present in the autoencoder structure. The regularization method is used to overcome this problem. The operations performed in this method are shown in Eq. (8).
min
than the number of hidden layers. Thanks to these two features, the autoencoder model performs better than the feed-forward neu-ral networks.