Genelleştirilmiş Regresyon Yapay Sinir Ağı (GRYSA) radyal tabanlı çalışan ve genellikle tahminleyici olarak kullanılan denetimliöğrenimli bir yapay sinir ağı (YSA) modelidir. Kolay modellenebilmesinin yanında hızlı ve tutarlı sonuçlar üretmesi bu algoritmanın güçlü yanlarını oluşturmaktadır. Ancak GRYSA tahmin mekanizmasında, eğitim veri setindeki her örnek veri için örüntü katmanında bir adet nöron tutulmaktadır. Bu nedenle, eğitim veri setinin çok büyük olduğu çalışmalarda örüntü katman yapısı örnek verilerinin sayısıyla aynı oranda büyümekte, yapılan işlem sayısı ve bellek gereksinimi artmaktadır. Bu çalışmada, GRYSA algoritmasının işlem sayısını azaltmaya yönelik olarak literatürde daha önce de denenmiş olan k-ortalama kümeleme algoritması ön-işlemci olarak kullanılmış, literatürdeki çalışmalardan farklı olarak, bu çalışmaların performansını negatif anlamda etkileyen kümeler arasına düşen test verileri bulunarak aykırı veri oluşmasının önüne geçilmiştir. Böylece, örüntü katmanındaki bellek ihtiyacı ve işlem sayısı azaltılırken, kümeleme algoritmasının eklenmesi ile performansta ortaya çıkan negatif etki büyük oranda giderilmiş ve yaklaşık %90 daha az eğitim verisi ile neredeyse aynı tahmin sonuçları elde edilmiştir. Generalized Regression Neural Network (GRNN), is a radial basis function based supervised learning type Artificial Neural Network (ANN) which is commonly used for data predictions. In addition to its easy modelling structure, being fast and producing accurate results are the other strong features of it. On the other hand, GRNN employs a neuron in pattern layer for each data sample in training data set. Therefore, for huge data sets pattern layer size increases proportional to the number of samples in training data set, memory requirement and computational time also increase excessively. In this study, in order to reduce space and time complexity of GRNN, k-means clustering algorithm which had been used as pre-processor in the literature is utilized and outlier data emergence which affects the performances of previous studies negatively, is prevented by identifying test data located between clusters. Hence, while memory requirement in pattern layer and number of calculations are reduced, negative effect on the performance emerged by the use of clustering algorithm is significantly removed and almost the same prediction performances to that of standard GRNN are achieved by using 90% less training samples.
In a general regression neural network (GRNN), the number of neurons in the pattern layer is proportional to the number of training samples in the dataset. The use of a GRNN in applications that have relatively large datasets becomes troublesome due to the architecture and speed required. The great number of neurons in the pattern layer requires a substantial increase in memory usage and causes a substantial decrease in calculation speed. Therefore, there is a strong need for pattern layer size reduction. In this study, a self-organizing map (SOM) structure is introduced as a pre-processor for the GRNN. First, an SOM is generated for the training dataset. Second, each training record is labelled with the most similar map unit. Lastly, when a new test record is applied to the network, the most similar map units are detected, and the training data that have the same labels as the detected units are fed into the network instead of the entire training dataset. This scheme enables a considerable reduction in the pattern layer size. The proposed hybrid model was evaluated by using fifteen benchmark test functions and eight different UCI datasets. According to the simulation results, the proposed model significantly simplifies the GRNN’s structure without any performance loss.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.