In order to address the low efficiency of traditional data-intensive extensible computing system in data source utilization and data transmission, a throughput optimization model of application layer based on parallel flow number prediction has been proposed. Firstly
Keywords: C order model; Second order model; Parallel flow number prediction; ThroughputCopyright © 2016 Universitas Ahmad Dahlan. All rights reserved.
IntroductionOver recent years, data storage scale is becoming increasingly complex and large with rapid development of science in various fields and it has been the development tendency of large data. As is known to all, as for large data transmission, parallel TCP data is equipped with greater data transmission performance and parallel data flow can obtain higher available bandwidth share by simulating behavior of single data flow [1][2][3][4][5][6]. However, it is very hard to predict network congestion point due to independence of time-space domain of some parameters. Therefore, it becomes extremely difficult to select parallel flow number and such selection depends on a number of network parameters such as available bandwidth, packet loss rate, capacity of bottleneck link and data size. In terms of parallel flow number prediction, some scholars have proposed prediction models, for instance, the full second order model throughout prediction model contained in literature [7] and such prediction model has been verified based on GridFTP data transmission system; based on this, a partial C order model throughout prediction model has been proposed in literature [8] and has also been verified based on GridFTP data transmission system. Literature [7] indicates that prediction accuracy of full second order model is higher than that of part order secondary moment. Based on such idea, research has been carried out on partial C order model network throughout prediction model and full C order model network throughout prediction model has been derived. However, it has been found through experiment and test that real-time performance of algorithm of full C order model network throughout prediction model is poor as there are additional network prediction parameters. In order to address such problem, a low sampling throughout optimization algorithm framework has been designed. Meanwhile, during simulation experiment, simulation verification has been conducted on prediction performance of related models in GridFTP data transmission system according to practices recorded in related literature in order to verify validity of the proposed algorithm.