In this paper, we have proved that the HDFS I/O operations performance is getting increased by integrating the set associativity in the cache design and changing the pipeline topology using fully connected digraph network topology. In read operation, since there is huge number of locations (words) at cache compared to direct mapping the chances of miss ratio is very low, hence reducing the swapping of the data between main memory and cache memory. This is increasing the memory I/O operations performance. In Write operation instead of using the sequential pipeline we need to construct the fully connected graph using the data blocks listed from the NameNode metadata. In sequential pipeline, the data is getting copied to source node in the pipeline. Source node will copy the data to next data block in the pipeline. The same copy process will continue until the last data block in the pipeline. The acknowledgment process has to follow the same process from last block to source block. The time required to transfer the data to all the data blocks in the pipeline and the acknowledgment process is almost 2n times to data copy time from one data block to another data block (if the replication factor is n).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.