Achieving high system execution performance is challenging due to the variety of application features, the erratic nature of memory resources, and the erratic use of the cache application programming interface (API). Ineffective cache replacement procedures can lead to a variety of performance issues, including lengthy application execution times, low memory use, frequent replacements, and even program execution failures owing to memory outages. Spark is presently using the least recently used (LRU) cache refilling technique. Despite being a popular classical strategy, LRU ignores environmental conditions and workloads. It cannot therefore function well in a variety of situations. The model in this paper introduces a new cache replacement technique called least partition weight (LPW). LPW considers several parameters, such as the partition size, computational cost, and reference count, in evaluating the performance of a system. Once the LPW algorithm was integrated into Spark, it was evaluated against the LRU and other cutting-edge methods. The cache replacement model's functionality is substantially improved by the recommended model with the application of auto data division and weighted tree technique. Therefore, the focus of this review paper is on examining previous research to comprehend the cache replacement process and present an improved version of the catch replacement using LPW in conjunction with parallel computation, which will be described in the paper's next edition.