“…2) Generality of proposed hardware: Designing controllers for CXL memory is currently under active development. CXL is being investigated to provide persistent memory [54], pooled remote memory to expand memory capacity [50], nearmemory accelerators on CXL [34] and dynamic tiered memory [35]. Efficient implementation of these designs requires controllers for data persistence, address translation and data coherence.…”
As cloud workloads increasingly adopt the faulttolerant Function-as-a-Service (FaaS) model, demand for improved performance has increased. Alas, the performance of FaaS applications is heavily bottlenecked by the remote object store in which FaaS objects are maintained. We identify that the upcoming CXL-based cache-coherent disaggregated memory is a promising technology for maintaining FaaS objects. Our analysis indicates that CXL's low-latency, high-bandwidth access characteristics coupled with compute-side caching of objects, provides significant performance potential over an in-memory RDMA-based object store.We observe however that CXL lacks the requisite level of faulttolerance necessary to operate at an inter-server scale within the datacenter. Furthermore, its cache-line granular accesses impose inefficiencies for object-granular data store accesses.We propose Āpta, a CXL-based object-granular memory interface for maintaining FaaS objects. Āpta's key innovation is a novel fault-tolerant coherence protocol for keeping the cached objects consistent without compromising availability in the face of compute server failures. Our evaluation of Āpta using 6 full FaaS application workflows (totaling 26 functions) indicates that it outperforms a state-of-the-art fault-tolerant object caching protocol on an RDMA-based system by 21-90% and an uncached CXL-based system by 15-42%.
“…2) Generality of proposed hardware: Designing controllers for CXL memory is currently under active development. CXL is being investigated to provide persistent memory [54], pooled remote memory to expand memory capacity [50], nearmemory accelerators on CXL [34] and dynamic tiered memory [35]. Efficient implementation of these designs requires controllers for data persistence, address translation and data coherence.…”
As cloud workloads increasingly adopt the faulttolerant Function-as-a-Service (FaaS) model, demand for improved performance has increased. Alas, the performance of FaaS applications is heavily bottlenecked by the remote object store in which FaaS objects are maintained. We identify that the upcoming CXL-based cache-coherent disaggregated memory is a promising technology for maintaining FaaS objects. Our analysis indicates that CXL's low-latency, high-bandwidth access characteristics coupled with compute-side caching of objects, provides significant performance potential over an in-memory RDMA-based object store.We observe however that CXL lacks the requisite level of faulttolerance necessary to operate at an inter-server scale within the datacenter. Furthermore, its cache-line granular accesses impose inefficiencies for object-granular data store accesses.We propose Āpta, a CXL-based object-granular memory interface for maintaining FaaS objects. Āpta's key innovation is a novel fault-tolerant coherence protocol for keeping the cached objects consistent without compromising availability in the face of compute server failures. Our evaluation of Āpta using 6 full FaaS application workflows (totaling 26 functions) indicates that it outperforms a state-of-the-art fault-tolerant object caching protocol on an RDMA-based system by 21-90% and an uncached CXL-based system by 15-42%.
“…As result, an opportunity was created for specialized accelerators, which offer significant performance improvements over CPUs [4]- [6] for two distinct reasons: (1) they are specifically optimized for a given application (or small set of applications), taking full advantage of its underlying characteristics, such as data access patterns and parallelization opportunities; and (2) they may be installed close to where data is stored (i.e., memory devices [7]- [10]), becoming Near-Data Processing (NDP) devices, henceforth designated by Near-Data Accelerators (NDAccs), which grants them a much higher bandwidth and lower latency to memory than the CPU.…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.