The CosMiC system is a user-level procefi migration environment. Process migration is dejined as the mechanism to checkpoint the state of an unjinished process, transfer the state from one machine to another; and resume process execution on the new machine. The main purposes of process migration are ( I ) to utilize the CPUpower and balance loadon all machines in an environment; ( 2 ) to provide faulttolerance by migrating a process from a failed machine to another machine.CosMiC provides an extensible architecture to allow an application to choose its own checkpointing nzechanism. It is equipped with four checkpoint libraries, namely, libckp, libfcp, libft and libst. Theyprovidedif ferent strategies f o r state saving and restoring. Libckp is a transparent checkpoint library, it checkpoints the entire process state. It requires minimum user involvement and no modijicutions to the source code. Libfcp is ajile checkpoint library that saves and restores file contents. Libft is a critical data checkpoint library. Users select critical data to be checkpointed using a set of applicationprogramming interfaces (APIs). Libst is a strong-type checkpoint library. It saves and restores architecture-independent checkpoints in a heterogeneous environment. In this paper; we describe our e.xperience in incorporating these different checkpointing mechanisms into the CosMiC system. total 131 hostsFigure 1. Numbers of idle machines in a day
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.