CMS expects to manage several Pbytes of data each year, distributing them over many computing sites around the world and enabling data access at those centers for analysis. CMS has identified the distributed sites as the primary location for physics analysis to support a wide community of users, with potentially as many as 3000 users. This represents an unprecedented scale of distributed computing resources and number of users. An overview of the computing architecture, the software tools and the distributed infrastructure deployed is reported. Summaries of the experience in establishing efficient and scalable operations to prepare for CMS distributed analysis are presented, followed by the user experience in their current analysis activities.JournalofGridComputing manuscript No. (will be inserted by the editor) Abstract CMS expects to manage several Pbytes of data each year, distributing them over many computing sites around the world and enabling data access at those centers for analysis. CMS has identified the distributed sites as the primary location for physics analysis to support a wide community of users, with potentially as many as 3000 users. This represents an unprecedented scale of distributed computing resources and number of users. An overview of the computing architecture, the software tools and the distributed infrastructure deployed is reported. Summaries of the experience in establishing efficient and scalable operations to prepare for CMS distributed analysis are presented, followed by the user experience in their current analysis activities.
Distributed Analysis in CMS
The Compact Muon Solenoid (CMS) is one of the general purpose experiments at the CERN Large Hadron Collider (LHC). CMS computing relies on different grid infrastructures to provide computational and storage resources. The major grid middleware stacks used for CMS computing are gLite, Open Science Grid (OSG) and ARC (Advanced Resource Connector). Helsinki Institute of Physics (HIP) hosts one of the Tier-2 centers for CMS computing. CMS Tier-2 centers operate software systems for data transfers (PhEDEx), Monte Carlo production (ProdAgent) and data analysis (CRAB). In order to provide the Tier-2 services for CMS, HIP uses tools and components from both ARC and gLite grid middleware stacks. Interoperation between grid systems is a challenging problem and HIP uses two different solutions to provide the needed services. The first solution is based on gLite-ARC grid level interoperability. This allows to use ARC resources in CMS without modifying the CMS application software. The second solution is based on developing specific ARC plugins in CMS software.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.