David Dykstra scite author profile

The CMS experiment at the LHC has established an infrastructure using the FroNTier framework to deliver conditions (i.e. calibration, alignment, etc.) data to processing clients worldwide. FroNTier is a simple web service approach providing client HTTP access to a central database service. The system for CMS has been developed to work with POOL which provides object relational mapping between the C++ clients and various database technologies. Because of the read only nature of the data, Squid proxy caching servers are maintained near clients and these caches provide high performance data access. Several features have been developed to make the system meet the needs of CMS including careful attention to cache coherency with the central database, and low latency loading required for the operation of the online High Level Trigger. The ease of deployment, stability of operation, and high performance make the FroNTier approach well suited to the GRID environment being used for CMS offline, as well as for the online environment used by the CMS High Level Trigger (HLT). The use of standard software, such as Squid and various monitoring tools, make the system reliable, highly configurable and easily maintained. We describe the architecture, software, deployment, performance, monitoring and overall operational experience for the system.

Scaling HEP to Web Size with RESTful Protocols: The Frontier Example

Dykstra

2011

The WorldWide Web has scaled to an enormous size. The largest single contributor to its scalability is the HTTP protocol, particularly when used in conformity to REST (REpresentational State Transfer) principles. High Energy Physics (HEP) computing also has to scale to an enormous size, so it makes sense to base much of it on RESTful protocols. Frontier, which reads databases with an HTTP-based RESTful protocol, has successfully scaled to deliver production detector conditions data from both the CMS and ATLAS LHC detectors to hundreds of thousands of computer cores worldwide. Frontier is also able to re-use a large amount of standard software that runs the Web: on the clients, caches, and servers. I discuss the specific ways in which HTTP and REST enable high scalability and for Frontier. I also briefly discuss another protocol used in HEP computing that is HTTP-based and RESTful, and another protocol that could benefit from it. My goal of is to encourage HEP protocol designers to consider HTTP and REST whenever the same information is needed in many places.

Advances in Grid Computing for the Fabric for Frontier Experiments Project at Fermilab

Herner

Bhat

et al. 2017

Distributed Analysis in CMS

et al. 2010

CMS expects to manage several Pbytes of data each year, distributing them over many computing sites around the world and enabling data access at those centers for analysis. CMS has identified the distributed sites as the primary location for physics analysis to support a wide community of users, with potentially as many as 3000 users. This represents an unprecedented scale of distributed computing resources and number of users. An overview of the computing architecture, the software tools and the distributed infrastructure deployed is reported. Summaries of the experience in establishing efficient and scalable operations to prepare for CMS distributed analysis are presented, followed by the user experience in their current analysis activities.JournalofGridComputing manuscript No. (will be inserted by the editor) Abstract CMS expects to manage several Pbytes of data each year, distributing them over many computing sites around the world and enabling data access at those centers for analysis. CMS has identified the distributed sites as the primary location for physics analysis to support a wide community of users, with potentially as many as 3000 users. This represents an unprecedented scale of distributed computing resources and number of users. An overview of the computing architecture, the software tools and the distributed infrastructure deployed is reported. Summaries of the experience in establishing efficient and scalable operations to prepare for CMS distributed analysis are presented, followed by the user experience in their current analysis activities. Distributed Analysis in CMS

Comparison of the Frontier Distributed Database Caching System to NoSQL Databases

Dykstra

2012

One of the main attractions of non-relational "NoSQL" databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also has high scalability and wide-area distributability for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It also compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.