The interest in using scalable data processing solutions based on
Apache Hadoop ecosystem is constantly growing in the High Energy Physics
(HEP) community. This drives the need for increased reliability and availability
of the central Hadoop service and underlying infrastructure provided to the
community by the CERN IT department. This paper reports on the overall status
of the Hadoop platform and related Hadoop and Spark service at CERN,
detailing recent enhancements and features introduced in many areas including
the service configuration, availability, alerting, monitoring and data protection,
in order to meet the new requirements posed by the users’ community.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.