68 communications of th e ac m | j u n e 2 0 1 0 | vo l . 5 3 | n o. 6 contributed articles DATA -orienTeD sC i e nT i f iC P ro C es se s depend on fast, accurate analysis of experimental data generated through empirical observation and simulation. However, scientists are increasingly overwhelmed by the volume of data produced by their own experiments. With improving instrument precision and the complexity of the simulated models, data overload promises to only get worse. The inefficiency of existing database management systems (DBMSs) for addressing the requirements of scientists has led to many application-specific systems. Unlike their general-purpose counterparts, these systems require more resources, hindering reuse of knowledge. Still, the data-management community aspires to generalpurpose scientific data management. Here, we explore the most important requirements of such systems and the techniques being used to address them.Observation and simulation of phenomena are keys for proving scientific theories and discovering facts of nature the human brain could otherwise never imagine. Scientists must be able to manage data derived from observations and simulations. Constant improvement of observational instruments and simulation tools give modern science effective options for abundant information capture, reflecting the rich diversity of complex life forms and cosmic phenomena. Moreover, the need for in-depth analysis of huge amounts of data relentlessly drives demand for additional computational support.Microsoft researcher and ACM Turing Award laureate Jim Gray once said, "A fourth data-intensive science is emerging. The goal is to have a world in which all of the science literature is online, all the science data is online, and they interoperate with each other."9 Unfortunately, today's commercial data-management tools are incapable of supporting the unprecedented scale, rate, and complexity of scientific data collection and processing.Despite its variety, scientific data does share some common features: Needed are generic, rather than one-off, DBMS solutions automating storage and analysis of data from scientific collaborations.BY anastasia aiLamaKi, VeRena KanteRe, anD DeBaBRata Dash managing scientific Data key insights managing the enormous amount of scientific data being collected is the key to scientific progress. though technology allows for the extreme collection rates of scientific data, processing is still performed with stale techniques developed for small data sets; efficient processing is necessary to be able to exploit the value of huge scientific data collections.Proposed solutions also promise to achieve efficient management for almost any other kind of data. Lack of complete solutions using commercial DBMSs has led scientists in all fields to develop or adopt application-specific solutions, though some have been added on top of commercial DBMSs; for example, the Sloan Digital Sky Survey (SDSS-1 and SDSS-2; http://www.sdss.org/) uses SQL Server as its backend. Moreover, the resulting sof...