We live in a world in which data are generated in ever increasing quantity and information derived from it has become a main driving force for scientific discovery and innovation. We can only fully employ these vast amounts of data if we have the methodologies to store and process it and turn it into valuable and accessible information by analysis and modeling, leading to understanding and a basis for informed decisions. Society is becoming increasingly reliant on data and the tools and methods to acquire and analyze it.

Important sources of data that are only starting to be explored come from social media, on-line full text science literature, on-line video material, on-line click and interaction patterns, financial transactions, customer behavior, sensors and scientific instrumentation. Being able to employ such data will improve the efficiency of our government, the quality of our health care, steer innovation in business, the effectiveness of our military and last but not least, catalyze new discoveries in science.

We view data science as a complex interplay of three processes, all driven by data.

Each of these components, including the data itself relies on various scientific disciplines. The groups in the DSRC have an excellent scientific track record in these different disciplines. In the DSRC we are studying the further elaboration and integration of these disciplines in a data science context.