Papers
Topics
Authors
Recent
Search
2000 character limit reached

Advancements in Big Data Processing in the ATLAS and CMS Experiments

Published 8 Mar 2013 in cs.DC, cs.DB, and hep-ex | (1303.1950v1)

Abstract: The ever-increasing volumes of scientific data present new challenges for distributed computing and Grid technologies. The emerging Big Data revolution drives exploration in scientific fields including nanotechnology, astrophysics, high-energy physics, biology and medicine. New initiatives are transforming data-driven scientific fields enabling massive data analysis in new ways. In petascale data processing scientists deal with datasets, not individual files. As a result, a task (comprised of many jobs) became a unit of petascale data processing on the Grid. Splitting of a large data processing task into jobs enabled fine-granularity checkpointing analogous to the splitting of a large file into smaller TCP/IP packets during data transfers. Transferring large data in small packets achieves reliability through automatic re-sending of the dropped TCP/IP packets. Similarly, transient job failures on the Grid can be recovered by automatic re-tries to achieve reliable six sigma production quality in petascale data processing on the Grid. The computing experience of the ATLAS and CMS experiments provides foundation for reliability engineering scaling up Grid technologies for data processing beyond the petascale.

Authors (1)
Citations (2)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.