The LHC Will Revolutionize Physics. Can it Revolutionize the Internet Too?

by Ian O'Neill on September 4, 2008

One gigabyte per second? No problem. The LHC computing grid could revolutionize how we handle data over Internet (CERN)

One gigabyte per second? No problem. The LHC computing grid could revolutionize how we handle data over Internet (CERN)


We already know that the Large Hadron Collider (LHC) will be the biggest, most expensive physics experiment ever carried out by mankind. Colliding relativistic particles at energies previously unimaginable (up to the 14 TeV mark by the end of the decade) will generate millions of particles (known and as yet to be discovered), that need to be tracked and characterized by huge particle detectors. This historic experiment will require a massive data collection and storage effort, re-writing the rules of data handling. Every five seconds, LHC collisions will generate the equivalent of a DVD-worth of data, that’s a data production rate of one gigabyte per second. To put this into perspective, an average household computer with a very good connection may be able to download data at a rate of one or two megabytes per second (if you are very lucky! I get 500 kilobytes/second). So, LHC engineers have designed a new kind of data handling method that can store and distribute petabytes (million-gigabytes) of data to LHC collaborators worldwide (without getting old and grey whilst waiting for a download).

In 1990, the European Organization for Nuclear Research (CERN) revolutionized the way in which we live. The previous year, Tim Berners-Lee, a CERN physicist, wrote a proposal for electronic information management. He put forward the idea that information could be transferred easily over the Internet using something called “hypertext.” As time went on Berners-Lee and collaborator Robert Cailliau, a systems engineer also at CERN, pieced together a single information network to help CERN scientists collaborate and share information from their personal computers without having to save it on cumbersome storage devices. Hypertext enabled users to browse and share text via web pages using hyperlinks. Berners-Lee then went on to create a browser-editor and soon realised this new form of communication could be shared by vast numbers of people. By May 1990, the CERN scientists called this new collaborative network the World Wide Web. In fact, CERN was responsible for the world’s first website: http://info.cern.ch/ and an early example of what this site looked like can be found via the World Wide Web Consortium website.

So CERN is no stranger to managing data over the Internet, but the brand new LHC will require special treatment. As highlighted by David Bader, executive director of high performance computing at the Georgia Institute of Technology, the current bandwidth allowed by the Internet is a huge bottleneck, making other forms of data sharing more desirable. “If I look at the LHC and what it’s doing for the future, the one thing that the Web hasn’t been able to do is manage a phenomenal wealth of data,” he said, meaning that it is easier to save large datasets on terabyte hard drives and then send them in the post to collaborators. Although CERN had addressed the collaborative nature of data sharing on the World Wide Web, the data the LHC will generate will easily overload the small bandwidths currently available.

How the LHC Computing Grid works (CERN/Scientific American)

How the LHC Computing Grid works (CERN/Scientific American)

This is why the LHC Computing Grid was designed. The grid handles vast LHC dataset production in tiers, the first (Tier 0) is located on-site at CERN near Geneva, Switzerland. Tier 0 consists of a huge parallel computer network containing 100,000 advanced CPUs that have been set up to immediately store and manage the raw data (1s and 0s of binary code) pumped out by the LHC. It is worth noting at this point, that not all the particle collisions will be detected by the sensors, only a very small fraction can be captured. Although only a comparatively small number of particles may be detected, this still translates into huge output.

Tier 0 manages portions of the data outputted by blasting it through dedicated 10 gigabit-per-second fibre optic lines to 11 Tier 1 sites across North America, Asia and Europe. This allows collaborators such as the Relativistic Heavy Ion Collider (RHIC) at the Brookhaven National Laboratory in New York to analyse data from the ALICE experiment, comparing results from the LHC lead ion collisions with their own heavy ion collision results.

From the Tier 1 international computers, datasets are packaged and sent to 140 Tier 2 computer networks located at universities, laboratories and private companies around the world. It is at this point that scientists will have access to the datasets to perform the conversion from the raw binary code into usable information about particle energies and trajectories.

The tier system is all well and good, but it wouldn’t work without a highly efficient type of software called “middleware.” When trying to access data, the user may want information that is spread throughout the petabytes of data on different servers in different formats. An open-source middleware platform called Globus will have the huge responsibility to gather the required information seamlessly as if that information is already sitting inside the researcher’s computer.

It is this combination of the tier system, fast connection and ingenious software that could be expanded beyond the LHC project. In a world where everything is becoming “on demand,” this kind of technology could make the Internet transparent to the end user. There would be instant access to everything from data produced by experiments on the other side of the planet, to viewing high definition movies without waiting for the download progress bar. Much like Berners-Lee’s invention of HTML, the LHC Computing Grid may revolutionize how we use the Internet.

Sources: Scientific American, CERN

About 

[Follow me on Twitter (@astroengine)]

[Check out my space blog: Astroengine.com]

[Check out my radio show: Astroengine Live!]

Hello! My name is Ian O'Neill and I've been writing for the Universe Today since December 2007. I am a solar physics doctor, but my space interests are wide-ranging. Since becoming a science writer I have been drawn to the more extreme astrophysics concepts (like black hole dynamics), high energy physics (getting excited about the LHC!) and general space colonization efforts. I am also heavily involved with the Mars Homestead project (run by the Mars Foundation), an international organization to advance our settlement concepts on Mars. I also run my own space physics blog: Astroengine.com, be sure to check it out!

Comments on this entry are closed.

Previous post:

Next post: