The LHC And The Largest Computing Grid Ever
The Large Hadron Collider will soon begin its second multi-year run. The discovery of Higgs and the discoveries it may yet make are certainly fascinating, yet few know of the amazing computer resources at the LHC’s disposal that were as integral as anything in making all this possible.
Like most people, you probably haven’t given much thought to what it took, from a computer perspective, to discover the Higgs Boson in 2012. Perhaps they used one super-duper computer to do all the heavy lifting. Maybe it was a bunch of these networked together. In actuality, it took the world’s largest distributed computer grid to pull it off. This is the Worldwide LHC Computing Grid (WLCG).
Even back in the 1990s, when the LHC was being designed and built, the engineers realized that the copious amounts of data produced would need computers far beyond the current capacity of CERN to manage and process. How much data? Well imagine protons traveling at 99.9999991% the speed of light circling the LHC, 27km underground tunnel 11,000 times per second. Wrapped around this tunnel, like cocoons, are the 4 primary experiment detectors (Atlas and CMS for example). This setup can produce an astounding 600 million collisions per second, each of which has to be vetted to determine which are ho-hum, been-there-seen-that and which could spawn the next revolution in physics.
As you probably suspect, the real interesting collisions are quite rare. The CERN computers sift through them and flag only about one in 10,000 which deserve to actually be stored in memory for a deeper look. Of these, only 1 in 100 actually merit even more serious attention. That means that only one in a million collisions are of any interest at all. The expression “Needle in a Haystack” comes to mind. I don’t know how many pieces of hay are in a haystack ( I actually tried to find out…and failed). 1 million pieces of hay may not make a gargantuan haystack but remember, new haystacks are being made every second of operation.
The Worldwide LHC Computing Grid is a networked collaboration of global computer centers consisting of 174 facilities in 40 countries. CERN’s data is shuttled thru concentric rings of these datacenters, each conceptual ring is referred to as a Tier. In its entirety, the tiers include over a quarter million processor cores and over 400 petabytes of tape and disk storage. In its four years of operation, it has gathered and poured over an amazing 60 million gigabytes of data (60 petabytes).
CERN’s local computer network is called Tier 0 which consists of 73,000 processor cores and takes the raw data and feeds it into its datacenter. There, it aggregates the data and copies it to long-term tape storage. It then distributes the data to Tier 1.
Tier 1 consists of about 15 computer centers located in The US, UK, Italy, Spain etc. They server as critical backup facilities and analysis for CERN as well as data re-processing centers whenever required.
Each Tier 1 is connected to one or more of the 160 Tier 2 sites. These computer centers were originally slated for data analysis and simulations but they have begun to also handle some data re-processing. In addition, Tier 2 has also proved very valuable for handling unanticipated peak loads.
Finally, the grid also consists of Tier 3 nodes that can range from small computer clusters in Universities down to single PCs which are authorized for use by the grid regularly.
This grid can be accessed by any of the 10,000 scientists that are allowed to use it, night or day, 24/7/365.24 days a year. As a web engineer involved in the support of servers that need to be online all the time, I can appreciate the tremendous effort that is required to not only provide uninterrupted access to a huge community but to also accomplish that during the inevitable software and hardware problems and upgrades that need to happen.
My description of the WLCG thus far have been for Run #1 of the LHC which ended in 2012 with the announcement of the discovery of the Higgs Bosom. The scientists and engineers have not been goofing off since then (ok, maybe a few of them have). In preparation for Run #2 of the LHC (set to begin in the spring of 2015) the engineers have installed powerful new electromagnets designed to accelerate its twin proton beams to 13 trillion electron volts (TeV). This increase of 5 TeV will not only roughly double the collision energy, it will produce almost 3 times the number of collisions of Run #1.
To handle this vast increase in data production, the following major projects have been implemented:
- An increase of CPU and Data storage of 30%.
- An implementation of multi-core processing; processors with 2,4, or 8 CPUs in one package.
- A re-write of much of the computer code to handle multi-threaded instructions for the new processors to handle.
- The addition of a new network protocol (xrootd) to handle the transfer of huge datasets efficiently.
After tireless work the past few years, the Worldwide LHC Computing Grid is now ready to handle the extra demands that will be placed on it as the LHC makes a bold attempt to eclipse its historic Nobel-prize winning discovery in 2012. We could conceivably make ground-breaking discoveries about supersymmetry, magnetic monopoles, even new dimensions of existence.
I really hope we do, even if it makes most people forget about the wondrous computing grid that made it all possible.