The Canadian Tier-1 Data Centre is a well-established national and international facility that plays an essential and unique role for a large number of researchers seeking to unlock the secrets of the universe through their participation in the ATLAS experiment at the Large Hadron Collider (LHC) at CERN (European Organization for Nuclear Research).
The ATLAS experiment studies proton-proton collisions from the LHC at the highest energy ever achieved in the laboratory, allowing scientists to probe the fundamental constituents of matter and their interactions. The ATLAS detector records these collisions to search for new particles and phenomena, of which the best known is the Higgs boson. However, the search for the Higgs boson and other new phenomena is complicated by the presence of background events that are 10 billion times more frequent. Discoveries thus require an enormous amount of data collection and detailed analysis. An international network of high-performance computing facilities linked by high-speed networks, the Worldwide LHC Computing Grid (WLCG), stores and processes this data. Ten Tier- 1 Data Centres play a central role in the WLCG; one of these is in Canada, the Canadian ATLAS Tier-1 Data Centre, located at Simon Fraser University (SFU). It is maintained and operated 24/7 by dedicated Tier-1 personnel. The Centre provides critical national and international resources to ~150 Canadian and ~3000 international scientists for the storage of ATLAS data, as well as computing capacity for data processing, simulation, and physics analyses.
Following the decision to consolidate high-performance computing in Canada into a small number of data centres managed by Compute Canada, the ATLAS Tier-1 Data Centre was moved from TRIUMF (original location of the Centre from 2007 - 2018) to the new Compute Canada's state-of-the-art facility at SFU. During the move, the Tier-1 Data Centre did undergo a full hardware technology refresh and a significant expansion. Following several major milestones, the relocation was successfully completed on-time and on-budget in the fall of 2018, with only a few hours interruption to ATLAS distributed computing operations. The Tier-1 Data Centre at SFU has been in full 24/7 mode of operations since then, with the existing dedicated TRIUMF Tier-1 personnel.
The Canadian Tier-1 Data Centre is run by a consortium of Canadian universities led by SFU. The centre has received $25.6M in capital funds (CFI and BCKDF) and $7.1M for O&M from CFI to date. The Tier-1 Centre is part of Canada's major collaboration in the ATLAS experiment at CERN's Large Hadron Collider (LHC), including significant contributions to the detector design and construction. The ATLAS detector is best known for its central role in the 2012 discovery of the Higgs boson.
The ATLAS detector generates vast amounts of data, about 10 petabytes per year. This is the equivalent of the data storage of 200,000 blue-ray DVD's, which if stacked in cases would form a tower 1200 metres tall. The storage, processing and physics modelling of this data is as crucial to particle physics discovery as the ATLAS detector itself. To manage this huge amount of data, the ATLAS collaboration operates a sophisticated distributed network of ten large-scale Tier-1 computing centres and about 100 smaller Tier-2 facilities used primarily for data analysis and simulations. The ATLAS network is part of the Worldwide LHC Computing Grid (WLCG), the global network that stores, distributes and analyses all LHC data. WLCG is the globe's largest and most advanced scientific computing grid.
The Canadian Tier-1 centre is a founding member of the WLCG, and a unique facility in Canada. All of ATLAS' Tier-1 Centres are run by national physics laboratories or other facilities capable of providing the experimental physics culture and infrastructure to support ATLAS' 24/7 stringent data storage, distribution, reprocessing and security requirements. Since coming fully online in 2007, the Canadian Tier-1 centre, supported by ten highly qualified TRIUMF staff, has performed with close to 100-percent up-time, supplying fully ten percent of ATLAS' global storage and computational Tier-1 resources. The exceptional performance has enabled the centre to provide additional capacity during critical times in ATLAS' science program. This included providing data reprocessing and modelling that enabled the 2012 confirmation of the discovery of the Higgs boson.
How It Works
ATLAS' success depends on its highly structured, secure and fault-tolerant globally distributed computing system for capturing, processing and analyzing ATLAS data. As a core part of this network, the Canadian ATLAS Tier-1 centre is a large-scale, data-intensive facility that is maintained 24/7, with at least one highly qualified staff member on call at all times.
Light Path to CERN
The Canadian Tier-1 centre is connected to CERN via dedicated high-speed, high-bandwidth light fibre links provided by CANARIE, Canada's advanced research network. In 2001, TRIUMF and CANARIE were involved in one of the first transatlantic light path tests to CERN, demonstrating the potential for operating ATLAS computing as a globally distributed network.
Distributed ATLAS Software
ATLAS has an online system of distributed, grid-based software and the TRIUMF group provides user support to ATLAS as a whole and members of the ATLAS-Canada collaboration. TRIUMF's user-support specialists developed high-level tools for the CERN Virtual Machines Files System and played a key role in its deployment and wide-scale adaptation. This provides ATLAS' thousands of scientists with easier, more efficient access to, and use of, ATLAS' complex online analysis software, and ensures that the entire collaboration works with validated software configurations.
A Tiered Approach to Big Data
ATLAS' computing demand is always increasing, whether for long-term data storage or the reprocessing of ever greater amounts of raw data. The TRIUMF-ATLAS Tier-1 Centre has grown from an initial 112 cores (each the equivalent of a powerful desktop computer) and about 10 terabytes of storage in 2006, to 7700 cores and 11 petabytes of disk and 31 petabytes of tape storage, making it one of Canada's largest capacity dedicated to a scientific project.
The Canadian Tier-1 centre capacity will continue to grow significantly in the coming years. The larger Compute Canada Simon Fraser University location will provide the TRIUMF/SFU-operated Tier-1 centre with cost efficiencies, infrastructure support and room to grow as ATLAS prepares for its High Luminosity Era when the experiment will generate ten times as much data.
ATLAS' global, distributed computing network meets the detector's computational needs by operating in a nested, tiered fashion, with different roles and responsibilities for each layer.
Tier-0: Located at CERN, the Tier-0 centre is the hub-of-the-wheel in the ATLAS computing network, collecting and distributing ATLAS raw data 24/7 when the experiment is running. CERN keeps the primary copy of all ATLAS raw data, and a secondary copy is distributed among the Tier-1 centres.
Tier-1: Ten Tier-1 centres operate 24/7 receiving raw ATLAS data for storage, reprocessing and applying the latest calibrations and reconstruction algorithms. Data from the ATLAS detector is collected as electronic signals and the Tier-1 centres use algorithms to reconstruct the raw data as particles with particular trajectories and energies. This reprocessed data is distributed back to CERN, to the other Tier-1 centres, and to Tier-2 centres worldwide.
Tier-2: About 100 ATLAS Tier-2 centres, most based at universities, are the primary sites for ATLAS-team scientists to model and analyze the data. In Canada, Compute Canada provides resources for university-based Tier-2 centres. The TRIUMF Tier-1 centre provides support and expert knowledge for the Canadian Tier-2 centres, and TRIUMF scientists have been responsible for coordinating ATLAS' distributed computing efforts in Canada since the beginning of the project.
Tier-1 centres also provide significant computing resources for large-scale simulations to supplement the capacity of Tier-2 centres. These simulations involve billions of events in order to have statistically reliable results and to properly model all of the physics processes needed to make discoveries. The Tier-1 centres also store all ATLAS data indefinitely, providing built-in redundancy, the equivalent of external hard-drives for a personal computer.