Spectra TFinity Tape Library Manages Supercomputing to the Stars
Australian Square Kilometre Array Pathfinder (ASKAP)
Established 14 years ago, Pawsey Supercomputing Centre in Perth, Western Australia is an unincorporated joint venture in which the federal and state governments, four universities, the CSIRO and collaborating organisations work together in a scientific research consortium. The researchers and radio astronomers from these organisations and institutions are using two custom built radio telescopes, also in WA, to investigate the origin of the universe and other astronomical events.
The telescopes – the Australian Square Kilometre Array Pathfinder (ASKAP) and the Murchison Widefield Array (MWA) – are used as scientific demonstrators for the design of the Square Kilometre Array (SKA), which will be the world’s largest radio telescope.
Observations from these two precursor telescopes generate immense amounts of data, requiring large-scale storage and a supercomputing data system accessible to astronomy researchers. Storing and managing this data represents a substantial challenge, handled through a hierarchical storage management (HSM) system, which Pawsey Supercomputing Centre oversees.
Pawsey Supercomputing Centre in Perth, WA
Not only does there need to be a sophisticated storage facility to host this big data, but it also needs to be accessible to the institutions, radio astronomers and researchers involved. The same facility also requires high capacity and scalable data storage for the future SKA project, which is estimated to capture 10 times more data than the current global internet traffic.
Data Collection
The MWA and ASKAP radio telescopes are located at Murchison Radio Observatory (MRO), which sits 850kms north of Perth in a radio-quiet zone. The MWA telescope is comprised of over 2,000 dipole antennas, distributed over 7 sq km of the Murchison outback. Nearby, the CSIRO-operated ASKAP telescope, a collection of 36 dish antennas, rapidly surveys the sky using Phased Array Feeds.
Murchison Widefield Array (MWA)
Spectra TFinity tape library
The telescopes are collecting radio waves from distant regions of the universe, looking for evidence of the origins of the galaxies, to investigate how stars and planets were formed after the Big Bang explosion that the researchers currently believe created the universe.
The MWA produces roughly 60GB of data per second, which is about 8,700 times faster than the Australian Internet can download. This data is processed and reduced through systems housed at the MRO and streamed down to Pawsey Supercomputing Centre in Perth. “In just one day we get around 45 terabytes from MWA,” said Paul Newman, HSM specialist at Pawsey Supercomputing. “It’s expected that ASKAP produces around 14 terabytes a day. They will be streaming about 400 megabytes a second from their observations. That’s equivalent to about one standard DVD every 30 seconds.”
Layered Storage and Tape Libraries
The Centre's approach to the situation was to design their data facilities as a layered storage system. After telescopes collect the data on-site, it passes through a dedicated network of 10-gigabit optical fibre links and streams down to the Pawsey Centre.
Pawsey has two 12-frame Spectra TFinity tape libraries with 32 IBM TS1150 tape drives per library. Researchers have access to a massive 40 petabytes of data across the two tape libraries in an active archive environment. “After data arrives at the MWA disk cache at Pawsey, it is transferred onto the Centre's infrastructure through a data migration facility (DMF), which sets up the multi-layered hierarchical storage management. The DMF migrates data off primary disk and writes it to two copies on tape, one copy in each library, then restores it back to disk automatically when a researcher requires it,” said Paul.
“Pawsey’s TFinity tape library is a core part of the HSM system. The hierarchy has tiers of storage, each including one tape tier and one disk. The data moves between tiers as required to make room for new data. Our data will always stay on tape, but it can be called into the other servers if a researcher needs a particular file.”
Expansion for SKA
The data management at Pawsey is comprehensive, gives researchers straightforward access for scientific analysis and, equally important, the storage capacities of the TFinity libraries will be relatively easy to increase when the SKA begins producing data. From the current capacity of 40 petabytes, Pawsey can expand its Tfinity storage up to 100 petabytes once the SKA telescope is operational.
Dipole antenna in the MWA.
Paul said, “Tape libraries generally are very expandable and can be upgraded with denser media. This flexibility matches our current and future uses for data storage. The ASKAP and MWA were planned as part of a five-year project, so it is quite likely that the Spectra TFinity tape library will still be here at the end of that and possibly beyond - for up to 10 years. This Pawsey infrastructure is integral to the SKA project because it will to continue to store all the radio astronomy data for the precursor projects. Our data will then be called on for the construction and implementation of the international SKA telescope.” www.spectralogic.com