The CI team behind Swift likes to call the philosophy behind the parallel scripting language "programming in the large." The idea is that Swift lives one order higher than a project's core family of applications, coordinating the data inputs and outputs and allocating tasks according to whatever resources are available. This directorial ability means researchers can use Swfit for a variety of different types of science, from analyzing materials in real time on Argonne's Advanced Photon Source to finding genetic signatures for cancer.
In an article for the Software Sustainability Institute, Daniel S. Katz, a CI senior fellow and one of the original developers of Swift, describes how the language is used for yet another type of research project: creating a high-resolution model of the Arctic. By processing satellite imagery into digital models, the ArcticDEM team is building the most detailed topographic maps of the region north of 60 degrees latitude, including all of Greenland and Alaska. This work is highly computational, requiring techniques that assemble overlapping satellite images into cohesive elevation models.
The ArcticDEM project needed a way to bundle over 500,000 single-node tasks into a smaller set of 100- to 1000-node jobs. Swift manages the grouping of the ArcticDEM tasks, and also handles the queue of unfinished tasks, adding new tasks into the pool of running jobs as tasks finish. As a result, the ArcticDEM project’s jobs are large enough to avoid burdening the HPC scheduler and still small enough to fit in otherwise unused nodes, increasing the utilisation of the machine. Most importantly, Swift enabled the ArcticDEM project to use over 18 million node hours compute time on Blue Waters since 2016. To date, the project has produced four data releases towards its goal of complete Arctic coverage.
The article also introduces Parsl, a prototype Python module that utilizes Swift functionality. For more on Swift, visit their website.