Skip to: Site menu | Main content

Execution Management


ManyJobs is a python based tool for managing ManyTasks on widely distributed supercomputing resources. To date it has been used in production mode on the following machines: Abe@NCSA, Lonestar@TACC, Ranger@TACC, QueenBee@LONI, Kraken@NICS, Steele@RCAC, (eric, louie, poseidon, painter, oliver)@LONI, and the CCS Cluster at Tulane's CCS.


A Pilot-Job allows the execution of jobs without the necessity to queue each individual job. A pilot job is started through the regular Grid resource manager, provides a container for many sub-jobs, i.e. applications submit these sub-jobs through the pilot job and not the resource manager. A major advantage of this approach is that the waiting time at the local resource manager, which usually significantly contributes the overall time-to-completion, is avoided. It also provides application-level control of the sub-job execution.
The SAGA BigJob framework is a SAGA-based pilot job implementation. Unlike other common pilot job systems SAGA BigJob (i) natively supports MPI job and (ii) works on a variety of back-end systems, generally reflecting the advantage of using a SAGA-based approach.