Core Concepts
UniFaaS is a parallel programming framework that adapts a federated function-as-a-service (FaaS) model to enable developers to compose distributed, scalable, and high-performance scientific workflows that span federated cyberinfrastructure.
Abstract
UniFaaS is a general-purpose parallel programming framework that leverages a federated function-as-a-service (FaaS) model to enable the composition of distributed, scalable, and high-performance scientific workflows across federated cyberinfrastructure.
Key highlights:
- UniFaaS provides a unified programming interface to express task parallelism and compose dynamic dependency graphs, which can be deployed across distributed resources seamlessly.
- UniFaaS implements a data manager to transparently manage data transfers across computers on behalf of users, using widely-used transfer mechanisms such as Globus and rsync.
- UniFaaS explores an observe-predict-decide approach to improve performance, where it monitors task characteristics, predicts task performance, and proposes a dynamic heterogeneity-aware scheduling algorithm.
- UniFaaS supports elasticity, allowing it to automatically scale various resources based on workflow characteristics.
- UniFaaS is designed to be modular, allowing users to easily plug in any appropriate schedulers or data transfer mechanisms for their workflows.
Stats
The drug screening workflow consists of 24,001 functions. The total computation time is 1,447 hours with an average of 220 seconds per task. The total size of the input, intermediate, and output data is 480.64 GB.
The montage workflow consists of 11,340 functions. The total computation time is 108 hours with an average of 6.4 seconds per task. The total size of the input, intermediate, and output data is 673.49 GB.
Quotes
"UniFaaS can improve the performance of a real-world drug screening workflow by as much as 22.99% when employing an additional 19.48% of resources and a montage workflow by 54.41% when employing an additional 47.83% of resources across multiple distributed clusters, in contrast to using a single cluster."