Containerization can improve reproducibility of HPC applications, but may introduce performance overheads. This study evaluates the performance impact of running the HPX/Kokkos-based astrophysics application Octo-Tiger in Singularity containers on a homogeneous CPU-based supercomputer (Fugaku) and a heterogeneous CPU-GPU cluster (DeepBayou).
A systematic training method called ScaleFold that incorporates optimizations to address the key factors preventing the AlphaFold training from scaling to more compute resources, enabling it to be completed in 10 hours.
Identifying and mitigating the impact of slow-performing nodes in a large supercomputer cluster through the use of machine learning, proxy applications, and scheduling prioritization.
Wilkins is an in situ workflow system designed to provide scalable and efficient execution of diverse scientific tasks without requiring any modifications to user task codes.
This study provides a comprehensive analysis of GPU acceleration for computational fluid dynamics (CFD) simulations on HPC systems, evaluating the impact on simulation speed, power consumption, and cost.
Matrix operations can be transformed into equivalent graph representations, enabling domain experts to implement various types of matrix computations using a unified graph programming interface. This graph engine-based scientific computing paradigm achieves performance comparable to the best-performing implementations while greatly simplifying the development of scientific computations on large-scale platforms.