toplogo
Sign In

Kilometer-Level Coupled Climate Modeling on a 40-Million-Core Supercomputer: An Eight-Year Journey of Model Development and Optimization


Core Concepts
A non-intrusive approach is used to port and optimize the Community Earth System Model (CESM) 2.2 for kilometer-level coupled climate modeling on a 40-million-core Sunway supercomputer, achieving a simulation speed of 222 simulated days per day.
Abstract
The authors present their eight-year journey of enabling high-resolution and ultra-high-resolution climate modeling on Sunway supercomputers, focusing on porting and optimizing the Community Earth System Model (CESM) 2.2. Key highlights: A hierarchical grid system with different spatial resolutions is adopted, enabling a progressive development of the final, ultra-high resolution coupled model. A non-intrusive proxy toolkit called O2ATH is developed to facilitate OpenMP offloading to the heterogeneous Sunway manycore architecture, without requiring significant manual code modifications. The O2ATH toolkit is widely adopted across major component models in CESM, including the atmosphere (CAM), ocean (POP), and sea ice (CICE) components, achieving significant performance improvements. The initialization stage of the model is optimized through techniques like reforming the MPI communication pattern, reducing time complexity of mapping algorithms, and balancing I/O and communication overheads. A suite of tools is developed to support the porting and optimization process, including profiling, hardware fault detection, bit-accurate result validation, and binary static call map analysis. The final coupled model, with a 5-km atmosphere and 3-km ocean resolution, scales to 101,800 nodes (39,702,000 cores) and achieves a simulation speed of 222 simulated days per day, enabling multi-year or even multi-decadal ultra-high-resolution climate modeling.
Stats
The final coupled model, with a 5-km atmosphere and 3-km ocean resolution, scales to 101,800 nodes (39,702,000 cores) and achieves a simulation speed of 222 simulated days per day. The atmosphere-only (CAM) component achieves a simulation speed of 340 simulated days per day at 5-km resolution. The ocean-only (POP) component achieves a simulation speed of 265 simulated days per day at 3-km resolution.
Quotes
"We form a non-intrusive yet efficient workflow to port CESM 2.2 to a 40-million-core heterogeneous supercomputer, in around three weeks. Maintaining the consistency of the code, we improve from simulating 1.79 days to 222 days per day (enabling multi-year or even multi-decadal ultra-high-resolution climate modeling)."

Deeper Inquiries

How can the non-intrusive porting approach be extended to other complex scientific applications beyond climate modeling?

The non-intrusive porting approach used in the context of climate modeling can be extended to other complex scientific applications by following a similar methodology. This approach involves minimizing manual code modifications and utilizing tools and frameworks to facilitate the transition to a new computing platform. To extend this approach to other applications, the following steps can be taken: Identify Key Components: Identify the key components of the scientific application that need to be ported to a new computing platform. This could include algorithms, data structures, and computational kernels. Develop Proxy Toolkits: Develop proxy toolkits similar to O2ATH that can facilitate the offloading of computations to heterogeneous architectures. These toolkits should provide a seamless interface between the existing codebase and the new platform. Optimize Communication: Focus on optimizing communication patterns and data transfers between different processing elements. This can help improve the overall performance of the application on the new platform. Parallelization Strategies: Implement parallelization strategies for different components of the application, similar to the strategies used for CAM, POP, and CICE in the climate modeling context. This can help leverage the computational power of the new platform effectively. Develop Supporting Tools: Develop supporting tools for profiling, debugging, and performance analysis. These tools can help identify bottlenecks, optimize code, and ensure the efficient execution of the application on the new platform. By following these steps and adapting the non-intrusive porting approach to the specific requirements of other scientific applications, it is possible to successfully migrate complex scientific models to new computing architectures while maintaining performance and consistency.

What are the potential limitations or challenges of the hierarchical grid system approach, and how can it be further improved?

The hierarchical grid system approach, while effective for enabling multi-resolution modeling in climate simulations, may have some limitations and challenges: Grid Generation Complexity: Generating and managing grids at multiple resolutions can be complex and computationally intensive. This complexity can increase with the number of resolution levels and the need for seamless transitions between grids. Data Transfer Overhead: Moving data between grids of different resolutions can introduce overhead in terms of memory usage and computational resources. Ensuring efficient data transfer mechanisms is crucial to minimize this overhead. Interpolation Errors: Interpolating data between grids of different resolutions can introduce errors, especially at boundaries. Managing these interpolation errors and ensuring accuracy across resolutions is a key challenge. Scalability: Scaling the hierarchical grid system to even higher resolutions or larger computational domains can pose scalability challenges. Ensuring that the system can efficiently handle increased computational demands is essential. To further improve the hierarchical grid system approach, the following strategies can be considered: Optimized Grid Generation: Develop more efficient grid generation algorithms that can handle multiple resolutions seamlessly and reduce computational overhead. Adaptive Grid Refinement: Implement adaptive grid refinement techniques that dynamically adjust grid resolution based on simulation requirements. This can help optimize computational resources and accuracy. Error Analysis and Correction: Implement robust error analysis techniques to identify and correct interpolation errors between grids. This can help improve the overall accuracy of the simulation results. Parallelization and Optimization: Parallelize grid operations and data transfer processes to leverage the computational power of modern supercomputers. Optimizing these operations can improve overall performance. By addressing these limitations and implementing these improvements, the hierarchical grid system approach can be enhanced to support even more complex and high-resolution simulations in various scientific applications.

What are the implications of the achieved ultra-high-resolution climate modeling capabilities on our understanding of climate change and extreme weather events?

The achieved ultra-high-resolution climate modeling capabilities have significant implications for our understanding of climate change and extreme weather events: Improved Accuracy: Higher resolution models can provide more detailed and accurate simulations of climate processes, allowing for better predictions of future climate trends and extreme weather events. Fine-Scale Phenomena: Ultra-high-resolution models can capture fine-scale phenomena such as convective storms, mesoscale weather systems, and ocean eddies, which are crucial for understanding regional climate variability. Impact Assessments: These models enable more precise assessments of the impacts of climate change on specific regions, ecosystems, and human populations. This can inform adaptation and mitigation strategies. Extreme Event Prediction: By simulating extreme weather events at a finer resolution, such as hurricanes, heatwaves, and heavy rainfall events, researchers can better understand the dynamics and drivers of these events. Feedback Mechanisms: Ultra-high-resolution models can help elucidate complex feedback mechanisms in the climate system, such as cloud-radiation interactions, ocean-atmosphere coupling, and land surface processes. Policy and Planning: The detailed insights provided by these models can support policymakers, urban planners, and emergency responders in making informed decisions related to climate resilience, disaster preparedness, and infrastructure development. Scientific Advancements: Advancements in ultra-high-resolution climate modeling contribute to the broader scientific understanding of Earth's climate system, fostering interdisciplinary research and collaboration. Overall, the capabilities enabled by ultra-high-resolution climate modeling have the potential to revolutionize our understanding of climate change and extreme weather events, leading to more informed decision-making and proactive measures to address the challenges posed by a changing climate.
0
visual_icon
generate_icon
translate_icon
scholar_search_icon
star