Extreme-scale Computing and Data Handling - the Heart of Progress in Weather and Climate Prediction

Past improvements

Progress in environmental monitoring and numerical weather and climate prediction has been intimately connected with the progress in supercomputing. Over the last few decades, advances in computing power have enabled us to increase the skill and detail of our forecasts through increasing spatial resolution, enhancing realism by adding more detailed representations of physical processes, joined with more Earth-system components and investment in ensemble techniques to characterize the uncertainty of both initial conditions and forecasts (Bauer et al., 2015).

Better models and better data assimilation techniques have allowed us to more effectively exploit information on the Earth system. As a computing task, data assimilation is as costly as producing forecasts and this cost grows along with model enhancements and with the increases in both volume and diversity of the assimilated observations. As prediction systems improve and the expectation of forecast consumers for more specialized forecast products increases, the volume and diversity of the output data will grow at similar rates or even faster than the computing cost.

In the past, this growth in cost was compensated mostly by a comparable growth in computing and data handling capabilities arising from the ability to engineer more transistors onto microprocessors (Moore’s law) and from higher clock-speeds at constant power (Dennard scaling), while processor prices reduced. As transistor density reaches physical limits and clock-speeds stabilize to limit electric power consumption, added growth in performance can only be expected from more enhanced parallelization and from new processor technologies that combine this connection with enhanced power efficiency. Much of this technology is currently derived from commodity devices such as mobile phones.

Future challenge

It has been predicted that in ten years, typical operational weather prediction and climate projection workloads with high-resolution, coupled Earth-system model ensembles will lead to at least a factor of 1 000 larger computing and data handling needs compared to today (Wehner et al. 2011). These needs can no longer be fulfilled through the evolution of hardware technology alone but will require complementary fundamental developments in mathematical, numerical and statistical methods but also in programming techniques that allow for the optimal mapping of the diverse range of computing tasks of numerical predictive models onto the emerging processor types ranging from CPU, GPU, FPGA to highly specialized ASIC (application-specific integrated circuit) devices (Schulthess, 2015). This range may widen even more in the future, and the significant challenge for any application will be to exploit the potential of future hardware.

A key upper-limit imposed on HPC systems is the affordable electric power level. Present peta-scale (such as supercomputers allowing 1015 floating point operations per second for tasks running at peak performance) systems consume O (10⁶) watts per year, which translates to O (10⁶) $ USD cost for power and cooling per year. At present, most HPC centres are built on the assumption that their overall power budget will not exceed O (20MW), which falls well short of the above factor-1000 increase. So simply buying larger computers is not an option because of affordability.

Data communication on the HPC system is a central concern in this enterprise as moving data on a chip consumes about ten times more energy than performing a calculation while moving data between chips costs another ten times more than moving data on the same chip (Kogge and Shalf 2013). A further concern is how observational input and model output data are managed along the prediction workflow to enable efficient pre- and post-processing, again aiming to minimize data movement, reduce storage needs and ensure resilient forecast production at the same time. While the computing and data handling challenges increase drastically, the requirements for data usability and fast access by users rather tighten.

More users wanting more information faster create tremendous challenges for data handling. This will require investing in a mixture of centralized and cloud computing solutions that allow moving applications closer to where large volumes of forecasting data sit and to more evenly distribute costly and user-driven data analytics and post-processing across a wider range of platforms.

And the role of artificial intelligence?

Data collection

The re-emergence of artificial intelligence methods sponsored by large-scale commercial applications has created opportunities for contributing to the much-needed efficiency gains. Big companies like IBM and Microsoft (supporting AccuWeather) advertise their ability to deliver highly specialized solutions for customers taking the available output from operational national and international centers in addition to their own forecasting products. This has become feasible because purpose-built processors are assembled at a larger scale and deep learning software is able to sort through vast amounts of data from both models and observations for extracting meteorological information to be forecast.

The replacement of physics based prediction systems by deep learning seems unlikely as the number of degrees of freedom and the non-linearity of the Earth system would require very complex neural networks that will be difficult to train and risk being potentially inefficient to run on computers (Düben and Bauer, 2018). Part of the challenge for neural networks when targeting globally valid forecasts across medium, seasonal and climate scales will be to produce physically consistent forecasts, maintaining both closed budgets and conservation of fluxes. Dealing with biases and errors in training data adds significantly to that challenge.

However, the use of such techniques for observational data pre-processing and model output post-processing can help better distribute the data handling workload along the workflow, to extract useful information more effectively from large data volumes and to reduce the computational burden of selected prediction model components by replacing them with surrogate neural networks. These applications are areas of active research at present but have already been tested in the past (Lee at al., 2018, Hsieh and Tang, 1998).

Present research efforts

The above challenges pose a fundamental obstruction for the future advancement of both weather and climate prediction capabilities. Increased awareness of this problem has led to large-scale research and innovation efforts in many developed countries which are supported by significant governmental and public-private funding efforts. The Department of Energy in the United States of America (US) and the European Commission project is an example complemented by many national weather forecasting agency efforts in the US, Japan, China and Europe.

Due to the complexity of the challenge, research needs to collaborate closely with the computing technology industry and weather and climate science needs to collaborate closely with impact sectors like water, energy, food and agriculture and risk management sectors. These collaborations and cutting-edge science-technology research are the central focus of the ExtremeEarth project that has been proposed as a European Flagship project promising breakthrough prediction capabilities in this new era.

These challenges clearly contribute to further widening the capability gap between more and less developed countries as they require a unique level of expertise, co-design between research and industry and significant technological support for both software and hardware. This is where international collaboration fostered by organizations like WMO will be crucial to produce sustainable economies of scale and to support knowledge-transfer between different areas of expertise and across countries and continents.

The role of WMO

Detailed recommendations for a revised strategy include:

The establishment of scientific methodologies exploring enhanced parallelism and reduced data movement when employing extreme-scale HPC infrastructures;
Support of standardisation of portable code structures and programming models ensuring efficiency and code readability, and exploiting the future range of processor and system-level technologies including metrics for code testing, performance analysis and benchmarking;
Designing portable data handling frameworks for observational data pre-processing and model output post-processing as well as product dissemination;
Supporting open and distributed, cloud-based computing and data management infrastructures dealing with all steps in the forecast production workflow including easy access, information discovery and visualization for end-users;
Adapting artificial intelligence methods (such as deep learning) to facilitate increasingly diverse observational data processing, user-dependent information extraction from increasingly complex model output data, and development of surrogate model components resulting in reduced computational cost;
Establishing capacity-building and training between applied science and computational science to facilitate uptake of new technologies and methodologies by the community.

Almost all application areas in the weather and climate prediction community will benefit from this strategy as the new computing and data handling capabilities will enable new scientific discovery, cost-effective operation and enhanced knowledge transfer from experts to the wide user base.

References

The need for a concerted effort between weather and climate science and computational science requires a visible WMO strategy. The aim of this effort would be to develop and share methodologies and technologies for cost-effective production of forecasts and the collection/dissemination of large data volumes with increasingly complex high-resolution prediction systems across all scales.

Bauer, P., A. Thorpe and G. Brunet, 2015: The quiet revolution of numerical weather prediction. Nature, 525, 47-55.

Düben, P. and P. Bauer, 2018: Challenges and design choices for global weather and climate models based on machine learning. Geoscientific Model Development, 11, 3999-4009.

Hsieh, W.W. and B. Tang, 1998: Applying neural network models to prediction and data analysis in meteorology and oceanography. Bulletin of the American Meteorological Society, 79, 1855-1870.

Kogge, P. and J. Shalf, 2013: Exascale computing trends: Adjusting to the “New Normal” for computer architecture. Computing in Science and Engineering, doi: 10.1109/MCSE.2013.95.

Lee, Y.-J., C. Bonfanti, L. Trailovic, B. J. Etherton, M. W. Govett, and J. Q. Stewart, 2018: Using deep learning for targeted data selection: Improving satellite observation utilization for model initialization. 17th Conference on Artificial and Computational Intelligence and its Applications to the Environmental Sciences.

Schulthess, T.C. 2015: Programming revisited. Nature Physics, 11, 369–373.

Wehner, M.F., L. Oliker, J. Shalf, D. Donofrio, L. A. Drummond, R. Heikes, S. Kamil, C. Kono, N. Miller, H. Miura, M. Mohiyuddin, D. Randall, W.‐S. Yang, 2011: Hardware/software co‐design of global cloud system resolving models. Journal of Advances in Modeling Systems, 3, M10003, 22 pp.

Authors:

Peter Bauer, ECMWF, UK

Michael C. Morgan, U Wisconsin-Madison, USA

Siham Sbill, National Weather Service, Morocco