Many codes that are used in eSTICC have been optimized or ported to new High-Performance Computing (HPC) platforms with support from CSC Finland. Given by the raise of new energy-efficient technologies deployed in state-of-the art HPC platforms, such as General Purpose Graphic Processing Units (GPGPU) and Many Integrated Cores (MIC), this involves new programming paradigms.
Example activities are:
Optimization of NorESM table look-up routines
Tests with idealized cases were run on different platforms, Cray XT 30 and HP ProLiant Cluster, as well as different compiler suites, Cray, Intel and GNU compilers. At best about 50% faster run times for 5D case was achieved. In a further step, real-world tests shall help to profile optimize similar lookup-tables that are implemented in NorESM, which is work in progress. This is a direct support of earth system modelling activities in eSTICC (CSC, MetNor).
Optimization of FLEXInvert
eSTICC helped in improving the serial performance of FlexINVERT, an atmospheric Bayesian inversion framework. For a small test case the computing time has been reduced to less than one third of the original. This was achieved by profiling the code and rearranging data as well as optimizing loops. This is a direct support of activities in WP2 (CSC, NILU).
Optimization of Elmer(/Ice) on Intel Xeon and Xeon Phi platforms
The role of CSC as a partner in the Intel Parallel Computing Center (IPCC) we could take advantage of early access to the Xeon Phi processor (Knights Landing – KNL) hardware as well as direct support from Intel in porting the community ice sheet code to this platform. First benchmark tests with the code show a similar performance on KNL in comparison to a standard compute node equipped with Xeon CPU’s. Elmer as a whole code base is also currently optimized within IPCC on the MPI level, which will also feed back into the performance of the ice dynamic module, Elmer/Ice. Later activity concentrate on threaded and SIMD approach to bulk-assembly, which is achieved by rearranging data structures and adding missing components of the bilinear forms needed in FEM to the code-base (CSC, Intel Corp.).
Porting of GPGPU ocean model
CSC and FMI got direct support from NVIDIA (J.Appleyard) concerning the GPGPU (OpenACC) version of NEMO. The test of NEMO GPGPU version ported to CSC’s Bull cluster (equipped with K40 accelerator cards) brought a lot of good information on the principles of porting an OpenACC-code (using PGI compiler enabling card-to-card communication). The fact that the setup of CSC’s Bull cluster only gave an acceleration of factor 2 compared to a pure CPU version, nevertheless, excluded this platform for serious consideration of production runs (CSC,FMI,NVIDIA).
Porting to Nordic High Performance Computing (NHPC)
Porting Elmer/Ice to the Nordic High Performance Computing (NHPC) facility at University of Iceland: The installation enables easy access to HPC simulations for Icelandic glaciologists from the existing NCoE, SVALI at the Icelandic Meteorological Institute (IMO) and the University of Iceland.