In November 2019, the Science and Technology Facilities Council (STFC) Hartree Centre and Scientific Computing Department exhibited at international conference Supercomputing 2019 (SC19) in Denver, USA. In this blog post, Research Software Engineer Tim Powell shares some thoughts and insights from the Hartree Centre team.
The variety of experiences one can have at Supercomputing is vast, and I think this is a good echo for the direction high performance computing (HPC) is going. The number of different disciplines that are adopting HPC and the different techniques available to acquire your computing power are growing more diverse. When discussing the themes of SC19 with a colleague (in the stationery room of all places) I accidentally summed it up quite well: “Supercomputing 2019 was tall and broad.”
So let’s look at each aspect of this assessment – first up: “tall”. The next phase of supercomputing is exa-scale. There was a significant number of talks, birds-of-a-feather, and panels discussing exa-scale computing, the applications, software, and hardware.
Our Chief Research Officer, Vassil Alexandrov, gives his account of Supercomputing 2019 and the current exa-scale landscape here:
“Supercomputing 2019 was a busy time for me, as always! In the discussions and talks I attended, I felt that this year’s content was of an even higher quality than previous years, and I noted that there were more precise presentations delivered by researchers.
One area which I paid particular attention to was the discussion around exa-scale. The US National Labs are making big moves with their Exa-Scale Computing Project. They are investing $1.8 billion in hardware and a similar amount for the development of software. The current US roadmap is to have their first machine, Frontier, in place in Q3 of 2021 costing an estimated $400 million. With another two machines to be delivered in 2022, each costing $600 million. All 3 machines are expected to be exa-scale and are rumoured to be a combination of AMD, Intel, Cray, and NVIDIA.
Europe are also heading towards exa-scale computing – eight centres across Europe are going to host large peta-scale and pre-exa-scale machines in their program to developing exa-scale capabilities, with machines expecting to reach 150-200 peta-flops. Japan is about to install their Post-K supercomputer which is based on ARM processors and it is likely to be a very efficient machine. The expectation is for it to be operational early 2020 so I am excited to see what the results will be when it is up and running. China is also a player but that is behind closed doors at the moment. It will be interesting to see what they reveal.
Throughout SC19, it was clear that the software challenges are going to be harder than the hardware challenges. My opinion is that we are still a few years off from having true exa-scale machines.”
Now, let’s talk about how SC19 was “broad”.
More so this year, than in previous years, have the different applications of HPC become so obvious. Multitudes of National Laboratories and Research Institutes from around the globe were seen displaying use cases on their stands in the exhibition hall, and there was a large variety of different topics discussed in talks and panels. There was, quite literally, something for everyone – assuming you have an interest or involvement in computation that is!
I think this is largely due to the growth in access to data, and new techniques such as machine learning and artificial intelligence (AI) requiring disciplines that traditionally don’t use HPC to access more computing resource. Additionally, with the massively growing offering of cloud computing resource, the barrier to entry has been significantly reduced and it is easier than ever to provision a cluster-in-the-cloud.
So tall is more powerful computing, and broad is more computing applications. This all accumulates in a bigger impact of High Performance Computing, which again was echoed at SC19 with a series of talks in the 1st HPC Impact Showcase.
My personal highlight this year at SC19 was participating in the Building the Future panel at the 6th SC Workshop on Best Practices for HPC Training and Education. The all-day workshop focused around common challenges for enhancing HPC training and education, and allowed the global community to share experiences and resources to address them. The Building the Future panel focused the discussion around how we as trainers and educators can best prepare for the future of HPC and the training and education needs it will bring. The key take-away from my talk was that there is a diverse future of applications for HPC and we need to help facilitate the power of HPC to non-HPC experts who are only just finding uses for it.
On the following day I was fortunate enough to attend the Early Careers Program, aimed at people in the first few years of their career in HPC and delivering a variety of activities, talks, and panels. It was great to see STFC represented by Catherine Jones and Alison Kennedy. As a Research Software Engineer (RSE) I particularly enjoyed panels and talks involving RSE and members from the RSE Societies around the globe. It’s great to see that managing research software properly is being put on the international stage at conferences as big as SC! I also noted that in a series of talks on cloud computing, a lot of time was given over to discussing the advantages (rarely the disadvantages) of tailor-made HPC in the cloud.
As a team, we had great fun facilitating a very popular build-your-own Lego supercomputer activity, in the form of our very own Scafell Pike! Needless to say, our limited supplies disappeared quicker and quicker each morning as the word spread. Our HPiC Raspberry Pi cluster was also present, boasting some new and updated demos developed by our recent summer placement students James and Lizzie!
I also spoke to some of my colleagues to get their own perspectives on SC19. Aiman Shaikh, Research Software Engineer, discussed her first time at the conference:
“I really enjoyed being part of the Women in HPC workshop, and attending technical talks around containers in HPC and LLVM compilers. The networking events held by different vendors was also a great opportunity to meet people. There was so much going on everywhere that it was difficult to keep pace with everything!
HPC and Cloud Operations at CERN was a very interesting talk by Maria Girone, who talked about technologies used at CERN, software and architecture issues and how they are investigating machine learning (ML) for object detection and reconstruction.
The Women in HPC workshop was really good, especially the keynote from Bev Crair, Lenovo, on “the butterfly effect of inclusive leadership”. Bev said that diverse teams lift performance by inviting in creativity, which I completely agree with. Another inspiring and motivating talk by Hai Ah Nam from Los Alamas National Lab talked about surviving difficult events and minimising their impact to your career. Hai explained that we cannot stop unforeseen events in life but we can focus on how to tackle them. The Women in HPC networking events, often joined by many diverse groups of people, provided a great chance to network with attendees from all different backgrounds.
The journey of exploration did not ended after SC as afterwards I went to the Rockies with some colleagues, which was fun-filled few days walking and with so little light pollution we could see the Milky Way at night!”
SC19 was a new experience for Research Software Engineer Drew Silcock too:
“Attending SC19 for the first time really exposed me to the wider scientific computing community. I gained an understanding of the various technologies used by the scientists and engineers and for what purposes they were used. Many are scaling their applications with standard MPI+ OpenMP stacks, but I attended several interesting workshops and talks about alternative technologies and approaches. Of particular interest to me are all topics relating to the development and programming languages and compilers, so I very much enjoyed hearing from people working on and with the LLVM compiler toolchain, additions to the C++ standard and the development of domain-specific languages for scientific computing.
In terms of trends, it’s exciting to see how many people are starting and continuing to use Python for scientific computing. Cloud services are also becoming increasingly relevant, especially for new companies without on premise capabilities. As machine learning models get bigger and bigger, there is more effort being put into bridging the gaps between the HPC and ML communities to ensure that they can benefit each other.”
Jony Castagna, a NVIDA Deep Learning Ambassador with 10 years experience in HPC and several years experience in Deep Learning, shared his thoughts:
“We’re seeing fast-growing applications of Deep Learning for science. Three different approaches have been identified: support/accelerate current algorithms like via AI precondition or matrix solver through Neural Networks (NN); solve partial differential equation using NN but enforcing physical information (via Physical Informed Neural Networks, PINN); fully replacing physical equations with NN trained using numerical simulation data. In particular this latest approach seems most attractive as it seems to show the capability of NN in learning the physics from data and extrapolate further at higher speed. For example, in the work of Kadupitiya, Fox and Jadhao, a simple NN has been used to predict the contact density of ions in Nanoconfinement using trained data from a Molecular Dynamic (MD) simulation. A strong match between prediction and MD simulation has been presented.
An increasing use of C++17 standard library has emerged for performance portability. Many paradigms, like Kokkos, RAJA, HPX, etc. have been presented as possible solution for targeting different architectures. However, NVIDIA doesn’t look to be standardising the heterogeneous programming, they expect the hardware to become more homogeneous between CPU and GPU. We’d like to test NN with DL_MESO to see how well they perform in reproducing coarse grain simulation. We have also applied for an ECAM2 project to port DL_MESO on C++17 and use Kokkos for performance portability. This will allow us to compare performance with the current CUDA version and understand how well Kokkos can perform.”
High Performance Software Engineer James Clark concluded:
“On Sunday I presented at the Atos Quantum Workshop. This was a showcase of how the Hartree Centre is using our Quantum Learning Machine, such as our joint training and access programme with Atos and our ongoing project work with Rolls-Royce.
I also talked about our future plans to develop quantum software that can take advantage of both quantum computing and HPC.
One of the most interesting developments in HPC this year was how far ARM CPUs have come. Riken and Fujitsu’s Fugaku is one of the major success stories, with the first deployment of the new SVE (Scalable Vector Extensions) instructions. Fujitsu announced that Cray will be bringing their ARM CPUs to the rest of the world. NVidia also announced that their GPGPUs will be supported on ARM platforms, with a number of ARM CPUs listed as supported on release. I am looking forward to the increased competition in the hardware space turns out, especially with AMD’s Rome CPUs and Intel’s Xe GPUs. The future of HPC looks to be very interesting and it’s an exciting time to be involved.”
I couldn’t have said it better myself!