Exascale and the Next Frontier of Supercomputing

Exascale and the Next Frontier of Supercomputing

February 2021:

Exascale and the Next Frontier of Supercomputing

2021 could be a turning-point year as society enters an exciting era in supercomputing.

Could 2021 be the new exascale era? It’s an important moment in the history of computing. The petascale era of computing, in which the fastest supercomputers can perform 10^15 FLOPS (floating point operations per second), is giving way to a new exascale age. Exascale supercomputers are 1,000 times more powerful than petascale supercomputers; they can perform a billion billion operations per second (10^18). These powerful supercomputers could solve previously inconceivable problems—from super accurate and hyperlocal weather and climate simulations to rational drug design for new vaccines, virtual jet-engine design, and beyond.

The idea that modeling and simulation has “all been done” is, in some experts’ minds, a misconception. In truth, according to Mark Parsons, professor and director of the EPCC, the supercomputing center at the University of Edinburgh, we are only at the “end of the beginning.” Parsons believes the scale of the next generation of systems and the advent of coupled simulation and AI (artificial intelligence) solutions, along with advances in quantum computing during the next few years means we are entering the most exciting, productive period for supercomputing in the past 20 years.

Is the best yet to come in 2021 and beyond? And what exactly does that mean?

Exascale Computing

In the simplest and most literal sense, exascale computing refers to computing systems capable of a certain speed/throughput threshold—specifically 10^18 FLOPS. But in a broader sense, according to Youssef Marzouk, professor in the Dept. of Aeronautics and Astronautics at MIT and codirector of MIT’s Center for Computational Science and Engineering, it is a threshold that embodies a new frontier of computing at scale. The new frontier holds promise, but not without challenges.

“Building computing hardware that can, in principle, achieve this threshold is only part of the challenge,” Marzouk says. “The other part is to create software that can actually use this hardware effectively. My main interest is to use exascale computing to simulate complex physical systems with the goal of solving open problems in science and engineering. Here, the challenges are manifold. Being able to write efficient programs for proposed exascale computing systems is a challenge unto itself; new programming languages and parallel compiler technologies are needed to make this possible.”

Source: Hyperion Research

Another piece of the exascale puzzle is what goes in the codes themselves. “In a physical setting, we can develop simulations with billions of atoms or billions of grid cells, but how do we know that they are actually predictive?” asks Marzouk. “How can we test or ensure that they represent our physical reality, at least within some bounds or sense of uncertainty? There are so many interesting issues underlying this question at the intersection of mathematics, statistics, computational science, and physical modeling.

The first almost-exascale supercomputer exists today—RIKEN’s Fugaku, which holds the title of world’s fastest supercomputer. The race is now on to take it to the next level. Martin Berzins, professor of computer science at the University of Utah’s Scientific Computing and Imaging Institute, says traditionally, there’s a 1,000x increase in the fastest machine in the world every decade, accompanied by a comparable shift in possible applications performance. “The first petaflop computer—10^15 FLOPS—was the U.S. DOE (Dept. of Energy) RoadRunnner at Los Alamos National Laboratory in 2008,” Berzins says. “Difficulties in manufacturing have led to a 13-year wait to the probable first exascale machine with 10^18 FLOPS in 2021.”

Source: Hyperion Research

Berzins and many others theorize that the first exascale machine will likely be the DOE Oak Ridge National Laboratory’s Frontier sometime this year. “The Intel Aurora architecture at the U.S. DOE Argonne National Laboratory is likely to follow in 2022,” Berzins adds. “However, there are also three possible exascale machines being developed in China—one or more of which may come online in the same timeframe.”

Peggy's Blog

Systemic Change: What’s Next for Women

Another 275,000 women left the labor force in January in the United States. These are women who were working and now are no longer doing so and are no longer even looking for employment. Gone. The total number of women who have left the labor force since the start of the pandemic reached more than 2.3 million last month alone.

Read More

Systemic Change and the Digital Divide

It is time to close the digital divide. A confluence of factors has led to this moment in history—yes, the pandemic, but also a number of other societal impacts and advancing technologies. We are at the precipice of a shift—one where we can come together to redefine our world.

Read More

5G: What's Next

In my last blog, I explained how in the next three years, 5G and Wi-Fi 6 are expected to more than double in importance, becoming the most critical wireless technologies for businesses. I started by narrowing in on Wi-Fi 6, but now let’s turn our attention to 5G.

Read More

2021: Connectivity Disruptors

From 5G to Wi-Fi 6, connectivity is offering industries new opportunities to communicate, changing the way we receive information. The opportunities are really endless, as I like to say. Can this help us end poverty, inequality, water shortages, hunger, you name it? Let’s take a closer look at what is happening in the wide world of connectivity—the disruptors in this space—to reveal what is going to come in the next century.

Read More

Doug Kothe, director of the ECP (Exascale Computing Project) at Oak Ridge National Laboratory, says capable (e.g., affordable, usable, and useful) exascale computing will be in the U.S. within two years. But he also says achieving exascale is about more than reaching 10^18 FLOPS. “For the U.S. DOE, it is a capable computing ecosystem (applications, software stack, hardware) specifically codesigned to provide breakthrough solutions addressing our nation’s most critical challenges in scientific discovery, energy assurance, economic competitiveness, and national security,” Kothe explains. “This ecosystem is not just a matter of ensuring and relying upon more powerful computing systems, but rather it must foster and support more valuable and rapid insights from a wide variety of applications, which requires a much higher level of inherent efficacy in all methods, software tools, and exascale-enabled computing technologies.”

The DOE ECP defines a capable exascale ecosystem as: “An environment where supercomputers can solve science problems 50x faster (or more complex) than on the 20 petaflops U.S. systems in use at the ECP’s inception (circa 2016), with a software stack that meets the needs of a broad spectrum of applications and workloads, all within a power envelope of 20-40 MW and sufficiently resilient such that user intervention due to hardware or system faults is on the order of a week on average.”

Applications of Exascale

The deployment of the first exascale supercomputers will be an important milestone for the global supercomputing community, which has been working toward this goal for more than a decade. “These systems will be built from the next generation of CPU and GPU technologies, which are more powerful and more power efficient than anything we’ve seen before,” says the University of Edinburgh’s Parsons. “Within 12 months, these technologies will also be available to the commercial and consumer markets. It’s an important moment in computing generally.”

 

“If we are successful in meeting exascale targets, the results should be that we also have high-performing ML systems within lower energy and cost budgets,”

Simon David Hammond, Sandia National Laboratories

AI and ML (machine learning) are becoming increasingly important focal points within supercomputing. “Over the last 30 years, the predominant use of supercomputers has been to support the most challenging modeling and simulation applications,” Parsons says. “There has been little focus on AI and machine-learning technologies. However, over the past few years, as data science challenges have grown across the commercial and academic sectors, the use of supercomputers to process large amounts of data has grown. Couple to this (the fact that) it is now commonplace for the largest supercomputers to contain large numbers of accelerators based on GPUs. This makes these systems ideal platforms for the largest AI and machine-learning problems. At the same time, the features of GPUs that make them very good at the training step of deep learning are gradually being added to CPUs (low and mixed-precision floating point arithmetic and matrix operations, for example). This means that exascale systems, which—with one honorable exception—are based on a node design with a CPU plus multiple GPUs will be the perfect platform for the largest AI training challenges.”

Source: Hyperion Research

Simon David Hammond, research scientist in the Scalable Computer Architectures group at Sandia National Laboratories, says exascale’s application in the AI and ML realms is an interesting area of discussion in the industry today. “Exascale computing is pushing the limits of computational power within specific energy, power, and cost targets,” Hammond says. “By redesigning high-performance systems to meet exascale targets, the entire computing industry is much more aggressively trying to find new designs to meet these targets, and, as they do, many of these techniques will be equally useful within the AI and machine-learning technology space.”

Hammond sees HPC (high-performance computing) and ML as being intertwined; both fields are looking to extend their capabilities extensively in the coming years. “If we are successful in meeting exascale targets, the results should be that we also have high-performing ML systems within lower energy and cost budgets,” Hammond adds. “The reverse is also true; aggressive pursuit of high-performance AI should also generate low-level technologies that can be used in exascale systems.

The question is also whether ML will be used within the scientific domain to support activities that today are thought of as HPC—for instance, replacing high-performance computing simulations with machine-learning interference.”

Hammond says exascale-class systems will impact a huge swathe of science and engineering industries, and that’s what makes exascale so important. “Some of the most exciting areas will be in much more capable engineering design capabilities; for instance, designing much higher efficiency and cleaner engines, more efficient planes, cars, and trucks, and novel approaches to energy generation, such as efficient wind turbine designs for renewables,” he explains. “By using virtual design capabilities, we should be able to optimize designs and do this faster and at lower cost than ever before. Another area which has the potential to be truly transformative is in medicine, either through the development of more efficient drugs or through the creation of drugs which more closely map to an individual’s personal genetics. I could imagine a world where virtual drug design and evaluation helps to reduce development costs and improve medical outcomes.”

A third area that stands to gain significantly from exascale systems is material design. “If we can utilize these huge machines to evaluate novel approaches to evaluating material properties, we could find future buildings would be stronger and cheaper to construct,” concludes Hammond. “Similarly, we could find materials that are stronger, yet lighter, reducing wasted energy in transportation and plane design. Exascale has the potential to provide dramatic changes to our lives.”

In the U.S., the ECP supports 24 mission-critical exascale applications that can be considered “first movers” for countless other applications in terms of formulating, implementing, and disseminating best practices and lessons learned for how to best exploit exascale computing for delivering on key science and engineering challenges.

  • 8 HPC Trends to Watch in 2021

    Doug Kothe, director of the Exascale Computing Project, Oak Ridge National Laboratory, lists eight trends in HPC to watch in the coming year:

    1. Exascale computing
    2. Accelerated-node computing
    3. AI for science
    4. Edge computing
    5. Cybersecurity
    6. Convergence of AI and HPC
    7. Machine learning at exascale (“ExaLearn”)
    8. COVID-19 (i.e., virtual drug design, epidemiological modeling, molecular docking, AI-trained surrogates for drug screening, biomolecular quantum-classical molecular dynamics simulations of proteins, and graph analytics for optimal vaccine distribution)

        These 24 mission-critical exascale applications include the predictive microstructural evolution of novel chemicals and materials for energy applications; the ability to accelerate the widespread adoption of additive manufacturing by enabling the routine fabrication of qualifiable metal alloy parts; the ability to demonstrate commercial-scale transformation energy technologies that curb fossil fuel plant CO2 emission; the ability to address fundamental science questions; the ability to reduce major uncertainties in earthquake hazard and risk assessments to ensure the safest and most cost-effective seismic designs; the ability to forecast water resource availability, food supply changes, and severe weather probabilities with confidence; the ability to optimize power grid planning and secure operation with very high reliability; and the ability to develop treatment strategies and pre-clinical cancer drug response models and mechanisms for certain cancers.

        Simon McIntosh-Smith, professor of high-performance computing and head of the HPC Research Group in the Dept. of Computer Science at the University of Bristol, says exascale computing will enable breakthroughs in a wide range of application areas, but there are four in particular that excite him the most. First, there’s weather and climate modeling. McIntosh-Smith says the resolution and sophistication of the models will increase dramatically, enabling hyperlocal weather predictions, better forecasting of dangerous weather extremes like storms, floods, snow, and heat waves, and a better understanding of humanity’s impact on our climate. Second, there’s rational drug design. Exascale may allow us to optimize drugs and vaccines, designing and optimizing them for a very specific need.

        Third, McIntosh-Smith says exascale could enable “in silico” gas turbine design, in which high-fidelity multi-physics simulations become so accurate and are so trusted that new generations of jet engines could be designed entirely in simulation—and without relying on the slow and expensive process of building a series of physical prototypes. Finally, McIntosh-Smith looks forward to fusion-energy simulation, which could lead to limitless clean energy.

        From helping scientists answer some of the biggest questions about physics, chemistry, and human health to facilitating breakthroughs in energy, vehicle design, and manufacturing, exascale computing is an exciting part of what’s to come in the next decade. 2021 and 2022 could be watershed years in ushering in the era of exascale, with supercomputers like Frontier and Aurora just around the corner in the U.S., and international competition not far behind. It’s a compelling time in the history of computing, and society will benefit in myriad ways from achieving exascale.

        “Even though we’re facing numerous challenges—for example, the slowing of Moore’s Law and the end of Dennard scaling—the future of computing is even more exciting than the past,” adds McIntosh-Smith. “Today, it’s hard to imagine what computing will be like by 2030 or 2040, as we’re seeing an explosion of different technologies and approaches in order to surmount these challenges. But, however we get there, tremendous new things are going to be possible, literally making the world a better place.”

        Want to tweet about this article? Use hashtags
        #IoT #sustainability #AI #5G #cloud #edge #digitaltransformation #machinelearning
        #computing #exascale #HPC #highperformancecomputing #cybersecurity #supercomputing

        Transcription

        Peggy and Çağlayan Arkan, vice president, Manufacturing Industry, Microsoft, talk about manufacturing of yesterday, today, and tomorrow. He says manufacturing is about innovation, building capacity, employment, and competitiveness.

        Read past articles or listen to poast podcasts related to Trends

        Guest Contributors