ITIF Logo
ITIF Search

A New Frontier: Leveraging U.S. High-Performance Computing Leadership in an Exascale Era

A New Frontier: Leveraging U.S. High-Performance Computing Leadership in an Exascale Era
Thursday, September 15, 202211:00 AM to 12:00 PM EST
Dirksen Senate Office Building, SD-56250 Constitution Ave NE Washington DC

Event Summary

High-performance computing (HPC) refers to supercomputers that, through a combination of massive processing capability and storage capacity, can rapidly solve complex computational problems across a diverse range of industrial, scientific, and technological fields. On May 30, 2022, the United States led the world into the exascale era with the introduction of the world’s fastest supercomputer, Frontier, capable of executing over one quintillion (1018) operations per second. The advent of exascale computing unlocks the door to potentially solving heretofore intractable challenges across a range of business and scientific fields from aerospace and biotechnology, to clean energy and semiconductor design. In short, leadership in HPC represents an essential national strategic capability that exerts a tremendous impact on a nation’s industrial competitiveness, national security, and ability to meet social challenges such as earthquake, hurricane, and weather forecasting.

Watch ITIF’s event at the Dirksen Senate Office Building (SD-562) as it released a new report exploring the promise of HPC in the exascale era, examining some of the latest cutting-edge applications of HPC, and articulating steps policymakers should take to keep the United States at the leading-edge of this highly globally competitive, yet truly foundational information technology.

Event Transcript

Stephen Ezell: All right. Well, good morning and thank you for joining us today. I’m Stephen Ezell, the Vice President of Global Innovation Policy here at the Information Technology and Innovation Foundation, and we’re pleased to have you with us today as we release a new report entitled “A New Frontier: Sustaining U.S. High-Performance Computing Leadership in an Exascale Era.” The report contends that leadership in high performance computing represents an essential strategic national capability and serves as a core enabler of U.S. industrial and economic competitiveness, as well as national security and defense capability.

To kick us off today, we’re delighted to have U.S. Senator Marsha Blackburn with us to provide opening keynote remarks. Senator Blackburn was elected to the U.S. Senate in 2018 after representing Tennessee’s seventh Congressional District in the House. In addition to her Armed Services, Veterans Affairs, and Judiciary Committee assignments, she serves as Ranking member on the Senate Commerce Subcommittee on Consumer Protection Product Safety and Data Security. The first female senator to represent the Volunteer State, Senator Blackburn dedicates her public service to promoting opportunities for women and making America a more prosperous place to live. And a key way she promotes that is through the Oak Ridge National Laboratory (ORNL) in Tennessee, which now houses the world’s fastest exascale-capable high performance computing in Frontier, which serves as a key enabler of industrial, biomedical, and defense innovation. So excited to hear about that, Senator. Thank you for joining us. And the floor is yours.

Senator Blackburn: Thank you. Yes. We were just talking a little bit about Summit, which was at Oak Ridge and many Tennesseans did not realize that there was so much work being done in super computing and it was being done in Tennessee and being done at Oak Ridge until we got to COVID-19. And during that time, as the Summit started to crunch all of that data and help with predicting the spread, that’s when people started to look at that. Now, one of the interesting things there also is that as they looked at what Summit was doing, then it started to make sense to them what they were hearing from some of the health IT companies in the Nashville area that were working on predictive diagnosis. And it has been very interesting for us to follow that and we’ve had such a nice conversation with your panelists about the Summit and Frontier. And I know that some of them were able to be at Oak Ridge for the unveiling, the ribbon cutting on Frontier.

This is something that as I talk with researchers and innovators, they talk about how they feel like they’ve hit a lot of the low hanging fruit, but what they need in order to be able to move forward in order to design the experiments that they want to perform, that they need tools that are going to allow them to function and to research in a bigger universe and examine the more complex systems that will help them move as fast as is possible. And I think GE is first in line on the industrial use with Frontier. So the pressure is on. And we know that supercomputing and the systems like Summit and Frontier take out some of the guesswork and will eliminate some of that trial and error. That is a very good thing.

DOE has such a capacity at Oak Ridge. We’re pleased with that and we know that some of their facilities at other places are working with the private sector and we are seeing industry after industry begin to move to high performance computing in a way that I fully believe are going to open the door and will open the door for more innovative and efficient processes. Isn’t it nice we’ll be able to solve problems more quickly and more accurately. Global governments have seen what’s happening, they’re trying to jump on board also. In Portugal, we saw Volkswagen (VW) launch a pilot project for bus and traffic optimization. Of course, VW has their U.S. plant in Chattanooga, Tennessee. So we watched that closely. Mitsubishi Electric Company worked with a Japanese software company to make waste collection and disposal more efficient. This is something that through the pandemic, we have seen local governments cite as a problem that needed to be solved.

Of course, all this global interest in high performance computing means we are in global competition to stay dominant and to keep our adversaries at bay. China in particular is far too close for comfort. China is investing in and building quantum applications at an alarming rate and we need to do more here in the United States to make sure our researchers can increase their own pace as we try to make certain we stay ahead of China. And no doubt your panelists are going to talk with you about that today. Before the pandemic hit most of the debate over this issue [supercomputing] focused on innovation in defense technologies and nuclear programs. But over the past few years, those of us who consistently work on this issue have been able to force a new discussion about how innovation in the private sector can have a make or break impact on energy security and also on national security.

And today with me is Jamie Susskind, who handles these issues in our office, Jamie is here in your audience today. If I could pick one word to describe how Tennesseans felt during the heat of the pandemic, I would say that word was vulnerable. Not because of the virus, but because of how quickly our systems and supply chains broke down. All of a sudden their access to medicines, food, basic technology was just gone. But now here in Congress we’re slowly but surely changing course and focusing on programs and funding that will help both the public and the private sectors work collaboratively so we can triage these vulnerabilities and use the most-powerful tools we have at our disposal. As a member of both the Commerce and the Armed Services Committee, I have the opportunity to meet with advocates from all sectors. Not all of my colleagues have this opportunity. So it’s important for you all to keep putting yourselves in front of policy makers and explaining that the work that you’re doing, things you’re working on in the lab have real world applications.

As I said, these are applications that will help solve problems sooner. I want to end by making it clear that one of my goals in the coming year is to make sure that the U.S. government offers even more support to research and development and deployment of high performance computing and applications. Earlier this year I sent a letter to Secretary Granholm urging the DOE to work more closely with the private sector to get that development and deployment timeline down to between one to three years. We do not have time for latency and lag time when we are looking at this research development and deployment. We’re focusing on directing more investment toward developing new technology. And I’m also behind legislation that would offer funding for key research infrastructure and partnerships in addition to that development support. You all need to stay involved. You need to stay in contact, stay in touch with us and let us hear from you as you continue to advocate for the research and development that is necessary. Thank you so much for letting me step in.

Stephen Ezell: Senator Blackburn, that was fantastic.

Senator Blackburn: Thank you.

Stephen Ezell: Thanks so very much. A round of applause for the Senator. And I should also note that on pages 22 and 23 of the report, we have a comprehensive analysis of the use of supercomputing at ORNL with regard to developing COVID-19 vaccines and therapeutics, with researchers their performing over 8,000 simulations toward developing COVID-19 drugs and therapeutics, which resulted in 77 candidates being identified, and a number of those entering clinical trials.

So let me just say a few more words about the report before I introduce our panelists and turn the conversation over to them. As I noted, the report explains what HPC is and why it matters before turning to examine HPC applications across a variety of industries from aerospace and auto manufacturing to biotechnology and clean energy. Throughout, the report endeavors to find very specific examples of impact: from helping P&G save over $1 billion in operations over the past decade to what I thought was one of the coolest examples in the report. So Boeing has now developed for the Air Force, the next generation jet fighter trainer aircraft called the Boeing E-T7A Redhawk. Now the aircraft was designed entirely virtually on high performance computers and the Air Force selected it before a single plane was even built, right? This shows the power now that we can design in modeling and simulation (M&S)-based environments.

And then when we make the first prototype for a plane or nuclear reactor, we’re deploying it in a more confirmatory than exploratory way. So if you’ve ever flown in a Boeing 777, when they built that back in the 1980s, they had to make 77 physical wing prototypes for this thing. By the time they built a 787 Dreamliner, they only had to build 11, so it was a seven-fold increase in efficiency. And since every wind tunnel test you have to do is millions of dollars, this is like a massive efficiency and time to market speed of innovation, etc. advantage, right? So there’s many examples like that throughout the report, but it’s there for you to read. So let me turn the discussion now over to our panel. And I note for our audience watching online that you can submit questions which we’ll take at the end via the Slido app, and also the presentations you’ll see here are also available online for PDF download.

First, today we’re going to hear from Mr. Justin Hotard, who leads Hewlett Packard Enterprise’s, or HPE’s, high performance computing and artificial intelligence business group. The HPC and AI team at HP provides digital transformation in AI capabilities to customers, helping them to address complex problems through data-intensive workloads. Justin previously served HP in roles as senior vice president for corporate transformations and as president of HP Japan. He holds an MBA from MIT.

Next, we’re delighted to be joined by Mr. Rick Arthur, who is the senior director and senior principal engineer for computational methods research at General Electric Research. Rick leads pathfinding efforts in applying computational methods to drive innovation in diverse industrial sectors from healthcare to air and rail transport to energy. Rick represents GE in government policy and project discussions, including as a member of the Department of Energy’s Advanced Scientific Computing Advisory Councils He holds BS and Masters degrees in Computer Engineering and an MBA from SUNY Albany.

Next we’ll hear from Ram Ramaswamy, who is the director of the Geophysical Fluid Dynamics Laboratory at NOAA. He’s been a central figure in climate change for several decades. From 1992 to 2021, Ram was a lead author, coordinating lead author, or review editor for each of the major assessment reports produced by the Intergovernmental Panel on Climate Change. He’s also a coordinating lead author at the World Methodological Organization Assessment on stratospheric ozone and climate issues.

Last, but not least, we’re delighted to have Bob Sorensen with us today. Bob is Hyperion Research’s senior vice president for research and chief analyst for quantum computing. Before joining Hyperion, Bob served 33 years for the U.S. government as a senior science and technology analyst covering global advanced computing developments. Bob has degrees in electrical engineering and computer science from George Washington University and the University of Rochester. And he requires me to say that he strongly prefers C to Python. With that, thank you all for being here today. Justin, the floor is yours.

Justin Hotard: Thank you very much, Stephen, and it’s a pleasure to be with all of you today. First, I thought it’d be helpful to maybe start with what a supercomputer is, what makes it different because many of you probably... Well most of you tuning in remotely are probably doing so on a computer today, many of you in the room probably have mobile phones. And many of you are used to accessing the Internet or using cloud services or applications like email. A supercomputer is fundamentally different. A supercomputer is a unique system that allows you to run massive computations, massive problems all at once. Think of it as stitching 30,000 laptops together in parallel to run one application at the same time. What’s also different about a supercomputer is we actually, because they’re so performance-intensive, we actually can’t just cool them with air, we actually have to run water through them.

So it’s a very unique system, very different than what you might see if you walked into a typical cloud service data center or enterprise application data center. What’s important about that is that it allows us to do things we can’t do anywhere else in the world. In fact, supercomputers have been with us for a few decades. They run our weather forecasts. So, as you’ll hear probably from one of my esteemed colleagues here that they help us design airplanes. And as Senator Blackburn mentioned, they helped accelerate the planning and the mitigation of the COVID-19 pandemic. In fact, on supercomputers, we identified the spike protein that led us to accelerate vaccine research and ultimately vaccine production and we accelerated our understanding of how the spread would occur. Supercomputers have an incredible impact already on our lives. They’ve helped us design and accelerate research in areas like Alzheimer’s. They’ve also helped us to better plan and protect our nation and our citizens through defense applications both at home and overseas.

But the key thing that’s different now is we’ve entered a new era. We’ve entered the exascale era. This was an incredible milestone and in our industry, we consider it like putting a man on the moon. And it started really with a vision over a decade ago and with investments made over a decade ago. And to give you a perspective on how much of a significant accomplishment this was, the fastest computer that was running 13 years ago was a thousand times slower than Frontier. The fastest computer running 20 years ago, just over 20 years ago, the fastest computer was a million times slower than Frontier. In fact, the Frontier Exascale system is as fast as the next seven fastest computers in the world combined.

It’s a massive step and it took great partnership and great vision between the United States Department of Energy, a great partner like Oak Ridge National Labs, and a broad set of industry partners including Hewlett Packard Enterprise and many others. These things are not done overnight. They are the result of bipartisan and broad commitment and shared vision. And what’s really important is, while this is a great milestone and while we’ve won the race to exascale, which is really important for our country, for our society, it’s actually the beginning. The exascale era is the beginning of a new era. It’s an era where we’re going to be able to get incredible insight from supercomputers. The types of things that we’re working with the United States Department of Energy on already in exascale are problems like helping better predict major weather events, things like hurricanes and preventing fire forest fires.

We’re also looking at all alternative energy sources. How do we accelerate those? How do we invest and better understand renewables? We’re looking at things around protecting the electrical grid. I think we’ve all seen the traumatic impacts that weather events or wildfires have had on the electrical grids in places like Texas and in California; supercomputers, and particularly exascale supercomputers, give us the ability to understand that and they actually give us a time machine to look ahead at what might happen and how to prevent and better respond to those unexpected events. These are foundational to making our society better, to improving our climate and ultimately to protecting our borders.

And this journey is also very critical because you heard Senator Blackburn talk about the quantum future. Quantum is a critical part of our future innovation, but the bridge to quantum will actually come from supercomputers and the exascale supercomputer computing technology will help us massively accelerate AI, things like autonomous driving, areas like integrating traditional modeling and simulation with machine learning to get us to more accurately predict major climate events or the characteristics of nuclear fission, it will also help us accelerate the journey to quantum. And this is why ongoing investment is so critical because while we’re leading today, it is critical for us to continue to lead. Senator Blackburn touched on the race against China. This is a competitive race that we must continue to win. Winning once is not enough. We have to continue to lead.

And as I touched on earlier, the public-private partnership is a critical element of success. The ecosystem between the United States government and many of our industry partners is what will be the difference to allow us to continue to lead. We need to continue to invest. We need your support to continue to look at investing in new technologies. Many, many technologies that we brought into the supercomputer today would not have been a made available if we didn’t have the foresight and the vision that we had with the United States Department of Energy. There’s a lot of great technology investment happening, but as we all know, a lot of that gets put into social media or cryptocurrency or other areas that have commercial applications.

And while many of the applications that we’re delivering now and will continue to deliver with super computers, we’ll have commercial value in areas like autonomous driving and accelerating cancer research. It’s really critical that we continue to invest in bringing some of these to market together in a public private partnership to be able to innovate in things that bridge areas where we can accelerate technology to enhance our lives and enhance our livelihood. So what I would urge all of you today to consider and I look forward to hearing from my esteemed colleagues, is to continue to work with us to understand how we can accelerate these technologies, how we can invest together, and how we can find new applications and new uses for this technology. Because supercomputing AI, and in the future, quantum computing, will be things that make our climate healthier, make our world safer, and ultimately improve our livelihood. Thank you.

Stephen Ezell: Thank you, Justin. That was fantastic. Let me bring up Rick’s presentation. One of the other I thought cool examples from the report was that Pfizer ran 16 million modeling and simulation runs over the past year, over 90 percent of those as they efforted to develop COVID-19 in vaccines and therapeutics. Another cool example in there was on cancer research. So there’s a set of genes called RAS genes that account for over 30 percent of cancers, but thus far we’ve been unable to develop any drug targets against the RAS gene because scientists have lacked a detailed molecular-level understanding of how RAS genes engage and activate proximal signaling proteins. There’s now a big National Cancer Institute research initiatives using exascale computing on this. Now they have the computing capability to try and tackle this particular problem of developing innovative drugs, targeting RAS gene proteins. So very interesting. Okay, Rick the floor is yours.

Rick Arthur: So a lot of the stuff I want to talk about is very visual and so there are slides here. So thank you for the opportunity to talk about our exciting achievements and computational modeling and our aspirations for exascale. So just as Justin mentioned with public private partnering, GE’s engaged a broad range of government agencies and national lab facilities to take science and engineering challenges on in our high tech products.

These extensive partnerships and technical diversity of the partners are necessary to deliver critical products and services for transportation, energy, and healthcare as well as a wide variety of past efforts in legacy GE businesses. At every minute, one-third of the electrons coming out of an electrical plug are generated on GE equipment, every minute 30 airlines take off with GE jet engines, and every minute there’s over 16,000 medical scans performed on GE equipment. So this is equipment that unlike digital or consumer products are high cost and have a high consequence. And they are critical infrastructure. So their operation and reliability across the life cycle are essential and manufacturing and field services of that equipment is just as important or more important than the design. So we see here a picture of a gentleman performing a service call on a wind turbine.

So here we have what we would call a beta test, which is quite a bit more than a web app widget. And commercial success depends upon competitively leveraging world class skills, tools, knowledge to deliver, as I mentioned, the reliability and confidence in products like this as well as value, capability, and performance. So one of those tools are models and you and I use models to think about all the decisions we make, the route you took to get to this event today, how to make dinner, or how to design and operate a wind turbine. So increasingly the computational power has become an essential tool for numerical models which are now critical for modern competitiveness. And we need models to clearly see so we can understand bounding constraints, causes and effects, and we need to deeply understand in order to predict what-if counterfactuals to guide our focus.

So consider driving through a foggy night because of limited clarity, you drive more slowly and cautiously or else you risk making mistakes, perhaps causing or coming to harm. When our understanding is flawed or insufficient, we make decisions more slowly in order to react and readjust. Uncertainty and ignorance result in wasted time, effort, resources [thus] endangering individuals, even jeopardizing wider systems such as our economy. So, like a car’s headlights and fog lamps, lane guides, GPS navigation. These are tools that inform us and help us see and to act with more speed and confidence moving forward when we are surrounded by volatility, uncertainty, complexity, and ambiguity, hallmarks of what the U.S. Army War College calls the VUCA world.

“To see” in the practice of medicine. We’re all well aware of high-tech machines that lend our human eyes the power to see into the body with super human detail, far more safely and more capably than the now disappearing practice of exploratory surgery. So just as an X-ray, a CT scan, microscope, or telescope enables the fields of medicine and science, computers become essential to see the detail to team the complexity and vast data through simulations and data analysis in the modern practice of science and engineering.

Turbo machinery is at the heart of jet propulsion and power generation, which of course are core to GE businesses. Compress air, add fuel burn, expand the high pressure, and you get your thrust and energy. Air flows over these shaped metal fins called blades, which are arranged around a rotor across many layers called stages. And it’s important to understand effects like turbulence, the red [part of the image] showing separated flow, as opposed to smooth flow to enable efficiency, performance, stability, cooling, noise emissions, and many critical factors. Over the past decade, GE has engaged with a number of the national labs on a variety of projects spanning the entire flow from the engine’s fan intake to its exhaust.

For wind farms to reliably generate the expected power, it becomes necessary to understand wake effects, that is, the shape and the size of the wake behind each turbine. These effects can reduce downstream efficiency of a turbine up to 40 percent. To the right is a study we did on the Summit machine that Senator Blackburn mentioned on coastal low level jets that produce a wind velocity profile potential, which is of importance to the design and reliable operation of offshore wind farms.

We’re advancing toward previously infeasible studies of wind patterns spanning kilometers in height over hundreds of square kilometers of territory down to a resolution of air flow over individual blades to form new control strategies for better performance and reliability. The RISE engine design shown here seeks to greatly increase propulsive efficiency, which is necessary to enable hydrogen as a route to zero carbon emission fuel and transportation. It’s a stepping stone because comparing hydrogen as a fuel source, the price per kilometer traveled, the aircraft’s range, the amount of volume on the aircraft needed to store the fuel and such currently are insufficient, with hydrogen powered propulsion as we have in terrestrial power plants that already can burn hydrogen for example.

This is a radical design change you see on the wing here, from the ducted fan that we’re all used to seeing. This is not a turbo prop, this is a completely new design and to understand this is requiring us to perform even more advanced analytics. At the right here, you see the comparative size of what we can test in a wind tunnel. They just don’t have wind tunnels of the scale necessary for our production size. So we are, as mentioned relative to the Boeing example, we have to sell these engines with performance specs that are not actually finalized until we run test certifications. So as we can better project and have confidence in what those will be, we can reduce waste and errors made along the way. And certainly as the success of wind farms has grown, we now have wind farms that are interacting with wind farms to do the complex interactions between those is yet another step up in competition.

So in summary, to be competitive in modern high tech industries and drive speed, agility, and confidence, you must employ the state of the art models to see through the VUCA fog and drive disciplined understanding of your products and their operations. Computational modeling is now core to the state of the art with the models’ predictive accuracy determined by the completeness of the physics embodied in the models, the correctness of numerical and numerical robustness of those models, and our ability to assert confidence bounds to shape options from which model-guided decisions can be made toward competitive success. Thank you and I look forward to your questions.

Stephen Ezell: Excellent, Rick, that was great. I mean amazing potential, 40 percent increase potentially in the output of [energy from] wind turbines and entire wind farms. And I think your presentation also highlights a key point for U.S. industrial and economic competitiveness. GE and Boeing…you have to sell your engines and aircraft to customers before you’ve ever manufactured a single one. You’re selling them on design specs that you’ve built in modeling and simulation environments and it’s the quality of that computing architecture that gives you the confidence to put forward a set of performance specifications your confident you’ll be able to achieve, at a profitable price point. And that’s a key dimension of industrial competitiveness in a global age.

Also, I want to highlight one other thing you were kind of enough to share with me, Rick, in the report you said, “Researchers can only model a universe that will fit within the size of the largest computer they can access. Therefore top leadership-class computers set the threshold for what phenomena can be perceived, studied, and understood by researchers. Exascale computing will give them the ability to develop more accurate data derived models with far greater completeness, accuracy, and fidelity in scale and scope.” And that’s really what we’re talking about at the frontier of biomedical research or astrophysics or predicting whether and climate, as we are going to hear it now from Ram Ramaswamy.

Ram Ramaswamy: Okay, yeah, good morning everyone and thank you Stephen for the introduction. I think it’s really great that my two esteemed members before me already articulated very well the need for HPC and modeling. These two play a very critical role in terms of weather and climate forecasting. I want to talk about weather and climate modeling and the frontiers there and both the advances that occurred on the science side and the opportunities that exist courtesy of the advances in HPC and what you see kind of on the left hand side with the animations going on was the atmosphere and the right side of the ocean and these are two critical components of the system that we want to try to understand and then predict. So what we want to, this is the earth system. On the left hand side you see the schematic of the earth system. On the upper right you see all the interactions that have to be accounted for with the computational needs. So computational needs in weather and climate modeling take two forms. One is representing the interactions between atmosphere, land, the biosphere, cryosphere, and the ecosystems, because there’s interaction back and forth between the components.

And then on the other side, if you see the lower right-hand panel, you see kind of a three-dimensional axis, three-dimensional picture because each one of them is important. One is resolution, obviously the greater the resolution, the more detailed you capture in terms of a region of complexity. The other system is complex because these systems are moving at different time scales. So you want to capture each one accurately and the interactions between them also accurately. So that’s the complexity. And then the simulation time, whether you want to simulate for weather purposes like for a few days or for climate out to seasons and decades; each thing adds to the requirement of the HPC. And the more advanced the HPC, the more actually you can get out of scientific models. And this, let’s see. So on the sort of upper panel you see kind of the latest model that we have: three kilometers global.

So this is three-kilometer resolution that you’re getting information on. And what you actually are seeing are clouds. And you can even see water vapor forming, which are kind of the basis for hurricanes and tropical storms and out of the oceans. And then you see kind of the giant movements of the air in the northern hemisphere and southern hemisphere. This is what you’re able to capture within the three kilometer. And this actually by the way has been run on the systems at Oak Ridge. This is kind of where it comes from. And with this you can capture not just the weather but now the details of the weather, the hurricanes, the storms and the clouds, even the tiny wisps of clouds that you see at three kilometer scale. This is what you’re able to do. And this is kind of something which is the highest, not really the highest resolution, but also it should be state of the art model. It’s actually being tried out now on an experimental basis. It is giving experimental forecasts and we’re just waiting to be able to run it on advanced HPCs.

This is an example actually of the outputs from a coupled climate system. This is now a coarser resolution model than the one shown on the previous slide, because, it’s, we can’t go more higher than 25 kilometers in trying to simulate the whole load system on an operational basis. The top panel is showing the absurd precipitation on an annual mean basis or the U.S. and what you want to see from the diagram is the blue which indicates where there is relatively more precipitation. The yellows and the browns indicate literally less precipitation. I mean that’s kind of where the west is going dry. And then the upper panels in the northwest are showing the increase in precipitation because of the mountains there. Now on the left-hand side of the panel you see the simulation of no model. So the top one is observed, the left-hand side bottom panel is the modern simulation, the a hundred kilo grid box.

This is kind of where we were about seven years ago with a hundred kilometers, that’s all we could do. You can see it captures some of the basic features, but it really is not very satisfactory. And then the bottom right is what we can do right now with 25 kilometers on an operational basis. And you see that captures the observed panel on the top rather well. And so this is the test of how far you can go with HPC combined with really state of the art scientific principles. And the next one is sort of even better because this is now actually showing a weather forecast. It’s actually Hurricane Ida, the remnants of Hurricane Ida as it moved through Tennessee and then to the northeast to Pennsylvania, New Jersey, New York, and New England. And again, the top panel is showing the observed precipitation which happened.

And then the bottom three panels are showing forecasts with now a model that has a nesting of three kilometers over the continental U.S. And you can see the details that captures out to even five days earlier, five and a half days, and then three and a half days and one and a half day forecast. Each one is reasonable to the observed forecast. And so this gives you the potential for what’s in store if you’re able to do this really in a routine basis with the HPCs that are coming on board. And so I come to a final slide, which is where is this all leading to? And with the high-resolution models, we first of all can do seamless. So we are not talking just a few days, few weeks, but really across the space, across the spectrum toward decades, so seasons and then in inter-annual and then decades.

And that’s kind of something that is now going into the prototype of these models ready to go. And what it can give us is the frequency of occurrence of extremes, the locations, the duration, and also the multiple multiplicity of consequences that happens when some extremes happen. And so what this means of course is not only protection of life and property, early warnings so that people can adapt, but also the savings in terms of economics and also affecting security, national security, water security, energy security, food security. Really everything. There is so much benefit to be had when we get to [really] advanced computing. So as a bottom line, the faster the computing, the better the models and the more accurate the predictions and more early the predictions, which is kind of what society wants. So I will end there. Thank you, Stephen.

Stephen Ezell: Thanks, Ram, that was great. And the report talks about this merging of weather, which is about ten days out with longer-term climate forecasting and how we get more kind of accurate predictions, especially in that not just ten days, but kind of one to two month timeframe to help plan events, et cetera. There’s also a really cool set of examples in the report about earthquake forecasting that they’ve done out in California. They did 250,000 simulations of earthquake potential under the ground in California and found for instance that the week after a 7.0 magnitude earthquake that it would be 300 times more likely than they expected it to be in the past for there to be another earthquake of that magnitude on the fault line in the following weeks.

I should also note, Ram, that a 2021 NOAA report on priorities for weather research found that the United States could use on the order of a hundred times greater the current operational computing capacity than we had (for weather forecasting). So that points to the importance for policymakers to continue to invest in these technologies. All right, Bob, I’ll turn it over to you. Give your take on the global and technological landscape of the HPC world.

Bob Sorensen: Great Stephen, thanks for inviting me. This is my trip back, I was here in 2016 when you did an earlier version of the report. I think we were in this exact same room and I noticed that of the four panelists, three of them have retired since then. But I think what that does is highlights the fact that while the personalities may change in this sector, the technology lives on. As Justin said, this is a 40 or 50-year journey, but it hasn’t been one that’s been static. It changes all the time. The HPC of today is not much like the HPC of ten or 20 years ago. And so what I thought I’d hit on today is really what’s changed since we were here in 2016. What are some of the topics that we didn’t talk about then or maybe touched upon but hadn’t really come to fruition?

And the first one, which is fascinating here is we’ve talked a lot here today about modeling and simulation and that’s really been the bread and butter of high performance computing over the years. The idea of modeling a physical phenomena that was too expensive, too dangerous perhaps in a nuclear test or just simply too expensive or you couldn’t get the data you really needed to do something interesting. I think it’s fascinating that BMW doesn’t really need to build cars to crush them into the walls anymore. They’re perfectly content to sell systems that have been crash tested entirely based on computational issues. It’s interesting. So the mod sim in the past has kind of been the bread and butter of HPC. Well, since 2016, we’ve seen some explosions in other interesting and compelling use cases in HPC that really have changed the state of where the sector’s going. We started looking at something new called, which we just called colloquially Big Data.

The idea of looking at a profusion of data that was collected, whether it be from the Internet or private sources or telemetry, remote sensing instrumentation and start to do interesting analysis on it to make some interesting conclusions. And one of my personal favorite ones is when Bank of America, when they were starting out in the big data efforts, they realized that they really didn’t need to do a lot of background checking on people’s financial status to figure out if they were a good investment, if they were a good risk for a loan because they found things that indicated things that people did that made them very good risks. And one of the things they found which was an absolute guarantee of incredible credit, was if you bought those little things that you put under your furniture if you have hardwood floors so you don’t scrape the floor.

And they found that people who do that never, ever default on their credit. People who tend to use their credit card to pay their bail bondsmen generally have issues with credit going forward. So this was a big data issue and what it did is it added a certain amount of diversity in what HPCs [could do for you] because you had, in terms of big data, you had tons and tons of data out there and it forced you to buy big machines that could do fast computations, whether it was because of a timing issue or just because the data sets were so large, you needed big machines to do that. Well then along came the AI explosion, machine learning and deep learning, the idea of no longer telling a machine what to do, but having the machine figure out what it needs to do. And we’ve seen all that through, it’s been a huge application space.

But again, the larger and larger big data machine learning applications require HPCs. They require specific architectures. So what we started to see now is a bifurcation and in what is an HPC? Is it a big data machine? Is it a mod sim machine or is it an AI machine? And some of these large systems like the one at Frontier, one of the reasons those machines are expensive is they’re designed for a one size fits all architecture right now and that’s expensive and that’s complicated. And so one of the things that we believe going forward is we’re going to start to see more use case-specific systems. The idea that I don’t build a machine to do all of these diverse workloads. I now have workload-specific kinds of applications. I want to build a mod sim machine, I want to build an AI machine. Those machines can be less expensive in some sense because they don’t have to do it all, but they must only fit the specific workloads at hand.

So that’s what we think is a change that we’re going to see going forward. One of the other things that’s really happened in terms of the HPC world, which has upset some people and delighted others, is the opportunity to move from what we call on-prem HPC. The idea of having a big $600 million system sitting out in the basement somewhere in your facility to migrating some of those workloads to cloud environments. And it’s the cloud providers, we’re all used to AWS, Google, Microsoft, Azure, Oracle and say Alibaba out of China…if people here either live near Ashburn or have a flight into Dallas and looked out the window, you’ll see that these large data centers, Ashburn is one of the largest collections of these data centers in the world right now. And those cloud service providers are making it much more attractive for lower barriers to entry for exploration of HPC.

You don’t need to buy a big system, you don’t need to have a lot of complicated experts to help run those systems. You can move a job to a cloud, learn what you need to do essentially if you end up growing into a cloud or an on-prem environment because of the economics you can justify. But we’re seeing real vitality in the HPC cloud migration, not only from the on-prem folks, but also what we call cloud natives. Folks that basically start a research program. They’re not HPC users, but they go right to the cloud, use a pay-as-you-go model to lower barriers to entry. And so not only does HPE build on-prem systems that rule the world, they also have cloud computing capabilities. It’s no longer a question of do I run my workload on prem or in the cloud? I have to have both.

I have to be able to balance the two. That’s something that we weren’t talking about six years ago. The other thing we’ve seen, and it touches a little more on the political as opposed to the technology, is there’s been an increasing emphasis on indigenous capability within the world. Up until Frontier came out, the fastest machine in the world was running in Japan, cost them about $1.1 billion. It was called the Fugaku system. Brilliant engineering design mainly because they wanted to build a custom machine that was entirely Japanese built. They didn’t want to use U.S. parts or U.S. components except when they absolutely had to. We’ve certainly seen that U.S. systems rely primarily on U.S. components. We’ve seen China moving very aggressively towards reducing its dependence on U.S. technology. Part of that is in some sense driven by concerns about U.S.-China high tech trade friction and export controls.

The most recent announcement a couple weeks ago that talked about certain components from Nvidia, which makes some of the critical computational engines for AI, not being allowed to sell into China above a certain level of sophistication. So China has really embarked on an indigenous HPC design capability, very aggressive. We’ve actually seen it in Europe as well. A lot of interesting startups going on in Europe. There are some EU programs to develop indigenous capability. So even Europe who basically consumes 30 percent of all HPC used in the world but only produces less than one percent want a part of that. They want to have control of their systems. So we’re starting to see a certain amount of bifurcation or even diversification, or stove piping if you will, in what used to be a very collaborative and global community that’s kind of adding a certain overlay to it from certainly a political sense.

It duplicates effort, it may result in people going off on an evolutionary track that may not be acceptable. So that’s a double edged sword in terms of what that really means. One of the things that we could talk a little bit about is this international competition issue, and I just want to weigh in on my two cents here. I almost feel like it denigrates the capabilities of the Frontier system to call it the fastest machine in the world because that decision is based on something called LINPAC, which is a rather simple test of peak performance. It’s a very simple job. It’s not simple to run sometimes, but it’s the metric that a lot of organizations use to say we’re the most powerful one. But what it doesn’t do is it doesn’t justify the sophistication of design and the value added of that particular system.

These machines are bears to program. Sometimes you may have 40 million or more computational cores that you have to control and get them working in concert to get that theoretical peak number. And so it’s very difficult to extract performance and you just can’t build a system and get that performance for free. There’s a very sophisticated infrastructure of programming, of knowledge, of expertise and specific applications to make that happen. So the Semiconductor Industry Association came out a couple months ago and they did some very good analytical work and they basically said that if you looked at the top ten HPCs in the world that actually exist versus the ones that are on the list, China would have seven of those ten systems. China has made a political decision not to list their most powerful systems on the top 500 list anymore. This list simply because they don’t want to draw or they don’t want to basically increase trade tensions.

So seven out of ten, should we be nervous? Frankly, one has to look at the capabilities of what the computer can do to deliver useful workloads to the scientists and engineers who are using it. It’s not about theoretical peak. And the reason I like to warn people about that is it goes back to this issue I talked about earlier with what workload composition is. We cannot, and I hope you agree with me on this one Justin, we’re on an unsustainable trajectory when it takes about how $600 million to build HPC systems. Argonne National Laboratory is going to take hopefully possession of another DOE exascale class system. It’s going to be exascale capable, it’s going to come out at about 60 megawatts to power that system. We’d like to use a rule of thumb of about $1 million a megawatt per year for an HPC system. So a $600 million system like the one at Argonne to be called Aurora, it’s $600 million to buy $300 million worth of electricity across a five-year lifespan.

These are unsustainable trajectories when it comes to next-generation systems. That’s why I talk about the idea of smaller systems targeted for specific workloads. So we are not in a position where we want to say China may have the most powerful systems when we may have a collection of smaller, more capable systems doing the jobs they need to do, but they just don’t run LINPAC as well. So we need to really understand that we’re going to have a different kind of trajectory going forward. I hope that 10 years from now we’re not here talking about zettaflop systems because that’s going to be very expensive. That is not what really benefits any scientific researcher. And again, the Frontier system is a brilliant design, great architecture, but what I’m enthusiastic is not only what Oak Ridge can do, but what HPE can do to take that technology and deliver it to science and engineers at the companies we’ve talked to today to do valuable work workloads and such.

And to me that’s one of the greatest values of what the Department of Energy does. It’s not just meeting their own important message missions at Office of Science and some of the national nuclear security agency organizations. It’s procuring systems that ultimately contribute to the overall health of the U.S. HPC competitive sector, which ultimately ends up driving our larger commercial national security and economic vision for what this country can do. So it’s a lot more than building a single system. It’s about being the pointy end of the spear in an ecosystem that thrives and will continue in a trajectory that may be slightly different. But again, the sector always changes and that’s the one thing we have to be completely clear about. There we go. Thanks.

Stephen Ezell: Excellent Bob, that was great. And I think you highlight a key point that too many in the media and the policy makers focus on these big shiny objects, the fastest this or that, but where the United States has truly excelled is in building machines that are capable of useful application, to use Rick’s term in the report, and in building democratized access to these types of tools through things like the National Center for Supercomputing Applications (NCSA) in Illinois and ensuring that researchers and students and companies can use these tools in effective ways.

One other point I wanted to mention is that semiconductors and high performance computers, apart from biotechnology amidst the pandemic, truly represent the world’s most important industry. We wrote a report last year called an “Allied Approach to Semiconductor Leadership,” which talked about how like-minded countries have to work together in developing next generation semiconductor architectures, issues like standards, workforce training, workforce education, etc. And I am hopeful that in initiatives like the EU-U.S. Trade and Technology Council we can really focus on collaboration with our allies. Because what you say, Bob, I think about the shift toward kind of indigenous efforts to build these systems is real and we have to be aware of it, but I think it’s important that we try and shift that trajectory into more of an allied, as opposed to a bifurcated, approach.

We have just a few minutes left, but we do want to take any questions if we have any from the audience or online. So if you have any questions, raise your hand, identify yourself, and if not, I have one or two quick ones to get through. Maybe I’ll just quickly ask, of course, in the Chips and Science Act we did see an increase of 40 percent authorization for the Advanced Scientific Computing Research program, the ASCR program. That does need to be complimented of course with similar increased investments in the sister advanced simulation and computing program run at the National Nuclear Security Administration. That’s got to be funded in the NDAA. And of course we have to be sure that all the great things that were authorized in Chips and Science actually get appropriated by this august body this fall. But I wanted to ask, beyond increases in investment, do any of you have further policy recommendations? Whether it pertains to workforce skills or other topics?

Justin Hotard: Well maybe I can offer a couple of things, Stephen. I think, first of all, those investments are particularly important. It’s key that we appropriate them to advancing technology. Think of it as pre-commercial technology. Things that we’re envisioning could go into general market in five or 10 years and accelerating those so we can take advantage of them in supercomputers. But there’s also two things I think are really critical. One is if you look at the Chips and Science Act, it’s a phenomenal first step. It’s taking the most powerful components inside a computer, the processor or some of the other key technologies, making sure we have sovereignty and support for local investment. But it’s not enough. If you crack open a computer, I wouldn’t recommend cracking open your phone because I don’t think they usually like that very much when you do that, but if you crack open your phone, what you’d see is there’s a bunch of little chips in a board on that.

And today, even if I make the highest-end semiconductor in the United States, I still need to depend on China, on other countries that maybe have geopolitical instability in some cases to actually source the other components to actually build the computer. So we have to take that next step to getting to the smaller components, things that are not as advanced. We also have to make sure we understand how to mine and source many of those materials. Some of those materials are being mined and sourced here today, but they’re being minded and sourced by global companies that may or may not have the same interests aligned to our national security or to our own national competitiveness. So obviously we want to enable competitiveness, we want to enable markets to compete, but we’ve got to recognize that it’s a whole ecosystem that we have to develop. The last point I’ll make is education matters.

And I think the point to you that we discussed on, well actually you can get the sense. Ram has got incredible competency and knowledge on how to build these models. And he’s one of the world’s foremost experts on it. But we need more people to actually understand how to use this software to use this technology. It’s really critical. And so continuing to support STEM development at elementary, middle, and high school levels, and continue to support it with the universities. Encouraging programs that encourage education is really critical because we need more scientists, more engineers across the government, across private sector for those of us of in the industry to be able to enable and to realize a lot of these insights.

Rick Arthur: I would also add that the photo-op is always the big machine and those are just glorified heaters without good software. And so we tend to emphasize the hardware investments and can often overlook and take for granted the corresponding software investments to make them useful. And this is actually an area where I would say that the U.S. has a strong lead on China, is that China’s ecosystem for software has been lagging severely. And so we need to maintain that lead and encourage education and further investment.

Bob Sorensen: Yeah. I’m just going to go quickly and say I’m hoping that they rename this, that CHIPS Act one, scene one. Because there is an awful lot to do and the thing that concerns me most about the CHIPS Act right now is that it favors domestic semiconductor producers that are already in production. And well, I think what a lot of people don’t understand is there’s this concept now in building chips where you have, we call them fabless companies like Nvidia or companies like AMD. They don’t make their chips, they send them out to foundries. And the two biggest foundries in the world right now, the most successful is in Taiwan, TSMC, and Samsung in South Korea.

This is a huge vulnerability on our part there. Most of our dependence is really on Taiwan right now and the CHIPS Acts does not have what I consider to be addressing the issue of what we’re going to do with these fabless companies who really can’t just go to Intel and say, “Would you build chips for us?” These companies can’t go to existing chip makers in the United States because they are completely invested in TSMC and Samsung outside the country. So the CHIPS Act needs to address that very large and critical segment of the semiconductor sector, which are U.S. companies that design but then send their chips out overseas.

In fact, it’s interesting, TSMC not only builds chips in the United States, I talked about indigenous capability in China. Chinese fables enterprises do design their chips, but then they send them out to TSMC as well. So there’s some complexities here from an international perspective that really is going to cause some concern when people start to unravel the realities of the CHIPS Act. It’s not just about encouraging current semiconductor makers, it’s about encouraging a much more vibrant ecosystem here in the U.S. that encourages foundry as well as chip-producing organizations.

Stephen Ezell: The key point there is, as we talked about earlier, so many chips today are developed for specific purposes for autonomous vehicles, for AI runs. A lot of startup companies want to design innovative chips, but since we haven’t had a shared commons or prototyping environment, they’ll call TSMC and say “Hey, I want to do a prototype of this.” And they say, “Well what’s your run?” And they go, “We only have a few because we’re testing/prototyping it.” This is exactly point of the National Semiconductor Technology Consortium (NSTC) that’s part of the CHIPS Act to provide a shared national test bed so that innovative companies can prototype these chips, test them out, and hopefully support the development of innovative new companies. Ram, I’m going to you the last word here.

Ram Ramaswamy: I was just going to pick up on a thread of thought that Justin had made, which is in the category of workforce education and training or maybe even academic education, I think one of the dimensions that these kinds of computers are coming online or opening up is the dimension of data, the data management, the volume of data, the management, the optimization of it, utilization of it, visualization of it. It’s something that we are just starting to, I think, get our hands around that. I think more and more there’s going to be the need to actually be able to get expertise to handle those kind of data sets. And this is where the artificial intelligence machine learning tools are increasingly becoming available to exploit it. But I think that there’s a need to actually have the appropriate quantum experts coming online to be able to handle this. I think this is true for no matter which area we are doing computing, and this is going to be fundamentally important.

Stephen Ezell: Thanks. One final thought. As I wrote the report and compiled all these examples, it really struck me that in so many cases it was access to supercomputing facilities at a university that was a key driver of regional innovation. So great stories in there about the University of Texas Advanced Supercomputing Center and how it’s a key biotechnology hub today, or in San Diego with the earthquakes and NCSA at the University of Illinois. So there’s a recommendation in here that as Congress hopefully pursues the $10 billion for the regional technology and innovation hubs program that’s contemplated in the CHIPS Act, that we’re certain that the bids selected connect somehow to regional super computing resources because they provide a critical enabler and asset to support this broader innovation ecosystem that we’re trying to advance at the regional level.

All right. With that, thank you for your indulgence for an extra ten minutes. I want to thank our panelists today for a great set of remarks. Thank you. Thank you for watching online and have some lunch. See you next time.

Speakers

Marsha
Marsha Blackburn@MarshaBlackburn
U.S. Senator (R-TN)
Keynote Speaker
Rick
Rick Arthur
Senior Director, Advanced Computational Methods Research
GE Research
Stephen
Stephen Ezell@sjezell
Vice President, Global Innovation Policy, and Director, Center for Life Sciences Innovation
Information Technology and Innovation Foundation
Moderator
Justin
Justin Hotard@justinhotard
EVP & General Manager, High Performance Computing & Artificial Intelligence
Hewlett Packard Enterprise
Panelist
Ram
Ram Ramaswamy
Director
Geophysical Fluid Dynamics Laboratory (GFDL)
Bob
Bob Sorensen
SVP for Research and Chief Analyst for Quantum Computing
Hyperion Research
Panelist
Back to Top