The next big thing, Nicolas Dubé, VP and chief technologist for HPE’s HPC business unit, told the virtual audience at SFE21, is something that will connect HPC and (broadly) all of IT – into what Dubé calls The Internet of Workflows (IoW).
The IoW, said Dubé, is about “applying those principles to a much broader set of scientific fields because we’re convinced that is where this is going.”
Dubé presented here are six takeaway, briefly touching on recent relevant advances as well as a list of requirements for developing the IoW.
First the Basics. The effort to achieve exascale and the needs of heterogeneous computing generally were catalysts in producing technologies needed for IoW. Dubé also noted the “countless silicon startups doing accelerators” to tackle diverse workloads. Still, lots more work is needed. Here’s snippet on MCM’s expected impact on memory.
White Hats & Data Sovereignty. A key issue, currently not fully addressed, is data sovereignty. Dubé agrees it’s a critical challenge now and will be even more so in an IoW world. He didn’t offer specific technology or practice guidelines.
New Runtimes for a Grand Vision. It’s one thing to dream of IoW; it’s another to build it. Effective parallel programming for diverse devices and the availability of reasonably performant runtime systems able to accommodate device diversity are all needed.
Chasing Performance Portability…Still. Tight vertical software integration as promoted by some (pick your favorite target vendor) isn’t a good idea, argued Dubé. This isn’t a new controversy and maybe it’s a hard-stop roadblock for IoW. We’ll see. Dubé argues for openness and says HPE (Cray) is trying to make the Cray Programming Environment a good choice.
A Combinatorial Explosion of Configurations”. Now there’s an interesting turn of phrase. The avalanche of new chips from old and newcomers is a blessing and curse. Creating systems to accommodate the new wealth of choices is likewise exciting but daunting and expensive. Dubé argues we need to find ways to cut the costs of silicon innovation and subsequent systems to help bring the IoW into being.
Worldwide Data Hub? If one is going to set goals, they may as well be big ones. Creating an infrastructure with reasonable governance and practices to support an IoW is a big goal. Data is at the core of nearly everything, Dubé argued.
An interesting article on delicate relationship between chip designs and foundries
The tight linkage between chip designs and chip manufacturing processes has caused its shared of havoc in the IT sector, and it is getting worse as Moore’s Law has slowed and Dennard scaling died a decade ago. Wringing more performance out of devices while trying to keep a lid on power draw is causing loads of trouble as chip makers try to advance the state of the art. When there are failures to meet chip process targets set by the foundries of the world, chips drive off the roadmap page and smash on the floor.
Fluid dynamics simulations are critical for applications ranging from wind turbine design to aircraft optimization. Running these simulations through direct numerical simulations, however, is computationally costly. Many researchers instead turn to large-eddy simulations (LES), which generalize the motions of a given fluid in order to reduce the computational costs – but these generalizations lead to tradeoffs in accuracy. Now, researchers are using supercomputers at the High-Performance Computing Center Stuttgart (HLRS) to help make those more accurate simulations accessible to more researchers.
The recent news that Intel will turn to TSMC to mass produce CPU products signals a new era in the processor IDM/foundry arena. The production is slated to start in the second half of 2021 and will cover some of Intel’s low- and mid- tier CPU products. Yole Développement’s report “Computing for Datacenter Servers 2021” and “Processor Quarterly Market Monitor” cover the market space where these events are occurring. Meanwhile, speculation over Intel’s motivation is rampant, as are theories of what this means for the firm’s long-term strategy.
In quarterly earnings reports this year, the CEO and founder of NVIDIA (a Liqid partner) noted that its recent advancements in delivering its new compute platform designed with AI in mind and its acquisition of a leading networking company this year are all designed to achieve the central goal of advancing what is increasingly known as data center-scale computing. For providers of high-performance computing solutions, both those built around NVIDIA’s tech and those that are competing with the GPU goliath, this need for data center-scale computing has been defined by and escalated alongside the data performance requirements of artificial intelligence and machine learning (AI+ML), something I discuss further in a recent article.
Computer scientists developed a deep learning method to create realistic objects for virtual environments that can be used to train robots. The researchers used TACC’s Maverick2 supercomputer to train the generative adversarial network. The network is the first that can produce colored point clouds with fine details at multiple resolutions.