LANL Faces Of Innovation: Gary Grider, Supercomputing Scientist

40021867573_d3eedb1278_bLos Alamos National Laboratory High Performance Supercomputing Division leader Gary Grider. Photo Courtesy LANL

Intro: From supercomputers to artificial lungs, Los Alamos National Laboratory’s mission is to provide science and technology to meet national security challenges. The Faces of Innovation series focuses on seven scientists and engineers who are pioneering new technology and programs at Los Alamos. Their groundbreaking ideas, experiments, and data have big implications for national security. This article originally appeared in National Security Science Magazine.

LANL NEWS

In the world of supercomputers, “fastest” traditionally equates to “best.” But Los Alamos’ High Performance Supercomputing Division leader, Gary Grider, is shaking up tradition.

Rather than continuing to aspire to the fastest computers, Grider chooses to focus the division’s efforts on computing efficiency, a more relevant and timely consideration for U.S. national security applications.
For decades, the TOP500 list—a notable world ranking of supercomputers by speed—was the gold standard for determining who could boast the top computer. Los Alamos played prominently in the competition, earning first-place rankings several times over.

A computer’s speed is assessed by the number of rapid calculations, or floating point operations per second (flops), it can execute for every watt of electricity it uses. Known as flops per watt, that criterion has influenced the supercomputing industry, but that benchmark has become less relevant for mission-centric computing: simulating nuclear weapon performance as part of the national program for monitoring the health and reliability of the U.S. nuclear stockpile. For those simulations—the bread and butter of the Laboratory’s national security mission—Grider explains, “the target of flops per watt has led to inefficient use of supercomputers—think 1 percent efficient for our needs.

Supercomputing has reached a fork in the road, with the TOP500 chasers speeding in one direction and the Grider team focusing on extreme-scale computing environments that achieve higher efficiency. Grider’s team calls itself the Efficient Mission-Centric Computing Consortium (EMC3) and, in addition to the Laboratory, it includes Mellanox Technologies, DDN Storage, nCorium, and Marvell.EMC3 recently brought Marvell’s new ThunderX2 ARM processors to Los Alamos. Rather than focusing on speed, the ThunderX2 answers the call for more-efficient extreme-scale weapons simulations.

The ThunderX2 offers high memory bandwidth and tolerance of complex problem solving that’s strategically targeted to Laboratory and EMC3 needs. In addition to its efficiency, the ThunderX2 was also rapidly deployable—weapons applications were moved quickly from previous processors. This was a result of careful planning and execution, both in the design of the processor and in the deployment strategy.

The ThunderX2 is the first in Grider’s planned family of more efficient processors. Marvell and the Lab are allying to create a variety of new architectural components (pieces of hardware and software) that will focus on higher-efficiency, more stockpile-valuable computing in the coming decade.