Intel’s Plan to Remain the Supercomputing King
As I wrote on Monday, this is a big week for supercomputing. The latest list of the world’s 500 most powerful supercomputers was released, and while the Top 10 didn’t change, some important barriers, like the 10 petaflop level, were broken.
And while it was Fujitsu, using SPARC chips, that made the top of the list, you couldn’t help noticing how many machines used chips from Intel. Of the 500 supercomputers on the list, 384 of them use chips from the semiconductor giant.
At the SC11 Supercomputing conference in Seattle today, Intel is making some important disclosures about what it is doing to maintain its role as the chip vendor of choice, and also offering its competitive response to a potential threat from the graphics chip specialist Nvidia.
As I’ve explained a few times before, the graphics chips, or GPUs, that Nvidia makes are starting to make some inroads into supercomputing and high-performance computing environments, thanks to their ability to handle floating point computations at a high rate of speed. Sometime next year, at the Oak Ridge National Laboratory in Tennessee, a machine called Titan, using a combination of chips from Advanced Micro Devices and Nvidia, is expected to break the 20 petaflop barrier when it begins operation.
The narrative that has emerged recently is that GPUs are generally better at the floating point operations that are increasingly used in supercomputing — better in many cases than traditional x86 chips from Intel and AMD. Even so, if you add up the number of systems on the Top 500 list using Intel and AMD chips, you’d hit a percentage that’s just shy of 90.
In a presentation today (on what just happens to be the 40th birthday of the Intel microprocessor — hence the two people I saw today outside the “Today” show at Rockefeller Center on my way to work), Rajeeb Hazra, Intel’s general manager of Technical Computing, detailed Intel’s response. First off, Intel is supporting a new technology, called PCI Express 3.0, that will speed up the ability of chips inside a supercomputer to share data. In systems this big, and working on such large amounts of data at once, the processors spend a lot of time tapping their feet and waiting for data to work on. Engineers call this latency, and the point of the new interconnect technology is to cut latency by doubling the bandwidth available. The result is an improvement in the raw FLOPS (floating point operations) available by 2.1 times in lab tests, and a 70 percent improvement in real-world workload tests. In supercomputing terms, that’s real progress, and it effectively means getting answers to big questions faster.
Another advance that Intel talked about today is a chip bearing the codename “Knight’s Corner.” It’s a coprocessor, meaning it’s an additional chip that would be added to a computer to boost its performance. Intel says it can do a full teraflop — a trillion floating point operations a second — and that’s just the result of demonstrations from the first silicon. When in full production, it will probably do even better.
And not only will it do a teraflop on a single chip, it will perform those calculations to what engineers call “double precision,” which is a fancy way of saying the result of each operation will be accurate to a higher level of granularity. As John Hengeveld, Intel’s director of technical computer marketing, told me last week, the rule of thumb in these matters says that moving from single to double precision boosts the amount of time you have to wait by four times.
Why is that important, when an off-the-shelf GPU from Nvidia can do 2 teraflops — though only at the single-point precision? Programming. If you’re a scientist who 10 years ago wrote a program to simulate weather patterns or nuclear explosions or some other classic supercomputing problem to run on systems running Intel chips, there’s nothing new to learn in terms of programming. While the GPUs are great, there are new programming rules to learn.
Finally, Intel is reiterating its plan to keep working on the exascale problem, which is the next great summit in supercomputing. Right now the world’s top supercomputer maxes out at 10.51 petaflops, and a candidate to top the list next year will go north of 20 petaflops, or quadrillions of floating point operations. Sometime this decade — say, about 2018 or so — the hope is that supercomputers will break the exaflop barrier, where machines will run quintillions of FLOPs.
The fundamental problem there isn’t the computing so much as it is power, as in electrical power. Already some of these machines consume as much power as a small city. Getting to exascale will require chips and other components that can run full out at speeds we can as yet only imagine, but doing it consuming a lot less power than they would otherwise be expected to. Think in terms of a Prius that could win the Indy 500 — and not just by a hair, but by a long mile — and do it day after day without really using much more gas than the other cars. It’s kind of like that.
Anyhow, Intel has said that it plans to enable exascale supercomputing that will require only a doubling of the power needed, rather than, say, 10 times as much. To that end, it said today it will open its fourth research lab in Europe. This one is in Barcelona and joins one in Paris; another in Juelich, Germany; and a third in Lueven, Belgium. They’ll all have a lot of work to do between now and 2018.