The root problem is that as integrated circuitry gets more complex and adds more components to the circuits, only some of the increase in complexity can be addressed by smaller components. The rest has to come by other means.
Simply put, it is very, very hard to cram more and more digital functionality (compute, RAM, NAND flash, etc.) into a given volume of space. This might sound somewhat counterintuitive, considering how small modern computer chips are, but just think about it for a moment. Your desktop PC case probably has a volume of 50 litres or more—but the CPU, GPU, RAM, and handful of other chips that actually constitute the computer probably account for less than 1 percent of that volume.It's not that the designers are happy with that inefficiency, it's that they don't have a way around it. It's not possible to extract the heat the circuits generate with current technologies.
In recent years we've seen the rise of one method of increasing density: stacking one die or chip on top of another. Even there, though, it seems that chip companies are struggling to go beyond a two-high stack, with logic (CPU, GPU) at the bottom and memory on top, or perhaps four or eight stacks in the case of high-bandwidth memory (HBM). Despite fairly regular announcements from various research groups and semiconductor companies, and the maturation of 3D stacking (TSVs) and packaging technologies (PoP), multi-story logic chips are rare beasts indeed.Anyone who builds computers regularly (any of you hang out here?) knows that processor power dissipation reached it's current level, around 130-150 Watts, in the middle of the last decade. In fact processor performance seems to have hit its zenith around that time. There's a number of reasons for this, but for this discussion it boils down to two:
First, as chips get smaller, there is less surface area that makes contact with the heatsink/water block/etc., which puts some fairly stringent limits on the absolute amount of thermal energy that can be dissipated by the chip. Second, as chips get smaller, hot spots—clusters of transistors that see more action than other parts of the chip—become denser and hotter. And because these hot spots are also getting physically smaller as transistors get smaller, they fall afoul of the first issue as well. The smaller the hot spot, the harder it is to ferry the heat away.Now consider stacking more than one of these chips in a two, three, or more-high chip stack and the problem goes unsolvable. Not only do you need to extract the heat, you need to connect all the signals. ARS has a map of a part called the LGA 1155 Ivy Bridge chip. It's not just a number intended to sound cool: it's the number of pins on the part. How do you stack parts with 1155 electrical interconnects, especially if pins don't line up? There are hundreds of power pins on that part! It's why the biggest stacks you typically see are two chips: a processor and a memory (RAM). RAM is pretty much the opposite of a CPU, with most pins dedicated to data transfer rather than power delivery.
An obvious answer is liquid cooling, but exactly how to do that is not always trivial. In the late 70s, there was a pioneering supercomputer company called Cray. Cray immersed all of their circuitry in an electrically inert liquid called Flourinert from 3M. The fluid was circulated through a pumping system that refrigerated the liquid.
A few years ago, IBM's research into liquid cooling resulted in the creation of two "hot water cooled" supercomputers: one at ETH in Zurich and SuperMUC at the Leibniz Supercomputing Centre in Germany. More recently, we covered the Solar Sunflower, where instead of using conventional copper water blocks to ferry heat away, water flows through micron-thick microfluidic channels that have been carved out of a silicon wafer. The Solar Sunflower story has more details if you want them, but the general gist of it is this: microfluidic liquid cooling could be really, really useful, both in terms of the absolute amount of thermal power that can be dissipated, and also in relieving those hard-to-reach hot spots.This rather famous graph (from the ARS article) shows the difference in computing efficiency (in operations per Joule of energy) on the left axis and computing density (a measure of how much space they use to do their computations) on the horizontal axis. The computers with the highest efficiency and highest computing density are all on the right end of the diagonal - and they're all natural systems. The graph is a bit old now, but the latest supercomputers haven't moved the bar much higher.
Quoting ARS again:
By this point you can probably tell where things are going. At IBM Research in Zurich, they are working on a technology that solves both power delivery and cooling in vertically stacked electronics, with the eventual goal of enabling the creation of skyscraper CPUs, GPUs, or whatever other IC you might fancy.Power delivery means they're using an electrically conductive fluid to both bring power and remove heat. Electrically conductive isn't really difficult to do; although keeping it corrosion resistant will be a big consideration. You don't need much corrosion to take out a part with microscopic features, after all.
Somewhat unfortunately, IBM calls this research program "towards five-dimensional scaling." Not three dimensions, like you might expect of a project tasked with stacking 2D chips on top of each other to form 3D piles, or perhaps four dimensions if you had done a module in marketing at university and were feeling exceedingly generous, but five.
Fortunately, however, IBM's five-dimensional scaling tech doesn't require an understanding of string theory. Rather, the fourth and fifth dimensions are rather mundane: number four is power delivery, and cooling is number five.
Basically, IBM needs to start with its microfluidic cooling tech and then modify the cooling medium so that it also carries soluble redox couples (i.e. a compound that can be oxidised to produce some electricity, and then reduced again to recharge). Then, instead of just providing microfluidic channels on the chip for cooling, there also needs to be a few extra bits to complete its transformation into a redox flow battery.OK; So I Get The 5D Part, But Why Call It Blood?
Well, the original inspiration for the research was biological efficiency. That is, despite how small our transistors are getting and how fast our interconnects are becoming, animal brains are several orders of magnitude more efficient in terms of computing efficiency and density. Some of the world's largest supercomputers have a total processing power that approximates a small mammal, but they require about 10,000,000 watts to get there. The human brain, by comparison, uses maybe 20 or 30 watts at full bore. A supercomputer is just slightly larger than a mammalian brain too. [Shown in the graph above - SiG]It's not recognizable blood in the sense of animal blood you've seen before; they're not using hemoglobin and it's not red. It's a green or bright blue fluid (depending on which version they're working with), so it's just blood in some metaphorical sense.
Those gaping gulfs in efficiency and density may begin to be bridged by neuromorphic (brain-like) chips, and other bleeding-edge advances in CMOS logic, but they'll only get us so far. To get towards biological levels of efficiency (and perhaps intelligence as well), we need some way of cramming millions of computer chips into a space the size of a shoe box—or, er, a human skull. Animals use blood for both energy delivery and cooling of the most efficient computers in the world, so why shouldn't IBM?