Future Computer Performance

Question

Moore's law has succesfully predicted up to now that integrated circuit transister density doubles every two years. However, computer performance is dependent on additional factors like architecture, chip design and software. What physics-related factors will enhance or limit ultimate performance of future computers?

Moore's law was an observation, not a prediction; and the doubling period has varied in time from circa 18--30 months: it's a heuristic. — dmckee --- ex-moderator kitten, Jul 18 '11 at 16:41
Fair enough. I understand that after Moore observed the trend, he predicted it would continue for 10 years, which it did. Also the time frame has indeed been revised. — Michael Luciuk, Jul 18 '11 at 17:09
I would call it 'Moore trust agreement', which allows to earn money for the industry while not spending too much resources on R&D. 250nm tech is so simple & easy, that we might have been there in early 80-s. — BarsMonster, Jul 19 '11 at 14:04

BarsMonster · Accepted Answer · 2011-07-19T04:03:17.407

Starting from 90nm tech processes we've started to see sad signs of stagnation:

1) Most of delay in logic circuits is in interconnect, not transistors

2) Most of energy dissipated is due to quantum tunneling, not transistor switching. By far.

3) As consequence of #2 - transistor gate width scaling has significantly slowed down, as well as dielectric width (it's already at 1.5nm, not much to reduce). We are near quantum limits already at transistor sizing.

Even if we can make 11nm transistors today, it would not make things 3 times faster and 9 smaller than we have now.

There are few way to significantly improve CPU technology:

1) superconductor interconnect - will improve CPU performance by a large margin, and will allow much larger CPU's. The problem here is that Physicists haven't yet discovered suitable superconductors, which could be patterned in 50-500nm lines and does not require some -100C. The one who will find a way to do such interconnect at room temperature would be extremely rich.

2) 3D transistor packing: i.e. instead 2D array of 1000x1000 transistors we would have 100x100x100 3D array - and interconnect length is 10 times shorter. This is actively being researched on, the problem is that some stages of manufacturing layer of modern high-performance transistors require temperatures of 1000C and higher, and this would destroy transistors on lower levels.

3) Optical interconnect: doing tiny LEDs & photo-diodes and transparent channels out of SiO2 would also allow for faster interconnect. This is also actively being worked on.

All these fancy things like graphene transistors, quantum dots and fancy HEMT structures indeed are abit faster (and ALOT more expensive or complex to manufacture), but does not solve problems with interconnect, which is major problem. Individual transistors at chips can switch much faster than 4Ghz, but we aren't really limited by transistor switching speed at the moment.

Lately its largely been power and heat removal that has become the limitation. Thats largely why CPU clock rates have stagnated, we can't cool them if we pushed them faster. 3D doesn't really help here, heat removal shows an area versus volume scaling, as does the transfer of data on/off chip. From a programmer/algorithm designer point of view the capability to manipulate bits in data residing within the chip is growing much faster than the capability to move bits on/off chip. So most applications become data transfer rate bound. — Omega Centauri, Jul 19 '11 at 15:44
@Omega heat dissipation is not 'fixed' - it's always a compromise. You can run nearly any CPU at 10% of it's heat dissipation, for like 40-50% of speed. So as delays in 3D stacking are much lower, one can lower voltage, and get same performance at much lower power consumption. Agree on IO, but it's a marketing problem, not an engineering one - we could easily have 512-bit memory IO at the moment(like we do have in GPU's) - there is no technical obstacles here, only passion to keep costs low. — BarsMonster, Jul 19 '11 at 16:29
Yes you can play off speed versus power. The I/O problem is real however, there are only so many bits that can be communicated per unit time. Large transfer sizes may improve the efficiency, but the communications bottleneck is fundamental. Memory access even in general purpose CPUs is by cache lines already, so there are no magic communications silver bullets waiting in the wings. — Omega Centauri, Jul 19 '11 at 19:51

score 6 · Answer 2 · answered Jul 18 '11 at 17:28

The first thing that comes to mind is that the speed of light limits the rate at which different components of a computer, or even a single chip, can communicate with each other. For example, if you have, say, 10 cm of wire running between your motherboard and hard drive, it will necessarily take a minimum of about a third of a nanosecond to fetch data from the drive. Right now HDD access technology is nowhere near that fast, at least not for personal computers, but it's conceivable that at some point in the future, we could have some sort of super-fast drive access for which this limit becomes an issue.

You could make a similar argument for the CPU itself: if CPU features are sized on the order of 10s of nanometers, it takes a signal a minimum of about $10^{-17}\text{ s}$ to travel across them. So we will not be able to make a computer that runs faster than $10^{17}\text{ Hz}$ without exploiting either subatomic technology or advanced parallel processing ;-) Again, clearly we are nowhere close to this being a problem in PCs.

Large supercomputers do have to deal with these issues, though, since different parts of the system can be separated by several meters, which corresponds to tens of nanoseconds of signal travel time. This means that the supercomputer as a whole would be limited to operating frequencies on the order of hundreds of MHz. In practice, modern supercomputers operate as a large cluster of individual nodes each acting quasi-independently, so the light speed limits aren't an issue.

Good answer. To add to your point about light-speed delays, rapidly switching circuits can only run as fast as their slowest component, even within a single bus of parallel wires. This means that bus traces have to be the same length or else terminate in buffers, which leads to the curious meandering circuit traces now common on PCBs, where a wire is significantly longer than it strictly topologically needs to be to connect two components. Related to BarsMonster's second point, multilayer PCBs, allowing traces to pass over and under each other can ameliorate this design constraint somewhat. — Richard Terrett, Jul 19 '11 at 08:38
@Richard Fortunately, this is no longer a requirements for many places - modern chips have tunable delay for each pin, so you can compensate some 10cm trace length difference. And in a chip crystal - speed is limited not by speed of light, but rather by RC constant - how long it would take to charge target capacitor over resistor (=trace resistance). — BarsMonster, Jul 19 '11 at 14:02
@BarsMonster - Both points are very interesting, thanks for bringing them to my attention. Looks like I have some reading to do. Great site, btw! — Richard Terrett, Jul 19 '11 at 15:16

score 0 · Answer 3 · answered Jul 20 '11 at 01:59

The other answers detailed addressed the physical part.
When the physical limits are an obstacle we will see a change of paradigm. I will continue with other factors that may be important in the future. I suspect that we are yet limited to a 2D (planar) design of chips and even 3D is a repackage of a 2D design.

Most of the work can be done in parallel: physical simulations, database operations, pattern recognition, datafile processing, etc...

AMD-ATI and Nvidia have done a great job with their high-end graphics cards. By now they have towsands of parallel processors on a chip. The software has also progressed with OpenCL, and PyOpenCL, Nvidia CUDA and ATI-Brook+, NoSQL and CouchDb databases, map-reduce algorithms. Clusters of graphics cards behave as a supercomputer.

I think that in the future we will see 3D neuronal networks, implemented in hardware with on-the-fly programmable coefficients that will try to mimic the brain (the most efficient parallel computer - per watt, and the only one that is really intelligent). Heat removal is a major issue in any circumstance.

New technology will make the entrance, may be this one:
Superconducting Niobium Chip Smashes Silicon Power Consumption Standards A superconducting logic chip with a clock speed of 6 GHz beats silicon energy efficiency by two orders of magnitude. arxiv-Ultra-Low-Power Superconductor Logic

(note: I'm playing around since chip 8080, and I saw a huge evolution and I'll see more, if I manage to keep alive ;)

Future Computer Performance

3 Answers3