Next Generation Networking Systems Laboratory
login
Edit

Energy-Efficient, High-Performance Computing of the Future

The phenomenal advances in computing technology over the past two decades were enabled by Dennard scaling 1, whereby the exponential improvements in power efficiency and performance and cost-effectiveness of silicon technology tracked Moore’s Law improvements in integrating more devices on each chip. As we approach atomic scale lithography, the end of Dennard scaling puts future growth of the computing industry in jeopardy. Multicore processors have provided a temporary respite from stagnation of CPU clock frequencies, but creates daunting challenges to programmability, and drives today’s system architectures towards extreme levels of unbalanced communication- to-computation ratios! Last year, Intel research team announced an 80-core Polaris chip capable of 1 TeraFlops. It is expected that computer chips in 2020 may contain 1000 cores with ultra high-density nanoscale devices exceeding 10 TeraFlops in performance. This trend casts new challenges in communication and power requirements.

Bell, Szalay, and Gray 2 convincingly argue the importance of the balanced computational systems. Amdahl’s law suggests that a system with balanced computation and communications perform the best under most circumstances. A 10 Teraflop chip would require an interconnect bandwidth of 100 Tb/s for a balanced architecture. This is twenty times larger than the average 5 Tb/s Internet traffic in the U.S. today! Further implications of Amdahl’s law are that the 10 TeraFlop system will need a 10 TeraByte of RAM with 100 Tb/s I/O bandwidth. In practice, the number of pins of DRAM chips has remained relatively unchanged due to packaging constraints and the signaling rate is ultimately limited by power constraints. These growing imbalances and constraints have driven computer architectures in the past several decades to evolve by seeking performance compromising solutions, for instance, by emphasizing locality and hierarchy of data access with limited bandwidths.

Photonic interconnects offer a disruptive technology solution that fundamentally changes the computing architectural design considerations. Optics provide ultra-high throughput, minimal access latencies, and low power dissipation that remains independent of capacity and distance. For a distance exceeding a few millimeters, the energy efficiency of electrical signaling is typically ~10 pJ/bit (equivalent to ~10 mW/Gb/s), and future generations may reduce this to ~2 pJ/bit 3. With 10 pJ/bit, 1000 watts is required just to support the interconnect bandwidth at 100 Tb/s. Optical communication links based on nanoscale silicon photonics have shown the ability to operate at 100 fJ/bit, with future modifications for bringing it down to 10 fJ/bit 4, 5. In addition to the energy efficiency, many of the fundamental physical problems of interconnects are directly addressed within the optical technology platform, including precise clock distribution 6, bit rate transparency, and power reduction, without concerns for 7 impedance, crosstalk, voltage isolation, pin inductance, signal distortion, and repeater-induced latency. Increasing the DRAM bandwidth by a factor of ~100 on each pin can potentially revolutionize future computing systems. Multiwavelength photonic interconnects in the DRAM I/O can be envisioned to achieve this performance in a scalable fashion.

The opportunity for optical interconnects 8, 9 in board-to-board and rack-to-rack communications is already well documented. Exciting opportunities exist in wavelength routing to reconfigure the high-capacity connectivity of multiple wavelengths to reduce contentions and increase system-wide throughput 10, 11. Recent advances in silicon nanophotonic technologies compatible with nanoelectronics offer new possibilities in realizing future computing systems with a fundamentally new architecture. CMOS-compatible silicon photonics offer a practical platform for routing and transporting multi-wavelength optical signals interconnecting electronic processors and memories using the same CMOS fabrication processes that are used in the foundries today.

Potentially, a combination of high-yield, high-uniformity, and high-volume productivity of CMOS processes can be applied to silicon photonics to realize low-cost and high-quality optical interconnection of computing processors and memories. Figure 1(b) shows an example of such a system in three layers: optical interconnection, memory, and processor planes all fabricated by CMOS compatible processes. Potentially, such a system can grow to a ~1000 core system optically interconnected by 100~1000 wavelengths where a flattened computing architecture becomes possible. Figure 1(a) shows an example of a state-of-the-art supercomputing center networked in a multi-tier, hierarchically interconnected system where parallel programming must consider locality of processors and memories.

Practical silicon photonics should also consider athermal operation to account for temperature variations from core-to-core by > 15 ºC. As Figure 1(c) shows, the thermally compensated slotted waveguide design 13 achieves temperature independent operation across 100 oC temperature range. Figure 2 shows the optical interconnection crossbar (a) architecture and (b) fabricated chip where silicon photonic micro-resonators provide wavelength dependent routing and interconnection.

Our efforts will cover the following five tasks:

  • Task 1: Computing System Architecture and Application Studies
  • Task 2: Nanotechnologies
  • Task 3: System-on-Chip integration
  • Task 4: Testbed and Application Studies, and Project Evaluation
  • Task 5: Outreach, Transformative Research and Education

Edit

Workshop at HP Research Lab: October 8, 2007

Mini Workshop Agenda

Edit

Workshop on Computing of the Future at Crown Plaza Hotel @SFO: February 29, 2008

Workshop Agenda
Edit

Weekly Seminars at UC Davis, 2008

Weekly Agenda

UC Davis | ECE Department | CS Department | CITRIS