Figure 1. A conceptual illustration of a possible future computing system combining detail-oriented and artificial-intelligence based computing combining von Neumann and non-von-Neumann architectures. In both cases, 3D integrated nano technologies will be essential.

Large-scale computing platforms, also called warehouse-scale computing, have significantly transformed our lives over the past decades. Emerging applications are even more data intensive. However, scaling to an exascale system for both floating-point (e.g. for climate modeling) and fixed-point (e.g. deep neural network for pattern recognition) are facing severe challenges since the Dennard’s power-scaling law failed to keep up with the Moore’s law in 2006.

Recently, bio-inspired neuromorphic hardware systems such as IBM’s TrueNorth based on low-leakage 28 nm CMOS technology achieved 176,000 times higher energy efficiency than a state-of-the-art computing system based on a von-Neumann architecture . However, this and most of the other approaches accompanied the following shortcomings:

  1. The hardwares could only do what it was trained to do, and the training processes were time/energy consuming.
  2. Electronic solutions include long electrical wires with large capacitance and high interconnect energy consumption; the TrueNorth chip running at slow speeds consumed 2.3 pJ/bit with an additional 3 pJ/bit for every cm transmission.
  3. Electronic interconnect topologies are typically in four directions and required a number of repeaters.
  4. Limited scalability owing to the 2D interconnection topology with a single hierarchy.
Figure 2. Proposed 3D Neuromorphic Nanocomputing architecture . (A) A canonical cortical microcircuit. (B) The cortex’s two-dimensional sheet. (C) The long range connections between cortical regions. (D) Structure of a neurosynaptic core with axons as inputs, neurons as outputs. Multicore networks at (E) chipscale and (F) multichip scale are both created by connecting a neuron on any core to an axon on any core.(G) Functional view of core as a crossbar where horizontal lines are axons, crosspoints are individually programmable synapses. (H) Functional chip architecture is a two dimensional array of cores where long-range connections are implemented by sending spike events (packets) over a mesh routing network to activate a target axon. (I) Hiearachical interconnection network. (J) Optically interconnected neurons. (K) 3D integrated neuromorphic system. (L) multi-3D chip neural networks.

In a recent article , D.A.B. Miller reviews attojoule photonics and practical ~ 10 fJ/b interconnect solutions exploiting quantum impedance conversion where signal transmission need not be subject to charging of capacitance but rather exploit close integration with electronics with ~1 fF capacitance. This 10 fJ/b energy efficiency assumes ~19 dB (80x) link loss budget and ~30% wall plug efficiency of the light source. Hence, if such a nanophotonic repeater is integrated with electronics to form a neuron with a fanout of 10-100 on low-loss waveguides, it is conceivable to realize a nanophotonic neuron capable of communicating with other neurons at ~10 fJ/b nearly independent of the communication distance. Further, through 2D and 3D photonic integrated circuits with reconfigurable nanoMEMS (NEMS) photonic interconnection fabric, we can form hierarchical synaptic interconnection with changing weight values remembered by latching NEMS component while consuming little static energy. A combination of nanophotonic-nanoelectronic-NEMS technologies can possibly lead to a bio-inspired neuromorphic computing system as illustrated in Figure 2.

A. Nanophotonic Neurons

Figure 3. (a) A simple example of a nonlinear model of a neuron which includes synapses, weighted addition, and nonlinear activation function. (b) A nonlinear activation function (e.g. sigmoid function) for five different slope parameter. (c) A proposed nanophotonic neuron with the spiking electronic circuit, nanophotonic detectors (PD1 for excitatory and PD2 for inhibitory), a nanophotonic LED, and other elements such as resistors and capacitance can be tuned to achieve desired sigmoid response function including temporal and rate coding, (d) Illustration of the response that achieves the desired integration and all-or-nothing response encoded in spikes.

As illustrated in Figure 3, a nanophotonic neuron including the spiking nanoelectronic circuit can acquire the necessary functionality (spiking, integration, thresholding, reset) when synaptic connections are made with other neurons. A nanophotonic light source (e.g. a nanolaser or a nano-LED) and nanophotonic detectors will be integrated very closely with nano-transistors to achieve quantum impedance conversion. We expect ~ 10 fJ/b operation with ~80× fanout.

B. Nanophotonic-NEMS Synaptic Neural Networks

Figure 4. Nanophotonic Neural Network (e.g. 3 layer) with Nanophotonic Neurons (inset diagram on the left) at each node, and the self-optimizing nanophotonic neural network with 2×2 NEMS-MZI between each layer.

Figure 4 shows an example of the proposed Nanophotonic Neural Network with Nanophotonic Neurons at each node, and the self-optimizing nanophotonic neural network with 2×2 NEMS-MZI between each layer. An important part of a neural computation scheme is some method of providing this necessary set of weighted connections from the neuron outputs and the inputs to the next layer of neurons. The recent “universal linear optical component” that can be built up using a network of 2×2 Mach-Zehnder interferometer blocks connected in a mesh, which we proved to be able to implement any linear transform from its inputs to its outputs.
Importantly, unlike many previous optical schemes, this can be accomplished entirely without having to throw away any of the optical power unless necessary for the desired linear mapping. Furthermore, this network can be set up without calculations and without calibration of the components, based on training it with appropriate sets of input light beams, and based on a sequence of local feedback loops using local mostly-transparent detectors. This ability to train the network can be used to self-configure, self-stabilize, and even self-learn, to realize an on-line and self-learning system.

C. 3D Nano-Integrated-Neuromorphic-Circuit

Figure 5. (a) A multi-layer nanophotonic neural network computing platform in a 3D nano-integrated circuit, (b) hierarchical clustering for scalable computing.

Deep neural networks with many stages or layers can be achieved by 2D and 3D integration of the nano-circuits as illustrated in Figure 5 (a), and can be further scaled out by hierarchical clustering as shown in Figure 5 (b).

Von Neumann computing with massively parallel nano-photonic interconnects with nano-electronic logics and NEMS non-volatile memory can also be constructed in a similar manner to complete the system of Figure 1. The complete system can exploit neuromorphic computing to analyze the workload, accelerate floating-point or detailed processing, and to make intelligent decisions based on the learned experiences.

[1] R. H. Dennard, et al., IEEE Journal of Solid-State Circuits, vol. 9, no. 5, pp. 256-268, 1974.
[2] P. A. Merolla et al., Science, vol. 345, no. 6197, pp. 668-673, 2014.
[3] D. A. B. Miller, Journal of Lightwave Technology, 2017.
[4] D. A. B. Miller, Optics Letters, vol. 14, no. 2, pp. 146-148, 1989.
[5] D. A. B. Miller, Photonics Research, vol. 1, no. 1, pp. 1-15, 2013.
[6] D. Silver et al.,Nature, vol. 529, no. 7587, pp. 484-489, 2016.
[7] Y. Shen et al., Nat Photon, vol. 11, no. 7, pp. 441-446, 2017.