{"id":655,"date":"2020-03-21T12:26:37","date_gmt":"2020-03-21T19:26:37","guid":{"rendered":"https:\/\/sierra.ece.ucdavis.edu\/?p=655"},"modified":"2020-03-22T11:32:59","modified_gmt":"2020-03-22T18:32:59","slug":"computing-architecture-algorithm-and-testbed-studies-for-reconfigurable-computing-with-photonic-interconnects-and-ai","status":"publish","type":"post","link":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/2020\/03\/21\/computing-architecture-algorithm-and-testbed-studies-for-reconfigurable-computing-with-photonic-interconnects-and-ai\/","title":{"rendered":"Computing: Reconfigurable Computing with Photonic Interconnects and AI"},"content":{"rendered":"\n<p><strong>Introduction<\/strong><\/p>\n\n\n\n<p>Current high-performance\ncomputing (HPC) systems are increasingly exploiting heterogeneous computing\nnodes to improve performance in terms of latency and energy utilization for completing\nspecific computation tasks [1, 2]. While the communication patterns driven by\nmodern workloads exhibit temporal bursts and spatial non-uniformity [3, 4],\ntoday\u00e2\u20ac\u2122s interconnection networks based on electronic switches and optical\nfibers are inherently rigid, incapable of changing the network topology or link\nbandwidth to adequately cope with the significant variations of traffic\npatterns. It would then be desirable to design a bandwidth-reconfigurable\ninterconnection network that can adapt its connectivity to the various traffic\ndemands [5-7]. <\/p>\n\n\n\n<p>There have been recent\nadvances in silicon photonic (SiPh) integrated reconfigurable wavelength\nrouting and space switching that allows to redefine the connectivity in both\nspectral and spatial domains on demand. Indeed, wavelength-and-space selective\nswitching fabrics that can reconfigure the bandwidth between selected pair of\ninput and output ports have been demonstrated [8, 9]. Recently, we proposed and\ndemonstrated a SiPh bandwidth-reconfigurable all-to-all interconnection switch,\n\u00e2\u20ac\u02dcFlexible Low-Latency Interconnect Optical Network Switch (Flex-LIONS),\u00e2\u20ac\u2122\nenabled by combination of all-to-all interconnection using an arrayed waveguide\ngrating router (AWGR) and multi-wavelength selective switches [10]. While\nFlex-LIONS has superior performance in terms of scalability and energy\nconsumption when compared with other proposed architectures (see [10] for more\ndetails), specific reconfiguration policies and algorithms at the network and\napplication layers to take advantage of such physical-layer reconfiguration\ncapability are still needed. In particular, we are interested in exploiting\nemerging AI techniques to address the challenges related to reconfiguration\npolicies. <\/p>\n\n\n\n<p><strong>Reconfigurable Architecture with Machine-learning-based Cognitive Control Plane<\/strong><\/p>\n\n\n\n<p>Figure 1\nshows the architecture that we are currently investigating. We called this\narchitecture Hyper-Flex-LIONS: it leverages\nFlex-LIONS &nbsp;to enhance a\nDragon-Fly like topology with unique optical reconfiguration capabilities\nwithin a group and between groups based on an <em>observe-analyze-act<\/em> cycle\nexploiting deep learning techniques. Optical reconfiguration is achieved using\nthe SiPh Flex-LIONS technology discussed below. <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"314\" src=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing1.png\" alt=\"\" class=\"wp-image-656\" srcset=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing1.png 624w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing1-300x151.png 300w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing1-465x234.png 465w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><figcaption> <a>Figure <\/a>1. Architecture of Hyper-Flex-LIONS, an optical reconfigurable Dragon-Fly with DRL-based reconfiguration for    ,    , and    .  <\/figcaption><\/figure><\/div>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing2.png\" alt=\"\" class=\"wp-image-657\" width=\"382\" height=\"396\" srcset=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing2.png 836w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing2-289x300.png 289w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing2-768x798.png 768w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing2-465x483.png 465w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing2-481x500.png 481w\" sizes=\"auto, (max-width: 382px) 100vw, 382px\" \/><figcaption> Figure 2. (Top) Flex-LIONS (<em>N<\/em>=4, &nbsp;<em>b<\/em>=3) architecture with AWGR, MRR add-drop filters and multi-wavelength MRR crossbar switch. (Bottom) Microscope image of fabricated eight-port SiPh Flex-LIONS (<em>N<\/em>=8, <em>b<\/em>=3) and transmission spectra of 8\u00c3\u20148 AWGR from input port 4. <\/figcaption><\/figure><\/div>\n\n\n\n<p>Figure 2 illustrates the working principle of\nFlex-LIONS. The SiPh Flex-LIONS has an <em>N<\/em>-port AWGR and <em>b<\/em>\nmicroring resonator (MRR) add-drop filters at each AWGR input\/output port. For\nuniform traffic, all MRR add-drop filters can be set off-resonance so that each\ninput port provides <em>N<\/em> wavelength division multiplexing (WDM) signals to\ninterconnect with all the <em>N<\/em> output ports according to the all-to-all\nwavelength routing property of the AWGR [12]. For different traffic patterns,\nthe MRR filters can be tuned in resonance to select specific wavelengths\nchannels to be switched by the multi-wavelength switch (for the SiPh chip shown\nin Figure 2 the multi-wavelength switch is implemented as\nan MRR crossbar [10]), practically creating a different topology as well as\nincreasing by a factor of <em>b<\/em> the bandwidth between the port pairs\nconnected through the multi-wavelength switch. <\/p>\n\n\n\n<p>Figure 3\ndepicts the NC&amp;M framework of Hyper-Flex-LIONS. Each group is equipped with\na group manager for managing the data plane operations within the group, using\na software-defined networking (SDN) paradigm. Meanwhile, Hyper-Flex-LIONS\nemploys an inter-group manager at a higher hierarchy being responsible for\nmanaging inter-group reconfiguration. The group and inter-group managers apply\nadvanced machine learning (ML) technologies (i.e. deep reinforcement learning) at\ndifferent time-scales to achieve knowledge-based cognitive networking, forming\na hierarchical <em>observe-analyze-act<\/em> paradigm similarly to interworking of\nbrain and reflex. <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing3-1024x474.png\" alt=\"\" class=\"wp-image-658\" width=\"540\" height=\"249\" srcset=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing3-1024x474.png 1024w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing3-300x139.png 300w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing3-768x356.png 768w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing3-465x215.png 465w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing3-695x322.png 695w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing3-920x425.png 920w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing3.png 1377w\" sizes=\"auto, (max-width: 540px) 100vw, 540px\" \/><figcaption> Figure 3. (a) Hierarchical control and management architecture and (b) workflow of Hyper-Flex-LIONS. <\/figcaption><\/figure><\/div>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>Preliminary Results<\/strong><\/p>\n\n\n\n<p>We used OMNeT++ simulator and TensorFlow to simulate the DRL-based\nreconfigurable Flex-LIONS architecture. We assumed 16 Top-of-Rack (ToR)\nswitches interconnected with one 16-port Flex-LIONS. We considered four\npossible topologies the DRL algorithm can choose from. We utilized a time-varying\ntraffic consisting of four traffic patterns: adversarial, neighbor exchange, and\nall-to-all for inter and intra-groups (a group is composed of four racks). The\nfour patterns appear periodically. For training process, the four changing\ntraffic patterns and the four network topologies are all set as part of the\nDNNs\u00e2\u20ac\u2122 input features. The DNN models consisted of two convolutional layers and\nfive fully connected layers, and each layer contains 128 neurons.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing4-1024x387.png\" alt=\"\" class=\"wp-image-659\" width=\"560\" height=\"212\" srcset=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing4-1024x387.png 1024w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing4-300x113.png 300w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing4-768x290.png 768w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing4-465x176.png 465w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing4-695x263.png 695w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing4.png 1267w\" sizes=\"auto, (max-width: 560px) 100vw, 560px\" \/><figcaption> Figure 4. (Left) Reward of DRL scheme for different learning rates. (Right) Average End-to-end delay for different injection rates for fixed topologies and DRL-based scheme under time varying traffic.  <\/figcaption><\/figure><\/div>\n\n\n\n<p>Figure 4 (Left) shows how the reward value converges\nvia training, which means the DRL agent works efficiently to maintain the\nlowest network end-to-end delay. In addition, convergences act differently according\nto different learning rate. We compared our DRL-based reconfigurable\narchitecture to different fixed networks in terms of average end-to-end delay [see\nFigure 4 (Right)]. The proposed DRL-based\nreconfigurable architecture always achieves the lowest average network latency\namong all packet injection rates.<\/p>\n\n\n\n<p><strong>Ongoing testbed work<\/strong><\/p>\n\n\n\n<p>Figure 5\nshows our in-progress testbed efforts to evaluate the proposed architecture and\nreconfiguration algorithm solutions on a real testbed exploiting research grade\nphotonic interconnect prototypes as well&nbsp;\ncommercial top-of-rack switches, servers and open-source software\nsolutions for network control and management plane and applications management.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"968\" height=\"443\" src=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing5.png\" alt=\"\" class=\"wp-image-666\" srcset=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing5.png 968w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing5-300x137.png 300w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing5-768x351.png 768w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing5-465x213.png 465w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/Computing5-695x318.png 695w\" sizes=\"auto, (max-width: 968px) 100vw, 968px\" \/><figcaption> Figure 5. Ongoing testbed work involving SiPh Flex-LIONS prototype, ToR switches, servers and SDN cognitive control plane. <\/figcaption><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>Current Research Opportunities<\/strong><\/p>\n\n\n\n<p>We are currently seeking Master students, PhD students and\nPostdoctoral researchers with a variety of skills that are interested in\nworking on architectures, algorithms and hands-on testbed work for\ndemonstrating innovative ideas and solutions in the context of the above\nresearch topic. Interested candidates should send their resumes to <a href=\"mailto:sbyoo@ucdavis.edu\">sbyoo@ucdavis.edu<\/a> or <a href=\"mailto:rproietti@ucdavis.edu\">rproietti@ucdavis.edu<\/a>. <\/p>\n\n\n\n<p>REFERENCES<\/p>\n\n\n\n<p>[1] Mittal, S., Jeffrey, S. V.: &#8216;A survey of CPU-GPU\nheterogeneous computing techniques&#8217;, ACM Computing Surveys (CSUR) 47.4 (2015):\n69.<\/p>\n\n\n\n<p>[2] Schulte, M. J., Ignatowski, M., Gabriel, H. L., et al.:\n&#8216;Achieving exascale capabilities through heterogeneous computing&#8217;, IEEE Micro\n35.4 (2015): 26-36<\/p>\n\n\n\n<p>[3] Roy, A., Zeng, H., Bagga, J., et al.: &#8216;Inside the social\nnetwork&#8217;s (datacenter) network&#8217;, ACM SIGCOMM Computer Communication Review. Vol.\n45. No. 4. ACM, 2015.<\/p>\n\n\n\n<p>[4] &nbsp;Zhang, Q., Liu,\nV., Zeng, H., et al.: &#8216;High-resolution measurement of data center microbursts&#8217;.\nProceedings of the 2017 Internet Measurement Conference. ACM, 2017<\/p>\n\n\n\n<p>[5] &nbsp;Cao, Z., Proietti,\nR., Clements, M., et al.: &#8216;Experimental demonstration of flexible bandwidth\noptical data center core network with all-to-all interconnectivity&#8217;, Journal of\nLightwave Technology 33.8 (2015): 1578-1585<\/p>\n\n\n\n<p>[6] &nbsp;Proietti, R., Liu,\nG., Xiao, X., et al.: &#8216;FlexLION: A Reconfigurable All-to-All Optical Interconnect\nFabric with Bandwidth Steering&#8217;. 2019 Conference on Lasers and Electro-Optics\n(CLEO). IEEE, 2019<\/p>\n\n\n\n<p>[7] S. Salman, C. Streiffer, H. Chen, T. Benson, and A.\nKadav, \u00e2\u20ac\u0153DeepConf: Automating data center network topologies management with\nmachine learning,\u00e2\u20ac\u009d in Proc. of NetAI, (2018), pp. 8\u00e2\u20ac\u201c14.<\/p>\n\n\n\n<p>[8] &nbsp;Seok, T. J., Luo,\nJ., Huang, Z., et al.: &#8216;MEMS-Actuated 8\u00c3\u2014 8 Silicon Photonic\nWavelength-Selective Switches with 8 Wavelength Channels&#8217;. 2018 Conference on\nLasers and Electro-Optics (CLEO). IEEE, 2018.<\/p>\n\n\n\n<p>[9] &nbsp;Khope, A. S. P.,\nSaeidi, M., Yu, R., et al.: &#8216;Multi-wavelength selective crossbar switch&#8217;,\nOptics Express 27.4 (2019): 5203-5216<\/p>\n\n\n\n<p>[10] Xiao, X., Proietti, R., Liu, G., Lu, H., Zhang, Y., Yoo,\nS.J.B., &#8220;Experimental Demonstration of SiPh Flex-LIONS for Bandwidth-Reconfigurable\nOptical Interconnects&#8221;, ECOC, 2019<\/p>\n\n\n\n<p>[11] Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel,\nIgor Mordatch, \u00e2\u20ac\u0153Multi-Agent Actor-Critic for Mixed Cooperative-Competitive\nEnvironments, \u00e2\u20ac\u009d &nbsp;arXiv.org &gt; cs &gt;\narXiv:1706.02275<\/p>\n\n\n\n<p style=\"text-align:left\">[12]&nbsp; Proietti, R., Cao, Z., Nitta, C. J., et al.: &#8216;A scalable, low-latency, high-throughput, optical interconnect architecture based on arrayed waveguide grating routers&#8217;, Journal of Lightwave Technology 33.4 (2015): 911-920<\/p>\n\n\n\n<p>[13] Guojun Yuan, Roberto Proietti, Xiaoli Liu, Alberto Castro, Dawei Zang, Ninghui Sun, CheYu Liu, Zheng Cao, and S. J. Ben Yoo, \u00e2\u20ac\u0153ARON: Application-Driven Reconfigurable Optical Networking for HPC Data Centers\u00e2\u20ac\u0153,  European Conference on Optical Communications (ECOC), 2016<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction Current high-performance computing (HPC) systems are increasingly exploiting heterogeneous computing nodes to improve performance in terms of latency and<span class=\"more-link\"><a href=\"https:\/\/sierra.ece.ucdavis.edu\/index.php\/2020\/03\/21\/computing-architecture-algorithm-and-testbed-studies-for-reconfigurable-computing-with-photonic-interconnects-and-ai\/\">Continue Reading<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":656,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20,21,23],"tags":[],"class_list":["entry","author-hluucdavis-edu","post-655","post","type-post","status-publish","format-standard","has-post-thumbnail","category-computing","category-high-performance-computing","category-scalable-modular-datacenters"],"_links":{"self":[{"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/posts\/655","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/comments?post=655"}],"version-history":[{"count":10,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/posts\/655\/revisions"}],"predecessor-version":[{"id":768,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/posts\/655\/revisions\/768"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/media\/656"}],"wp:attachment":[{"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/media?parent=655"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/categories?post=655"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/tags?post=655"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}