{"id":143,"date":"2020-03-20T10:50:09","date_gmt":"2020-03-20T17:50:09","guid":{"rendered":"https:\/\/lab.ece.ucdavis.edu\/?p=143"},"modified":"2020-03-24T14:08:31","modified_gmt":"2020-03-24T21:08:31","slug":"service-provisioning-in-multi-domain-sd-eon-with-machine-learning-and-game-theory-approaches","status":"publish","type":"post","link":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/2020\/03\/20\/service-provisioning-in-multi-domain-sd-eon-with-machine-learning-and-game-theory-approaches\/","title":{"rendered":"Networking: AI-Assisted Self-Driving Autonomic Optical Networking"},"content":{"rendered":"\n<p><strong>I<\/strong><strong>ntroduction<\/strong>&nbsp;<\/p>\n\n\n\n<p>Current Internet is composed of heterogeneous&nbsp;multi-AS (autonomous system)&nbsp;networks (wireless, optical&nbsp;networks&nbsp;etc.)&nbsp;as shown in Fig.&nbsp;1.&nbsp;Wherein, 5G and optical passive networking&nbsp;(PON)&nbsp;have been&nbsp;recognized as the building blocks for&nbsp;next-generation&nbsp;access networks,&nbsp;while&nbsp;elastic optical networking&nbsp;(EON) is emerging as one of most promising&nbsp;technologies for future&nbsp;backbone networks&nbsp;due to its&nbsp;fine-grained and agile spectrum allocation schemes.&nbsp;With the rapid development of datacenter networks and the explosion of cloud-driven applications, future Internet is expected to be able to support&nbsp;dynamic,&nbsp;high-capacity and quality-of-transmission aware end-to-end services across domains.&nbsp;Due to the inherent complexity in optimizing service provisioning in optical networks and the heterogeneity and autonomy of&nbsp;ASes,&nbsp;current networking designs relying on artificially defined rules are becoming the main factors restricting the network-wide performance and hindering the evolutions of the Internet.&nbsp;Therefore, we envision a&nbsp;powerful network control and management&nbsp;(NC&amp;M)&nbsp;system&nbsp;equipped with self-learning, self-adapting and self-healing capabilities to meet&nbsp;the challenges of the next-generation Internet.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-1024x811.png\" alt=\"\" class=\"wp-image-628\" width=\"563\" height=\"446\" srcset=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image.png 1024w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-300x238.png 300w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-768x608.png 768w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-465x368.png 465w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-631x500.png 631w\" sizes=\"auto, (max-width: 563px) 100vw, 563px\" \/><\/figure><\/div>\n\n\n\n<p>Fig. 1 Internet Infrastructure [1].&nbsp;<\/p>\n\n\n\n<p><strong>Research&nbsp;<\/strong><strong>Activities<\/strong>&nbsp;<\/p>\n\n\n\n<p><strong><em>Network Architectural Design<\/em><\/strong><strong><em>:<\/em><\/strong>&nbsp;In this&nbsp;research, we perform architectural studies on self-driving autonomic optical networking systems, including the required system function modules&nbsp;and&nbsp;working flows.&nbsp;Fig. 2&nbsp;shows the block diagram of the proposed DRL-based autonomic networking framework. The framework is built on the basis of the&nbsp;software-defined networking (SDN)&nbsp;architecture, with&nbsp;decoupled data and NC&amp;M planes. The data plane adopts EON technologies to provision dynamic and flex-grid (e.g., at a granularity of 6.25&nbsp;GHz) optical connections for clients from metro networks, datacenters, and research facilities. Optical performance monitoring functionalities (e.g., monitoring of optical signal-to-noise ratio) are also employed for sensing the states of data plane operations. The NC&amp;M plane employs a remote and centralized SDN controller for service provisioning management. The SDN controller utilizes advanced network modeling languages and SDN protocols to communicate with SDN agents (locally attached to data plane equipment) for collecting service requests, distributing service schemes, and inquiring device conditions and monitoring data on demand.&nbsp;We design the service provisioning mechanism based on the principle of&nbsp;deep reinforcement learning&nbsp;(DRL). Specifically, upon an event (e.g., reception of a service request) that triggers a specific DRL application (e.g., DRL-based&nbsp;routing and spectrum assignment&nbsp;or failure restoration), the SDN controller makes the feature engineering module generate an EON state representation for the corresponding DRL agent. The feature engineering module retrieves various network state data (such as pending requests, in-service connections, and resource utilization) from the traffic engineering database and tailors the data to meet the demand of the DRL agent. The&nbsp;deep neural networks&nbsp;(DNNs)&nbsp;of the DRL agent take as input the state data and output a service provisioning policy to the SDN controller. Here, a service provisioning policy can be a probability distribution over a set of available service schemes. The SDN controller in turn determines a service scheme with the policy. Based on the service provisioning outcome, corresponding feedbacks are sent to the reward system. The reward system translates the feedbacks into an immediate reward for the DRL agent. The reward enables the DRL agent to quantitatively measure the quality of the action taken (i.e., the service scheme selected). For example, an agent gets a reward of \u00e2\u20ac\u02dc1\u00e2\u20ac\u2122&nbsp;if a request&nbsp;is successfully serviced, and \u00e2\u20ac\u02dc0\u00e2\u20ac\u2122&nbsp;otherwise. The service provisioning sample (i.e., the state, action, and reward tuple) is stored in the experience buffer, which afterward produces training signals to update the DNNs. In particular, the DRL agent tunes the DNNs to reinforce actions (i.e., increase the corresponding probabilities) leading to higher long-term cumulative rewards. This way, through repeated service provisioning practice, the DRL agent can progressively learn effective policies. Meanwhile, as the DRL agent performs training constantly upon new observations, it is able to adapt to gradual network evolutions. Different DRL agents can also work in collaboration through knowledge transfers for faster convergence and improved network-wide performance. Eventually, the DRL-based service provisioning design enables a fully autonomic EON system with self-learning and self-adapting capabilities. Note that, with slight modifications, the proposed framework is also applicable for networks using different data plane technologies (e.g., packet networks).&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-3.png\" alt=\"\" class=\"wp-image-631\" width=\"513\" height=\"501\" srcset=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-3.png 882w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-3-300x293.png 300w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-3-768x751.png 768w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-3-465x454.png 465w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-3-512x500.png 512w\" sizes=\"auto, (max-width: 513px) 100vw, 513px\" \/><\/figure><\/div>\n\n\n\n<p>Fig.&nbsp;2.&nbsp;Schematic of self-driving autonomic optical networking system.&nbsp;<\/p>\n\n\n\n<p><strong><em>S<\/em><\/strong><strong><em>erv<\/em><\/strong><strong><em>ice Provisioning Policy Design:<\/em><\/strong>&nbsp;We have performed studies on designing autonomic routing and spectrum assignment (RMSA) agents&nbsp;(called&nbsp;DeepRMSA)&nbsp;based on the proposed autonomic networking framework.&nbsp;Fig. 3 shows an example of RMSA operations in EON,&nbsp;where two&nbsp;lightpath&nbsp;requests&nbsp;R_1&nbsp;(from node 1 to node 4) and&nbsp;R_2&nbsp;(from node 2 to node 5) arrive sequentially, each demanding for bandwidth of 2 or 4&nbsp;frequency slots (FS\u00e2\u20ac\u2122s). For the sake of clarity, we reduce the optimization dimension of the RMSA problem by fixing the routing paths as 1-2-4 and 2-4-5, respectively, and omitting the modulation format assignment procedure. Based on the spectrum utilization state on each link, two FS-blocks (i.e., [1,5] and [9,10]) are available on path 1-2-4. However, the only correct&nbsp;policy is allocating FS-block [9,10]&nbsp;to&nbsp;R_1&nbsp;(where both of the requests are successfully serviced), since otherwise,&nbsp;R_2&nbsp;will be blocked due to the lack of spare spectra on link 2-4. Note that, more practical RMSA problems involving realistic-scale topologies and larger link capacities while allowing flexible routing and modulation format choices would be much more complicated than that given by the above example.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-2-1024x991.png\" alt=\"\" class=\"wp-image-630\" width=\"427\" height=\"413\" srcset=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-2.png 1024w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-2-300x290.png 300w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-2-768x743.png 768w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-2-465x450.png 465w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-2-517x500.png 517w\" sizes=\"auto, (max-width: 427px) 100vw, 427px\" \/><\/figure><\/div>\n\n\n\n<p>Fig. 3. An example of RMSA operations in EON.&nbsp;<\/p>\n\n\n\n<p>In our design, we make&nbsp;the&nbsp;DeepRMSA&nbsp;agents read&nbsp;information of&nbsp;lightpath&nbsp;requests (source, destination nodes, bandwidth requirements, service duration)&nbsp;and&nbsp;the&nbsp;spectrum utilization state on K&nbsp;shortest candidate routing paths. For each path, we compute the number of&nbsp;FS\u00e2\u20ac\u2122s required based on an&nbsp;impairment-aware model, the total number of available FS\u00e2\u20ac\u2122s, the average size of available FS&nbsp;blocks (consecutive available FS\u00e2\u20ac\u2122s), and the size and starting position of the first available FS&nbsp;block. We adopt a fully-connected&nbsp;DNN&nbsp;consisting of five hidden layers of 128 neurons&nbsp;and&nbsp;a&nbsp;single-layer policy and value heads. An agent receives a reward of 1 or -1 if a&nbsp;lightpath&nbsp;request&nbsp;is successfully serviced or is rejected.&nbsp;For more details of the training process, please refer to our work in [2]. Fig. 4&nbsp;plots the results of request blocking probability, where SP-FF and KSP-FF are baselines algorithms found as the state of the art.&nbsp;It can be seen that&nbsp;DeepRMSA&nbsp;successfully beats both of the baselines after training of around 30,000&nbsp;epochs and eventually can achieve a blocking reduction of 45.9%&nbsp;compared with KSF-FF. The average spectrum utilization ratios from&nbsp;DeepRMSA&nbsp;(after training of&nbsp;200,000 epochs), KSP-FF and SP-FF are 32.6%, 30.4%, and&nbsp;27.2%,&nbsp;respectively. Since&nbsp;DeepRMSA&nbsp;enables to accommodate more requests, it utilizes the largest amount of spectrum resources.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-1.png\" alt=\"\" class=\"wp-image-629\" width=\"438\" height=\"330\" srcset=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-1.png 488w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-1-300x226.png 300w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-1-465x351.png 465w\" sizes=\"auto, (max-width: 438px) 100vw, 438px\" \/><\/figure><\/div>\n\n\n\n<p>Fig.&nbsp;4. Results of request blocking probability during the learning process.&nbsp;<\/p>\n\n\n\n<p><strong><em>S<\/em><\/strong><strong><em>ystem design and implementati<\/em><\/strong><strong><em>on:<\/em><\/strong>&nbsp;This&nbsp;resarch&nbsp;also aims at&nbsp;designing and implementing a self-driving autonomic prototype system for verifying the proposed designs, as shown in Fig. 5 for&nbsp;a two-domain seven-node SD-EON network&nbsp;testbed. The first domain has a star-ring architecture that consists&nbsp;of four nodes, while the second domain has a three-node&nbsp;ring architecture. Each node is connected to other nodes by&nbsp;spools of single-mode fiber (SMF) or dispersion shifted fiber&nbsp;(DSF) of different lengths (15, 20, and 25 km). A 10&nbsp;GBd&nbsp;16-QAM coherent transmitter generates the testing signal used for&nbsp;data training and prediction. This signal is multiplexed with&nbsp;20 50 GHz spacing 10 Gb\/s dense wavelength division multiplexing&nbsp;(DWDM) on-off keying (OOK) signals, serving as the background traffic. The signal at the output of the multiplexer&nbsp;is injected into the testbed. The optical spectrum analyzer&nbsp;(OSA)-based OPMs are placed at the inputs of each node&nbsp;to monitor the optical power and the spectrum occupancy of&nbsp;background traffics.&nbsp;<\/p>\n\n\n\n<p>The collection of training and evaluation datasets was&nbsp;achieved by enumerating each of the possible routing paths. We applied random&nbsp;routing for the background traffic and random attenuations&nbsp;(0 dB\u00e2\u20ac\u201c7 dB for each WSS) for all the signals to purposely introduce&nbsp;perturbations to the network and allowing to sample the&nbsp;entire input space of the unknown target function that correlates&nbsp;the&nbsp;QoT&nbsp;and OPM readings. The launch power of each fiber span varies from \u00e2\u02c6\u20197&nbsp;dBm&nbsp;to 12&nbsp;dBm&nbsp;depending on the random&nbsp;applied attenuation and routing at each WSS node. At each run,&nbsp;we measured the actual Q-factor of the testing signal at Node G&nbsp;and record this value as the label of the current dataset. We implemented the domain manager-level ANNs and&nbsp;broker-level ANNs of 25 hidden units. For benchmarking purpose,&nbsp;we also implemented an omniscient ANN that can access&nbsp;all the OPM data of the two domains. Fig.&nbsp;6&nbsp;shows the training and Q-factor prediction performance&nbsp;of the omniscient and hierarchical estimators for one of&nbsp;the routing path. Both ANNs converge properly&nbsp;without overfitting. The Q-factor deviations for the omniscient&nbsp;and hierarchical estimators are 0.5 and 0.52 dB, respectively. From Figs.&nbsp;6(a) and (b), the hierarchical estimator seems to have&nbsp;a slower convergence speed and slightly higher out-of-sample&nbsp;error against the&nbsp;omniscient ANN. These results indicate that the&nbsp;proposed hierarchical estimator can achieve nearly ideal&nbsp;QoT&nbsp;prediction performance (with a small penalty) while supporting&nbsp;the autonomy and privacy of each autonomous domain.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-5-1024x579.png\" alt=\"\" class=\"wp-image-633\" width=\"472\" height=\"266\" srcset=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-5.png 1024w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-5-300x170.png 300w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-5-768x434.png 768w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-5-465x263.png 465w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-5-695x393.png 695w\" sizes=\"auto, (max-width: 472px) 100vw, 472px\" \/><\/figure><\/div>\n\n\n\n<p>Fig. 5.&nbsp;Our field test experimental testbed.&nbsp;DSP: digital signal processing; DAC: digital-to-analog converter; IQM: I\/Q modulator; OF-Agents:&nbsp;openflow&nbsp;agent; WSS: wavelength selective switch; Co-Rx: Coherent receiver; SFP: small-form factor pluggable.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-4.png\" alt=\"\" class=\"wp-image-632\" width=\"493\" height=\"343\" srcset=\"https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-4.png 918w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-4-300x208.png 300w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-4-768x534.png 768w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-4-465x323.png 465w, https:\/\/sierra.ece.ucdavis.edu\/wp-content\/uploads\/2020\/03\/image-4-695x483.png 695w\" sizes=\"auto, (max-width: 493px) 100vw, 493px\" \/><\/figure><\/div>\n\n\n\n<p>Fig. 6. Inter-domain learning performance: (a) MSE&nbsp;vs. training iterations for omniscient estimator; (b)&nbsp;QoT&nbsp;prediction accuracy for&nbsp;omniscient estimator; (c) MSE vs. training iterations for hierarchical estimator;&nbsp;(d)&nbsp;QoT&nbsp;estimation accuracy for hierarchical estimator.&nbsp;<\/p>\n\n\n\n<p><strong>References<\/strong>&nbsp;<\/p>\n\n\n\n<p>[1]&nbsp;https:\/\/esdn.upc.edu\/en&nbsp;<\/p>\n\n\n\n<p>[2]&nbsp;&#8220;DeepRMSA: A Deep Reinforcement Learning Framework for Routing, Modulation and Spectrum Assignment in Elastic Optical Networks&#8221;, X. Chen et al.,&nbsp;<em>IEEE&nbsp;<\/em><em>JL<\/em><em>T, 2019<\/em>.&nbsp;<\/p>\n\n\n\n<p>[3]&nbsp;&#8220;Hierarchical learning for cognitive end-to-end service provisioning in multi-domain autonomous optical networks&#8221;,&nbsp;G. Liu et al.,&nbsp;<em>IEEE JLT<\/em>, 2019.&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction&nbsp; Current Internet is composed of heterogeneous&nbsp;multi-AS (autonomous system)&nbsp;networks (wireless, optical&nbsp;networks&nbsp;etc.)&nbsp;as shown in Fig.&nbsp;1.&nbsp;Wherein, 5G and optical passive networking&nbsp;(PON)&nbsp;have been&nbsp;recognized<span class=\"more-link\"><a href=\"https:\/\/sierra.ece.ucdavis.edu\/index.php\/2020\/03\/20\/service-provisioning-in-multi-domain-sd-eon-with-machine-learning-and-game-theory-approaches\/\">Continue Reading<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":438,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[27,24],"tags":[],"class_list":["entry","author-hluucdavis-edu","post-143","post","type-post","status-publish","format-standard","has-post-thumbnail","category-cognitive-machine-learning-networking","category-networking"],"_links":{"self":[{"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/posts\/143","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/comments?post=143"}],"version-history":[{"count":11,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/posts\/143\/revisions"}],"predecessor-version":[{"id":851,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/posts\/143\/revisions\/851"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/media\/438"}],"wp:attachment":[{"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/media?parent=143"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/categories?post=143"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sierra.ece.ucdavis.edu\/index.php\/wp-json\/wp\/v2\/tags?post=143"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}