The explosive volume growth of deep-learning (DL) applications has triggered an era in computing, with neuromorphic photonic platforms promising to merge ultra-high speed and energy efficiency credentials with the brain-inspired computing primitives. The transfer of deep neural networks (DNNs) onto silicon photonic (SiPho) architectures requires, however, an analog computing engine that can perform tiled matrix multiplication (TMM) at line rate to support DL applications with a large number of trainable parameters, similar to the approach followed by state-of-the-art electronic graphics processing units. Herein, we demonstrate an analog SiPho computing engine that relies on a coherent architecture and can perform optical TMM at the record-high speed of 50 GHz. Its potential to support DL applications, where the number of trainable parameters exceeds the available hardware dimensions, is highlighted through a photonic DNN that can reliably detect distributed denial-of-service attacks within a data center with a Cohen’s kappa score-based accuracy of 0.636.
The emergence of demanding machine learning and AI workloads in modern computational systems and Data Centers (DC) has fueled a drive towards custom hardware, designed to accelerate Multiply-Accumulate (MAC) operations. In this context, neuromorphic photonics have recently attracted attention as a promising technological candidate, that can transfer photonics low-power, high bandwidth credentials in neuromorphic hardware implementations. However, the deployment of such systems necessitates progress in both the underlying constituent building blocks as well as the development of deep learning training models that can take into account the physical properties of the employed photonic components and compensate for their non-ideal performance. Herein, we present an overview of our progress in photonic neuromorphic computing based on coherent layouts, that exploits the phase of the light traversing the photonic circuitry both for sign representation and matrix manipulation. Our approach breaks-through the direct trade-off of insertion loss and modulation bandwidth of State-Of-The-Art coherent architectures and allows high-speed operation in reasonable energy envelopes. We present a silicon-integrated coherent linear neuron (COLN) that relies on electro-absorption modulators (EAM) both for its on-chip data generation and weighting, demonstrating a record-high 32 GMAC/sec/axon compute linerate and an experimentally obtained accuracy of 95.91% in the MNIST classification task. Moreover, we present our progress on component specific neuromorphic circuitry training, considering both the photonic link thermal noise and its channel response. Finally, we present our roadmap on scaling our architecture using a novel optical crossbar design towards a 32×32 layout that can offer >;32 GMAC/sec/axon computational power in ~0.09 pJ/MAC.
Neuromorphic computing has emerged as a highly-promising compute alternative, migrating from von-Neuman architectures towards mimicking the human brain for sustaining computational power increases within a reduced power consumption envelope. Electronic neuromorphic chips like IBM’s TrueNorth, Intel’s Loihi and Mythic’s AI platform reveal a tremendous performance improvement in terms of computational speed and density; at the same time, neuromorphic photonic layouts are constantly gaining ground in exploiting their large component portfolio for enabling GHz-bandwidth and low-energy neurons. Progressing in tight synergy with appropriate training techniques, this evolution has already started to translate into performance improvements in end-to-end applications, highlighting the practical perspectives of the new neural network hardware when effectively synergized with new training frameworks. Herein, we present a complete portfolio of neuromorphic photonic subsystems and architectures, highlighting their utilization in practical application scenario for time series classification and fiber transmission links. Our work extends along feed-forward and recurrent photonic NN models, demonstrating experimental results together with the required training methods for bridging the gap between software-deployed NNs and the photonic hardware. We report on the experimentally validated performance of a 10GHz photonic time series classification engine, presenting also preliminary results on how photonic neurons can replace DSP modules in end-to-end fiber transmission schemes. The perspectives of these layouts to yield energy and area efficiency benefits are discussed through a detailed energy and area breakdown of neuromorphic photonic technologies, highlighting a promising roadmap when plasmo-photonic hardware is adopted.
The ever-increasing energy consumption of Data Centers (DC), along with the significant waste of resources that is observed in traditional DCs, have forced DC operators to invest in solutions that will considerably improve energy efficiency. In this context, Rack- and board-scale resource disaggregation is under heavy research, as a groundbreaking innovation that could amortize the energy and cost impact caused by the vast diversity in resource demand of emerging DC workloads. However disaggregation, by breaking apart the critical CPU-to-memory path, introduces a challenging set of requirements in the underlying network infrastructure, that has to support low-latency and high-throughput communication for a high number of nodes.
In this paper we present our recent work on optical interconnects towards enabling resource disaggregation both on Rack-level as well as on board-level. To this end, we have demonstrated the Hipoλaos architecture that can efficiently integrate Spanke-based switching with AWGR-based wavelength routing and optical feedforward buffering into highport switch layouts. The proof-of-concept Hipoλaos prototype, based on the 1024-port layout, provide latency performance of 456ns, while system level evaluations reveal sub-μs latency performance for a variety of synthetic traffic profiles. Moving towards high-capacity board-level interconnects, we present the latest achievements realized within the context of H2020-STREAMS project, where single-mode optical PCBs hosting Si-based routing modules and mid-board optics are exploited towards a massive any-to-any, buffer-less, collision-less and extremely low latency routing platform with 25.6Tb/s throughput. Finally, we combine the Hipolaos and STREAMS architectures in a dual-layer switching scheme and evaluate its performance via system-level simulations.
The identification of neuromorphic computing as a highly promising alternative computing system has been emerged from its potential to increase rapidly the computational efficiency that is currently restricted by Moore’s law end. First electronic neuromorphic chips like IBM’s TrueNorth and Intel’s Loihi revealed a tremendous performance improvement in terms of computational speed and density; however, they are still operating in MHz rates. To this end, neuromorphic photonic integrated circuits can further increase the computational speed and density, having a large portfolio of components with GHz-bandwidth and low-energy. Herein, we present an all-optical sigmoid activation function as well as a single-λ linear neuron. The all-optical sigmoid activation function comprises a Semiconductor Optical Amplifier-Mach-Zehnder Interferometer (SOA-MZI) configured in differentially-biased scheme followed by an SOA. Its thresholding capabilities have been experimentally demonstrated with 100psec optical pulses. Then, we introduce an all-optical phase-encoded weighting scheme and we experimentally demonstrate its linear algebra operational credentials by the means of a typical IQ modulator operated at 10Gbaud/s.
The urgent need for high-bandwidth and high-port connectivity in Data Centers has boosted the deployment of optoelectronic packet switches towards bringing high data-rate optics closer to the ASIC, realizing optical transceiver functions directly at the ASIC package for high-rate, low-energy and low-latency interconnects. Even though optics can offer a broad range of low-energy integrated switch fabrics for replacing electronic switches and seamlessly interface with the optical I/Os, the use of energy- and latency-consuming electronic SerDes continues to be a necessity, mainly dictated by the absence of integrated and reliable optical buffering solutions. SerDes undertakes the role of optimally synergizing the lower-speed electronic buffers with the incoming and outgoing optical streams, suggesting that a SerDes-released chip-scale optical switch fabric can be only realized in case all necessary functions including contention resolution and switching can be implemented on a common photonic integration platform. In this paper, we demonstrate experimentally a hybrid Broadcast-and-Select (BS) / wavelength routed optical switch that performs both the optical buffering and switching functions with μm-scale Silicon-integrated building blocks. Optical buffering is carried out in a silicon-integrated variable delay line bank with a record-high on-chip delay/footprint efficiency of 2.6ns/mm2 and up to 17.2 nsec delay capability, while switching is executed via a BS design and a silicon-integrated echelle grating, assisted by SOA-MZI wavelength conversion stages and controlled by a FPGA header processing module. The switch has been experimentally validated in a 3x3 arrangement with 10Gb/s NRZ optical data packets, demonstrating error-free switching operation with a power penalty of <5dB.
KEYWORDS: Switches, Optical switching, Field programmable gate arrays, Data centers, Switching, Computer architecture, Signal processing, Modulation, Device simulation, Data conversion
Disaggregated Data Centers (DCs) have emerged as a powerful architectural framework towards increasing resource utilization and system power efficiency, requiring, however, a networking infrastructure that can ensure low-latency and high-bandwidth connectivity between a high-number of interconnected nodes. This reality has been the driving force towards high-port count and low-latency optical switching platforms, with recent efforts concluding that the use of distributed control architectures as offered by Broadcast-and-Select (BS) layouts can lead to sub-μsec latencies. However, almost all high-port count optical switch designs proposed so far rely either on electronic buffering and associated SerDes circuitry for resolving contention or on buffer-less designs with packet drop and re-transmit procedures, unavoidably increasing latency or limiting throughput. In this article, we demonstrate a 256x256 optical switch architecture for disaggregated DCs that employs small-size optical delay line buffering in a distributed control scheme, exploiting FPGA-based header processing over a hybrid BS/Wavelength routing topology that is implemented by a 16x16 BS design and a 16x16 AWGR. Simulation-based performance analysis reveals that even the use of a 2- packet optical buffer can yield <620nsec latency with >85% throughput for up to 100% loads. The switch has been experimentally validated with 10Gb/s optical data packets using 1:16 optical splitting and a SOA-MZI wavelength converter (WC) along with fiber delay lines for the 2-packet buffer implementation at every BS outgoing port, followed by an additional SOA-MZI tunable WC and the 16x16 AWGR. Error-free performance in all different switch input/output combinations has been obtained with a power penalty of <2.5dB.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.