Code Generation for Synchronous Control Asynchronous Dataflow Architectures

  • Scaling up conventional processor architectures cannot translate the ever-increasing number of transistors into comparable application performance. Although the trend is to shift from single-core to multi-core architectures, utilizing these multiple cores is not a trivial task for many applications due to thread synchronization and weak memory consistency issues. This is especially true for applications in real-time embedded systems since timing analysis becomes more complicated due to contention on shared resources. One inherent reason for the limited use of instruction-level parallelism (ILP) by conventional processors is the use of registers. Therefore, some recent processors bypass register usage by directly communicating values from producer processing units to consumer processing units. In widely used superscalar processors, this direct instruction communication is organized by hardware at runtime, adversely affecting its scalability. The exposed datapath architectures provide a scalable alternative by allowing compilers to move values directly from output ports to the input ports of processing units. Though exposed datapath architectures have already been studied in great detail, they still use registers for executing programs, thus limiting the amount of ILP they can exploit. This limitation stems from a drawback in their execution paradigm, code generator, or both. This thesis considers a novel exposed datapath architecture named Synchronous Control Asynchronous Dataflow (SCAD) that follows a hybrid control-flow dataflow execution paradigm. The SCAD architecture employs first-in-first-out (FIFO) buffers at the output and input ports of processing units. It is programmed by move instructions that transport values from the head of output buffers to the tail of input buffers. Thus, direct instruction communication is facilitated by the architecture. The processing unit triggers the execution of an operation when operand values are available at the heads of its input buffers. We propose a code generation technique for SCAD processors inspired by classical queue machines that completely eliminates the use of registers. On this basis, we first generate optimal code by using satisfiability (SAT) solvers after establishing that optimal code generation is hard. Heuristics based on a novel buffer interference analysis are then developed to compile larger programs. The experimental results demonstrate the efficacy of the execution paradigm of SCAD using our queue-oriented code generation technique.
Metadaten
Author:Anoop BhagyanathORCiD
URN:urn:nbn:de:hbz:386-kluedo-62425
DOI:https://doi.org/10.26204/KLUEDO/6242
Advisor:Klaus SchneiderORCiD
Document Type:Doctoral Thesis
Language of publication:English
Date of Publication (online):2021/01/29
Year of first Publication:2021
Publishing Institution:Technische Universität Kaiserslautern
Granting Institution:Technische Universität Kaiserslautern
Acceptance Date of the Thesis:2020/01/31
Date of the Publication (Server):2021/01/29
Tag:Code Generation; Exposed Datapath Architectures; Processor Architectures; SCAD; Synchronous Control Asynchronous Dataflow
Page Number:XVII, 126
Faculties / Organisational entities:Kaiserslautern - Fachbereich Informatik
CCS-Classification (computer science):C. Computer Systems Organization / C.1 PROCESSOR ARCHITECTURES
DDC-Cassification:0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik
Licence (German):Creative Commons 4.0 - Namensnennung, nicht kommerziell, keine Bearbeitung (CC BY-NC-ND 4.0)