Real-time systems are systems that have to react correctly to stimuli from the environment within given timing constraints.
Today, real-time systems are employed everywhere in industry, not only in safety-critical systems but also in, e.g., communication, entertainment, and multimedia systems.
With the advent of multicore platforms, new challenges on the efficient exploitation of real-time systems have arisen:
First, there is the need for effective scheduling algorithms that feature low overheads to improve the use of the computational resources of real-time systems.
The goal of these algorithms is to ensure timely execution of tasks, i.e., to provide runtime guarantees.
Additionally, many systems require their scheduling algorithm to flexibly react to unforeseen events.
Second, the inherent parallelism of multicore systems leads to contention for shared hardware resources and complicates system analysis.
At any time, multiple applications run with varying resource requirements and compete for the scarce resources of the system.
As a result, there is a need for an adaptive resource management.
Achieving and implementing an effective and efficient resource management is a challenging task.
The main goal of resource management is to guarantee a minimum resource availability to real-time applications.
A further goal is to fulfill global optimization objectives, e.g., maximization of the global system performance, or the user perceived quality of service.
In this thesis, we derive methods based on the slot shifting algorithm.
Slot shifting provides flexible scheduling of time-constrained applications and can react to unforeseen events in time-triggered systems.
For this reason, we aim at designing slot shifting based algorithms targeted for multicore systems to tackle the aforementioned challenges.
The main contribution of this thesis is to present two global slot shifting algorithms targeted for multicore systems.
Additionally, we extend slot shifting algorithms to improve their runtime behavior, or to handle non-preemptive firm aperiodic tasks.
In a variety of experiments, the effectiveness and efficiency of the algorithms are evaluated and confirmed.
Finally, the thesis presents an implementation of a slot-shifting-based logic into a resource management framework for multicore systems.
Thus, the thesis closes the circle and successfully bridges the gap between real-time scheduling theory and real-world implementations.
We prove applicability of the slot shifting algorithm to effectively and efficiently perform adaptive resource management on multicore systems.
Specification of asynchronous circuit behaviour becomes more complex as the
complexity of today’s System-On-a-Chip (SOC) design increases. This also causes
the Signal Transition Graphs (STGs) – interpreted Petri nets for the specification
of asynchronous circuit behaviour – to become bigger and more complex, which
makes it more difficult, sometimes even impossible, to synthesize an asynchronous
circuit from an STG with a tool like petrify [CKK+96] or CASCADE [BEW00].
It has, therefore, been suggested to decompose the STG as a first step; this
leads to a modular implementation [KWVB03] [KVWB05], which can reduce syn-
thesis effort by possibly avoiding state explosion or by allowing the use of library
elements. A decomposition approach for STGs was presented in [VW02] [KKT93]
[Chu87a]. The decomposition algorithm by Vogler and Wollowski [VW02] is based
on that of Chu [Chu87a] but is much more generally applicable than the one in
[KKT93] [Chu87a], and its correctness has been proved formally in [VW02].
This dissertation begins with Petri net background described in chapter 2.
It starts with a class of Petri nets called a place/transition (P/T) nets. Then
STGs, the subclass of P/T nets, is viewed. Background in net decomposition
is presented in chapter 3. It begins with the structural decomposition of P/T
nets for analysis purposes – liveness and boundedness of the net. Then STG
decomposition for synthesis from [VW02] is described.
The decomposition method from [VW02] still could be improved to deal with
STGs from real applications and to give better decomposition results. Some
improvements for [VW02] to improve decomposition result and increase algorithm
efficiency are discussed in chapter 4. These improvement ideas are suggested in
[KVWB04] and some of them are have been proved formally in [VK04].
The decomposition method from [VW02] is based on net reduction to find
an output block component. A large amount of work has to be done to reduce
an initial specification until the final component is found. This reduction is not
always possible, which causes input initially classified as irrelevant to become
relevant input for the component. But under certain conditions (e.g. if structural
auto-conflicts turn out to be non-dynamic) some of them could be reclassified as
irrelevant. If this is not done, the specifications become unnecessarily large, which
intern leads to unnecessarily large implemented circuits. Instead of reduction, a
new approach, presented in chapter 5, decomposes the original net into structural
components first. An initial output block component is found by composing the
structural components. Then, a final output block component is obtained by net
As we cope with the structure of a net most of the time, it would be useful
to have a structural abstraction of the net. A structural abstraction algorithm
[Kan03] is presented in chapter 6. It can improve the performance in finding an
output block component in most of the cases [War05] [Taw04]. Also, the structure
net is in most cases smaller than the net itself. This increases the efficiency of the
decomposition algorithm because it allows the transitions contained in a node of
the structure graph to be contracted at the same time if the structure graph is
used as internal representation of the net.
Chapter 7 discusses the application of STG decomposition in asynchronous
circuit design. Application to speed independent circuits is discussed first. Af-
ter that 3D circuits synthesized from extended burst mode (XBM) specifications
are discussed. An algorithm for translating STG specifications to XBM specifi-
cations was first suggested by [BEW99]. This algorithm first derives the state
machine from the STG specification, then translates the state machine to XBM
specification. An XBM specification, though it is a state machine, allows some
concurrency. These concurrencies can be translated directly, without deriving
all of the possible states. An algorithm which directly translates STG to XBM
specifications, is presented in chapter 7.3.1. Finally DESI, a tool to decompose
STGs and its decomposition results are presented.
Seit Aufkommen der Halbleiter-Technologie existiert ein Trend zur Miniaturisierung elektronischer Systeme. Dies, steigende Anforderungen sowie die zunehmende Integration verschiedener Sensoren zur Interaktion mit der Umgebung lassen solche eingebetteten Systeme, wie sie zum Beispiel in mobilen Geräten oder Fahrzeugen vorkommen, zunehmend komplexer werden. Die Folgen sind ein Anstieg der Entwicklungszeit und ein immer höherer Bauteileaufwand, bei gleichzeitig geforderter Reduktion von Größe und Energiebedarf. Insbesondere der Entwurf von Multi-Sensor-Systemen verlangt für jeden verwendeten Sensortyp jeweils gesondert nach einer spezifischen Sensorelektronik und steht damit den Forderungen nach Miniaturisierung und geringem Leistungsverbrauch entgegen.
In dieser Forschungsarbeit wird das oben beschriebene Problem aufgegriffen und die Entwicklung eines universellen Sensor-Interfaces für eben solche Multi-Sensor-Systeme erörtert. Als ein einzelner integrierter Baustein kann dieses Interface bis zu neun verschiedenen Sensoren unterschiedlichen Typs als Sensorelektronik dienen. Die aufnehmbaren Messgrößen umfassen: Spannung, Strom, Widerstand, Kapazität, Induktivität und Impedanz.
Durch dynamische Rekonfigurierbarkeit und applikationsspezifische Programmierung wird eine variable Konfiguration entsprechend der jeweiligen Anforderungen ermöglicht. Sowohl der Entwicklungs- als auch der Bauteileaufwand können dank dieser Schnittstelle, die zudem einen Energiesparmodus beinhaltet, erheblich reduziert werden.
Die flexible Struktur ermöglicht den Aufbau intelligenter Systeme mit sogenannten Self-x Charakteristiken. Diese betreffen Fähigkeiten zur eigenständigen Systemüberwachung, Kalibrierung oder Reparatur und tragen damit zu einer erhöhten Robustheit und Fehlertoleranz bei. Als weitere Innovation enthält das universelle Interface neuartige Schaltungs- und Sensorkonzepte, beispielsweise zur Messung der Chip-Temperatur oder Kompensation thermischer Einflüsse auf die Sensorik.
Zwei unterschiedliche Anwendungen demonstrieren die Funktionalität der hergestellten Prototypen. Die realisierten Applikationen haben die Lebensmittelanalyse sowie die dreidimensionale magnetische Lokalisierung zum Gegenstand.
The objective of this thesis consists in developing systematic event-triggered control designs for specified event generators, which is an important alternative to the traditional periodic sampling control. Sporadic sampling inherently arising in event-triggered control is determined by the event-triggering conditions. This feature invokes the desire of
finding new control theory as the traditional sampled-data theory in computer control.
Developing controller coupling with the applied event-triggering condition to maximize the control performance is the essence for event-triggered control design. In the design the stability of the control system needs to be ensured with the first priority. Concerning variant control aims they should be clearly incorporated in the design procedures. Considering applications in embedded control systems efficient implementation requires a low complexity of embedded software architectures. The thesis targets at offering such a design to further complete the theory of event-triggered control designs.
In this thesis we studied and investigated a very common but a long existing noise problem and we provided a solution to this problem. The task is to deal with different types of noise that occur simultaneously and which we call hybrid. Although there are individual solutions for specific types one cannot simply combine them because each solution affects the whole speech. We developed an automatic speech recognition system DANSR ( Dynamic Automatic Noisy Speech Recognition System) for hybrid noisy environmental noise. For this we had to study all of speech starting from the production of sounds until their recognition. Central elements are the feature vectors on which pay much attention. As an additional effect we worked on the production of quantities for psychoacoustic speech elements.
The thesis has four parts:
1) The first part we give an introduction. The chapter 2 and 3 give an overview over speech generation and recognition when machines are used. Also noise is considered.
2) In the second part we describe our general system for speech recognition in a noisy environment. This is contained in the chapters 4-10. In chapter 4 we deal with data preparation. Chapter 5 is concerned with very strong noise and its modeling using Poisson distribution. In the chapters 5-8 we deal with parameter based modeling. Chapter 7 is concerned with autoregressive methods in relation to the vocal tract. In the chapters 8 and 9 we discuss linear prediction and its parameters. Chapter 9 is also concerned with quadratic errors, the decomposition into sub-bands and the use of Kalman filters for non-stationary colored noise in chapter 10. There one finds classical approaches as long we have used and modified them. This includes covariance mehods, the method of Burg and others.
3) The third part deals firstly with psychoacoustic questions. We look at quantitative magnitudes that describe them. This has serious consequences for the perception models. For hearing we use different scales and filters. In the center of the chapters 12 and 13 one finds the features and their extraction. The fearures are the only elements that contain information for further use. We consider here Cepstrum features and Mel frequency cepstral coefficients(MFCC), shift invariant local trigonometric transformed (SILTT), linear predictive coefficients (LPC), linear predictive cepstral coefficients (LPCC), perceptual linear predictive (PLP) cepstral coefficients. In chapter 13 we present our extraction methods in DANSR and how they use window techniques And discrete cosine transform (DCT-IV) as well as their inverses.
4) The fourth part considers classification and the ultimate speech recognition. Here we use the hidden Markov model (HMM) for describing the speech process and the Gaussian mixture model (GMM) for the acoustic modelling. For the recognition we use forward algorithm, the Viterbi search and the Baum-Welch algorithm. We also draw the connection to dynamic time warping (DTW). In the rest we show experimental results and conclusions.
The work presented in this thesis discusses the thermal and power management of multi-core processors (MCPs) with both two dimensional (2D) package and there dimensional (3D) package chips. The power and thermal management/balancing is of increasing concern and is a technological challenge to the MCP development and will be a main performance bottleneck for the development of MCPs. This thesis develops optimal thermal and power management policies for MCPs. The system thermal behavior for both 2D package and 3D package chips is analyzed and mathematical models are developed. Thereafter, the optimal thermal and power management methods are introduced.
Nowadays, the chips are generally packed in 2D technique, which means that there is only one layer of dies in the chip. The chip thermal behavior can be described by a 3D heat conduction partial differential equation (PDE). As the target is to balance the thermal behavior and power consumption among the cores, a group of one dimensional (1D) PDEs, which is derived from the developed 3D PDE heat conduction equation, is proposed to describe the thermal behavior of each core. Therefore, the thermal behavior of the MCP is described by a group of 1D PDEs. An optimal controller is designed to manage the power consumption and balance the temperature among the cores based on the proposed 1D model.
3D package is an advanced package technology, which contains at least 2 layers of dies stacked in one chip. Different from 2D package, the cooling system should be installed among the layers to reduce the internal temperature of the chip. In this thesis, the micro-channel liquid cooling system is considered, and the heat transfer character of the micro-channel is analyzed and modeled as an ordinary differential equation (ODE). The dies are discretized to blocks based on the chip layout with each block modeled as a thermal resistance and capacitance (R-C) circuit. Thereafter, the micro-channels are discretized. The thermal behavior of the whole system is modeled as an ODE system. The micro-channel liquid velocity is set according to the workload and the temperature of the dies. Under each velocity, the system can be described as a linear ODE model system and the whole system is a switched linear system. An H-infinity observer is designed to estimate the states. The model predictive control (MPC) method is employed to design the thermal and power management/balancing controller for each submodel.
The models and controllers developed in this thesis are verified by simulation experiments via MATLAB. The IBM cell 8 cores processor and water micro-channel cooling system developed by IBM Research in collaboration with EPFL and ETHZ are employed as the experiment objects.
This paper presents a case study comparing the hardware description language „Constructing Hardware in a Scala Embedded Language“(Chisel) to VHDL. For a thorough comparison the Heston Model was implemented, a stochastic model used in financial mathematics to calculate option prices. Metrics like hardware utilization and maximum clock rate were extracted from both resulting designs and compared to each other. The results showed a 30% reduction in code size compared to VHDL, while the resulting circuits had about the same hardware utilization. Using Chisel however proofed to be difficult because of a few features that were not available for this case study.
Chisel (Constructing Hardware in a Scala embedded language) is a new programming language, which embedded in Scala, used for hardware synthesis. It aims to increase productivity when creating hardware by enabling designers to use features present in higher level programming languages to build complex hardware blocks. In this paper, the most advertised features of Chisel are investigated and compared to their VHDL counterparts, if present. Afterwards, the authors’ opinion if a switch to Chisel is worth considering is presented. Additionally, results from a related case study on Chisel are briefly summarized. The author concludes that, while Chisel has promising features, it is not yet ready for use in the industry.
This work shall provide a foundation for the cross-design of wireless networked control systems with limited resources. A cross-design methodology is devised, which includes principles for the modeling, analysis, design, and realization of low cost but high performance and intelligent wireless networked control systems. To this end, a framework is developed in which control algorithms and communication protocols are jointly designed, implemented, and optimized taking into consideration the limited communication, computing, memory, and energy resources of the low performance, low power, and low cost wireless nodes used. A special focus of the proposed methodology is on the prediction and minimization of the total energy consumption of the wireless network (i.e. maximization of the lifetime of wireless nodes) under control performance constraints (e.g. stability and robustness) in dynamic environments with uncertainty in resource availability, through the joint (offline/online) adaptation of communication protocol parameters and control algorithm parameters according to the traffic and channel conditions. Appropriate optimization approaches that exploit the structure of the optimization problems to be solved (e.g. linearity, affinity, convexity) and which are based on Linear Matrix Inequalities (LMIs), Dynamic Programming (DP), and Genetic Algorithms (GAs) are investigated. The proposed cross-design approach is evaluated on a testbed consisting of a real lab plant equipped with wireless nodes. Obtained results show the advantages of the proposed cross-design approach compared to standard approaches which are less flexible.
Hardware prototyping is an essential part in the hardware design flow. Furthermore, hardware prototyping usually relies on system-level design and hardware-in-the-loop simulations in order to develop, test and evaluate intellectual property cores. One common task in this process consist on interfacing cores with different port specifications. Data width conversion is used to overcome this issue. This work presents two open source hardware cores compliant with AXI4-Stream bus protocol, where each core performs upsizing/downsizing data width conversion.