Modern society relies on convenience services and mobile communication. Cloud computing is the current trend to make data and applications available at any time on every device. Data centers concentrate computation and storage at central locations, while they claim themselves green due to their optimized maintenance and increased energy efﬁciency. The key enabler for this evolution is the microelectronics industry. The trend to power efﬁcient mobile devices has forced this industry to change its design dogma to: ”keep data locally and reduce data communication whenever possible”. Therefore we ask: is cloud computing repeating the aberrations of its enabling industry?
3D integration of solid-state memories and logic, as demonstrated by the Hybrid Memory Cube (HMC), offers major opportunities for revisiting near-memory computation and gives new hope to mitigate the power and performance losses caused by the “memory wall”. In this paper we present the first exploration steps towards design of the Smart Memory Cube (SMC), a new Processor-in-Memory (PIM) architecture that enhances the capabilities of the logic-base (LoB) in HMC. An accurate simulation environment has been developed, along with a full featured software stack. All offloading and dynamic overheads caused by the operating system, cache coherence, and memory management are considered, as well. Benchmarking results demonstrate up to 2X performance improvement in comparison with the host SoC, and around 1.5X against a similar host-side accelerator. Moreover, by scaling down the voltage and frequency of PIM’s processor it is possible to reduce energy by around 70% and 55% in comparison with the host and the accelerator, respectively.
The capacity of embedded memory on LSIs has kept increasing. It is important to reduce the leakage power of embedded memory for low-power LSIs. In fact, the ITRS predicts that the leakage power in embedded memory will account for 40% of all power consumption by 2024 . A spin transfer torque magneto-resistance random access memory (STT-MRAM) is promising for use as non-volatile memory to reduce the leakage power. It is useful because it can function at low voltages and has a lifetime of over 1016 write cycles . In addition, the STT-MRAM technology has a smaller bit cell than an SRAM. Making the STT-MRAM is suitable for use in high-density products [3–7]. The STT-MRAM uses magnetic tunnel junction (MTJ). The MTJ has two states: a parallel state and an anti-parallel state. These states mean that the magnetization direction of the MTJ’s layers are the same or different. The directions pair determines the MTJ’s magneto- resistance value. The states of MTJ can be changed by the current flowing. The MTJ resistance becomes low in the parallel state and high in the anti-parallel state. The MTJ potentially operates at less than 0.4 V . In other hands, it is difficult to design peripheral circuitry for an STT-MRAM array at such a low voltage. In this paper, we propose a counter-based read circuit that functions at 0.4 V, which is tolerant of process variation and temperature fluctuation.
For many years real-time task models have focused the timing constraints on execution windows defined by earliest start times and deadlines for feasibility.
However, the utility of some application may vary among scenarios which yield correct behavior, and maximizing this utility improves the resource utilization.
For example, target sensitive applications have a target point where execution results in maximized utility, and an execution window for feasibility.
Execution around this point and within the execution window is allowed, albeit at lower utility.
The intensity of the utility decay accounts for the importance of the application.
Examples of such applications include multimedia and control; multimedia application are very popular nowadays and control applications are present in every automated system.
In this thesis, we present a novel real-time task model which provides for easy abstractions to express the timing constraints of target sensitive RT applications: the gravitational task model.
This model uses a simple gravity pendulum (or bob pendulum) system as a visualization model for trade-offs among target sensitive RT applications.
We consider jobs as objects in a pendulum system, and the target points as the central point.
Then, the equilibrium state of the physical problem is equivalent to the best compromise among jobs with conflicting targets.
Analogies with well-known systems are helpful to fill in the gap between application requirements and theoretical abstractions used in task models.
For instance, the so-called nature algorithms use key elements of physical processes to form the basis of an optimization algorithm.
Examples include the knapsack problem, traveling salesman problem, ant colony optimization, and simulated annealing.
We also present a few scheduling algorithms designed for the gravitational task model which fulfill the requirements for on-line adaptivity.
The scheduling of target sensitive RT applications must account for timing constraints, and the trade-off among tasks with conflicting targets.
Our proposed scheduling algorithms use the equilibrium state concept to order the execution sequence of jobs, and compute the deviation of jobs from their target points for increased system utility.
The execution sequence of jobs in the schedule has a significant impact on the equilibrium of jobs, and dominates the complexity of the problem --- the optimum solution is NP-hard.
We show the efficacy of our approach through simulations results and 3 target sensitive RT applications enhanced with the gravitational task model.
Advanced sensing systems, sophisticated algorithms, and increasing computational resources continuously enhance the advanced driver assistance systems (ADAS). To date, despite that some vehicle based approaches to driver fatigue/drowsiness detection have been realized and deployed, objectively and reliably detecting the fatigue/drowsiness state of driver without compromising driving experience still remains challenging. In general, the choice of input sensorial information is limited in the state-of-the-art work. On the other hand, smart and safe driving, as representative future trends in the automotive industry worldwide, increasingly demands the new dimensional human-vehicle interactions, as well as the associated behavioral and bioinformatical data perception of driver. Thus, the goal of this research work is to investigate the employment of general and custom 3D-CMOS sensing concepts for the driver status monitoring, and to explore the improvement by merging/fusing this information with other salient customized information sources for gaining robustness/reliability. This thesis presents an effective multi-sensor approach with novel features to driver status monitoring and intention prediction aimed at drowsiness detection based on a multi-sensor intelligent assistance system -- DeCaDrive, which is implemented on an integrated soft-computing system with multi-sensing interfaces in a simulated driving environment. Utilizing active illumination, the IR depth camera of the realized system can provide rich facial and body features in 3D in a non-intrusive manner. In addition, steering angle sensor, pulse rate sensor, and embedded impedance spectroscopy sensor are incorporated to aid in the detection/prediction of driver's state and intention. A holistic design methodology for ADAS encompassing both driver- and vehicle-based approaches to driver assistance is discussed in the thesis as well. Multi-sensor data fusion and hierarchical SVM techniques are used in DeCaDrive to facilitate the classification of driver drowsiness levels based on which a warning can be issued in order to prevent possible traffic accidents. The realized DeCaDrive system achieves up to 99.66% classification accuracy on the defined drowsiness levels, and exhibits promising features such as head/eye tracking, blink detection, gaze estimation that can be utilized in human-vehicle interactions. However, the driver's state of "microsleep" can hardly be reflected in the sensor features of the implemented system. General improvements on the sensitivity of sensory components and on the system computation power are required to address this issue. Possible new features and development considerations for DeCaDrive are discussed as well in the thesis aiming to gain market acceptance in the future.
This study presents an energy-efficient ultra-low voltage standard-cell based memory in 28nm FD-SOI. The storage element (standard-cell latch) is replaced with a full- custom designed latch with 50 % less area. Error-free operation is demonstrated down to 450mV @ 9MHz. By utilizing body bias (BB) @ VDD = 0.5 V performance spans from 20 MHz @ BB=0V to 110MHz @ BB=1V.
At present the standardization of third generation (3G) mobile radio systems is the subject of worldwide research activities. These systems will cope with the market demand for high data rate services and the system requirement for exibility concerning the offered services and the transmission qualities. However, there will be de ciencies with respect to high capacity, if 3G mobile radio systems exclusively use single antennas. Very promising technique developed for increasing the capacity of 3G mobile radio systems the application is adaptive antennas. In this thesis, the benefits of using adaptive antennas are investigated for 3G mobile radio systems based on Time Division CDMA (TD-CDMA), which forms part of the European 3G mobile radio air interface standard adopted by the ETSI, and is intensively studied within the standardization activities towards a worldwide 3G air interface standard directed by the 3GPP (3rd Generation Partnership Project). One of the most important issues related to adaptive antennas is the analysis of the benefits of using adaptive antennas compared to single antennas. In this thesis, these bene ts are explained theoretically and illustrated by computer simulation results for both data detection, which is performed according to the joint detection principle, and channel estimation, which is applied according to the Steiner estimator, in the TD-CDMA uplink. The theoretical explanations are based on well-known solved mathematical problems. The simulation results illustrating the benefits of adaptive antennas are produced by employing a novel simulation concept, which offers a considerable reduction of the simulation time and complexity, as well as increased exibility concerning the use of different system parameters, compared to the existing simulation concepts for TD-CDMA. Furthermore, three novel techniques are presented which can be used in systems with adaptive antennas for additionally improving the system performance compared to single antennas. These techniques concern the problems of code-channel mismatch, of user separation in the spatial domain, and of intercell interference, which, as it is shown in the thesis, play a critical role on the performance of TD-CDMA with adaptive antennas. Finally, a novel approach for illustrating the performance differences between the uplink and downlink of TD-CDMA based mobile radio systems in a straightforward manner is presented. Since a cellular mobile radio system with adaptive antennas is considered, the ultimate goal is the investigation of the overall system efficiency rather than the efficiency of a single link. In this thesis, the efficiency of TD-CDMA is evaluated through its spectrum efficiency and capacity, which are two closely related performance measures for cellular mobile radio systems. Compared to the use of single antennas, the use of adaptive antennas allows impressive improvements of both spectrum efficiency and capacity. Depending on the mobile radio channel model and the user velocity, improvement factors range from six to 10.7 for the spectrum efficiency, and from 6.7 to 12.6 for the spectrum capacity of TD-CDMA. Thus, adaptive antennas constitute a promising technique for capacity increase of future mobile communications systems.
Real-time systems are systems that have to react correctly to stimuli from the environment within given timing constraints.
Today, real-time systems are employed everywhere in industry, not only in safety-critical systems but also in, e.g., communication, entertainment, and multimedia systems.
With the advent of multicore platforms, new challenges on the efficient exploitation of real-time systems have arisen:
First, there is the need for effective scheduling algorithms that feature low overheads to improve the use of the computational resources of real-time systems.
The goal of these algorithms is to ensure timely execution of tasks, i.e., to provide runtime guarantees.
Additionally, many systems require their scheduling algorithm to flexibly react to unforeseen events.
Second, the inherent parallelism of multicore systems leads to contention for shared hardware resources and complicates system analysis.
At any time, multiple applications run with varying resource requirements and compete for the scarce resources of the system.
As a result, there is a need for an adaptive resource management.
Achieving and implementing an effective and efficient resource management is a challenging task.
The main goal of resource management is to guarantee a minimum resource availability to real-time applications.
A further goal is to fulfill global optimization objectives, e.g., maximization of the global system performance, or the user perceived quality of service.
In this thesis, we derive methods based on the slot shifting algorithm.
Slot shifting provides flexible scheduling of time-constrained applications and can react to unforeseen events in time-triggered systems.
For this reason, we aim at designing slot shifting based algorithms targeted for multicore systems to tackle the aforementioned challenges.
The main contribution of this thesis is to present two global slot shifting algorithms targeted for multicore systems.
Additionally, we extend slot shifting algorithms to improve their runtime behavior, or to handle non-preemptive firm aperiodic tasks.
In a variety of experiments, the effectiveness and efficiency of the algorithms are evaluated and confirmed.
Finally, the thesis presents an implementation of a slot-shifting-based logic into a resource management framework for multicore systems.
Thus, the thesis closes the circle and successfully bridges the gap between real-time scheduling theory and real-world implementations.
We prove applicability of the slot shifting algorithm to effectively and efficiently perform adaptive resource management on multicore systems.
The recently established technologies in the areas of distributed measurement and intelligent
information processing systems, e.g., Cyber Physical Systems (CPS), Ambient
Intelligence/Ambient Assisted Living systems (AmI/AAL), the Internet of Things
(IoT), and Industry 4.0 have increased the demand for the development of intelligent
integrated multi-sensory systems as to serve rapid growing markets [1, 2]. These increase
the significance of complex measurement systems, that incorporate numerous advanced
methodological implementations including electronics circuit, signal processing,
and multi-sensory information fusion. In particular, in multi-sensory cognition applications,
to design such systems, the skill-required tasks, e.g., method selection, parameterization,
model analysis, and processing chain construction are elaborated with immense
effort, which conventionally are done manually by the expert designer. Moreover, the
strong technological competition imposes even more complicated design problems with
multiple constraints, e.g., cost, speed, power consumption,
exibility, and reliability.
Thus, the conventional human expert based design approach may not be able to cope
with the increasing demand in numbers, complexity, and diversity. To alleviate the issue,
the design automation approach has been the topic for numerous research works [3-14]
and has been commercialized to several products [15-18]. Additionally, the dynamic
adaptation of intelligent multi-sensor systems is the potential solution for developing
dependable and robust systems. Intrinsic evolution approach and self-x properties ,
which include self-monitoring, -calibrating/trimming, and -healing/repairing, are among
the best candidates for the issue. Motivated from the ongoing research trends and based
on the background of our research work [12, 13] among the pioneers in this topic, the
research work of the thesis contributes to the design automation of intelligent integrated
In this research work, the Design Automation for Intelligent COgnitive system with self-
X properties, the DAICOX, architecture is presented with the aim of tackling the design
effort and to providing high quality and robust solutions for multi-sensor intelligent
systems. Therefore, the DAICOX architecture is conceived with the defined goals as
Perform front to back complete processing chain design with automated method
selection and parameterization,
Provide a rich choice of pattern recognition methods to the design method pool,
Associate design information via interactive user interface and visualization along
with intuitive visual programming,
Deliver high quality solutions outperforming conventional approaches by using
Gain the adaptability, reliability and robustness of designed solutions with self-x
Derived from the goals, several scientific methodological developments and implementations,
particularly in the areas of pattern recognition and computational intelligence,
will be pursued as part of the DAICOX architecture in the research work of this thesis.
The method pool is aimed to contain a rich choice of methods and algorithms covering
data acquisition and sensor configuration, signal processing and feature computation,
dimensionality reduction, and classification. These methods will be selected and parameterized
automatically by the DAICOX design optimization to construct a multi-sensory
cognition processing chain. A collection of non-parametric feature quality assessment
functions for the purpose of Dimensionality Reduction (DR) process will be presented.
In addition, to standard DR methods, the variations of feature selection method, in
particular, feature weighting will be proposed. Three different classification categories
shall be incorporated in the method pool. Hierarchical classification approach will be
proposed and developed to serve as a multi-sensor fusion architecture at the decision
level. Beside multi-class classification, one-class classification methods, e.g., One-Class
SVM and NOVCLASS will be presented to extend functionality of the solutions, in particular,
anomaly and novelty detection. DAICOX is conceived to effectively handle the
problem of method selection and parameter setting for a particular application yielding
high performance solutions. The processing chain construction tasks will be carried
out by meta-heuristic optimization methods, e.g., Genetic Algorithms (GA) and Particle
Swarm Optimization (PSO), with multi-objective optimization approach and model
analysis for robust solutions. In addition, to the automated system design mechanisms,
DAICOX will facilitate the design tasks with intuitive visual programming and various
options of visualization. Design database concept of DAICOX is aimed to allow the
reusability and extensibility of the designed solutions gained from previous knowledge.
Thus, the cooperative design of machine and knowledge from the design expert can also
be utilized for obtaining fully enhanced solutions. In particular, the integration of self-x
properties as well as intrinsic optimization into the system is proposed to gain enduring
reliability and robustness. Hence, DAICOX will allow the inclusion of dynamically
reconfigurable hardware instances to the designed solutions in order to realize intrinsic
optimization and self-x properties.
As a result from the research work in this thesis, a comprehensive intelligent multisensor
system design architecture with automated method selection, parameterization,
and model analysis is developed with compliance to open-source multi-platform software.It is integrated with an intuitive design environment, which includes visual programming
concept and design information visualizations. Thus, the design effort is minimized as
investigated in three case studies of different application background, e.g., food analysis
(LoX), driving assistance (DeCaDrive), and magnetic localization. Moreover, DAICOX
achieved better quality of the solutions compared to the manual approach in all cases,
where the classification rate was increased by 5.4%, 0.06%, and 11.4% in the LoX,
DeCaDrive, and magnetic localization case, respectively. The design time was reduced
by 81.87% compared to the conventional approach by using DAICOX in the LoX case
study. At the current state of development, a number of novel contributions of the thesis
are outlined below.
Automated processing chain construction and parameterization for the design of
signal processing and feature computation.
Novel dimensionality reduction methods, e.g., GA and PSO based feature selection
and feature weighting with multi-objective feature quality assessment.
A modification of non-parametric compactness measure for feature space quality
Decision level sensor fusion architecture based on proposed hierarchical classification
approach using, i.e., H-SVM.
A collection of one-class classification methods and a novel variation, i.e.,
Automated design toolboxes supporting front to back design with automated
model selection and information visualization.
In this research work, due to the complexity of the task, neither all of the identified goals
have been comprehensively reached yet nor has the complete architecture definition been
fully implemented. Based on the currently implemented tools and frameworks, ongoing
development of DAICOX is pursuing towards the complete architecture. The potential
future improvements are the extension of method pool with a richer choice of methods
and algorithms, processing chain breeding via graph based evolution approach, incorporation
of intrinsic optimization, and the integration of self-x properties. According to
these features, DAICOX will improve its aptness in designing advanced systems to serve
the increasingly growing technologies of distributed intelligent measurement systems, in
particular, CPS and Industrie 4.0.