Kaiserslautern - Fachbereich Elektrotechnik und Informationstechnik
Refine
Year of publication
Document Type
- Doctoral Thesis (64)
- Article (21)
- Conference Proceeding (17)
- Preprint (6)
- Report (3)
- Other (2)
- Course Material (1)
- Master's Thesis (1)
Language
- English (115) (remove)
Has Fulltext
- yes (115)
Keywords
- Mobilfunk (5)
- Cache (3)
- DRAM (3)
- Formal Verification (3)
- MIMO (3)
- Model checking (3)
- OFDM (3)
- SRAM (3)
- System-on-Chip (3)
- Chisel (2)
Faculty / Organisational entity
Modern society relies on convenience services and mobile communication. Cloud computing is the current trend to make data and applications available at any time on every device. Data centers concentrate computation and storage at central locations, while they claim themselves green due to their optimized maintenance and increased energy efficiency. The key enabler for this evolution is the microelectronics industry. The trend to power efficient mobile devices has forced this industry to change its design dogma to: ”keep data locally and reduce data communication whenever possible”. Therefore we ask: is cloud computing repeating the aberrations of its enabling industry?
The fifth-generation (5G) of wireless networks promises to bring new advances, such as a huge increase in mobile data rates, a plunge in communications latency, and an increase in the quality of experience perceived by users that can cope with the ever-increasing demand in Internet traffic. However, the high cost of capital and operational expenditure (CAPEX/OPEX) of the new 5G network and the lack of a killer application hinder its rapid adoption. In this context, Mobile Network Operators (MNOs) have turned their attention to the following idea: opening up their infrastructure so that vertical businesses can leverage the new 5G network to improve their primary businesses and develop new ones. However, deploying multiple isolated vertical applications on top of the same infrastructure poses unique challenges that must be addressed. In this thesis, we provide critical contributions to developing 5G networks to accommodate different vertical applications in an isolated, flexible, and automated manner. This thesis contributions spawn on three main areas: (i) the development of an integrated fronthaul and backhaul network, (ii) the development of a network slicing overbooking algorithm, and (iii) the development of a method to mitigate the noisy neighbors' problem in a vRAN deployment.
Sensing location information in indoor scenes requires a high accuracy and is a challenging task, mainly because of multipath and NLoS (non-line-of-sight) propagation. GNSS signals cannot penetrate well in indoor environment. Satellite-based navigation and positioning systems cannot therefore be used for indoor positioning.. Other technologies have been suggested for indoor usage, among them, Wi-Fi (802.11) and 5G NR (New Radio). The primary aim of this study is to discuss the advantages and drawbacks of 5G and Wi-Fi positioning techniques for indoor localization.
3D integration of solid-state memories and logic, as demonstrated by the Hybrid Memory Cube (HMC), offers major opportunities for revisiting near-memory computation and gives new hope to mitigate the power and performance losses caused by the “memory wall”. In this paper we present the first exploration steps towards design of the Smart Memory Cube (SMC), a new Processor-in-Memory (PIM) architecture that enhances the capabilities of the logic-base (LoB) in HMC. An accurate simulation environment has been developed, along with a full featured software stack. All offloading and dynamic overheads caused by the operating system, cache coherence, and memory management are considered, as well. Benchmarking results demonstrate up to 2X performance improvement in comparison with the host SoC, and around 1.5X against a similar host-side accelerator. Moreover, by scaling down the voltage and frequency of PIM’s processor it is possible to reduce energy by around 70% and 55% in comparison with the host and the accelerator, respectively.
Beamforming performs spatial filtering to preserve the signal from given directions of interest while suppressing interfering signals and noise arriving from other directions.
For example, a microphone array equipped with beamforming algorithm could preserve the sound coming from a target speaker and suppress sounds coming from other speakers.
Beamformer has been widely used in many applications such as radar, sonar, communication, and acoustic systems.
A data-independent beamformer is the beamformer whose coefficients are independent on sensor signals, it normally uses less computation since the coefficients are computed once. Moreover, its coefficients are derived from the well-defined statistical models, then it produces less artifacts. The major drawback of this beamforming class is its limitation to the interference suppression.
On the other hand, an adaptive beamformer is a beamformer whose coefficients depend on or adapt to sensor signals. It is capable of suppressing the interference better than a data-independent beamforming but it suffers from either too much distortion of the signal of interest or less noise reduction when the updating rate of coefficients does not synchronize with the changing rate of the noise model. Besides, it is computationally intensive since the coefficients need to be updated frequently.
In acoustic applications, the bandwidth of signals of interest extends over several octaves, but we always expect that the characteristic of the beamformer is invariant with regard to the bandwidth of interest. This can be achieved by the so-called broadband beamforming.
Since the beam pattern of conventional beamformers depends on the frequency of the signal, it is common to use a dense and uniform array for the broadband beamforming to guarantee some essential performances together, such as frequency-independence, less sensitive to white noise, high directivity factor or high front-to-back ratio. In this dissertation, we mainly focus on the sparse array of which the aim is to use fewer sensors in the array,
while simultaneously assuring several important performances of the beamformer.
In the past few decades, many design methodologies for sparse arrays have been proposed and were applied in a variety of practical applications.
Although good results were presented, there are still some restrictions, such as the number of sensors is large, the designed beam pattern must be fixed, the steering ability is limited and the computational complexity is high.
In this work, two novel approaches for the sparse array design taking a hypothesized uniform array as a basis are proposed, that is, one for data-independent beamformers and the another for adaptive beamformers.
As an underlying component of the proposed methods, the dissertation introduces some new insights into the uniform array with broadband beamforming. In this context, a function formulating the relations between the sensor coefficients and its beam pattern over frequency is proposed. The function mainly contains the coordinate transform and inverse Fourier transform.
Furthermore, from the bijection of the function and broadband beamforming perspective, we propose the lower and upper bounds for the inter-distance of sensors. Within these bounds, the function is a bijective function that can be utilized to design the uniform array with broadband beamforming.
For data-independent beamforming, many studies have focused on optimization procedures to seek the sparse array deployment. This dissertation presents an alternative approach to determine the location of sensors.
Starting with a weight spectrum of a virtual dense and uniform array, some techniques are used, such as analyzing a weight spectrum to determine the critical sensors, applying the clustering technique to group the sensors into different groups and selecting representative sensors for each group.
After the sparse array deployment is specified, the optimization technique is applied to find the beamformer coefficients. The proposed method helps to save the computation time in the design phase and its beamformer performance outperforms other state-of-the-art methods in several aspects such as the higher white noise gain, higher directivity factor or more frequency-independence.
For adaptive beamforming, the dissertation attempts to design a versatile sparse microphone array that can be used for different beam patterns.
Furthermore, we aim to reduce the number of microphones in the sparse array while ensuring that its performance can continue to compete with a highly dense and uniform array in terms of broadband beamforming.
An irregular microphone array in a planar surface with the maximum number of distinct distances between the microphones is proposed.
It is demonstrated that the irregular microphone array is well-suited to sparse recovery algorithms that are used to solve underdetermined systems with subject to sparse solutions. Here, a sparse solution is the sound source's spatial spectrum that need to be reconstructed from microphone signals.
From the reconstructed sound sources, a method for array interpolation is presented to obtain an interpolated dense and uniform microphone array that performs well with broadband beamforming.
In addition, two alternative approaches for generalized sidelobe canceler (GSC) beamformer are proposed. One is the data-independent beamforming variant, the other is the adaptive beamforming variant. The GSC decomposes beamforming into two paths: The upper path is to preserve the desired signal, the lower path is to suppress the desired signal. From a beam pattern viewpoint, we propose an improvement for GSC, that is, instead of using the blocking matrix in the lower path to suppress the desired signal, we design a beamformer that contains the nulls at the look direction and at some other directions. Both approaches are simple beamforming design methods and they can be applied to either sparse array or uniform array.
Lastly, a new technique for direction-of-arrival (DOA) estimation based on the annihilating filter is also presented in this dissertation.
It is based on the idea of finite rate of innovation to reconstruct the stream of Diracs, that is, identifying an annihilating filter/locator filter for a few uniform samples and the position of the Diracs are then related to the roots of the filter. Here, an annihilating filter is the filter that suppresses the signal, since its coefficient vector is always orthogonal to every frame of signal.
In the DOA context, we regard an active source as a Dirac associated with the arrival direction, then the directions of active sources can be derived from the roots of the annihilating filter. However,
the DOA obtained by this method is sensitive to noise and the number of DOAs is limited.
To address these issues, the dissertation proposes a robust method to design the annihilating filter and to increase the degree-of-freedom of the measurement system (more active sources can be detected) via observing multiple data frames.
Furthermore, we also analyze the performance of DOA with diffuse noise and propose an extended multiple signal classification algorithm that takes diffuse noise into account. In the simulation,
it shows, that in the case of diffuse noise, only the extended multiple signal classification algorithm can estimate the DOAs properly.
A counter-based read circuit tolerant to process variation for low-voltage operating STT-MRAM
(2016)
The capacity of embedded memory on LSIs has kept increasing. It is important to reduce the leakage power of embedded memory for low-power LSIs. In fact, the ITRS predicts that the leakage power in embedded memory will account for 40% of all power consumption by 2024 [1]. A spin transfer torque magneto-resistance random access memory (STT-MRAM) is promising for use as non-volatile memory to reduce the leakage power. It is useful because it can function at low voltages and has a lifetime of over 1016 write cycles [2]. In addition, the STT-MRAM technology has a smaller bit cell than an SRAM. Making the STT-MRAM is suitable for use in high-density products [3–7]. The STT-MRAM uses magnetic tunnel junction (MTJ). The MTJ has two states: a parallel state and an anti-parallel state. These states mean that the magnetization direction of the MTJ’s layers are the same or different. The directions pair determines the MTJ’s magneto- resistance value. The states of MTJ can be changed by the current flowing. The MTJ resistance becomes low in the parallel state and high in the anti-parallel state. The MTJ potentially operates at less than 0.4 V [8]. In other hands, it is difficult to design peripheral circuitry for an STT-MRAM array at such a low voltage. In this paper, we propose a counter-based read circuit that functions at 0.4 V, which is tolerant of process variation and temperature fluctuation.
We study the sensor fault estimation and accommodation problems in a data-driven \(\mathcal{H}_\infty\) setting, leading to a data-driven sensor fault-tolerant control scheme. First, we formulate the fault estimation problem as a finite-horizon minimax \(\mathcal{H}_\infty\)-optimization problem in a data-driven setup, whose solution yields the fault estimate. The estimated fault is then used for output compensation. This compensated output and the experimental input are used to achieve certain control objectives in a data-driven \(\mathcal{H}_\infty\) setting. Next, the data-driven \(\mathcal{H}_\infty\) fault estimation and control problems are solved using a subspace predictor-based approach. Finally, the proposed algorithm is applied to the steering subsystem of the remotely operated underwater vehicle.
For many years real-time task models have focused the timing constraints on execution windows defined by earliest start times and deadlines for feasibility.
However, the utility of some application may vary among scenarios which yield correct behavior, and maximizing this utility improves the resource utilization.
For example, target sensitive applications have a target point where execution results in maximized utility, and an execution window for feasibility.
Execution around this point and within the execution window is allowed, albeit at lower utility.
The intensity of the utility decay accounts for the importance of the application.
Examples of such applications include multimedia and control; multimedia application are very popular nowadays and control applications are present in every automated system.
In this thesis, we present a novel real-time task model which provides for easy abstractions to express the timing constraints of target sensitive RT applications: the gravitational task model.
This model uses a simple gravity pendulum (or bob pendulum) system as a visualization model for trade-offs among target sensitive RT applications.
We consider jobs as objects in a pendulum system, and the target points as the central point.
Then, the equilibrium state of the physical problem is equivalent to the best compromise among jobs with conflicting targets.
Analogies with well-known systems are helpful to fill in the gap between application requirements and theoretical abstractions used in task models.
For instance, the so-called nature algorithms use key elements of physical processes to form the basis of an optimization algorithm.
Examples include the knapsack problem, traveling salesman problem, ant colony optimization, and simulated annealing.
We also present a few scheduling algorithms designed for the gravitational task model which fulfill the requirements for on-line adaptivity.
The scheduling of target sensitive RT applications must account for timing constraints, and the trade-off among tasks with conflicting targets.
Our proposed scheduling algorithms use the equilibrium state concept to order the execution sequence of jobs, and compute the deviation of jobs from their target points for increased system utility.
The execution sequence of jobs in the schedule has a significant impact on the equilibrium of jobs, and dominates the complexity of the problem --- the optimum solution is NP-hard.
We show the efficacy of our approach through simulations results and 3 target sensitive RT applications enhanced with the gravitational task model.
A Multi-Sensor Intelligent Assistance System for Driver Status Monitoring and Intention Prediction
(2017)
Advanced sensing systems, sophisticated algorithms, and increasing computational resources continuously enhance the advanced driver assistance systems (ADAS). To date, despite that some vehicle based approaches to driver fatigue/drowsiness detection have been realized and deployed, objectively and reliably detecting the fatigue/drowsiness state of driver without compromising driving experience still remains challenging. In general, the choice of input sensorial information is limited in the state-of-the-art work. On the other hand, smart and safe driving, as representative future trends in the automotive industry worldwide, increasingly demands the new dimensional human-vehicle interactions, as well as the associated behavioral and bioinformatical data perception of driver. Thus, the goal of this research work is to investigate the employment of general and custom 3D-CMOS sensing concepts for the driver status monitoring, and to explore the improvement by merging/fusing this information with other salient customized information sources for gaining robustness/reliability. This thesis presents an effective multi-sensor approach with novel features to driver status monitoring and intention prediction aimed at drowsiness detection based on a multi-sensor intelligent assistance system -- DeCaDrive, which is implemented on an integrated soft-computing system with multi-sensing interfaces in a simulated driving environment. Utilizing active illumination, the IR depth camera of the realized system can provide rich facial and body features in 3D in a non-intrusive manner. In addition, steering angle sensor, pulse rate sensor, and embedded impedance spectroscopy sensor are incorporated to aid in the detection/prediction of driver's state and intention. A holistic design methodology for ADAS encompassing both driver- and vehicle-based approaches to driver assistance is discussed in the thesis as well. Multi-sensor data fusion and hierarchical SVM techniques are used in DeCaDrive to facilitate the classification of driver drowsiness levels based on which a warning can be issued in order to prevent possible traffic accidents. The realized DeCaDrive system achieves up to 99.66% classification accuracy on the defined drowsiness levels, and exhibits promising features such as head/eye tracking, blink detection, gaze estimation that can be utilized in human-vehicle interactions. However, the driver's state of "microsleep" can hardly be reflected in the sensor features of the implemented system. General improvements on the sensitivity of sensory components and on the system computation power are required to address this issue. Possible new features and development considerations for DeCaDrive are discussed as well in the thesis aiming to gain market acceptance in the future.
The authors explore the intrinsic trade-off in a DRAM between the power consumption (due to refresh) and the reliability. Their unique measurement platform allows tailoring to the design constraints depending on whether power consumption, performance or reliability has the highest design priority. Furthermore, the authors show how this measurement platform can be used for reverse engineering the internal structure of DRAMs and how this knowledge can be used to improve DRAM’s reliability.
This study presents an energy-efficient ultra-low voltage standard-cell based memory in 28nm FD-SOI. The storage element (standard-cell latch) is replaced with a full- custom designed latch with 50 % less area. Error-free operation is demonstrated down to 450mV @ 9MHz. By utilizing body bias (BB) @ VDD = 0.5 V performance spans from 20 MHz @ BB=0V to 110MHz @ BB=1V.
At present the standardization of third generation (3G) mobile radio systems is the subject of worldwide research activities. These systems will cope with the market demand for high data rate services and the system requirement for exibility concerning the offered services and the transmission qualities. However, there will be de ciencies with respect to high capacity, if 3G mobile radio systems exclusively use single antennas. Very promising technique developed for increasing the capacity of 3G mobile radio systems the application is adaptive antennas. In this thesis, the benefits of using adaptive antennas are investigated for 3G mobile radio systems based on Time Division CDMA (TD-CDMA), which forms part of the European 3G mobile radio air interface standard adopted by the ETSI, and is intensively studied within the standardization activities towards a worldwide 3G air interface standard directed by the 3GPP (3rd Generation Partnership Project). One of the most important issues related to adaptive antennas is the analysis of the benefits of using adaptive antennas compared to single antennas. In this thesis, these bene ts are explained theoretically and illustrated by computer simulation results for both data detection, which is performed according to the joint detection principle, and channel estimation, which is applied according to the Steiner estimator, in the TD-CDMA uplink. The theoretical explanations are based on well-known solved mathematical problems. The simulation results illustrating the benefits of adaptive antennas are produced by employing a novel simulation concept, which offers a considerable reduction of the simulation time and complexity, as well as increased exibility concerning the use of different system parameters, compared to the existing simulation concepts for TD-CDMA. Furthermore, three novel techniques are presented which can be used in systems with adaptive antennas for additionally improving the system performance compared to single antennas. These techniques concern the problems of code-channel mismatch, of user separation in the spatial domain, and of intercell interference, which, as it is shown in the thesis, play a critical role on the performance of TD-CDMA with adaptive antennas. Finally, a novel approach for illustrating the performance differences between the uplink and downlink of TD-CDMA based mobile radio systems in a straightforward manner is presented. Since a cellular mobile radio system with adaptive antennas is considered, the ultimate goal is the investigation of the overall system efficiency rather than the efficiency of a single link. In this thesis, the efficiency of TD-CDMA is evaluated through its spectrum efficiency and capacity, which are two closely related performance measures for cellular mobile radio systems. Compared to the use of single antennas, the use of adaptive antennas allows impressive improvements of both spectrum efficiency and capacity. Depending on the mobile radio channel model and the user velocity, improvement factors range from six to 10.7 for the spectrum efficiency, and from 6.7 to 12.6 for the spectrum capacity of TD-CDMA. Thus, adaptive antennas constitute a promising technique for capacity increase of future mobile communications systems.
Real-time systems are systems that have to react correctly to stimuli from the environment within given timing constraints.
Today, real-time systems are employed everywhere in industry, not only in safety-critical systems but also in, e.g., communication, entertainment, and multimedia systems.
With the advent of multicore platforms, new challenges on the efficient exploitation of real-time systems have arisen:
First, there is the need for effective scheduling algorithms that feature low overheads to improve the use of the computational resources of real-time systems.
The goal of these algorithms is to ensure timely execution of tasks, i.e., to provide runtime guarantees.
Additionally, many systems require their scheduling algorithm to flexibly react to unforeseen events.
Second, the inherent parallelism of multicore systems leads to contention for shared hardware resources and complicates system analysis.
At any time, multiple applications run with varying resource requirements and compete for the scarce resources of the system.
As a result, there is a need for an adaptive resource management.
Achieving and implementing an effective and efficient resource management is a challenging task.
The main goal of resource management is to guarantee a minimum resource availability to real-time applications.
A further goal is to fulfill global optimization objectives, e.g., maximization of the global system performance, or the user perceived quality of service.
In this thesis, we derive methods based on the slot shifting algorithm.
Slot shifting provides flexible scheduling of time-constrained applications and can react to unforeseen events in time-triggered systems.
For this reason, we aim at designing slot shifting based algorithms targeted for multicore systems to tackle the aforementioned challenges.
The main contribution of this thesis is to present two global slot shifting algorithms targeted for multicore systems.
Additionally, we extend slot shifting algorithms to improve their runtime behavior, or to handle non-preemptive firm aperiodic tasks.
In a variety of experiments, the effectiveness and efficiency of the algorithms are evaluated and confirmed.
Finally, the thesis presents an implementation of a slot-shifting-based logic into a resource management framework for multicore systems.
Thus, the thesis closes the circle and successfully bridges the gap between real-time scheduling theory and real-world implementations.
We prove applicability of the slot shifting algorithm to effectively and efficiently perform adaptive resource management on multicore systems.
The recently established technologies in the areas of distributed measurement and intelligent
information processing systems, e.g., Cyber Physical Systems (CPS), Ambient
Intelligence/Ambient Assisted Living systems (AmI/AAL), the Internet of Things
(IoT), and Industry 4.0 have increased the demand for the development of intelligent
integrated multi-sensory systems as to serve rapid growing markets [1, 2]. These increase
the significance of complex measurement systems, that incorporate numerous advanced
methodological implementations including electronics circuit, signal processing,
and multi-sensory information fusion. In particular, in multi-sensory cognition applications,
to design such systems, the skill-required tasks, e.g., method selection, parameterization,
model analysis, and processing chain construction are elaborated with immense
effort, which conventionally are done manually by the expert designer. Moreover, the
strong technological competition imposes even more complicated design problems with
multiple constraints, e.g., cost, speed, power consumption,
exibility, and reliability.
Thus, the conventional human expert based design approach may not be able to cope
with the increasing demand in numbers, complexity, and diversity. To alleviate the issue,
the design automation approach has been the topic for numerous research works [3-14]
and has been commercialized to several products [15-18]. Additionally, the dynamic
adaptation of intelligent multi-sensor systems is the potential solution for developing
dependable and robust systems. Intrinsic evolution approach and self-x properties [19],
which include self-monitoring, -calibrating/trimming, and -healing/repairing, are among
the best candidates for the issue. Motivated from the ongoing research trends and based
on the background of our research work [12, 13] among the pioneers in this topic, the
research work of the thesis contributes to the design automation of intelligent integrated
multi-sensor systems.
In this research work, the Design Automation for Intelligent COgnitive system with self-
X properties, the DAICOX, architecture is presented with the aim of tackling the design
effort and to providing high quality and robust solutions for multi-sensor intelligent
systems. Therefore, the DAICOX architecture is conceived with the defined goals as
listed below.
Perform front to back complete processing chain design with automated method
selection and parameterization,
Provide a rich choice of pattern recognition methods to the design method pool,
Associate design information via interactive user interface and visualization along
with intuitive visual programming,
Deliver high quality solutions outperforming conventional approaches by using
multi-objective optimization,
Gain the adaptability, reliability and robustness of designed solutions with self-x
properties,
Derived from the goals, several scientific methodological developments and implementations,
particularly in the areas of pattern recognition and computational intelligence,
will be pursued as part of the DAICOX architecture in the research work of this thesis.
The method pool is aimed to contain a rich choice of methods and algorithms covering
data acquisition and sensor configuration, signal processing and feature computation,
dimensionality reduction, and classification. These methods will be selected and parameterized
automatically by the DAICOX design optimization to construct a multi-sensory
cognition processing chain. A collection of non-parametric feature quality assessment
functions for the purpose of Dimensionality Reduction (DR) process will be presented.
In addition, to standard DR methods, the variations of feature selection method, in
particular, feature weighting will be proposed. Three different classification categories
shall be incorporated in the method pool. Hierarchical classification approach will be
proposed and developed to serve as a multi-sensor fusion architecture at the decision
level. Beside multi-class classification, one-class classification methods, e.g., One-Class
SVM and NOVCLASS will be presented to extend functionality of the solutions, in particular,
anomaly and novelty detection. DAICOX is conceived to effectively handle the
problem of method selection and parameter setting for a particular application yielding
high performance solutions. The processing chain construction tasks will be carried
out by meta-heuristic optimization methods, e.g., Genetic Algorithms (GA) and Particle
Swarm Optimization (PSO), with multi-objective optimization approach and model
analysis for robust solutions. In addition, to the automated system design mechanisms,
DAICOX will facilitate the design tasks with intuitive visual programming and various
options of visualization. Design database concept of DAICOX is aimed to allow the
reusability and extensibility of the designed solutions gained from previous knowledge.
Thus, the cooperative design of machine and knowledge from the design expert can also
be utilized for obtaining fully enhanced solutions. In particular, the integration of self-x
properties as well as intrinsic optimization into the system is proposed to gain enduring
reliability and robustness. Hence, DAICOX will allow the inclusion of dynamically
reconfigurable hardware instances to the designed solutions in order to realize intrinsic
optimization and self-x properties.
As a result from the research work in this thesis, a comprehensive intelligent multisensor
system design architecture with automated method selection, parameterization,
and model analysis is developed with compliance to open-source multi-platform software.It is integrated with an intuitive design environment, which includes visual programming
concept and design information visualizations. Thus, the design effort is minimized as
investigated in three case studies of different application background, e.g., food analysis
(LoX), driving assistance (DeCaDrive), and magnetic localization. Moreover, DAICOX
achieved better quality of the solutions compared to the manual approach in all cases,
where the classification rate was increased by 5.4%, 0.06%, and 11.4% in the LoX,
DeCaDrive, and magnetic localization case, respectively. The design time was reduced
by 81.87% compared to the conventional approach by using DAICOX in the LoX case
study. At the current state of development, a number of novel contributions of the thesis
are outlined below.
Automated processing chain construction and parameterization for the design of
signal processing and feature computation.
Novel dimensionality reduction methods, e.g., GA and PSO based feature selection
and feature weighting with multi-objective feature quality assessment.
A modification of non-parametric compactness measure for feature space quality
assessment.
Decision level sensor fusion architecture based on proposed hierarchical classification
approach using, i.e., H-SVM.
A collection of one-class classification methods and a novel variation, i.e.,
NOVCLASS-R.
Automated design toolboxes supporting front to back design with automated
model selection and information visualization.
In this research work, due to the complexity of the task, neither all of the identified goals
have been comprehensively reached yet nor has the complete architecture definition been
fully implemented. Based on the currently implemented tools and frameworks, ongoing
development of DAICOX is pursuing towards the complete architecture. The potential
future improvements are the extension of method pool with a richer choice of methods
and algorithms, processing chain breeding via graph based evolution approach, incorporation
of intrinsic optimization, and the integration of self-x properties. According to
these features, DAICOX will improve its aptness in designing advanced systems to serve
the increasingly growing technologies of distributed intelligent measurement systems, in
particular, CPS and Industrie 4.0.
Wireless sensor networks are the driving force behind many popular and interdisciplinary research areas, such as environmental monitoring, building automation, healthcare and assisted living applications. Requirements like compactness, high integration of sensors, flexibility, and power efficiency are often very different and cannot be fulfilled by state-of-the-art node platforms at once. In this paper, we present and analyze AmICA: a flexible, compact, easy-to-program, and low-power node platform. Developed from scratch and including a node, a basic communication protocol, and a debugging toolkit, it assists in an user-friendly rapid application development. The general purpose nature of AmICA was evaluated in two practical applications with diametric requirements. Our analysis shows that AmICA nodes are 67% smaller than BTnodes, have five times more sensors than Mica2Dot and consume 72% less energy than the state-of-the-art TelosB mote in sleep mode.
Wireless Sensor Networks (WSN) are dynamically-arranged networks typically composed of a large number of arbitrarily-distributed sensor nodes with computing capabilities contributing to –at least– one common application. The main characteristic of these networks is that of being functionally constrained due to a scarce availability of resources and strong dependence on uncontrollable environmental factors. These conditions introduce severe restrictions on the applicability of classic real-time methods aiming at guaranteeing time-bounded communications. Existing real-time solutions tend to apply concepts that were originally not conceived for sensor networks, idealizing realistic application scenarios and overlooking at important design limitations. This results in a number of misleading practices contributing to approaches of restricted validity in real-world scenarios. Amending the confrontation between WSNs and real-time objectives starts with a review of the basic fundamentals of existing approaches. In doing so, this thesis presents an alternative approach based on a generalized timeliness notion suitable to the particularities of WSNs. The new conceptual notion allows the definition of feasible real-time objectives opening a new scope of possibilities not constrained to idealized systems. The core of this thesis is based on the definition and application of Quality of Service (QoS) trade-offs between timeliness and other significant QoS metrics. The analysis of local and global trade-offs provides a step-by-step methodology identifying the correlations between these quality metrics. This association enables the definition of alternative trade-off configurations (set points) influencing the quality performance of the network at selected instants of time. With the basic grounds established, the above concepts are embedded in a simple routing protocol constituting a proof of concept for the validity of the presented analysis. Extensive evaluations under realistic scenarios are driven on simulation environments as well as real testbeds, validating the consistency of this approach.
An interrupter for use in a daisy-chained VME bus interrupt system has beendesigned and implemented as an asynchronous sequential circuit. The concur-rency of the processes posed a design problem that was solved by means of asystematic design procedure that uses Petri nets for specifying system and in-terrupter behaviour, and for deriving a primitive flow table. Classical designand additional measures to cope with non-fundamental mode operation yieldeda coded state-machine representation. This was implemented on a GAL 22V10,chosen for its hazard-preventing structure and for rapid prototyping in studentlaboratories.
Phase-gradient metasurfaces can be designed to manipulate electromagnetic waves according to the generalized Snell’s law. Here, we show that a phased parallel-plate waveguide array (PPWA) can be devised to act in the same manner as a phase-gradient metasurface. We derive an analytic model that describes the wave propagation in the PPWA and calculate both the angle and amplitude distribution of the diffracted waves. The analytic model provides an intuitive understanding of the diffraction from the PPWA. We verify the (semi-)analytically calculated angle and amplitude distribution of the diffracted waves by numerical 3-D simulations and experimental measurements in a microwave goniometer.
Radar cross section reducing (RCSR) metasurfaces or coding metasurfaces were primarily designed for normally incident radiation in the past. It is evident that the performance of coding metasurfaces for RCSR can be significantly improved by additional backscattering reduction of obliquely incident radiation, which requires a valid analytic conception tool. Here, we derive an analytic current density distribution model for the calculation of the backscatter far-field of obliquely incident radiation on a coding metasurface for RCSR. For demonstration, we devise and fabricate a metasurface for a working frequency of 10.66GHz and obtain good agreement between the measured, simulated, and analytically calculated backscatter far-fields. The metasurface significantly reduces backscattering for incidence angles between −40∘ and 40∘ in a spectral working range of approximately 1GHz.
The energy efficiency of today’s microcontrollers is supported by the extensive usage of low-power mechanisms. A full power-down requires in many cases a complex, and maybe error prone, administration scheme, because data from the volatile memory have to be stored in a flash based back- up memory. New types of non-volatile memory, e.g. in RRAM technology, are faster and consumes a fraction of the energy compared to flash technology. This paper evaluates power gating for WSN with RRAM as back-up memory.
Hardware devices fabricated with recent process technology are intrinsically
more susceptible to faults than before. Resilience against hardware faults is,
therefore, a major concern for safety-critical embedded systems and has been
addressed in several standards. These standards demand a systematic and
thorough safety evaluation, especially for the highest safety levels. However,
any attempt to cover all faults for all theoretically possible scenarios that a sys-
tem might be used in can easily lead to excessive costs. Instead, an application-
dependent approach should be taken: strategies for test and fault resilience
must target only those faults that can actually have an effect in the situations
in which the hardware is being used.
In order to provide the data for such safety evaluations, we propose scalable
and formal methods to analyse the effects of hardware faults on hardware/soft-
ware systems across three abstraction levels where we:
(1) perform a fault effect analysis at instruction set architecture level by em-
ploying fault injection into a hardware-dependent software model called
program netlist,
(2) use the results from the program netlist analysis to perform a deductive
analysis to determine “application-redundant” faults at the gate level by
exploiting standard combinational test pattern generation,
(3) use the results from the program netlist analysis to perform an inductive
analysis to identify all faults of a given fault list that can have an effect
on selected objects of the high-level software, such as specified safety
functions, by employing Abstract Interpretation.
These methods aid in the certification process for the higher safety levels
by (a) providing formal guarantees that certain faults can be ignored and (b)
pointing to those faults which need to be detected in order to ensure product
safety.
We consider transient and permanent faults corrupting data in program-
visible hardware registers and model them using the single-event upset and
stuck-at fault models, respectively.
Scalability of our approaches results from combining an analysis at the ma-
chine and hardware level with separate analyses on gate level and C level
source code, as well as, exploiting certain properties that are characteristic for
embedded systems software. We demonstrate the effectiveness and scalability
of each method on industry-oriented software, including a software system
with about 138 k lines of C code.
Photonic crystals are inhomogeneous dielectric media with periodic variation of the refractive index. A photonic crystal gives us new tools for the manipulation of photons and thus has received great interests in a variety of fields. Photonic crystals are expected to be used in novel optical devices such as thresholdless laser diodes, single-mode light emitting diodes, small waveguides with low-loss sharp bends, small prisms, and small integrated optical circuits. They can be operated in some aspects as "left handed materials" which are capable of focusing transmitted waves into a sub-wavelength spot due to negative refraction. The thesis is focused on the applications of photonic crystals in communications and optical imaging: • Photonic crystal structures for potential dispersion management in optical telecommunication systems • 2D non-uniform photonic crystal waveguides with a square lattice for wide-angle beam refocusing using negative refraction • 2D non-uniform photonic crystal slabs with triangular lattice for all-angle beam refocusing • Compact phase-shifted band-pass transmission filter based on photonic crystals
The design of the fifth generation (5G) cellular network should take account of the emerging services with divergent quality of service requirements. For instance, a vehicle-to-everything (V2X) communication is required to facilitate the local data exchange and therefore improve the automation level in automated driving applications. In this work, we inspect the performance of two different air interfaces (i.e., LTE-Uu and PC5) which are proposed by the third generation partnership project (3GPP) to enable the V2X communication. With these two air interfaces, the V2X communication can be realized by transmitting data packets either over the network infrastructure or directly among traffic participants. In addition, the ultra-high reliability requirement in some V2X communication scenarios can not be fulfilled with any single transmission technology (i.e., either LTE-Uu or PC5). Therefore, we discuss how to efficiently apply multi-radio access technologies (multi-RAT) to improve the communication reliability. In order to exploit the multi-RAT in an efficient manner, both the independent and the coordinated transmission schemes are designed and inspected. Subsequently, the conventional uplink is also extended to the case where a base station can receive data packets through both the LTE-Uu and PC5 interfaces. Moreover, different multicast-broadcast single-frequency network (MBSFN) area mapping approaches are also proposed to improve the communication reliability in the LTE downlink. Last but not least, a system level simulator is implemented in this work. The simulation results do not only provide us insights on the performances of different technologies but also validate the effectiveness of the proposed multi-RAT scheme.
Three-dimensional (3D) integration using through- silicon via (TSV) has been used for memory designs. Content addressable memory (CAM) is an important component in digital systems. In this paper, we propose an evaluation tool for 3D CAMs, which can aid the designer to explore the delay and power of various partitioning strategies. Delay, power, and energy models of 3D CAM with respect to different architectures are built as well.
Hardware prototyping is an essential part in the hardware design flow. Furthermore, hardware prototyping usually relies on system-level design and hardware-in-the-loop simulations in order to develop, test and evaluate intellectual property cores. One common task in this process consist on interfacing cores with different port specifications. Data width conversion is used to overcome this issue. This work presents two open source hardware cores compliant with AXI4-Stream bus protocol, where each core performs upsizing/downsizing data width conversion.
Analog sensor electronics requires special care during design in order to increase the quality and precision of the signal, and the life time of the product. Nevertheless, it can experience static deviations due to the manufacturing tolerances, and dynamic deviations due to operating in non-ideal environment. Therefore, the advanced applications such as MEMS technology employs calibration loop to deal with the deviations, but unfortunately, it is considered only in the digital domain, which cannot cope with all the analog deviations such as saturation of the analog signal, etc. On the other hand, rapid-prototyping is essential to decrease the development time, and the cost of the products for small quantities. Recently, evolvable hardware has been developed with the motivation to cope with the mentioned sensor electronic problems. However the industrial specifications and requirements are not considered in the hardware learning loop. Indeed, it minimizes the error between the required output and the real output generated due to given test signal. The aim of this thesis is to synthesize the generic organic-computing sensor electronics and return hardware with predictable behavior for embedded system applications that gains the industrial acceptance; therefore, the hardware topology is constrained to the standard hardware topologies, the hardware standard specifications are included in the optimization, and hierarchical optimization are abstracted from the synthesis tools to evolve first the building blocks, then evolve the abstract level that employs these optimized blocks. On the other hand, measuring some of the industrial specifications needs expensive equipments and some others are time consuming which is not fortunate for embedded system applications. Therefore, the novel approach "mixtrinsic multi-objective optimization" is proposed that simulates/estimates the set of the specifications that is hard to be measured due to the cost or time requirements, while it measures intrinsically the set of the specifications that has high sensitivity to deviations. These approaches succeed to optimize the hardware to meet the industrial specifications with low cost measurement setup which is essential for embedded system applications.
For many years, most distributed real-time systems employed data communication systems specially tailored to address the specific requirements of individual domains: for instance, Controlled Area Network (CAN) and Flexray in the automotive domain, ARINC 429 [FW10] and TTP [Kop95] in the aerospace domain. Some of these solutions were expensive, and eventually not well understood.
Mostly driven by the ever decreasing costs, the application of such distributed real-time system have drastically increased in the last years in different domains. Consequently, cross-domain communication systems are advantageous. Not only the number of distributed real-time systems have been increasing but also the number of nodes per system, have drastically increased, which in turn increases their network bandwidth requirements. Further, the system architectures have been changing, allowing for applications to spread computations among different computer nodes. For example, modern avionics systems moved from federated to integrated modular architecture, also increasing the network bandwidth requirements.
Ethernet (IEEE 802.3) [iee12] is a well established network standard. Further, it is fast, easy to install, and the interface ICs are cheap [Dec05]. However, Ethernet does not offer any temporal guarantee. Research groups from academia and industry have presented a number of protocols merging the benefits of Ethernet and the temporal guarantees required by distributed real-time systems. Two of these protocols are: Avionics Full-Duplex Switched Ethernet (AFDX) [AFD09] and Time-Triggered Ethernet (TTEthernet) [tim16]. In this dissertation, we propose solutions for two problems faced during the design of AFDX and TTEthernet networks: avoiding data loss due to buffer overflow in AFDX networks with multiple priority traffic, and scheduling of TTEthernet networks.
AFDX guarantees bandwidth separation and bounded transmission latency for each communication channel. Communication channels in AFDX networks are not synchronized, and therefore frames might compete for the same output port, requiring buffering to avoid data loss. To avoid buffer overflow and the resulting data loss, the network designer must reserve a safe, but not too pessimistic amount of memory of each buffer. The current AFDX standard allows for the classification of the network traffic with two priorities. Nevertheless, some commercial solutions provide multiple priorities, increasing the complexity of the buffer backlog analysis. The state-of-the-art AFDX buffer backlog analysis does not provide a method to compute deterministic upper bounds
iiifor buffer backlog of AFDX networks with multiple priority traffic. Therefore, in this dissertation we propose a method to address this open problem. Our method is based on the analysis of the largest busy period encountered by frames stored in a buffer. We identify the ingress (and respective egress) order of frames in the largest busy period that leads to the largest buffer backlog, and then compute the respective buffer backlog upper bound. We present experiments to measure the computational costs of our method.
In TTEthernet, nodes are synchronized, allowing for message transmission at well defined points in time, computed off-line and stored in a conflict-free scheduling table. The computation of such scheduling tables is a NP-complete problem [Kor92], which should be solved in reasonable time for industrial size networks. We propose an approach to efficiently compute a schedule for the TT communication channels in TTEthernet networks, in which we model the scheduling problem as a search tree. As the scheduler traverses the search tree, it schedules the communication channels on a physical link. We presented two approaches to traverse the search tree while progressively creating the vertices of the search tree. A valid schedule is found once the scheduler reaches a valid leaf. If on the contrary, it reaches an invalid leaf, the scheduler backtracks searching for a path to a valid leaf. We present a set of experiments to demonstrate the impact of the input parameters on the time taken to compute a feasible schedule or to deem the set of virtual links infeasible.
In embedded systems, there is a trend of integrating several different functionalities on a common platform. This has been enabled by increasing processing power and the arise of integrated system-on-chips.
The composition of safety-critical and non-safety-critical applications results in mixed-criticality systems. Certification Authorities (CAs) demand the certification of safety-critical applications with strong confidence in the execution time bounds. As a consequence, CAs use conservative assumptions in the worst-case execution time (WCET) analysis which result in more pessimistic WCETs than the ones used by designers. The existence of certified safety-critical and non-safety-critical applications can be represented by dual-criticality systems, i.e., systems with two criticality levels.
In this thesis, we focus on the scheduling of mixed-criticality systems which are subject to certification. Scheduling policies cognizant of the mixed-criticality nature of the systems and the certification requirements are needed for efficient and effective scheduling. Furthermore, we aim at reducing the certification costs to allow faster modification and upgrading, and less error-prone certification. Besides certification aspects, requirements of different operational modes result in challenging problems for the scheduling process. Despite the mentioned problems, schedulers require a low runtime overhead for an efficient execution at runtime.
The presented solutions are centered around time-triggered systems which feature a low runtime overhead. We present a transformation to include event-triggered activities, represented by sporadic tasks, already into the offline scheduling process. Further, this transformation can also be applied on periodic tasks to shorten the length of schedule tables which reduces certification costs. These results can be used in our method to construct schedule tables which creates two schedule tables to fulfill the requirements of dual-criticality systems using mode changes at runtime. Finally, we present a scheduler based on the slot-shifting algorithm for mixed-criticality systems. In a first version, the method schedules dual-criticality jobs without the need for mode changes. An already certified schedule table can be used and at runtime, the scheduler reacts to the actual behavior of the jobs and thus, makes effective use of the available resources. Next, we extend this method to schedule mixed-criticality job sets with different operational modes. As a result, we can schedule jobs with varying parameters in different modes.
To continue reducing voltage in scaled technologies, both circuit and architecture-level resiliency techniques are needed to tolerate process-induced defects, variation, and aging in SRAM cells. Many different resiliency schemes have been proposed and evaluated, but most prior results focus on voltage reduction instead of energy reduction. At the circuit level, device cell architectures and assist techniques have been shown to lower Vmin for SRAM, while at the architecture level, redundancy and cache disable techniques have been used to improve resiliency at low voltages. This paper presents a unified study of error tolerance for both circuit and architecture techniques and estimates their area and energy overheads. Optimal techniques are selected by evaluating both the error-correcting abilities at low supplies and the overheads of each technique in a 28nm. The results can be applied to many of the emerging memory technologies.
The dissertation describes a practically proven, particularly efficient approach for the verification of digital circuit designs. The approach outperforms simulation based verification wrt. final circuit quality as well as wrt. required verification effort. In the dissertation, the paradigm of transaction based verification is ported from simulation to formal verification. One consequence is a particular format of formal properties, called operation properties. Circuit descriptions are verified by proof of operation properties with Interval Property Checking (IPC), a particularly strong SAT based formal verification algorithm. Furtheron, a completeness checker is presented that identifies all verification gaps in sets of operation properties. This completeness checker can handle the large operation properties that arise, if this approach is applied to realistic circuits. The methodology of operation properties, Interval Property Checking, and the completeness checker form a symbiosis that is of particular benefit to the verification of digital circuit designs. On top of this symbiosis an approach to completely verify the interaction of completely verified modules has been developed by adaptation of the modelling theories of digital systems. The approach presented in the dissertation has proven in multiple commercial application projects that it indeed completely verifies modules. After reaching a termination criterion that is well defined by completeness checking, no further bugs were found in the verified modules. The approach is marketed by OneSpin Solutions GmbH, Munich, under the names "Operation Based Verification" and "Gap Free Verification".
”In contemporary electronics 80% of a chip may perform digital functions but the 20%
of analog functions may take 80% of the development time.” [1]. Aggravating this, the
demands on analog design is increasing with rapid technology scaling. Most designs
have moved away from analog to digital domains, where possible, however, interacting
with the environment will always require analog to digital data conversion. Adding to
this problem, the number of sensors used in consumer and industry related products are
rapidly increasing. Designers of ADCs are dealing with this problem in several ways, the
most important is the migration towards digital designs and time domain techniques.
Time to Digital Converters (TDC) are becoming increasingly popular for robust signal
processing. Biological neurons make use of spikes, which carry spike timing information
and will not be affected by the problems related to technology scaling. Neuromorphic
ADCs still remain exotic with few implementations in sub-micron technologies Table 2.7.
Even among these few designs, the strengths of biological neurons are rarely exploited.
From a previous work [2], LUCOS, a high dynamic range image sensor, the efficiency
of spike processing has been validated. The ideas from this work can be generalized to
make a highly effective sensor signal conditioning system, which carries the promise to
be robust to technology scaling.
The goal of this work is to create a novel spiking neural ADC as a novel form of a
Multi-Sensor Signal Conditioning and Conversion system, which
• Will be able to interface with or be a part of a System on Chip with traditional
analog or advanced digital components.
• Will have a graceful degradation.
• Will be robust to noise and jitter related problems.
• Will be able to learn and adapt to static errors and dynamic errors.
• Will be capable of self-repair, self-monitoring and self-calibration
Sensory systems in humans and other animals analyze the environment using several
techniques. These techniques have been evolved and perfected to help the animal sur-
vive. Different animals specialize in different sense organs, however, the peripheral
neural network architectures remain similar among various animal species with few ex-
ceptions. While there are many biological sensing techniques present, most popularly
used engineering techniques are based on intensity detection, frequency detection, and
edge detection. These techniques are used with traditional analog processing (e.g., colorvi
sensors using filters), and with biological techniques (e.g. LUCOS chip [2]). The local-
ization capability of animals has never been fully utilized.
One of the most important capabilities for animals, vertebrates or invertebrates, is the
capability for localization. The object of localization can be predator, prey, sources of
water, or food. Since these are basic necessities for survival, they evolve much faster
due to the survival of the fittest. In fact, localization capabilities, even if the sensors
are different, have convergently evolved to have same processing methods (coincidence
detection) in their peripheral neurons (for e.g., forked tongue of a snake, antennae of
a cockroach, acoustic localization in fishes and mammals). This convergent evolution
increases the validity of the technique. In this work, localization concepts based on
acoustic localization and tropotaxis are investigated and employed for creation of novel
ADCs.
Unlike intensity and frequency detection, which are not linear (for e.g. eyes saturate in
bright light, loose color perception in low light), localization is inherently linear. This
is mainly because the accurate localization of predator or prey can be the difference
between life and death for an animal.
Figure 1 visually explains the ADC concept proposed in this work. This has two parts.
(1) Sensor to Spike(time) Conversion (SSC), (2) Spike(time) to Digital Conversion(SDC).
Both of the structures have been designed with models of biological neurons. The
combination of these two structures is called SSDC.
To efficiently implement the proposed concept, a comparison of several biological neural
models is made and two models are shortlisted. Various synapse structures are also
studied. From this study, Leaky Integrate and Fire neuron (LIF) is chosen since it
fulfills all the requirements of the proposed structure. The analog neuron and synapse
designs from Indiveri et. al. [3], [4] were taken, and simulations were conducted using
cadence and the behavioral equivalence with biological counterpart was checked. The
LIF neuron had features, that were not required for the proposed approach. A simple
LIF neuron stripped of these features and was designed to be as fast as allowed by the
technology.
The SDC was designed with the neural building blocks and the delays were designed
using buffer chains. This SDC converts incoming Time Interval Code (TIC) to sparse
place coding using coincidence detection. Coincidence detection is a property of spiking
neurons, which is a time domain equivalent of a Gaussian Kernel. The SDC is designed to
have an online reconfigurable Gaussian kernel width, weight, threshold, and refractory
period. The advantage of sparse place codes, which contain rank order coding wasvii
Figure 1: ADC as a localization problem (right), Jeffress model of sound localization
visualized (left). The values t 1 and t 2 indicate the time taken from the source to s1 and
s2 respectively.
described in our work [5]. A time based winner take all circuit with memory was created
based on a previous work [6] for reading out of sparse place codes asynchronously.
The SSC was also initially designed with the same building blocks. Additionally, a
differential synapse was designed for better SSC. The sensor element considered wasviii
a Wheatstone full bridge AMR sensor AFF755 from Sensitec GmbH. A reconfigurable
version of the synapse was also designed for a more generic sensor interface.
The first prototype chip SSDCα was designed with 257 modules of coincidence detectors
realizing the SDC and the SSC. Since the spike times are the most important information,
the spikes can be treated as digital pulses. This provides the capability for digital
communication between analog modules. This creates a lot of freedom for use of digital
processing between the discussed analog modules. This advantage is fully exploited
in the design of SSDCα. Three SSC modules are multiplexed to the SDC. These SSC
modules also provide outputs from the chip simultaneously. A rising edge detecting fixed
pulse width generation circuit is used to create pulses that are best suited for efficient
performance of the SDC. The delay lines are made reconfigurable to increase robustness
and modify the span of the SDC. The readout technique used in the first prototype is
a relatively slow but safe shift register. It is used to analyze the characteristics of the
core work. This will be replaced by faster alternatives discussed in the work. The area
of the chip is 8.5 mm 2 . It has a sampling rate from DC to 150 kHz. It has a resolution
from 8-bit to 13-bit. It has 28,200 transistors on the chip. It has been designed in 350
nm CMOS technology from ams. The chip has been manufactured and tested with a
sampling rate of 10 kHz and a theoretical resolution of 8 bits. However, due to the
limitations of our Time-Interval-Generator, we are able to confirm for only 4 bits of
resolution.
The key novel contributions of this work are
• Neuromorphic implementation of AD conversion as a localization problem based
on sound localization and tropotaxis concepts found in nature.
• Coincidence detection with sparse place coding to enhance resolution.
• Graceful degradation without redundant elements, inherent robustness to noise,
which helps in scaling of technologies
• Amenable to local adaptation and self-x features.
Conceptual goals have all been fulfilled, with the exception of adaptation. The feasibility
for local adaptation has been shown with promising results and further investigation is
required for future work. This thesis work acts as a baseline, paving the way for R&D
in a new direction. The chip design has used 350 nm ams hitkit as a vehicle to prove
the functionality of the core concept. The concept can be easily ported to present
aggressively-scaled-technologies and future technologies.
The heterogeneity of today's access possibilities to wireless networks imposes challenges for efficient mobility support and resource management across different Radio Access Technologies (RATs). The current situation is characterized by the coexistence of various wireless communication systems, such as GSM, HSPA, LTE, WiMAX, and WLAN. These RATs greatly differ with respect to coverage, spectrum, data rates, Quality of Service (QoS), and mobility support.
In real systems, mobility-related events, such as Handover (HO) procedures, directly affect resource efficiency and End-To-End (E2E) performance, in particular with respect to signaling efforts and users' QoS. In order to lay a basis for realistic multi-radio network evaluation, a novel evaluation methodology is introduced in this thesis.
A central hypothesis of this thesis is that the consideration and exploitation of additional information characterizing user, network, and environment context, is beneficial for enhancing Heterogeneous Access Management (HAM) and Self-Optimizing Networks (SONs). Further, Mobile Network Operator (MNO) revenues are maximized by tightly integrating bandwidth adaptation and admission control mechanisms as well as simultaneously accounting for user profiles and service characteristics. In addition, mobility robustness is optimized by enabling network nodes to tune HO parameters according to locally observed conditions.
For establishing all these facets of context awareness, various schemes and algorithms are developed and evaluated in this thesis. System-level simulation results demonstrate the potential of context information exploitation for enhancing resource utilization, mobility support, self-tuning network operations, and users' E2E performance.
In essence, the conducted research activities and presented results motivate and substantiate the consideration of context awareness as key enabler for cognitive and autonomous network management. Further, the performed investigations and aspects evaluated in the scope of this thesis are highly relevant for future 5G wireless systems and current discussions in the 5G infrastructure Public Private Partnership (PPP).
The fifth-generation mobile telecommunication network is expected to support multi-access edge computing (MEC), which intends to distribute computation tasks and services from the central cloud to the edge clouds. Toward ultra-responsive, ultra-reliable, and ultra-low-latency MEC services, the current mobile network security architecture should enable a more decentralized approach for authentication and authorization processes. This paper proposes a novel decentralized authentication architecture that supports flexible and low-cost local authentication with the awareness of context information of network elements such as user equipment and virtual network functions. Based on a Markov model for backhaul link quality as well as a random walk mobility model with mixed mobility classes and traffic scenarios, numerical simulations have demonstrated that the proposed approach is able to achieve a flexible balance between the network operating cost and the MEC reliability.
Context-Enabled Optimization of Energy-Autarkic Networks for Carrier-Grade Wireless Backhauling
(2015)
This work establishes the novel category of coordinated Wireless Backhaul Networks (WBNs) for energy-autarkic point-to-point radio backhauling. The networking concept is based on three major building blocks: cost-efficient radio transceiver hardware, a self-organizing network operations framework, and power supply from renewable energy sources. The aim of this novel backhauling approach is to combine carrier-grade network performance with reduced maintenance effort as well as independent and self-sufficient power supply. In order to facilitate the success prospects of this concept, the thesis comprises the following major contributions: Formal, multi-domain system model and evaluation methodology
First, adapted from the theory of cyber-physical systems, the author devises a multi-domain evaluation methodology and a system-level simulation framework for energy-autarkic coordinated WBNs, including a novel balanced scorecard concept. Second, the thesis specifically addresses the topic of Topology Control (TC) in point-to-point radio networks and how it can be exploited for network management purposes. Given a set of network nodes equipped with multiple radio transceivers and known locations, TC continuously optimizes the setup and configuration of radio links between network nodes, thus supporting initial network deployment, network operation, as well as topology re-configuration. In particular, the author shows that TC in WBNs belongs to the class of NP-hard quadratic assignment problems and that it has significant impact in operational practice, e.g., on routing efficiency, network redundancy levels, service reliability, and energy consumption. Two novel algorithms focusing on maximizing edge connectivity of network graphs are developed.
Finally, this work carries out an analytical benchmarking and a numerical performance analysis of the introduced concepts and algorithms. The author analytically derives minimum performance levels of the the developed TC algorithms. For the analyzed scenarios of remote Alpine communities and rural Tanzania, the evaluation shows that the algorithms improve energy efficiency and more evenly balance energy consumption across backhaul nodes, thus significantly increasing the number of available backhaul nodes compared to state-of-the-art TC algorithms.
This thesis focuses on the development and analysis of Stochastic Model Predictive Control (SMPC) strategies for both distributed stochastic systems and centralized stochastic systems with partially known distributional information. The first part deals with the development of distributed SMPC schemes that can be synthesized and operated in a fully distributed manner, establishing rigorous theoretical guarantees such as recursive feasibility, stability and closed-loop chance constraint satisfaction. We study several control problems of practical interest, such as the output-feedback regulation problem or the state-feedback tracking problem under additive stochastic noise, and the regulation problem under multiplicative noise. In the second part of this thesis, a novel research topic known as distributionally robust MPC (DR-MPC) is explored, which enhances the applicability of SMPC to real-world problems. DR-MPC is advantageous as it solely necessitates partial knowledge in the form of samples of the uncertainty, which is usually available in practical scenarios, while SMPC mandates exact knowledge of the (unknown) distributional information. We investigate different so-called ambiguity sets to immunize the DR-MPC optimization problem against sampling inaccuracies, leading to tractable optimization problems with strong theoretical guarantees. Altogether, both parts provide rigorous theoretical guarantees with practical design procedures demonstrated by numerical examples, which are the main contributions of this thesis.
This thesis has the goal to propose measures which allow an increase of the power efficiency of OFDM transmission systems. As compared to OFDM transmission over AWGN channels, OFDM transmission over frequency selective radio channels requires a significantly larger transmit power in order to achieve a certain transmission quality. It is well known that this detrimental impact of frequency selectivity can be combated by frequency diversity. We revisit and further investigate an approach to frequency diversity based on the spreading of subsets of the data elements over corresponding subsets of the OFDM subcarriers and term this approach Partial Data Spreading (PDS). The size of said subsets, which we designate as spreading factor, is a design parameter of PDS, and by properly choosing , depending on the system designer's requirements, an adequate compromise between a good system performance and a low complexity can be found. We show how PDS can be combined with ML, MMSE and ZF data detection, and it is recognized that MMSE data detection offers a good compromise between performance and complexity. After having presented the utilization of PDS in OFDM transmission without FEC encoding, we also show that PDS readily lends itself for FEC encoded OFDM transmission. We display that in this case the system performance can be significantly enhanced by specific schemes of interleaving and utilization of reliabiliy information developed in the thesis. A severe problem of OFDM transmission is the large Peak-to-Average-Power Ratio (PAPR) of the OFDM symbols, which hampers the application of power efficient transmit amplifiers. Our investigations reveal that PDS inherently reduces the PAPR. Another approch to PAPR reduction is the well known scheme Selective Data Mapping (SDM). In the thesis it is shown that PDS can be beneficially combined with SDM to the scheme PDS-SDM with a view to jointly exploit the PAPR reduction potentials of both schemes. However, even when such a PAPR reduction is achieved, the amplitude maximum of the resulting OFDM symbols is not constant, but depends on the data content. This entails the disadvantage that the power amplifier cannot be designed, with a view to achieve a high power efficiency, for a fixed amplitude maximum, what would be desirable. In order to overcome this problem, we propose the scheme Optimum Clipping (OC), in which we obtain the desired fixed amplitude maximum by a specific combination of the measures clipping, filtering and rescaling. In OFDM transmission a certain number of OFDM subcarriers have to be sacrificed for pilot transmission in order to enable channel estimation in the receiver. For a given energy of the OFDM symbols, the question arises in which way this energy should be subdivided among the pilots and the data carrying OFDM subcarriers. If a large portion of the available transmit energy goes to the pilots, then the quality of channel estimation is good, however, the data detection performs poor. Data detection also performs poor if the energy provided for the pilots is too small, because then the channel estimate indispensable for data detection is not accurate enough. We present a scheme how to assign the energy to pilot and data OFDM subcarriers in an optimum way which minimizes the symbol error probability as the ultimate quality measure of the transmission. The major part of the thesis is dedicated to point-to-point OFDM transmission systems. Towards the end of the thesis we show that the PDS can be also applied to multipoint-to-point OFDM transmission systems encountered for instance in the uplinks of mobile radio systems.
Contributions to the application of adaptive antennas and CDMA code pooling in the TD CDMA downlink
(2002)
TD (Time Division)-CDMA is one of the partial standards adopted by 3GPP (3rd Generation Partnership Project) for 3rd Generation (3G) mobile radio systems. An important issue when designing 3G mobile radio systems is the efficient use of the available frequency spectrum, that is the achievement of a spectrum efficiency as high as possible. It is well known that the spectrum efficiency can be enhanced by utilizing multi-element antennas instead of single-element antennas at the base station (BS). Concerning the uplink of TD- CDMA, the benefits achievable by multi-element BS antennas have been quantitatively studied to a satisfactory extent. However, corresponding studies for the downlink are still missing. This thesis has the goal to make contributions to fill this lack of information. For near-to-reality directional mobile radio scenarios TD-CDMA downlink utilizing multi-element antennas at the BS are investigated both on the system level and on the link level. The system level investigations show how the carrier-to-interference ratio can be improved by applying such antennas. As the result of the link level investigations, which rely on the detection scheme Joint Detection (JD), the improvement of the bit er- ror rate by utilizing multi-element antennas at the BS can be quantified. Concerning the link level of TD-CDMA, a number of improvements are proposed which allow considerable performance enhancement of TD-CDMA downlink in connection with multi-element BS antennas. These improvements include * the concept of partial joint detection (PJD), in which at each mobile station (MS) only a subset of the arriving CDMA signals including those being of interest to this MS are jointly detected, * a blind channel estimation algorithm, * CDMA code pooling, that is assigning more than one CDMA code to certain con- nections in order to offer these users higher data rates, * maximizing the Shannon transmission capacity by an interleaving concept termed CDMA code interleaving and by advantageously selecting the assignment of CDMA codes to mobile radio channels, * specific power control schemes, which tackle the problem of different transmission qualities of the CDMA codes. As a comprehensive illustration of the advantages achievable by multi-element BS anten- nas in the TD-CDMA downlink, quantitative results concerning the spectrum efficiency for different numbers of antenna elements at the BS conclude the thesis.
This paper presents a completely systematic design procedure for asynchronous controllers.The initial step is the construction of a signal transition graph (STG, an interpreted Petri net) ofthe dialog between data path and controller: a formal representation without reference to timeor internal states. To implement concurrently operating control structures, and also to reducedesign effort and circuit cost, this STG can be decomposed into overlapping subnets. A univer-sal initial solution is then obtained by algorithmically constructing a primitive flow table fromeach component net. This step links the procedure to classical asynchronous design, in particu-lar to its proven optimization methods, without restricting the set of solutions. In contrast toother approaches, there is no need to extend the original STG intuitively.
Users privacy is more and more relevant in today digital world. In this paper, we study how mobile network operators (MNOs) practices can lead to loss of privacy for mobile phone subscribers. This article focuses on the mobile phone service providers' implication in privacy violation. Network attacks from other agents, such as cyber-criminals, are not covered in this work.
We review the impact of the location tracking improvement from 2G to 5G networks on police investigations and users' privacy rights.
We also study the role of MNOs in users' sensitive data monetization and the legality behind this practice.
There are few existing publications aiming to enhance mobile phone users' privacy protection against mobile broadband internet providers. We have tried to list all of them in this article.
Divide-and-Conquer is a common strategy to manage the complexity of system design and verification. In the context of System-on-Chip (SoC) design verification, an SoC system is decomposed into several modules and every module is separately verified. Usually an SoC module is reactive: it interacts with its environmental modules. This interaction is normally modeled by environment constraints, which are applied to verify the SoC module. Environment constraints are assumed to be always true when verifying the individual modules of a system. Therefore the correctness of environment constraints is very important for module verification.
Environment constraints are also very important for coverage analysis. Coverage analysis in formal verification measures whether or not the property set fully describes the functional behavior of the design under verification (DuV). if a set of properties describes every functional behavior of a DuV, the set of properties is called complete. To verify the correctness of environment constraints, Assume-Guarantee Reasoning rules can be employed.
However, the state of the art assume-guarantee reasoning rules cannot be applied to the environment constraints specified by using an industrial standard property language such as SystemVerilog Assertions (SVA).
This thesis proposes a new assume-guarantee reasoning rule that can be applied to environment constraints specified by using a property language such as SVA. In addition, this thesis proposes two efficient plausibility checks for constraints that can be conducted without a concrete implementation of the considered environment.
Furthermore, this thesis provides a compositional reasoning framework determining that a system is completely verified if all modules are verified with Complete Interval Property Checking (C-IPC) under environment constraints.
At present, there is a trend that more of the functionality in SoCs is shifted from the hardware to the hardware-dependent software (HWDS), which is a crucial component in an SoC, since other software layers, such as the operating systems are built on it. Therefore there is an increasing need to apply formal verification to HWDS, especially for safety-critical systems.
The interactions between HW and HWDS are often reactive, and happen in a temporal order. This requires new property languages to specify the reactive behavior at the HW and SW interfaces.
This thesis introduces a new property language, called Reactive Software Property Language (RSPL), to specify the reactive interactions between the HW and the HWDS.
Furthermore, a method for checking the completeness of software properties, which are specified by using RSPL, is presented in this thesis. This method is motivated by the approach of checking the completeness of hardware properties.
Multiple-channel die-stacked DRAMs have been used for maximizing the performance and minimizing the power of memory access in 2.5D/3D system chips. Stacked DRAM dies can be used as a cache for the processor die in 2.5D/3D system chips. Typically, modern processor system-on-chips (SOCs) have three-level caches, L1, L2, and L3. Could the DRAM cache be used to replace which level of caches? In this paper, we derive an inequality which can aid the designer to check if the designed DRAM cache can provide better performance than the L3 cache. Also, design considerations of DRAM caches for meet the inequality are discussed. We find that a dilemma of the DRAM cache access time and associativity exists for providing better performance than the L3 cache. Organizing multiple channels into a DRAM cache is proposed to cope with the dilemma.
The number of sensors used in modern devices is rapidly increasing, and the interaction with sensors demands analog-to-digital data conversion (ADC). A conventional ADC in leading-edge technologies faces
many issues due to signal swings, manufacturing deviations, noise, etc. Designers of ADCs are moving to the
time domain and digital designs techniques to deal with these issues. This work pursues a novel self-adaptive
spiking neural ADC (SN-ADC) design with promising features, e.g., technology scaling issues, low-voltage
operation, low power, and noise-robust conditioning. The SN-ADC uses spike time to carry the information.
Therefore, it can be effectively translated to aggressive new technologies to implement reliable advanced sensory electronic systems. The SN-ADC supports self-x (self-calibration, self-optimization, and self-healing) and
machine learning required for the internet of things (IoT) and Industry 4.0. We have designed the main part of
SN-ADC, which is an adaptive spike-to-digital converter (ASDC). The ASDC is based on a self-adaptive complementary metal–oxide–semiconductor (CMOS) memristor. It mimics the functionality of biological synapses,
long-term plasticity, and short-term plasticity. The key advantage of our design is the entirely local unsupervised
adaptation scheme. The adaptation scheme consists of two hierarchical layers; the first layer is self-adapted, and
the second layer is manually treated in this work. In our previous work, the adaptation process is based on 96 variables. Therefore, it requires considerable adaptation time to correct the synapses’ weight. This paper proposes a
novel self-adaptive scheme to reduce the number of variables to only four and has better adaptation capability
with less delay time than our previous implementation. The maximum adaptation times of our previous work
and this work are 15 h and 27 min vs. 1 min and 47.3 s. The current winner-take-all (WTA) circuits have issues, a
high-cost design, and no identifying the close spikes. Therefore, a novel WTA circuit with memory is proposed.
It used 352 transistors for 16 inputs and can process spikes with a minimum time difference of 3 ns. The ASDC
has been tested under static and dynamic variations. The nominal values of the SN-ADC parameters’ number
of missing codes (NOMCs), integral non-linearity (INL), and differential non-linearity (DNL) are no missing
code, 0.4 and 0.22 LSB, respectively, where LSB stands for the least significant bit. However, these values are
degraded due to the dynamic and static deviation with maximum simulated change equal to 0.88 and 4 LSB and
6 codes for DNL, INL, and NOMC, respectively. The adaptation resets the SN-ADC parameters to the nominal
values. The proposed ASDC is designed using X-FAB 0.35 µm CMOS technology and Cadence tools.
The simulation of Dynamic Random Access Memories (DRAMs) on system level requires highly accurate models due to their complex timing and power behavior. However, conventional cycle-accurate DRAM subsystem models often become a bottleneck for the overall simulation speed. A promising alternative are simulators based on Transaction Level Modeling, which can be fast and accurate at the same time. In this paper we present DRAMSys4.0, which is, to the best of our knowledge, the fastest and most extensive open-source cycle-accurate DRAM simulation framework. DRAMSys4.0 includes a novel software architecture that enables a fast adaption to different hardware controller implementations and new JEDEC standards. In addition, it already supports the latest standards DDR5 and LPDDR5. We explain how to apply optimization techniques for an increased simulation speed while maintaining full temporal accuracy. Furthermore, we demonstrate the simulator’s accuracy and analysis tools with two application examples. Finally, we provide a detailed investigation and comparison of the most prominent cycle-accurate open-source DRAM simulators with regard to their supported features, analysis capabilities and simulation speed.
Autonomous driving is disrupting the conventional automotive development. In fact, autonomous driving kicks off the consolidation of control units, i.e. the transition from distributed Electronic Control Units (ECUs) to centralized domain controllers. Platforms like Audi’s zFAS demonstrate this very clearly, where GPUs, Custom SoCs, Microcontrollers, and FPGAs are integrated on a single domain controller in order to perform sensor fusion, processing and decision making on a single Printed Circuit Board (PCB). The communication between these heterogeneous components and the algorithms for Advanced Driving Assistant Systems (ADAS) itself requires a huge amount of memory bandwidth, which will bring the Memory Wall from High Performance Computing (HPC) and data-centers directly in our cars. In this paper we highlight the roles and issues of Dynamic Random Access Memories (DRAMs) for future autonomous driving architectures.
In this thesis we studied and investigated a very common but a long existing noise problem and we provided a solution to this problem. The task is to deal with different types of noise that occur simultaneously and which we call hybrid. Although there are individual solutions for specific types one cannot simply combine them because each solution affects the whole speech. We developed an automatic speech recognition system DANSR ( Dynamic Automatic Noisy Speech Recognition System) for hybrid noisy environmental noise. For this we had to study all of speech starting from the production of sounds until their recognition. Central elements are the feature vectors on which pay much attention. As an additional effect we worked on the production of quantities for psychoacoustic speech elements.
The thesis has four parts:
1) The first part we give an introduction. The chapter 2 and 3 give an overview over speech generation and recognition when machines are used. Also noise is considered.
2) In the second part we describe our general system for speech recognition in a noisy environment. This is contained in the chapters 4-10. In chapter 4 we deal with data preparation. Chapter 5 is concerned with very strong noise and its modeling using Poisson distribution. In the chapters 5-8 we deal with parameter based modeling. Chapter 7 is concerned with autoregressive methods in relation to the vocal tract. In the chapters 8 and 9 we discuss linear prediction and its parameters. Chapter 9 is also concerned with quadratic errors, the decomposition into sub-bands and the use of Kalman filters for non-stationary colored noise in chapter 10. There one finds classical approaches as long we have used and modified them. This includes covariance mehods, the method of Burg and others.
3) The third part deals firstly with psychoacoustic questions. We look at quantitative magnitudes that describe them. This has serious consequences for the perception models. For hearing we use different scales and filters. In the center of the chapters 12 and 13 one finds the features and their extraction. The fearures are the only elements that contain information for further use. We consider here Cepstrum features and Mel frequency cepstral coefficients(MFCC), shift invariant local trigonometric transformed (SILTT), linear predictive coefficients (LPC), linear predictive cepstral coefficients (LPCC), perceptual linear predictive (PLP) cepstral coefficients. In chapter 13 we present our extraction methods in DANSR and how they use window techniques And discrete cosine transform (DCT-IV) as well as their inverses.
4) The fourth part considers classification and the ultimate speech recognition. Here we use the hidden Markov model (HMM) for describing the speech process and the Gaussian mixture model (GMM) for the acoustic modelling. For the recognition we use forward algorithm, the Viterbi search and the Baum-Welch algorithm. We also draw the connection to dynamic time warping (DTW). In the rest we show experimental results and conclusions.
This article proposes a new clock-dependent gain-scheduled dynamic output feedback controller for delayed linear parameter varying systems with piecewise constant parameters. The proposed controller guarantees ℒ2-performance. By employing a clock-dependent Lyapunov–Krasovskii functional, a sufficient condition for the existence of the controller is provided in terms of clock- and parameter-dependent linear matrix inequalities. A case study on output feedback control of delayed switched systems is also provided. To illustrate the efficacy of the result, it is applied to a practical VTOL helicopter model.
Modern applications in the realms of wireless communication and mobile broadband Internet increase the demand for compact antennas with well defined directivity. Here, we present an approach for the design and implementation of hybrid antennas consisting of a classic feeding antenna that is near-field-coupled to a subwavelength resonator. In such a combined structure, the composite antenna always radiates at the resonance frequency of the subwavelength oscillator as well as at the resonance frequency of the feeding antenna. While the classic antenna serves as impedance-matched feeding element, the subwavelength resonator induces an additional resonance to the composite antenna. In general, these near-field coupled structures are known for decades and are lately published as near-field resonant parasitic antennas. We describe an antenna design consisting of a high-frequency electric dipole antenna at fd=25 GHz that couples to a low-frequency subwavelength split-ring resonator, which emits electromagnetic waves at fSRR=10.41 GHz. The radiating part of the antenna has a size of approximately 3.2mm×8mm×1mm and thus is electrically small at this frequency with a product k⋅a=0.5 . The input return loss of the antenna was moderate at −18 dB and it radiated at a spectral bandwidth of 120 MHz. The measured main lobe of the antenna was observed at 60∘ with a −3 dB angular width of 65∘ in the E-plane and at 130∘ with a −3 dB angular width of 145∘ in the H-plane
Recurrent Neural Networks, in particular One-dimensional and Multidimensional Long Short-Term Memory (1D-LSTM and MD-LSTM) have achieved state-of-the-art classification accuracy in many applications such as machine translation, image caption generation, handwritten text recognition, medical imaging and many more. However, high classification accuracy comes at high compute, storage, and memory bandwidth requirements, which make their deployment challenging, especially for energy-constrained platforms such as portable devices. In comparison to CNNs, not so many investigations exist on efficient hardware implementations for 1D-LSTM especially under energy constraints, and there is no research publication on hardware architecture for MD-LSTM. In this article, we present two novel architectures for LSTM inference: a hardware architecture for MD-LSTM, and a DRAM-based Processing-in-Memory (DRAM-PIM) hardware architecture for 1D-LSTM. We present for the first time a hardware architecture for MD-LSTM, and show a trade-off analysis for accuracy and hardware cost for various precisions. We implement the new architecture as an FPGA-based accelerator that outperforms NVIDIA K80 GPU implementation in terms of runtime by up to 84× and energy efficiency by up to 1238× for a challenging dataset for historical document image binarization from DIBCO 2017 contest, and a well known MNIST dataset for handwritten digits recognition. Our accelerator demonstrates highest accuracy and comparable throughput in comparison to state-of-the-art FPGA-based implementations of multilayer perceptron for MNIST dataset. Furthermore, we present a new DRAM-PIM architecture for 1D-LSTM targeting energy efficient compute platforms such as portable devices. The DRAM-PIM architecture integrates the computation units in a close proximity to the DRAM cells in order to maximize the data parallelism and energy efficiency. The proposed DRAM-PIM design is 16.19 × more energy efficient as compared to FPGA implementation. The total chip area overhead of this design is 18 % compared to a commodity 8 Gb DRAM chip. Our experiments show that the DRAM-PIM implementation delivers a throughput of 1309.16 GOp/s for an optical character recognition application.