### Refine

#### Year of publication

#### Document Type

- Doctoral Thesis (45)
- Conference Proceeding (16)
- Preprint (6)
- Article (5)
- Report (3)
- Other (2)
- Course Material (1)

#### Language

- English (78) (remove)

#### Keywords

- Mobilfunk (5)
- Cache (3)
- DRAM (3)
- MIMO (3)
- Model checking (3)
- OFDM (3)
- SRAM (3)
- System-on-Chip (3)
- Chisel (2)
- FPGA (2)

#### Faculty / Organisational entity

- Fachbereich Elektrotechnik und Informationstechnik (78) (remove)

Rapid growth in sensors and sensor technology introduces variety of products to the market. The increasing number of available sensor concepts and implementations demands more versatile sensor electronics and signal conditioning. Nowadays signal conditioning for the available spectrum of sensors is becoming more and more challenging. Moreover, developing a sensor signal conditioning ASIC is a function of cost, area, and robustness to maintain signal integrity. Field programmable analog approaches and the recent evolvable hardware approaches offer partial solution for advanced compensation as well as for rapid prototyping. The recent research field of evolutionary concepts focuses predominantly on digital and is at its advancement stage in analog domain. Thus, the main research goal is to combine the ever increasing industrial demand for sensor signal conditioning with evolutionary concepts and dynamically reconfigurable matched analog arrays implemented in main stream Complementary Metal Oxide Semiconductors (CMOS) technologies to yield an intelligent and smart sensor system with acceptable fault tolerance and the so called self-x features, such as self-monitoring, self-repairing and self-trimming. For this aim, the work suggests and progresses towards a novel, time continuous and dynamically reconfigurable signal conditioning hardware platform suitable to support variety of sensors. The state-of-the-art has been investigated with regard to existing programmable/reconfigurable analog devices and the common industrial application scenario and circuits, in particular including resource and sizing analysis for proper motivation of design decisions. The pursued intermediate granular level approach called as Field Programmable Medium-granular mixed signal Array (FPMA) offers flexibility, trimming and rapid prototyping capabilities. The proposed approach targets at the investigation of industrial applicability of evolvable hardware concepts and to merge it with reconfigurable or programmable analog concepts, and industrial electronics standards and needs for next generation robust and flexible sensor systems. The devised programmable sensor signal conditioning test chips, namely FPMA1/FPMA2, designed in 0.35 µm (C35B4) Austriamicrosystems, can be used as a single instance, off the shelf chip at the PCB level for conditioning or in the loop with dedicated software to inherit the aspired self-x features. The use of such self–x sensor system carries the promise of improved flexibility, better accuracy and reduced vulnerability to manufacturing deviations and drift. An embedded system, namely PHYTEC miniMODUL-515C was used to program and characterize the mixed-signal test chips in various feedback arrangements to answer some of the questions raised by the research goals. Wide range of established analog circuits, ranging from single output to fully differential amplifiers, was investigated at different hierarchical levels to realize circuits like instrumentation amplifier and filters. A more extensive design issues based on low-power like for e.g., sub-threshold design were investigated and a novel soft sleep mode idea was proposed. The bandwidth limitations observed in the state of the art fine granular approaches were enhanced by the proposed intermediate granular approach. The so designed sensor signal conditioning instrumentation amplifier was then compared to the commercially available products in the market like LT 1167, INA 125 and AD 8250. In an adaptive prototype, evolutionary approaches, in particular based on particle swarm optimization with multi-objectives, were just deployed to all the test samples of FPMA1/FMPA2 (15 each) to exhibit self-x properties and to recover from manufacturing variations and drift. The variations observed in the performance of the test samples were compensated through reconfiguration for the desired specification.

In recent years, formal property checking has become adopted successfully in industry and is used increasingly to solve the industrial verification tasks. This success results from property checking formulations that are well adapted to specific methodologies. In particular, assertion checking and property checking methodologies based on Bounded Model Checking or related techniques have matured tremendously during the last decade and are well supported by industrial methodologies. This is particularly true for formal property checking of computational System-on-Chip (SoC) modules. This work is based on a SAT-based formulation of property checking called Interval Property Checking (IPC). IPC originates in the Siemens company and is in industrial use since the mid 1990s. IPC handles a special type of safety properties, which specify operations in intervals between abstract starting and ending states. This paves the way for extremely efficient proving procedures. However, there are still two problems in the IPC-based verification methodology flow that reduce the productivity of the methodology and sometimes hamper adoption of IPC. First, IPC may return false counterexamples since its computational bounded circuit model only captures local reachability information, i.e., long-term dependencies may be missed. If this happens, the properties need to be strengthened with reachability invariants in order to rule out the spurious counterexamples. Identifying strong enough invariants is a laborious manual task. Second, a set of properties needs to be formulated manually for each individual design to be verified. This set, however, isn’t re-usable for different designs. This work exploits special features of communication modules in SoCs to solve these problems and to improve the productivity of the IPC methodology flow. First, the work proposes a decomposition-based reachability analysis to solve the problem of identifying reachability information automatically. Second, this work develops a generic, reusable set of properties for protocol compliance verification.

The high demanded data throughput of data communication between units in the system can be covered by short-haul optical communication and high speed serial data communication. In these data communication schemes, the receiver has to extract the corresponding clock from serial data stream by a clock and data recovery circuit (CDR). Data transceiver nodes have their own local reference clocks for their data transmission and data processing units. The reference clocks are normally slightly different even if they are specified to have the same frequency. Therefore, the data communication transceivers always work in a plesiochronous condition, an operation with slightly different reference frequencies. The difference of the data rates is covered by an elastic buffer. In a data readout system in the experiment in particle physics, such as a particle detector, the data of analog-to-digital converters (ADCs) in all detector nodes are transmitted over the networks. The plesiochronous condition in these networks are non-preferable because it causes the difficulty in the time stamping, which is used to indicate the relative time between events. The separated clock distribution network is normally required to overcome this problem. If the existing data communication networks can support the clock distribution function, the system complexity can be largely reduced. The CDRs on all detector nodes have to operate without a local reference clock and provide the recovered clocks, which have sufficiently good quality, for using as the reference timing for their local data processing units. In this thesis, a low jitter clock and data recovery circuit for large synchronous networks is presented. It possesses a 2-loop topology. They are clock and data recovery loop and clock jitter filter loop. In CDR loop, the CDR with rotational frequency detector is applied to increase its frequency capture range, therefore the operation without local reference clock is possible. Its loop bandwidth can be freely adjusted to meet the specified jitter tolerance. The 1/4-rate time-interleaving architecture is used to reduce the operation frequency and optimize the power consumption. The clock-jitter-filter loop is applied to improve the jitter of the recovered clock. It uses a low jitter LC voltage controlled oscillator (VCO). The loop bandwidth of the clock-jitter-filter is minimized to suppress the jitter of the recovered clock. The 1/4-rate CDR with frequency detector and clock-jitter-filter with LC-VCO were implemented in 0.18µm CMOS Technology. Both circuits occupy an area of 1.61mm2 and consume 170mW from 1.8V supply. The CDR can cover data rate from 1 to 2Gb/s. Its loop bandwidth is configurable from 700kHz to 4MHz. Its jitter tolerance can comply to SONET standard. The clock-jitter-filter has the configurable input/output frequencies from 9.191 to 78.125MHz. Its loop bandwidth is adjustable from 100kHz to 3MHz. The high frequency clock is also available for a serial data transmitter. The CDR with clock-jitter-filter can generate clock with jitter of 4.2ps rms from the incoming serial data with inter-symbol-interference jitter of 150ps peak-to-peak.

Photonic crystals are inhomogeneous dielectric media with periodic variation of the refractive index. A photonic crystal gives us new tools for the manipulation of photons and thus has received great interests in a variety of fields. Photonic crystals are expected to be used in novel optical devices such as thresholdless laser diodes, single-mode light emitting diodes, small waveguides with low-loss sharp bends, small prisms, and small integrated optical circuits. They can be operated in some aspects as "left handed materials" which are capable of focusing transmitted waves into a sub-wavelength spot due to negative refraction. The thesis is focused on the applications of photonic crystals in communications and optical imaging: • Photonic crystal structures for potential dispersion management in optical telecommunication systems • 2D non-uniform photonic crystal waveguides with a square lattice for wide-angle beam refocusing using negative refraction • 2D non-uniform photonic crystal slabs with triangular lattice for all-angle beam refocusing • Compact phase-shifted band-pass transmission filter based on photonic crystals

The dissertation describes a practically proven, particularly efficient approach for the verification of digital circuit designs. The approach outperforms simulation based verification wrt. final circuit quality as well as wrt. required verification effort. In the dissertation, the paradigm of transaction based verification is ported from simulation to formal verification. One consequence is a particular format of formal properties, called operation properties. Circuit descriptions are verified by proof of operation properties with Interval Property Checking (IPC), a particularly strong SAT based formal verification algorithm. Furtheron, a completeness checker is presented that identifies all verification gaps in sets of operation properties. This completeness checker can handle the large operation properties that arise, if this approach is applied to realistic circuits. The methodology of operation properties, Interval Property Checking, and the completeness checker form a symbiosis that is of particular benefit to the verification of digital circuit designs. On top of this symbiosis an approach to completely verify the interaction of completely verified modules has been developed by adaptation of the modelling theories of digital systems. The approach presented in the dissertation has proven in multiple commercial application projects that it indeed completely verifies modules. After reaching a termination criterion that is well defined by completeness checking, no further bugs were found in the verified modules. The approach is marketed by OneSpin Solutions GmbH, Munich, under the names "Operation Based Verification" and "Gap Free Verification".

Wireless sensor networks are the driving force behind many popular and interdisciplinary research areas, such as environmental monitoring, building automation, healthcare and assisted living applications. Requirements like compactness, high integration of sensors, flexibility, and power efficiency are often very different and cannot be fulfilled by state-of-the-art node platforms at once. In this paper, we present and analyze AmICA: a flexible, compact, easy-to-program, and low-power node platform. Developed from scratch and including a node, a basic communication protocol, and a debugging toolkit, it assists in an user-friendly rapid application development. The general purpose nature of AmICA was evaluated in two practical applications with diametric requirements. Our analysis shows that AmICA nodes are 67% smaller than BTnodes, have five times more sensors than Mica2Dot and consume 72% less energy than the state-of-the-art TelosB mote in sleep mode.

Wireless Sensor Networks (WSN) are dynamically-arranged networks typically composed of a large number of arbitrarily-distributed sensor nodes with computing capabilities contributing to –at least– one common application. The main characteristic of these networks is that of being functionally constrained due to a scarce availability of resources and strong dependence on uncontrollable environmental factors. These conditions introduce severe restrictions on the applicability of classic real-time methods aiming at guaranteeing time-bounded communications. Existing real-time solutions tend to apply concepts that were originally not conceived for sensor networks, idealizing realistic application scenarios and overlooking at important design limitations. This results in a number of misleading practices contributing to approaches of restricted validity in real-world scenarios. Amending the confrontation between WSNs and real-time objectives starts with a review of the basic fundamentals of existing approaches. In doing so, this thesis presents an alternative approach based on a generalized timeliness notion suitable to the particularities of WSNs. The new conceptual notion allows the definition of feasible real-time objectives opening a new scope of possibilities not constrained to idealized systems. The core of this thesis is based on the definition and application of Quality of Service (QoS) trade-offs between timeliness and other significant QoS metrics. The analysis of local and global trade-offs provides a step-by-step methodology identifying the correlations between these quality metrics. This association enables the definition of alternative trade-off configurations (set points) influencing the quality performance of the network at selected instants of time. With the basic grounds established, the above concepts are embedded in a simple routing protocol constituting a proof of concept for the validity of the presented analysis. Extensive evaluations under realistic scenarios are driven on simulation environments as well as real testbeds, validating the consistency of this approach.

Model-based fault diagnosis and fault-tolerant control for a nonlinear electro-hydraulic system
(2010)

The work presented in this thesis discusses the model-based fault diagnosis and fault-tolerant control with application to a nonlinear electro-hydraulic system. High performance control with guaranteed safety and reliability for electro-hydraulic systems is a challenging task due to the high nonlinearity and system uncertainties. This thesis developed a diagnosis integrated fault-tolerant control (FTC) strategy for the electro-hydraulic system. In fault free case the nominal controller is in operation for achieving the best performance. If the fault occurs, the controller will be automatically reconfigured based on the fault information provided by the diagnosis system. Fault diagnosis and reconfigurable controller are the key parts for the proposed methodology. The system and sensor faults both are studied in the thesis. Fault diagnosis consists of fault detection and isolation (FDI). A model-base residual generating is realized by calculating the redundant information from the system model and available signal. In this thesis differential-geometric approach is employed, which gives a general formulation of FDI problem and is more compact and transparent among various model-based approaches. The principle of residual construction with differential-geometric method is to find an unobservable distribution. It indicates the existence of a system transformation, with which the unknown system disturbance can be decoupled. With the observability codistribution algorithm the local weak observability of transformed system is ensured. A Fault detection observer for the transformed system can be constructed to generate the residual. This method cannot isolated sensor faults. In the thesis the special decision making logic (DML) is designed based on the individual signal analysis of the residuals to isolate the fault. The reconfigurable controller is designed with the backstepping technique. Backstepping method is a recursive Lyapunov-based approach and can deal with nonlinear systems. Some system variables are considered as ``virtual controls'' during the design procedure. Then the feedback control laws and the associate Lyapunov function can be constructed by following step-by-step routine. For the electro-hydraulic system adaptive backstepping controller is employed for compensate the impact of the unknown external load in the fault free case. As soon as the fault is identified, the controller can be reconfigured according to the new modeling of faulty system. The system fault is modeled as the uncertainty of system and can be tolerated by parameter adaption. The senor fault acts to the system via controller. It can be modeled as parameter uncertainty of controller. All parameters coupled with the faulty measurement are replaced by its approximation. After the reconfiguration the pre-specified control performance can be recovered. FDI integrated FTC based on backstepping technique is implemented successfully on the electro-hydraulic testbed. The on-line robust FDI and controller reconfiguration can be achieved. The tracking performance of the controlled system is guaranteed and the considered faults can be tolerated. But the problem of theoretical robustness analysis for the time delay caused by the fault diagnosis is still open.

This thesis has the goal to propose measures which allow an increase of the power efficiency of OFDM transmission systems. As compared to OFDM transmission over AWGN channels, OFDM transmission over frequency selective radio channels requires a significantly larger transmit power in order to achieve a certain transmission quality. It is well known that this detrimental impact of frequency selectivity can be combated by frequency diversity. We revisit and further investigate an approach to frequency diversity based on the spreading of subsets of the data elements over corresponding subsets of the OFDM subcarriers and term this approach Partial Data Spreading (PDS). The size of said subsets, which we designate as spreading factor, is a design parameter of PDS, and by properly choosing , depending on the system designer's requirements, an adequate compromise between a good system performance and a low complexity can be found. We show how PDS can be combined with ML, MMSE and ZF data detection, and it is recognized that MMSE data detection offers a good compromise between performance and complexity. After having presented the utilization of PDS in OFDM transmission without FEC encoding, we also show that PDS readily lends itself for FEC encoded OFDM transmission. We display that in this case the system performance can be significantly enhanced by specific schemes of interleaving and utilization of reliabiliy information developed in the thesis. A severe problem of OFDM transmission is the large Peak-to-Average-Power Ratio (PAPR) of the OFDM symbols, which hampers the application of power efficient transmit amplifiers. Our investigations reveal that PDS inherently reduces the PAPR. Another approch to PAPR reduction is the well known scheme Selective Data Mapping (SDM). In the thesis it is shown that PDS can be beneficially combined with SDM to the scheme PDS-SDM with a view to jointly exploit the PAPR reduction potentials of both schemes. However, even when such a PAPR reduction is achieved, the amplitude maximum of the resulting OFDM symbols is not constant, but depends on the data content. This entails the disadvantage that the power amplifier cannot be designed, with a view to achieve a high power efficiency, for a fixed amplitude maximum, what would be desirable. In order to overcome this problem, we propose the scheme Optimum Clipping (OC), in which we obtain the desired fixed amplitude maximum by a specific combination of the measures clipping, filtering and rescaling. In OFDM transmission a certain number of OFDM subcarriers have to be sacrificed for pilot transmission in order to enable channel estimation in the receiver. For a given energy of the OFDM symbols, the question arises in which way this energy should be subdivided among the pilots and the data carrying OFDM subcarriers. If a large portion of the available transmit energy goes to the pilots, then the quality of channel estimation is good, however, the data detection performs poor. Data detection also performs poor if the energy provided for the pilots is too small, because then the channel estimate indispensable for data detection is not accurate enough. We present a scheme how to assign the energy to pilot and data OFDM subcarriers in an optimum way which minimizes the symbol error probability as the ultimate quality measure of the transmission. The major part of the thesis is dedicated to point-to-point OFDM transmission systems. Towards the end of the thesis we show that the PDS can be also applied to multipoint-to-point OFDM transmission systems encountered for instance in the uplinks of mobile radio systems.

For many years real-time task models have focused the timing constraints on execution windows defined by earliest start times and deadlines for feasibility.
However, the utility of some application may vary among scenarios which yield correct behavior, and maximizing this utility improves the resource utilization.
For example, target sensitive applications have a target point where execution results in maximized utility, and an execution window for feasibility.
Execution around this point and within the execution window is allowed, albeit at lower utility.
The intensity of the utility decay accounts for the importance of the application.
Examples of such applications include multimedia and control; multimedia application are very popular nowadays and control applications are present in every automated system.
In this thesis, we present a novel real-time task model which provides for easy abstractions to express the timing constraints of target sensitive RT applications: the gravitational task model.
This model uses a simple gravity pendulum (or bob pendulum) system as a visualization model for trade-offs among target sensitive RT applications.
We consider jobs as objects in a pendulum system, and the target points as the central point.
Then, the equilibrium state of the physical problem is equivalent to the best compromise among jobs with conflicting targets.
Analogies with well-known systems are helpful to fill in the gap between application requirements and theoretical abstractions used in task models.
For instance, the so-called nature algorithms use key elements of physical processes to form the basis of an optimization algorithm.
Examples include the knapsack problem, traveling salesman problem, ant colony optimization, and simulated annealing.
We also present a few scheduling algorithms designed for the gravitational task model which fulfill the requirements for on-line adaptivity.
The scheduling of target sensitive RT applications must account for timing constraints, and the trade-off among tasks with conflicting targets.
Our proposed scheduling algorithms use the equilibrium state concept to order the execution sequence of jobs, and compute the deviation of jobs from their target points for increased system utility.
The execution sequence of jobs in the schedule has a significant impact on the equilibrium of jobs, and dominates the complexity of the problem --- the optimum solution is NP-hard.
We show the efficacy of our approach through simulations results and 3 target sensitive RT applications enhanced with the gravitational task model.

Modern society relies on convenience services and mobile communication. Cloud computing is the current trend to make data and applications available at any time on every device. Data centers concentrate computation and storage at central locations, while they claim themselves green due to their optimized maintenance and increased energy efﬁciency. The key enabler for this evolution is the microelectronics industry. The trend to power efﬁcient mobile devices has forced this industry to change its design dogma to: ”keep data locally and reduce data communication whenever possible”. Therefore we ask: is cloud computing repeating the aberrations of its enabling industry?

The increasing complexity of modern SoC designs makes tasks of SoC formal verification
a lot more complex and challenging. This motivates the research community to develop
more robust approaches that enable efficient formal verification for such designs.
It is a common scenario to apply a correctness by integration strategy while a SoC
design is being verified. This strategy assumes formal verification to be implemented in
two major steps. First of all, each module of a SoC is considered and verified separately
from the other blocks of the system. At the second step – when the functional correctness
is successfully proved for every individual module – the communicational behavior has
to be verified between all the modules of the SoC. In industrial applications, SAT/SMT-based interval property checking(IPC) has become widely adopted for SoC verification. Using IPC approaches, a verification engineer is able to afford solving a wide range of important verification problems and proving functional correctness of diverse complex components in a modern SoC design. However, there exist critical parts of a design where formal methods often lack their robustness. State-of-the-art property checkers fail in proving correctness for a data path of an industrial central processing unit (CPU). In particular, arithmetic circuits of a realistic size (32 bits or 64 bits) – especially implementing multiplication algorithms – are well-known examples when SAT/SMT-based
formal verification may reach its capacity very fast. In cases like this, formal verification
is replaced with simulation-based approaches in practice. Simulation is a good methodology that may assure a high rate of discovered bugs hidden in a SoC design. However, in contrast to formal methods, a simulation-based technique cannot guarantee the absence of errors in a design. Thus, simulation may still miss some so-called corner-case bugs in the design. This may potentially lead to additional and very expensive costs in terms of time, effort, and investments spent for redesigns, refabrications, and reshipments of new chips.
The work of this thesis concentrates on studying and developing robust algorithms
for solving hard arithmetic decision problems. Such decision problems often originate from a task of RTL property checking for data-path designs. Proving properties of those
designs can efficiently be performed by solving SMT decision problems formulated with
the quantifier-free logic over fixed-sized bit vectors (QF-BV).
This thesis, firstly, proposes an effective algebraic approach based on a Gröbner basis theory that allows to efficiently decide arithmetic problems. Secondly, for the case of custom-designed components, this thesis describes a sophisticated modeling technique which is required to restore all the necessary arithmetic description from these components. Further, this thesis, also, explains how methods from computer algebra and the modeling techniques can be integrated into a common SMT solver. Finally, a new QF-BV SMT solver is introduced.

This work shall provide a foundation for the cross-design of wireless networked control systems with limited resources. A cross-design methodology is devised, which includes principles for the modeling, analysis, design, and realization of low cost but high performance and intelligent wireless networked control systems. To this end, a framework is developed in which control algorithms and communication protocols are jointly designed, implemented, and optimized taking into consideration the limited communication, computing, memory, and energy resources of the low performance, low power, and low cost wireless nodes used. A special focus of the proposed methodology is on the prediction and minimization of the total energy consumption of the wireless network (i.e. maximization of the lifetime of wireless nodes) under control performance constraints (e.g. stability and robustness) in dynamic environments with uncertainty in resource availability, through the joint (offline/online) adaptation of communication protocol parameters and control algorithm parameters according to the traffic and channel conditions. Appropriate optimization approaches that exploit the structure of the optimization problems to be solved (e.g. linearity, affinity, convexity) and which are based on Linear Matrix Inequalities (LMIs), Dynamic Programming (DP), and Genetic Algorithms (GAs) are investigated. The proposed cross-design approach is evaluated on a testbed consisting of a real lab plant equipped with wireless nodes. Obtained results show the advantages of the proposed cross-design approach compared to standard approaches which are less flexible.

Hardware prototyping is an essential part in the hardware design flow. Furthermore, hardware prototyping usually relies on system-level design and hardware-in-the-loop simulations in order to develop, test and evaluate intellectual property cores. One common task in this process consist on interfacing cores with different port specifications. Data width conversion is used to overcome this issue. This work presents two open source hardware cores compliant with AXI4-Stream bus protocol, where each core performs upsizing/downsizing data width conversion.

Chisel (Constructing Hardware in a Scala embedded language) is a new programming language, which embedded in Scala, used for hardware synthesis. It aims to increase productivity when creating hardware by enabling designers to use features present in higher level programming languages to build complex hardware blocks. In this paper, the most advertised features of Chisel are investigated and compared to their VHDL counterparts, if present. Afterwards, the authors’ opinion if a switch to Chisel is worth considering is presented. Additionally, results from a related case study on Chisel are briefly summarized. The author concludes that, while Chisel has promising features, it is not yet ready for use in the industry.

Investigate the hardware description language Chisel - A case study implementing the Heston model
(2013)

This paper presents a case study comparing the hardware description language „Constructing Hardware in a Scala Embedded Language“(Chisel) to VHDL. For a thorough comparison the Heston Model was implemented, a stochastic model used in financial mathematics to calculate option prices. Metrics like hardware utilization and maximum clock rate were extracted from both resulting designs and compared to each other. The results showed a 30% reduction in code size compared to VHDL, while the resulting circuits had about the same hardware utilization. Using Chisel however proofed to be difficult because of a few features that were not available for this case study.

The objective of this thesis consists in developing systematic event-triggered control designs for specified event generators, which is an important alternative to the traditional periodic sampling control. Sporadic sampling inherently arising in event-triggered control is determined by the event-triggering conditions. This feature invokes the desire of
finding new control theory as the traditional sampled-data theory in computer control.
Developing controller coupling with the applied event-triggering condition to maximize the control performance is the essence for event-triggered control design. In the design the stability of the control system needs to be ensured with the first priority. Concerning variant control aims they should be clearly incorporated in the design procedures. Considering applications in embedded control systems efficient implementation requires a low complexity of embedded software architectures. The thesis targets at offering such a design to further complete the theory of event-triggered control designs.

The work presented in this thesis discusses the thermal and power management of multi-core processors (MCPs) with both two dimensional (2D) package and there dimensional (3D) package chips. The power and thermal management/balancing is of increasing concern and is a technological challenge to the MCP development and will be a main performance bottleneck for the development of MCPs. This thesis develops optimal thermal and power management policies for MCPs. The system thermal behavior for both 2D package and 3D package chips is analyzed and mathematical models are developed. Thereafter, the optimal thermal and power management methods are introduced.
Nowadays, the chips are generally packed in 2D technique, which means that there is only one layer of dies in the chip. The chip thermal behavior can be described by a 3D heat conduction partial differential equation (PDE). As the target is to balance the thermal behavior and power consumption among the cores, a group of one dimensional (1D) PDEs, which is derived from the developed 3D PDE heat conduction equation, is proposed to describe the thermal behavior of each core. Therefore, the thermal behavior of the MCP is described by a group of 1D PDEs. An optimal controller is designed to manage the power consumption and balance the temperature among the cores based on the proposed 1D model.
3D package is an advanced package technology, which contains at least 2 layers of dies stacked in one chip. Different from 2D package, the cooling system should be installed among the layers to reduce the internal temperature of the chip. In this thesis, the micro-channel liquid cooling system is considered, and the heat transfer character of the micro-channel is analyzed and modeled as an ordinary differential equation (ODE). The dies are discretized to blocks based on the chip layout with each block modeled as a thermal resistance and capacitance (R-C) circuit. Thereafter, the micro-channels are discretized. The thermal behavior of the whole system is modeled as an ODE system. The micro-channel liquid velocity is set according to the workload and the temperature of the dies. Under each velocity, the system can be described as a linear ODE model system and the whole system is a switched linear system. An H-infinity observer is designed to estimate the states. The model predictive control (MPC) method is employed to design the thermal and power management/balancing controller for each submodel.
The models and controllers developed in this thesis are verified by simulation experiments via MATLAB. The IBM cell 8 cores processor and water micro-channel cooling system developed by IBM Research in collaboration with EPFL and ETHZ are employed as the experiment objects.

In this thesis we studied and investigated a very common but a long existing noise problem and we provided a solution to this problem. The task is to deal with different types of noise that occur simultaneously and which we call hybrid. Although there are individual solutions for specific types one cannot simply combine them because each solution affects the whole speech. We developed an automatic speech recognition system DANSR ( Dynamic Automatic Noisy Speech Recognition System) for hybrid noisy environmental noise. For this we had to study all of speech starting from the production of sounds until their recognition. Central elements are the feature vectors on which pay much attention. As an additional effect we worked on the production of quantities for psychoacoustic speech elements.
The thesis has four parts:
1) The first part we give an introduction. The chapter 2 and 3 give an overview over speech generation and recognition when machines are used. Also noise is considered.
2) In the second part we describe our general system for speech recognition in a noisy environment. This is contained in the chapters 4-10. In chapter 4 we deal with data preparation. Chapter 5 is concerned with very strong noise and its modeling using Poisson distribution. In the chapters 5-8 we deal with parameter based modeling. Chapter 7 is concerned with autoregressive methods in relation to the vocal tract. In the chapters 8 and 9 we discuss linear prediction and its parameters. Chapter 9 is also concerned with quadratic errors, the decomposition into sub-bands and the use of Kalman filters for non-stationary colored noise in chapter 10. There one finds classical approaches as long we have used and modified them. This includes covariance mehods, the method of Burg and others.
3) The third part deals firstly with psychoacoustic questions. We look at quantitative magnitudes that describe them. This has serious consequences for the perception models. For hearing we use different scales and filters. In the center of the chapters 12 and 13 one finds the features and their extraction. The fearures are the only elements that contain information for further use. We consider here Cepstrum features and Mel frequency cepstral coefficients(MFCC), shift invariant local trigonometric transformed (SILTT), linear predictive coefficients (LPC), linear predictive cepstral coefficients (LPCC), perceptual linear predictive (PLP) cepstral coefficients. In chapter 13 we present our extraction methods in DANSR and how they use window techniques And discrete cosine transform (DCT-IV) as well as their inverses.
4) The fourth part considers classification and the ultimate speech recognition. Here we use the hidden Markov model (HMM) for describing the speech process and the Gaussian mixture model (GMM) for the acoustic modelling. For the recognition we use forward algorithm, the Viterbi search and the Baum-Welch algorithm. We also draw the connection to dynamic time warping (DTW). In the rest we show experimental results and conclusions.

Specification of asynchronous circuit behaviour becomes more complex as the
complexity of today’s System-On-a-Chip (SOC) design increases. This also causes
the Signal Transition Graphs (STGs) – interpreted Petri nets for the specification
of asynchronous circuit behaviour – to become bigger and more complex, which
makes it more difficult, sometimes even impossible, to synthesize an asynchronous
circuit from an STG with a tool like petrify [CKK+96] or CASCADE [BEW00].
It has, therefore, been suggested to decompose the STG as a first step; this
leads to a modular implementation [KWVB03] [KVWB05], which can reduce syn-
thesis effort by possibly avoiding state explosion or by allowing the use of library
elements. A decomposition approach for STGs was presented in [VW02] [KKT93]
[Chu87a]. The decomposition algorithm by Vogler and Wollowski [VW02] is based
on that of Chu [Chu87a] but is much more generally applicable than the one in
[KKT93] [Chu87a], and its correctness has been proved formally in [VW02].
This dissertation begins with Petri net background described in chapter 2.
It starts with a class of Petri nets called a place/transition (P/T) nets. Then
STGs, the subclass of P/T nets, is viewed. Background in net decomposition
is presented in chapter 3. It begins with the structural decomposition of P/T
nets for analysis purposes – liveness and boundedness of the net. Then STG
decomposition for synthesis from [VW02] is described.
The decomposition method from [VW02] still could be improved to deal with
STGs from real applications and to give better decomposition results. Some
improvements for [VW02] to improve decomposition result and increase algorithm
efficiency are discussed in chapter 4. These improvement ideas are suggested in
[KVWB04] and some of them are have been proved formally in [VK04].
The decomposition method from [VW02] is based on net reduction to find
an output block component. A large amount of work has to be done to reduce
an initial specification until the final component is found. This reduction is not
always possible, which causes input initially classified as irrelevant to become
relevant input for the component. But under certain conditions (e.g. if structural
auto-conflicts turn out to be non-dynamic) some of them could be reclassified as
irrelevant. If this is not done, the specifications become unnecessarily large, which
intern leads to unnecessarily large implemented circuits. Instead of reduction, a
new approach, presented in chapter 5, decomposes the original net into structural
components first. An initial output block component is found by composing the
structural components. Then, a final output block component is obtained by net
reduction.
As we cope with the structure of a net most of the time, it would be useful
to have a structural abstraction of the net. A structural abstraction algorithm
[Kan03] is presented in chapter 6. It can improve the performance in finding an
output block component in most of the cases [War05] [Taw04]. Also, the structure
net is in most cases smaller than the net itself. This increases the efficiency of the
decomposition algorithm because it allows the transitions contained in a node of
the structure graph to be contracted at the same time if the structure graph is
used as internal representation of the net.
Chapter 7 discusses the application of STG decomposition in asynchronous
circuit design. Application to speed independent circuits is discussed first. Af-
ter that 3D circuits synthesized from extended burst mode (XBM) specifications
are discussed. An algorithm for translating STG specifications to XBM specifi-
cations was first suggested by [BEW99]. This algorithm first derives the state
machine from the STG specification, then translates the state machine to XBM
specification. An XBM specification, though it is a state machine, allows some
concurrency. These concurrencies can be translated directly, without deriving
all of the possible states. An algorithm which directly translates STG to XBM
specifications, is presented in chapter 7.3.1. Finally DESI, a tool to decompose
STGs and its decomposition results are presented.