The work presented in this thesis discusses the thermal and power management of multi-core processors (MCPs) with both two dimensional (2D) package and there dimensional (3D) package chips. The power and thermal management/balancing is of increasing concern and is a technological challenge to the MCP development and will be a main performance bottleneck for the development of MCPs. This thesis develops optimal thermal and power management policies for MCPs. The system thermal behavior for both 2D package and 3D package chips is analyzed and mathematical models are developed. Thereafter, the optimal thermal and power management methods are introduced.
Nowadays, the chips are generally packed in 2D technique, which means that there is only one layer of dies in the chip. The chip thermal behavior can be described by a 3D heat conduction partial differential equation (PDE). As the target is to balance the thermal behavior and power consumption among the cores, a group of one dimensional (1D) PDEs, which is derived from the developed 3D PDE heat conduction equation, is proposed to describe the thermal behavior of each core. Therefore, the thermal behavior of the MCP is described by a group of 1D PDEs. An optimal controller is designed to manage the power consumption and balance the temperature among the cores based on the proposed 1D model.
3D package is an advanced package technology, which contains at least 2 layers of dies stacked in one chip. Different from 2D package, the cooling system should be installed among the layers to reduce the internal temperature of the chip. In this thesis, the micro-channel liquid cooling system is considered, and the heat transfer character of the micro-channel is analyzed and modeled as an ordinary differential equation (ODE). The dies are discretized to blocks based on the chip layout with each block modeled as a thermal resistance and capacitance (R-C) circuit. Thereafter, the micro-channels are discretized. The thermal behavior of the whole system is modeled as an ODE system. The micro-channel liquid velocity is set according to the workload and the temperature of the dies. Under each velocity, the system can be described as a linear ODE model system and the whole system is a switched linear system. An H-infinity observer is designed to estimate the states. The model predictive control (MPC) method is employed to design the thermal and power management/balancing controller for each submodel.
The models and controllers developed in this thesis are verified by simulation experiments via MATLAB. The IBM cell 8 cores processor and water micro-channel cooling system developed by IBM Research in collaboration with EPFL and ETHZ are employed as the experiment objects.
This paper presents a case study comparing the hardware description language „Constructing Hardware in a Scala Embedded Language“(Chisel) to VHDL. For a thorough comparison the Heston Model was implemented, a stochastic model used in financial mathematics to calculate option prices. Metrics like hardware utilization and maximum clock rate were extracted from both resulting designs and compared to each other. The results showed a 30% reduction in code size compared to VHDL, while the resulting circuits had about the same hardware utilization. Using Chisel however proofed to be difficult because of a few features that were not available for this case study.
Chisel (Constructing Hardware in a Scala embedded language) is a new programming language, which embedded in Scala, used for hardware synthesis. It aims to increase productivity when creating hardware by enabling designers to use features present in higher level programming languages to build complex hardware blocks. In this paper, the most advertised features of Chisel are investigated and compared to their VHDL counterparts, if present. Afterwards, the authors’ opinion if a switch to Chisel is worth considering is presented. Additionally, results from a related case study on Chisel are briefly summarized. The author concludes that, while Chisel has promising features, it is not yet ready for use in the industry.
This work shall provide a foundation for the cross-design of wireless networked control systems with limited resources. A cross-design methodology is devised, which includes principles for the modeling, analysis, design, and realization of low cost but high performance and intelligent wireless networked control systems. To this end, a framework is developed in which control algorithms and communication protocols are jointly designed, implemented, and optimized taking into consideration the limited communication, computing, memory, and energy resources of the low performance, low power, and low cost wireless nodes used. A special focus of the proposed methodology is on the prediction and minimization of the total energy consumption of the wireless network (i.e. maximization of the lifetime of wireless nodes) under control performance constraints (e.g. stability and robustness) in dynamic environments with uncertainty in resource availability, through the joint (offline/online) adaptation of communication protocol parameters and control algorithm parameters according to the traffic and channel conditions. Appropriate optimization approaches that exploit the structure of the optimization problems to be solved (e.g. linearity, affinity, convexity) and which are based on Linear Matrix Inequalities (LMIs), Dynamic Programming (DP), and Genetic Algorithms (GAs) are investigated. The proposed cross-design approach is evaluated on a testbed consisting of a real lab plant equipped with wireless nodes. Obtained results show the advantages of the proposed cross-design approach compared to standard approaches which are less flexible.
Hardware prototyping is an essential part in the hardware design flow. Furthermore, hardware prototyping usually relies on system-level design and hardware-in-the-loop simulations in order to develop, test and evaluate intellectual property cores. One common task in this process consist on interfacing cores with different port specifications. Data width conversion is used to overcome this issue. This work presents two open source hardware cores compliant with AXI4-Stream bus protocol, where each core performs upsizing/downsizing data width conversion.
Modern society relies on convenience services and mobile communication. Cloud computing is the current trend to make data and applications available at any time on every device. Data centers concentrate computation and storage at central locations, while they claim themselves green due to their optimized maintenance and increased energy efﬁciency. The key enabler for this evolution is the microelectronics industry. The trend to power efﬁcient mobile devices has forced this industry to change its design dogma to: ”keep data locally and reduce data communication whenever possible”. Therefore we ask: is cloud computing repeating the aberrations of its enabling industry?
The increasing complexity of modern SoC designs makes tasks of SoC formal verification
a lot more complex and challenging. This motivates the research community to develop
more robust approaches that enable efficient formal verification for such designs.
It is a common scenario to apply a correctness by integration strategy while a SoC
design is being verified. This strategy assumes formal verification to be implemented in
two major steps. First of all, each module of a SoC is considered and verified separately
from the other blocks of the system. At the second step – when the functional correctness
is successfully proved for every individual module – the communicational behavior has
to be verified between all the modules of the SoC. In industrial applications, SAT/SMT-based interval property checking(IPC) has become widely adopted for SoC verification. Using IPC approaches, a verification engineer is able to afford solving a wide range of important verification problems and proving functional correctness of diverse complex components in a modern SoC design. However, there exist critical parts of a design where formal methods often lack their robustness. State-of-the-art property checkers fail in proving correctness for a data path of an industrial central processing unit (CPU). In particular, arithmetic circuits of a realistic size (32 bits or 64 bits) – especially implementing multiplication algorithms – are well-known examples when SAT/SMT-based
formal verification may reach its capacity very fast. In cases like this, formal verification
is replaced with simulation-based approaches in practice. Simulation is a good methodology that may assure a high rate of discovered bugs hidden in a SoC design. However, in contrast to formal methods, a simulation-based technique cannot guarantee the absence of errors in a design. Thus, simulation may still miss some so-called corner-case bugs in the design. This may potentially lead to additional and very expensive costs in terms of time, effort, and investments spent for redesigns, refabrications, and reshipments of new chips.
The work of this thesis concentrates on studying and developing robust algorithms
for solving hard arithmetic decision problems. Such decision problems often originate from a task of RTL property checking for data-path designs. Proving properties of those
designs can efficiently be performed by solving SMT decision problems formulated with
the quantifier-free logic over fixed-sized bit vectors (QF-BV).
This thesis, firstly, proposes an effective algebraic approach based on a Gröbner basis theory that allows to efficiently decide arithmetic problems. Secondly, for the case of custom-designed components, this thesis describes a sophisticated modeling technique which is required to restore all the necessary arithmetic description from these components. Further, this thesis, also, explains how methods from computer algebra and the modeling techniques can be integrated into a common SMT solver. Finally, a new QF-BV SMT solver is introduced.
Wireless sensor networks are the driving force behind many popular and interdisciplinary research areas, such as environmental monitoring, building automation, healthcare and assisted living applications. Requirements like compactness, high integration of sensors, flexibility, and power efficiency are often very different and cannot be fulfilled by state-of-the-art node platforms at once. In this paper, we present and analyze AmICA: a flexible, compact, easy-to-program, and low-power node platform. Developed from scratch and including a node, a basic communication protocol, and a debugging toolkit, it assists in an user-friendly rapid application development. The general purpose nature of AmICA was evaluated in two practical applications with diametric requirements. Our analysis shows that AmICA nodes are 67% smaller than BTnodes, have five times more sensors than Mica2Dot and consume 72% less energy than the state-of-the-art TelosB mote in sleep mode.
For many years real-time task models have focused the timing constraints on execution windows defined by earliest start times and deadlines for feasibility.
However, the utility of some application may vary among scenarios which yield correct behavior, and maximizing this utility improves the resource utilization.
For example, target sensitive applications have a target point where execution results in maximized utility, and an execution window for feasibility.
Execution around this point and within the execution window is allowed, albeit at lower utility.
The intensity of the utility decay accounts for the importance of the application.
Examples of such applications include multimedia and control; multimedia application are very popular nowadays and control applications are present in every automated system.
In this thesis, we present a novel real-time task model which provides for easy abstractions to express the timing constraints of target sensitive RT applications: the gravitational task model.
This model uses a simple gravity pendulum (or bob pendulum) system as a visualization model for trade-offs among target sensitive RT applications.
We consider jobs as objects in a pendulum system, and the target points as the central point.
Then, the equilibrium state of the physical problem is equivalent to the best compromise among jobs with conflicting targets.
Analogies with well-known systems are helpful to fill in the gap between application requirements and theoretical abstractions used in task models.
For instance, the so-called nature algorithms use key elements of physical processes to form the basis of an optimization algorithm.
Examples include the knapsack problem, traveling salesman problem, ant colony optimization, and simulated annealing.
We also present a few scheduling algorithms designed for the gravitational task model which fulfill the requirements for on-line adaptivity.
The scheduling of target sensitive RT applications must account for timing constraints, and the trade-off among tasks with conflicting targets.
Our proposed scheduling algorithms use the equilibrium state concept to order the execution sequence of jobs, and compute the deviation of jobs from their target points for increased system utility.
The execution sequence of jobs in the schedule has a significant impact on the equilibrium of jobs, and dominates the complexity of the problem --- the optimum solution is NP-hard.
We show the efficacy of our approach through simulations results and 3 target sensitive RT applications enhanced with the gravitational task model.
This thesis has the goal to propose measures which allow an increase of the power efficiency of OFDM transmission systems. As compared to OFDM transmission over AWGN channels, OFDM transmission over frequency selective radio channels requires a significantly larger transmit power in order to achieve a certain transmission quality. It is well known that this detrimental impact of frequency selectivity can be combated by frequency diversity. We revisit and further investigate an approach to frequency diversity based on the spreading of subsets of the data elements over corresponding subsets of the OFDM subcarriers and term this approach Partial Data Spreading (PDS). The size of said subsets, which we designate as spreading factor, is a design parameter of PDS, and by properly choosing , depending on the system designer's requirements, an adequate compromise between a good system performance and a low complexity can be found. We show how PDS can be combined with ML, MMSE and ZF data detection, and it is recognized that MMSE data detection offers a good compromise between performance and complexity. After having presented the utilization of PDS in OFDM transmission without FEC encoding, we also show that PDS readily lends itself for FEC encoded OFDM transmission. We display that in this case the system performance can be significantly enhanced by specific schemes of interleaving and utilization of reliabiliy information developed in the thesis. A severe problem of OFDM transmission is the large Peak-to-Average-Power Ratio (PAPR) of the OFDM symbols, which hampers the application of power efficient transmit amplifiers. Our investigations reveal that PDS inherently reduces the PAPR. Another approch to PAPR reduction is the well known scheme Selective Data Mapping (SDM). In the thesis it is shown that PDS can be beneficially combined with SDM to the scheme PDS-SDM with a view to jointly exploit the PAPR reduction potentials of both schemes. However, even when such a PAPR reduction is achieved, the amplitude maximum of the resulting OFDM symbols is not constant, but depends on the data content. This entails the disadvantage that the power amplifier cannot be designed, with a view to achieve a high power efficiency, for a fixed amplitude maximum, what would be desirable. In order to overcome this problem, we propose the scheme Optimum Clipping (OC), in which we obtain the desired fixed amplitude maximum by a specific combination of the measures clipping, filtering and rescaling. In OFDM transmission a certain number of OFDM subcarriers have to be sacrificed for pilot transmission in order to enable channel estimation in the receiver. For a given energy of the OFDM symbols, the question arises in which way this energy should be subdivided among the pilots and the data carrying OFDM subcarriers. If a large portion of the available transmit energy goes to the pilots, then the quality of channel estimation is good, however, the data detection performs poor. Data detection also performs poor if the energy provided for the pilots is too small, because then the channel estimate indispensable for data detection is not accurate enough. We present a scheme how to assign the energy to pilot and data OFDM subcarriers in an optimum way which minimizes the symbol error probability as the ultimate quality measure of the transmission. The major part of the thesis is dedicated to point-to-point OFDM transmission systems. Towards the end of the thesis we show that the PDS can be also applied to multipoint-to-point OFDM transmission systems encountered for instance in the uplinks of mobile radio systems.