A prime motivation for using XML to directly represent pieces of information is the ability of supporting ad-hoc or 'schema-later' settings. In such scenarios, modeling data under loose data constraints is essential. Of course, the flexibility of XML comes at a price: the absence of a rigid, regular, and homogeneous structure makes many aspects of data management more challenging. Such malleable data formats can also lead to severe information quality problems, because the risk of storing inconsistent and incorrect data is greatly increased. A prominent example of such problems is the appearance of the so-called fuzzy duplicates, i.e., multiple and non-identical representations of a real-world entity. Similarity joins correlating XML document fragments that are similar can be used as core operators to support the identification of fuzzy duplicates. However, similarity assessment is especially difficult on XML datasets because structure, besides textual information, may exhibit variations in document fragments representing the same real-world entity. Moreover, similarity computation is substantially more expensive for tree-structured objects and, thus, is a serious performance concern. This thesis describes the design and implementation of an effective, flexible, and high-performance XML-based similarity join framework. As main contributions, we present novel structure-conscious similarity functions for XML trees - either considering XML structure in isolation or combined with textual information -, mechanisms to support the selection of relevant information from XML trees and organization of this information into a suitable format for similarity calculation, and efficient algorithms for large-scale identification of similar, set-represented objects. Finally, we validate the applicability of our techniques by integrating our framework into a native XML database management system; in this context we address several issues around the integration of similarity operations into traditional database architectures.
Embedded systems have become ubiquitous in everyday life, and especially in the automotive industry. New applications challenge their design by introducing a new class of problems that are based on a detailed analysis of the environmental situation. Situation analysis systems rely on models and algorithms of the domain of computational geometry. The basic model is usually an Euclidean plane, which contains polygons to represent the objects of the environment. Usual implementations of computational geometry algorithms cannot be directly used for safety-critical systems. First, a strict analysis of their correctness is indispensable and second, nonfunctional requirements with respect to the limited resources must be considered. This thesis proposes a layered approach to a polygon-processing system. On top of rational numbers, a geometry kernel is formalised at first. Subsequently, geometric primitives form a second layer of abstraction that is used for plane sweep and polygon algorithms. These layers do not only divide the whole system into manageable parts but make it possible to model problems and reason about them at the appropriate level of abstraction. This structure is used for the verification as well as the implementation of the developed polygon-processing library.
Nowadays, accounting, charging and billing users' network resource consumption are commonly used for the purpose of facilitating reasonable network usage, controlling congestion, allocating cost, gaining revenue, etc. In traditional IP traffic accounting systems, IP addresses are used to identify the corresponding consumers of the network resources. However, there are some situations in which IP addresses cannot be used to identify users uniquely, for example, in multi-user systems. In these cases, network resource consumption can only be ascribed to the owners of these hosts instead of corresponding real users who have consumed the network resources. Therefore, accurate accountability in these systems is practically impossible. This is a flaw of the traditional IP address based IP traffic accounting technique. This dissertation proposes a user based IP traffic accounting model which can facilitate collecting network resource usage information on the basis of users. With user based IP traffic accounting, IP traffic can be distinguished not only by IP addresses but also by users. In this dissertation, three different schemes, which can achieve the user based IP traffic accounting mechanism, are discussed in detail. The inband scheme utilizes the IP header to convey the user information of the corresponding IP packet. The Accounting Agent residing in the measured host intercepts IP packets passing through it. Then it identifies the users of these IP packets and inserts user information into the IP packets. With this mechanism, a meter located in a key position of the network can intercept the IP packets tagged with user information, extract not only statistic information, but also IP addresses and user information from the IP packets to generate accounting records with user information. The out-of-band scheme is a contrast scheme to the in-band scheme. It also uses an Accounting Agent to intercept IP packets and identify the users of IP traffic. However, the user information is transferred through a separated channel, which is different from the corresponding IP packets' transmission. The Multi-IP scheme provides a different solution for identifying users of IP traffic. It assigns each user in a measured host a unique IP address. Through that, an IP address can be used to identify a user uniquely without ambiguity. This way, traditional IP address based accounting techniques can be applied to achieve the goal of user based IP traffic accounting. In this dissertation, a user based IP traffic accounting prototype system developed according to the out-of-band scheme is also introduced. The application of user based IP traffic accounting model in the distributed computing environment is also discussed.
Computer-based simulation and visualization of acoustics of a virtual scene can aid during the design process of concert halls, lecture rooms, theaters, or living rooms. Because, not only the visual aspect of the room is important, but also its acoustics. In factory floors noise reduction is important since noise is hazardous to health. Despite the obvious dissimilarity between our aural and visual senses, many techniques required for the visualization of photo-realistic images and for the auralization of acoustic environments are quite similar. Both applications can be served by geometric methods such as particle- and ray tracing if we neglect a number of less important effects. By means of the simulation of room acoustics we want to predict the acoustic properties of a virtual model. For auralization, a pulse response filter needs to be assembled for each pair of source and listener positions. The convolution of this filter with an anechoic source signal provides the signal received at the listener position. Hence, the pulse response filter must contain all reverberations (echos) of a unit pulse, including their frequency decompositions due to absorption at different surface materials. For the room acoustic simulation a method named phonon tracing, since it is based on particles, is developed. The approach computes the energy or pressure decomposition for each particle (phonon) sent out from a sound source and uses this in a second pass (phonon collection) to construct the response filters for different listeners. This step can be performed in different precision levels. During the tracing step particle paths and additional information are stored in a so called phonon map. Using this map several sound visualization approaches were developed. From the visualization, the effect of different materials on the spectral energy / pressure distribution can be observed. The first few reflections already show whether certain frequency bands are rapidly absorbed. The absorbing materials can be identified and replaced in the virtual model, improving the overall acoustic quality of the simulated room. Furthermore an insight into the pressure / energy received at the listener position is possible. The phonon tracing algorithm as well as several sound visualization approaches are integrated into a common system utilizing Virtual Reality technologies in order to facilitate the immersion into the virtual scene. The system is a prototype developed within a project at the University of Kaiserslautern and is still a subject of further improvements. It consists of a stereoscopic back-projection system for visual rendering as well as professional audio equipment for auralization purposes.
Three dimensional (3d) point data is used in industry for measurement and reverse engineering. Precise point data is usually acquired with triangulating laser scanners or high precision structured light scanners. Lower precision point data is acquired by real-time structured light devices or by stereo matching with multiple cameras. The basic principle of all these methods is the so-called triangulation of 3d coordinates from two dimensional (2d) camera images.
This dissertation contributes a method for multi-camera stereo matching that uses a system of four synchronized cameras. A GPU based stereo matching method is presented to achieve a high quality reconstruction at interactive frame rates. Good depth resolution is achieved by allowing large disparities between the images. A multi level approach on the GPU allows a fast processing of these large disparities. In reverse engineering, hand-held laser scanners are used for the scanning of complex shaped objects. The operator of the scanner can scan complex regions slower, multiple times, or from multiple angles to achieve a higher point density. Traditionally, computer aided design (CAD) geometry is reconstructed in a separate step after the scanning. Errors or missing parts in the scan prevent a successful reconstruction. The contribution of this dissertation is an on-line algorithm that allows the reconstruction during the scanning of an object. Scanned points are added to the reconstruction and improve it on-line. The operator can detect the areas in the scan where the reconstruction needs additional data.
First, the point data is thinned out using an octree based data structure. Local normals and principal curvatures are estimated for the reduced set of points. These local geometric values are used for segmentation using a region growing approach. Implicit quadrics are fitted to these segments. The canonical form of the quadrics provides the parameters of basic geometric primitives.
An improved approach uses so called accumulated means of local geometric properties to perform segmentation and primitive reconstruction in a single step. Local geometric values can be added and removed on-line to these means to get a stable estimate over a complete segment. By estimating the shape of the segment it is decided which local areas are added to a segment. An accumulated score estimates the probability for a segment to belong to a certain type of geometric primitive. A boundary around the segment is reconstructed using a growing algorithm that ensures that the boundary is closed and avoids self intersections.
The primary emphasis of this thesis concerns the extraction and representation of intrinsic properties of three-dimensional (3D) unorganized point clouds. The points establishing a point cloud as it mainly emerges from LiDaR (Light Detection and Ranging) scan devices or by reconstruction from two-dimensional (2D) image series represent discrete samples of real world objects. Depending on the type of scenery the data is generated from the resulting point cloud may exhibit a variety of different structures. Especially, in the case of environmental LiDaR scans the complexity of the corresponding point clouds is relatively high. Hence, finding new techniques allowing the efficient extraction and representation of the underlying structural entities becomes an important research issue of recent interest. This thesis introduces new methods regarding the extraction and visualization of structural features like surfaces and curves (e.g. ridge-lines, creases) from 3D (environmental) point clouds. One main part concerns the extraction of curve-like features from environmental point data sets. It provides a new method supporting a stable feature extraction by incorporating a probability-based point classification scheme that characterizes individual points regarding their affiliation to surface-, curve- and volume-like structures. Another part is concerned with the surface reconstruction from (environmental) point clouds exhibiting objects that are more or less complex. A new method providing multi-resolutional surface representations from regular point clouds is discussed. Following the applied principles of this approach a volumetric surface reconstruction method based on the proposed classification scheme is introduced. It allows the reconstruction of surfaces from highly unstructured and noisy point data sets. Furthermore, contributions in the field of reconstructing 3D point clouds from 2D image series are provided. In addition, a discussion concerning the most important properties of (environmental) point clouds with respect to feature extraction is presented.
Automated theorem proving is a search problem and, by its undecidability, a very difficult one. The challenge in the development of a practically successful prover is the mapping of the extensively developed theory into a program that runs efficiently on a computer. Starting from a level-based system model for automated theorem provers, in this work we present different techniques that are important for the development of powerful equational theorem provers. The contributions can be divided into three areas: Architecture. We present a novel prover architecture that is based on a set-based compression scheme. With moderate additional computational costs we achieve a substantial reduction of the memory requirements. Further wins are architectural clarity, the easy provision of proof objects, and a new way to parallelize a prover which shows respectable speed-ups in practice. The compact representation paves the way to new applications of automated equational provers in the area of verification systems. Algorithms. To improve the speed of a prover we need efficient solutions for the most time-consuming sub-tasks. We demonstrate improvements of several orders of magnitude for two of the most widely used term orderings, LPO and KBO. Other important contributions are a novel generic unsatisfiability test for ordering constraints and, based on that, a sufficient ground reducibility criterion with an excellent cost-benefit ratio. Redundancy avoidance. The notion of redundancy is of central importance to justify simplifying inferences which are used to prune the search space. In our experience with unfailing completion, the usual notion of redundancy is not strong enough. In the presence of associativity and commutativity, the provers often get stuck enumerating equations that are permutations of each other. By extending and refining the proof ordering, many more equations can be shown redundant. Furthermore, our refinement of the unfailing completion approach allows us to use redundant equations for simplification without the need to consider them for generating inferences. We describe the efficient implementation of several redundancy criteria and experimentally investigate their influence on the proof search. The combination of these techniques results in a considerable improvement of the practical performance of a prover, which we demonstrate with extensive experiments for the automated theorem prover Waldmeister. The progress achieved allows the prover to solve problems that were previously out of reach. This considerably enhances the potential of the prover and opens up the way for new applications.
Ultraschall ist eines der am häufigsten genutzen, bildgebenden Verfahren in der Kardiologie. Dies ist durch die günstige Erzeugung, die Nicht-Invasivität und die Unschädlichkeit für die Patienten begründet. Nachteilig an den existierenden Geräten ist der Umstand, daß lediglich zwei-dimensionale Bilder generiert werden können. Zusätzlich können diese Bilder aufgrund anatomischer Gegebenheiten nicht aus einer wahlfreien Position akquiriert werden. Dies erschwert die Analyse der Daten und folglich die Diagnose. Mit dieser Arbeit wurden neue, algorithmische Aspekte des vier-dimensionalen, kardiologischen Ultraschalls ausgehend von der Akquisition der Rohdaten, deren Synchronisation und Rekonstruktion bis hin zur Visualisierung bearbeitet. In einem zusätzlichen Kapitel wurde eine neue Technik zur weiteren Aufwertung der Visualisierung, sowie zur visuellen Bearbeitung der Ultraschalldaten entwickelt. Durch die hier entwickelten Verfahren ist es möglich bestimmte Einschränkungen des kardiologischen Ultraschalls aufzuheben oder zumindest zu mildern. Hierunter zählen vor allem die Einschränkung auf zwei-dimensionale Schnittbilder, sowie die eingeschränkte Sichtwahl.
Asynchronous programs are challenging to implement correctly: the loose coupling between asynchronously executed tasks makes the control and data dependencies difficult to follow. Even subtle design and programming mistakes on the programs have the capability to introduce erroneous or divergent behaviors. As asynchronous programs are typically written to provide a reliable, high-performance infrastructure, there is a critical need for analysis techniques to guarantee their correctness.
In this dissertation, I provide scalable verification and testing tools to make asyn- chronous programs more reliable. I show that the combination of counter abstraction and partial order reduction is an effective approach for the verification of asynchronous systems by presenting PROVKEEPER and KUAI, two scalable verifiers for two types of asynchronous systems. I also provide a theoretical result that proves a counter-abstraction based algorithm called expand-enlarge-check, is an asymptotically optimal algorithm for the coverability problem of branching vector addition systems as which many asynchronous programs can be modeled. In addition, I present BBS and LLSPLAT, two testing tools for asynchronous programs that efficiently uncover many subtle memory violation bugs.
Open distributed systems are a class of distributed systems where (i) only partial information about the environment, in which they are running, is present, (ii) new resources may become available at runtime, and (iii) a subsystem may become aware of other subsystems after some interaction. Modeling and implementing such systems correctly is a complex task due to the openness and the dynamicity aspects. One way to ensure that the resulting systems behave correctly is to utilize formal verification.
Formal verification requires an adequate semantic model of the implementation, a specification of the desired behavior, and a reasoning technique. The actor model is a semantic model that captures the challenging aspects of open distributed systems by utilizing actors as universal primitives to represent system entities and allowing them to create new actors and to communicate by sending directed messages as reply to received messages. To enable compositional reasoning, where the reasoning task is reduced to independent verification of the system parts, semantic entities at a higher level of abstraction than actors are needed.
This thesis proposes an automaton model and combines sound reasoning techniques to compositionally verify implementations of open actor systems. Based on I/O automata, the model allows automata to be created dynamically and captures dynamic changes in communication patterns. Each automaton represents either an actor or a group of actors. The specification of the desired behavior is given constructively as an automaton. As the basis for compositionality, we formalize a component notion based on the static structure of the implementation instead of the dynamic entities (the actors) occurring in the system execution. The reasoning proceeds in two stages. The first stage establishes the connection between the automata representing single actors and their implementation description by means of weakest liberal preconditions. The second stage employs this result as the basis for verifying whether a component specification is satisfied. The verification is done by building a simulation relation from the automaton representing the implementation to the component's automaton. Finally, we validate the compositional verification approach through a number of examples by proving correctness of their actor implementations with respect to system specifications.