## Fachbereich Informatik

### Refine

#### Year of publication

- 2009 (8) (remove)

#### Document Type

- Doctoral Thesis (8) (remove)

#### Keywords

- Datenanalyse (2)
- Visualisierung (2)
- 3D Gene Expression (1)
- 3D Point Data (1)
- Ableitungsschätzung (1)
- Befahrbarkeitsanalyse (1)
- Cluster-Analyse (1)
- Computer Graphic (1)
- Computergraphik (1)
- Computerphysik (1)

We study the extension of techniques from Inductive Logic Programming (ILP) to temporal logic programming languages. Therefore we present two temporal logic programming languages and analyse the learnability of programs from these languages from finite sets of examples. In first order temporal logic the following topics are analysed: - How can we characterize the denotational semantics of programs? - Which proof techniques are best suited? - How complex is the learning task? In propositional temporal logic we analyse the following topics: - How can we use well known techniques from model checking in order to refine programs? - How complex is the learning task? In both cases we present estimations for the VC-dimension of selected classes of programs.

Most software systems are described in high-level model or programming languages. Their runtime behavior, however, is determined by the compiled code. For uncritical software, it may be sufficient to test the runtime behavior of the code. For safety-critical software, there is an additional aggravating factor resulting from the fact that the code must satisfy the formal specification which reflects the safety policy of the software consumer and that the software producer is obliged to demonstrate that the code is correct with respect to the specification using formal verification techniques. In this scenario, it is of great importance that static analyses and formal methods can be applied on the source code level, because this level is more abstract and better suited for such techniques. However, the results of the analyses and the verification can only be carried over to the machine code level, if we can establish the correctness of the translation. Thus, compilation is a crucial step in the development of software systems and formally verified translation correctness is essential to close the formalization chain from high-level formal methods to the machine-code level. In this thesis, I propose an approach to certifying compilers which achieves the aim of closing the formalization chain from high-level formal methods to the machine-code level by applying techniques from mathematical logic and programming language semantics. I propose an approach called foundational translation validation (FTV) in which the software producer implements an FTV system comprising a compiler and a specification and verification framework (SVF) which is implemented in higher-order logic (HOL). The most important part of the SVF is an explicit translation contract which comprises the formalizations of the source and the target languages of the compiler and the formalization of a binary translation correctness predicate corrTrans(S,T) for source programs S and target programs T. The formalizations of the languages are realized as deep embeddings in HOL. This enables one to declare the whole program in a formalized language as a HOL constant. The predicate formally specifies when T is considered to be a correct translation of S. Its definition is explicitly based on the program semantics definitions provided by the translation contract. Subsequent to the translation, the compiler translates the source and the target programs into their syntactic representations as HOL constants, S and T, and generates a proof of corrTrans(S,T). We call a compiler which follows the FTV approach a proof generating compiler. Our approach borrows the idea of representing programs in correctness proofs as logic constants from the foundational proof-carrying code (FPCC) approach. Novel features that distinquish our approach from further approaches to certifying compilers, such as proof-carrying code (PCC) and translation validation (TV) are the following: Firstly, the presence of an explicit translation contract formalized in HOL: The approaches PCC and TV do not formalize a translation contract explicitly. Instead of this, they incorporate operational semantics and translation correctness criterion in translation validation tools on the programming language level. Secondly, representation of programs in correctness proofs as logic constants: The approaches PCC and the TV translate programs into their representations as semantic abstractions that serve as inputs for translation validation tools. Thirdly, certification of program transformation chains: Unlike the TV approach, which certifies single program transformations, the FTV approach achieves the aim of certifying whole chains of program transformations. This is possible due to the fact that the translation contract provides, for all programming languages involved in the program transformation chain, definitions of program semantics functions which map programs to mathematical objects that are elements of a set with an (at least) partial order "<=". Then, the proof makes use of the fact that the relation "<=" is transitive. In this thesis, the feasibility of the FTV approach is exemplified by the implementation of an FTV system. The system comprises a compiler front-end that certifies its optimization phase and an accompanying SVF that is implemented in the theorem prover Isabelle/HOL. The compiler front-end translates programs in a small C-like programming language, performs three optimizations: constant folding, dead assignment elimination, and loop invariant hoisting, and generates translation certificates in the form of Isabelle/HOL theories. The main focus of the thesis is on the description of the SVF and its translation verification techniques.

This PhD thesis aims at finding a global robot navigation strategy for rugged off-road terrain which is robust against inaccurate self-localization, scalable to large environments, but also cost-efficient, e.g. able to generate navigation paths which optimize a cost measure closely related to terrain traversability. In order to meet this goal, aspects of both metrical and topological navigation techniques are combined. A primarily topological map is extended with the previously lacking capability of cost-efficient path planning and map extension. Further innovations include a multi-dimensional cost measure for topological edges, a method to learn these costs based on live feedback from the robot and a set of extrapolation methods to predict the traversability costs for untraversed edges. The thesis presents two sophisticated new image analysis techniques to optimize cost prediction based on the shape and appearance of surrounding terrain. Experimental results indicate that the proposed global navigation system is indeed able to perform cost-efficient, large scale path planning. At the same time, the need to maintain a fine-grained, global world model which would reduce the scalability of the approach is avoided.

Adaptive Extraction and Representation of Geometric Structures from Unorganized 3D Point Sets
(2009)

The primary emphasis of this thesis concerns the extraction and representation of intrinsic properties of three-dimensional (3D) unorganized point clouds. The points establishing a point cloud as it mainly emerges from LiDaR (Light Detection and Ranging) scan devices or by reconstruction from two-dimensional (2D) image series represent discrete samples of real world objects. Depending on the type of scenery the data is generated from the resulting point cloud may exhibit a variety of different structures. Especially, in the case of environmental LiDaR scans the complexity of the corresponding point clouds is relatively high. Hence, finding new techniques allowing the efficient extraction and representation of the underlying structural entities becomes an important research issue of recent interest. This thesis introduces new methods regarding the extraction and visualization of structural features like surfaces and curves (e.g. ridge-lines, creases) from 3D (environmental) point clouds. One main part concerns the extraction of curve-like features from environmental point data sets. It provides a new method supporting a stable feature extraction by incorporating a probability-based point classification scheme that characterizes individual points regarding their affiliation to surface-, curve- and volume-like structures. Another part is concerned with the surface reconstruction from (environmental) point clouds exhibiting objects that are more or less complex. A new method providing multi-resolutional surface representations from regular point clouds is discussed. Following the applied principles of this approach a volumetric surface reconstruction method based on the proposed classification scheme is introduced. It allows the reconstruction of surfaces from highly unstructured and noisy point data sets. Furthermore, contributions in the field of reconstructing 3D point clouds from 2D image series are provided. In addition, a discussion concerning the most important properties of (environmental) point clouds with respect to feature extraction is presented.

Knowledge discovery from large and complex collections of today’s scientific datasets is a challenging task. With the ability to measure and simulate more processes at increasingly finer spatial and temporal scales, the increasing number of data dimensions and data objects is presenting tremendous challenges for data analysis and effective data exploration methods and tools. Researchers are overwhelmed with data and standard tools are often insufficient to enable effective data analysis and knowledge discovery. The main objective of this thesis is to provide important new capabilities to accelerate scientific knowledge discovery form large, complex, and multivariate scientific data. The research covered in this thesis addresses these scientific challenges using a combination of scientific visualization, information visualization, automated data analysis, and other enabling technologies, such as efficient data management. The effectiveness of the proposed analysis methods is demonstrated via applications in two distinct scientific research fields, namely developmental biology and high-energy physics. Advances in microscopy, image analysis, and embryo registration enable for the first time measurement of gene expression at cellular resolution for entire organisms. Analysis of highdimensional spatial gene expression datasets is a challenging task. By integrating data clustering and visualization, analysis of complex, time-varying, spatial gene expression patterns and their formation becomes possible. The analysis framework MATLAB and the visualization have been integrated, making advanced analysis tools accessible to biologist and enabling bioinformatic researchers to directly integrate their analysis with the visualization. Laser wakefield particle accelerators (LWFAs) promise to be a new compact source of highenergy particles and radiation, with wide applications ranging from medicine to physics. To gain insight into the complex physical processes of particle acceleration, physicists model LWFAs computationally. The datasets produced by LWFA simulations are (i) extremely large, (ii) of varying spatial and temporal resolution, (iii) heterogeneous, and (iv) high-dimensional, making analysis and knowledge discovery from complex LWFA simulation data a challenging task. To address these challenges this thesis describes the integration of the visualization system VisIt and the state-of-the-art index/query system FastBit, enabling interactive visual exploration of extremely large three-dimensional particle datasets. Researchers are especially interested in beams of high-energy particles formed during the course of a simulation. This thesis describes novel methods for automatic detection and analysis of particle beams enabling a more accurate and efficient data analysis process. By integrating these automated analysis methods with visualization, this research enables more accurate, efficient, and effective analysis of LWFA simulation data than previously possible.

Interactive visualization of large structured and unstructured data sets is a permanent challenge for scientific visualization. Large data sets are for example created by magnetic resonance imaging (MRI), computed tomography (CT), Computational fluid dynamics (CFD) finite element method (FEM), and computer aided design (CAD). For visualizing those data sets not only accelerated rasterization by means of using specialized hardware i.e. graphics cards is of interest, but also ray casting, as it is perfectly suited for scientific visualization. Ray casting does not only support many rendering modes (e.g., opaque rendering, semi transparent rendering, iso surface rendering, maximum intensity projection, x-ray, absorption emitter model, ...) for which it allows the creation of high quality images, but it also supports many primitives (e.g., not only triangles but also spheres, curved iso surfaces, NURBS, implicit functions, ...). It furthermore scales basically linear to the amount of processor cores used and - this makes it highly interesting for the visualization of large data sets - it scales for static scenes sublinear to data size. Interactive ray casting is currently not widely used within the scientifc visualization community. This is mainly based on historical reasons, as just a few years ago no applicable interactive ray casters for commodity hardware did exist. Interactive scientific visualization has only been possible by using graphics cards or specialized and/or expensive hardware. The goal of this work is to broaden the possibilies for interactive scientific visualization, by showing that interactive CPU based ray casting is today feasible on commodity hardware and that it may efficiently be used together with GPU based rasterization. In this thesis it is first shown that interactive CPU based ray casters may efficiently be integrated into already existing OpenGL frameworks. This is achieved through an OpenGL friendly interface that supports multiple threads and single instruction multiple data (SIMD) operations. For the visualization of rectilinear (and not necessarily cartesian) grids are new implicit kd-trees introduced. They have fast construction times, low memory requirements, and allow ontoday's commodity desktop machines interactive iso surface ray tracing and maximum intensity projection of large scalar fields. A new interactive SIMD ray tracing technique for large tetrahedral meshes is introduced. It is very portable and general and is therefore suited for portation upon different (future) hardware and for usage upon several applications. The thesis ends with a real life commercial application which shows that CPU-based ray casting has already reached the state where it may outperform GPU-based rasterization for scientific visualization.

In engineering and science, a multitude of problems exhibit an inherently geometric nature. The computational assessment of such problems requires an adequate representation by means of data structures and processing algorithms. One of the most widely adopted and recognized spatial data structures is the Delaunay triangulation which has its canonical dual in the Voronoi diagram. While the Voronoi diagram provides a simple and elegant framework to model spatial proximity, the core of which is the concept of natural neighbors, the Delaunay triangulation provides robust and efficient access to it. This combination explains the immense popularity of Voronoi- and Delaunay-based methods in all areas of science and engineering. This thesis addresses aspects from a variety of applications that share their affinity to the Voronoi diagram and the natural neighbor concept. First, an idea for the generalization of B-spline surfaces to unstructured knot sets over Voronoi diagrams is investigated. Then, a previously proposed method for \(C^2\) smooth natural neighbor interpolation is backed with concrete guidelines for its implementation. Smooth natural neighbor interpolation is also one of many applications requiring derivatives of the input data. The generation of derivative information in scattered data with the help of natural neighbors is described in detail. In a different setting, the computation of a discrete harmonic function in a point cloud is considered, and an observation is presented that relates natural neighbor coordinates to a continuous dependency between discrete harmonic functions and the coordinates of the point cloud. Attention is then turned to integrating the flexibility and meritable properties of natural neighbor interpolation into a framework that allows the algorithmically transparent and smooth extrapolation of any known natural neighbor interpolant. Finally, essential properties are proved for a recently introduced novel finite element tessellation technique in which a Delaunay triangulation is transformed into a unique polygonal tessellation.

Modern digital imaging technologies, such as digital microscopy or micro-computed tomography, deliver such large amounts of 2D and 3D-image data that manual processing becomes infeasible. This leads to a need for robust, flexible and automatic image analysis tools in areas such as histology or materials science, where microstructures are being investigated (e.g. cells, fiber systems). General-purpose image processing methods can be used to analyze such microstructures. These methods usually rely on segmentation, i.e., a separation of areas of interest in digital images. As image segmentation algorithms rarely adapt well to changes in the imaging system or to different analysis problems, there is a demand for solutions that can easily be modified to analyze different microstructures, and that are more accurate than existing ones. To address these challenges, this thesis contributes a novel statistical model for objects in images and novel algorithms for the image-based analysis of microstructures. The first contribution is a novel statistical model for the locations of objects (e.g. tumor cells) in images. This model is fully trainable and can therefore be easily adapted to many different image analysis tasks, which is demonstrated by examples from histology and materials science. Using algorithms for fitting this statistical model to images results in a method for locating multiple objects in images that is more accurate and more robust to noise and background clutter than standard methods. On simulated data at high noise levels (peak signal-to-noise ratio below 10 dB), this method achieves detection rates up to 10% above those of a watershed-based alternative algorithm. While objects like tumor cells can be described well by their coordinates in the plane, the analysis of fiber systems in composite materials, for instance, requires a fully three dimensional treatment. Therefore, the second contribution of this thesis is a novel algorithm to determine the local fiber orientation in micro-tomographic reconstructions of fiber-reinforced polymers and other fibrous materials. Using simulated data, it will be demonstrated that the local orientations obtained from this novel method are more robust to noise and fiber overlap than those computed using an established alternative gradient-based algorithm, both in 2D and 3D. The property of robustness to noise of the proposed algorithm can be explained by the fact that a low-pass filter is used to detect local orientations. But even in the absence of noise, depending on fiber curvature and density, the average local 3D-orientation estimate can be about 9° more accurate compared to that alternative gradient-based method. Implementations of that novel orientation estimation method require repeated image filtering using anisotropic Gaussian convolution filters. These filter operations, which other authors have used for adaptive image smoothing, are computationally expensive when using standard implementations. Therefore, the third contribution of this thesis is a novel optimal non-orthogonal separation of the anisotropic Gaussian convolution kernel. This result generalizes a previous one reported elsewhere, and allows for efficient implementations of the corresponding convolution operation in any dimension. In 2D and 3D, these implementations achieve an average performance gain by factors of 3.8 and 3.5, respectively, compared to a fast Fourier transform-based implementation. The contributions made by this thesis represent improvements over state-of-the-art methods, especially in the 2D-analysis of cells in histological resections, and in the 2D and 3D-analysis of fibrous materials.