Today’s high-resolution digital images and videos require large amounts of storage space and transmission bandwidth. To cope with this, compression methods are necessary that reduce the required space while at the same time minimize visual artifacts. We propose a compression method based on a piecewise linear color interpolation induced by a triangulation of the image domain. We present methods to speed up significantly the optimization process for finding the triangulation. Furthermore, we extend the method to digital videos. Laser scanners to capture the surface of three-dimensional objects are widely used in industry nowadays, e.g., for reverse engineering or quality measurement. Hand-held scanning devices have the advantage that the laser device can be moved to any position, permitting a scan of complex objects. But operating a hand-held laser scanner is challenging. The operator has to keep track of the scanned regions in his mind, and has no feedback of the sample density unless he starts the surface reconstruction after finishing the scan. We present a system to support the operator by computing and rendering high-quality surface meshes of the captured data online, i.e., while he is still scanning, and in real time. Furthermore, it color-codes the rendered surface to reflect the surface quality. Thereby, instant feedback is provided, resulting in better scans in less time.
Sound surrounds us all the time and in every place in our daily life, may it be pleasant music in a concert hall or disturbing noise emanating from a busy street in front of our home. The basic properties are the same for both kinds of sound, namely sound waves propagating from a source, but we perceive it in different ways depending on our current mood or if the sound is wanted or not. In this thesis both pleasant sound as well as disturbing noise is examined by means of simulating the sound and visualizing the results thereof. However, although the basic properties of music and traffic noise are the same, one is interested in different features. For example, in a concert hall, the reverberation time is an important quality measure, but if noise is considered only the resulting sound level, for example on ones balcony, is of interest. Such differences are reflected in different methods of simulation and required visualizations, therefore this thesis is divided into two parts. The first part about room acoustics deals with the simulation and novel visualizations for indoor sound and acoustic quality measures, such as definition (original "Deutlichkeit") and clarity index (original "Klarheitsmaß"). For the simulation two different methods, a geometric (phonon tracing) and a wave based (FEM) approach, are applied and compared. The visualization techniques give insight into the sound behaviour and the acoustic quality of a room from a global as well as a listener based viewpoint. Furthermore, an acoustic rendering equation is presented, which is used to render interference effects for different frequencies. Last but not least a novel visualization approach for low frequency sound is presented, which enables the topological analysis of pressure fields based on room eigenfrequencies. The second part about environmental noise is concerned with the simulation and visualization of outdoor sound with a focus on traffic noise. The simulation instruction prescribed by national regulations is discussed in detail, and an approach for the computation of noise volumes, as well as an extension to the simulation, allowing interactive noise calculation, are presented. Novel visualization and interaction techniques for the calculated noise data, incorporated in an interactive three dimensional environment, enabling the easy comprehension of noise problems, are presented. Furthermore additional information can be integrated into the framework to enhance the visualization of noise and the usability of the framework for different usages.
Layout analysis--the division of page images into text blocks, lines, and determination of their reading order--is a major performance limiting step in large scale document digitization projects. This thesis addresses this problem in several ways: it presents new performance measures to identify important classes of layout errors, evaluates the performance of state-of-the-art layout analysis algorithms, presents a number of methods to reduce the error rate and catastrophic failures occurring during layout analysis, and develops a statistically motivated, trainable layout analysis system that addresses the needs of large-scale document analysis applications. An overview of the key contributions of this thesis is as follows. First, this thesis presents an efficient local adaptive thresholding algorithm that yields the same quality of binarization as that of state-of-the-art local binarization methods, but runs in time close to that of global thresholding methods, independent of the local window size. Tests on the UW-1 dataset demonstrate a 20-fold speedup compared to traditional local thresholding techniques. Then, this thesis presents a new perspective for document image cleanup. Instead of trying to explicitly detect and remove marginal noise, the approach focuses on locating the page frame, i.e. the actual page contents area. A geometric matching algorithm is presented to extract the page frame of a structured document. It is demonstrated that incorporating page frame detection step into document processing chain results in a reduction in OCR error rates from 4.3% to 1.7% (n=4,831,618 characters) on the UW-III dataset and layout-based retrieval error rates from 7.5% to 5.3% (n=815 documents) on the MARG dataset. The performance of six widely used page segmentation algorithms (x-y cut, smearing, whitespace analysis, constrained text-line finding, docstrum, and Voronoi) on the UW-III database is evaluated in this work using a state-of-the-art evaluation methodology. It is shown that current evaluation scores are insufficient for diagnosing specific errors in page segmentation and fail to identify some classes of serious segmentation errors altogether. Thus, a vectorial score is introduced that is sensitive to, and identifies, the most important classes of segmentation errors (over-, under-, and mis-segmentation) and what page components (lines, blocks, etc.) are affected. Unlike previous schemes, this evaluation method has a canonical representation of ground truth data and guarantees pixel-accurate evaluation results for arbitrary region shapes. Based on a detailed analysis of the errors made by different page segmentation algorithms, this thesis presents a novel combination of the line-based approach by Breuel with the area-based approach of Baird which solves the over-segmentation problem in area-based approaches. This new approach achieves a mean text-line extraction error rate of 4.4% (n=878 documents) on the UW-III dataset, which is the lowest among the analyzed algorithms. This thesis also describes a simple, fast, and accurate system for document image zone classification that results from a detailed comparative analysis of performance of widely used features in document analysis and content-based image retrieval. Using a novel combination of known algorithms, an error rate of 1.46% (n=13,811 zones) is achieved on the UW-III dataset in comparison to a state-of-the-art system that reports an error rate of 1.55% (n=24,177 zones) using more complicated techniques. In addition to layout analysis of Roman script documents, this work also presents the first high-performance layout analysis method for Urdu script. For that purpose a geometric text-line model for Urdu script is presented. It is shown that the method can accurately extract Urdu text-lines from documents of different layouts like prose books, poetry books, magazines, and newspapers. Finally, this thesis presents a novel algorithm for probabilistic layout analysis that specifically addresses the needs of large-scale digitization projects. The presented approach models known page layouts as a structural mixture model. A probabilistic matching algorithm is presented that gives multiple interpretations of input layout with associated probabilities. An algorithm based on A* search is presented for finding the most likely layout of a page, given its structural layout model. For training layout models, an EM-like algorithm is presented that is capable of learning the geometric variability of layout structures from data, without the need for a page segmentation ground-truth. Evaluation of the algorithm on documents from the MARG dataset shows an accuracy of above 95% for geometric layout analysis.
Die vorliegende Arbeit beschäftigt sich mit der visuellen Kontrolle raumplanerischer Entwürfe. Grundlage der Überlegungen ist das gegenwärtige Verfahren, der Planungsprozess, das zur Erstellung der Entwürfe führt. Der Entscheidungsweg hin zum endgültigen Ergebnis erfolgt zurzeit noch ohne Rechnerunterstützung. Die in den Planungsprozess Involvierten stützen ihre Entscheidungen bspw. auf Pläne, eigene Erfahrungen und Statistiken und fertigen im Verlauf von Diskussionsrunden verschiedene Entwürfe an. Dieser Ablauf ist komplex, aufgrund der eingehenden Daten und der damit zusammenhängenden Diskussionen, und langwierig da erst nach einigen Iterationsschritten ein Ergebnis vorliegt. Die Arbeit verfolgt das Ziel, die Akteure durch eine Rechnerunterstützung schneller und zielgerichtet zu einer Entscheidungsfindung zu führen. Meine Untersuchung des Anwendungsumfeldes hat ergeben, dass dies nur möglich ist, wenn zum Einen das entstehende System in der Lage ist, die großen, heterogenen Datenmengen zu verarbeiten und andererseits die Visualisierung der Ergebnisse in einer Form erfolgt, die den Akteuren vom bisherigen Planungsprozess her bekannt ist. Die Visualisierung darf dabei keine bewertende Aussage treffen, sondern muss die Informationen der Analyse neutral in einem dem Nutzer bekannten Format abbilden. Als Ansatzpunkt stellt sich der informelle Bereich der Entscheidungsfindung dar. Es werden zwei Lösungswege aus dem Bereich der Clusteringalgorithmen verfolgt, die die großen Datenmengen verarbeiten und analysieren. Als Ergebnis erhalten die Akteure durch das Voronoi-Diagramm direkt einen Entwurf, der die Einschätzungen aller Akteure widerspiegelt und durch ein Übereinanderlegen mit der Karte des Plangebietes dem klassischen Format im Rahmen des Planungsprozesses entspricht. Dadurch wird die Akzeptanz der Rechnerunterstützung bei den Beteiligten des Planungsprozesses gesteigert. Sollte dieser Entwurf noch keine direkte Zustimmung finden, kann über die entwickelte Informationsvisualisierung eine Anzeige und in der Folge eine Anpassung der Eingangsgrößen erfolgen und somit sehr schnell ein neuer Entwurf entwickelt werden. Die Visualisierung übernimmt dabei die Funktion der bisher in Papierform erstellten Pläne im Entscheidungsprozess und bietet damit auch fachfremden Beteiligten eine visuelle Kontrollmöglichkeit der Qualität des Entwurfes. Insgesamt werden mit dem Tool IKone die Akteure in Anlehnung an die standardmäßigen Abläufe und visuellen Darstellungen mittels eines rechnergestützten Systems unterstützt.
Computer-based simulation and visualization of acoustics of a virtual scene can aid during the design process of concert halls, lecture rooms, theaters, or living rooms. Because, not only the visual aspect of the room is important, but also its acoustics. In factory floors noise reduction is important since noise is hazardous to health. Despite the obvious dissimilarity between our aural and visual senses, many techniques required for the visualization of photo-realistic images and for the auralization of acoustic environments are quite similar. Both applications can be served by geometric methods such as particle- and ray tracing if we neglect a number of less important effects. By means of the simulation of room acoustics we want to predict the acoustic properties of a virtual model. For auralization, a pulse response filter needs to be assembled for each pair of source and listener positions. The convolution of this filter with an anechoic source signal provides the signal received at the listener position. Hence, the pulse response filter must contain all reverberations (echos) of a unit pulse, including their frequency decompositions due to absorption at different surface materials. For the room acoustic simulation a method named phonon tracing, since it is based on particles, is developed. The approach computes the energy or pressure decomposition for each particle (phonon) sent out from a sound source and uses this in a second pass (phonon collection) to construct the response filters for different listeners. This step can be performed in different precision levels. During the tracing step particle paths and additional information are stored in a so called phonon map. Using this map several sound visualization approaches were developed. From the visualization, the effect of different materials on the spectral energy / pressure distribution can be observed. The first few reflections already show whether certain frequency bands are rapidly absorbed. The absorbing materials can be identified and replaced in the virtual model, improving the overall acoustic quality of the simulated room. Furthermore an insight into the pressure / energy received at the listener position is possible. The phonon tracing algorithm as well as several sound visualization approaches are integrated into a common system utilizing Virtual Reality technologies in order to facilitate the immersion into the virtual scene. The system is a prototype developed within a project at the University of Kaiserslautern and is still a subject of further improvements. It consists of a stereoscopic back-projection system for visual rendering as well as professional audio equipment for auralization purposes.