The automatic analysis and retrieval of technical line drawings is hindered by many challenges such as: the large amount of contextual clutter around the symbols within the drawings, degradation, transformations on the symbols in drawings, large databases of drawings
and large alphabets of symbols. The core tasks required for the analysis of technical line
drawings are: symbol recognition, spotting and retrieval. The current systems for performing these tasks have poor performance due to the mentioned challenges. This dissertation
presents a number of methods that address these challenges. These methods achieve both
accurate and efficient symbol spotting and retrieval in technical line drawings, and perform
significantly better than state-of-the-art methods on the same problems. An overview of
the key contributions of this dissertation is given in the following.
First, this dissertation presents a geometric matching-based method for symbol recognition
and spotting. The method performs recognition in the presence of large amounts of contextual clutter, and provides precise localization of the recognized symbols. On standard
databases such as GREC-2005 and GREC-2011, the method achieves up to 10% higher
recall and up to 28% higher precision than state-of-the-art methods on the spotting task,
and achieves up to 7% higher recognition accuracy on the isolated recognition task. The
method is based on a geometric matching approach, which is flexible enough to incorporate
improvements on the matching strategy, feature types and information on the features. The
method also includes an adaptive preprocessing algorithm that deals with a wide variety
of noise types.
In order to improve the performance of the spotting method when dealing with degraded
drawings, two novel methods are presented in this dissertation. Both methods are based on
combining geometric matching with machine learning techniques. The geometric matching
is used to automatically generate training data that contain information on how well the
features of the queries are matched in both the true and the false matches found by the
spotting method. The first method learns the feature weights of the different query symbols
by linear discriminant analysis (LDA). The weighted query features are used in the spotting
method and result in 27% higher average precision than the original method, with a speedup
factor of 2. The second method uses SVM classification as a post-spotting step to distinguish
the true from the false matches in the spotting method. The use of the classification step
further improves the average precision of the spotting method by 20.6%.
This dissertation also presents methods for content analysis of line drawings. First, a
method for accurate and consistent detection (95.8%) of regions of interest (ROIs) is presented. The method is based on statistical feature grouping. The ROI-finding method is
identified as an important part of a symbol retrieval system: the better the detected ROIs,the higher the performance of a retrieval system. The ROI-finding method is also used to
improve the performance of the geometric-based spotting system.
Second, a symbol clustering method for building a compact and accurate representation of
a large database of technical drawings is presented. This method uses the output from the
ROI-finding method as input, and uses geometric matching as a similarity measure. The
method achieves high accuracy (90.1% recall, 94.3% precision) in forming clusters of symbols. The representatives of the clusters (34 symbols) are used as key entries to a symbol
index, which is identified as the outcome of an off-line stage of a symbol retrieval system.
Finally, an efficient and high performing large scale symbol retrieval system is presented
in this dissertation. The system follows the bag of visual words (BoVW) model, but with
using methods that are suitable to line drawings. The system uses the symbol index to
represent a database of drawings. During the on-line query retrieval stage, the query is
analyzed by the ROI-finding method, matched with the key entries of the symbol index via
geometric matching, and finally, a spatial verification step is performed on the retrieved
matches. The system achieves a query lookup time that is independent of the size of the
database, and is instead dependent on the size of the symbol index. The system achieves up
to 10% higher recall and up to 28% higher precision than state-of-the-art spotting systems
on similar databases.
Overall, these contributions are major advancements in the research of graphics recognition.
The hope is that, such contributions provide the basis for the development of reliable and
accurate performing applications for browsing, querying or classification of line drawings
for the benefit of end users.
Generic layout analysis--process of decomposing document image into homogeneous regions for a collection of diverse document images--has many important applications in document image analysis and understanding such as preprocessing of degraded warped, camera-captured document images, high performance layout analysis of document images containing complex cursive scripts, and word spotting in historical document images at page level. Many areas in this field like generic text line extraction method are considered as elusive goals so far, still beyond the reach of the state-of-the-art methods [NJ07, LSZT07, KB06]. This thesis addresses this problem in such a way that it presents generic, domain-independent, text line extraction and text and non-text segmentation methods, and then describes some important applications, that were developed based on these methods. An overview of the key contributions of this thesis is as follows.
The first part of this thesis presents a generic text line extraction method using a combination of matched filtering and ridge detection techniques, which are commonly used in computer vision. Unlike the state-of-the-art text line extraction methods in the literature, the generic text line extraction method can be equally and robustly applied to a large variety of document image classes including scanned and camera-captured documents, binary and grayscale documents, typed-text and handwritten documents, historical and contemporary documents, and documents containing different scripts. Different standard datasets are selected for performance evaluation that belong to different categories of document images such as the UW-III [GHHP97] dataset of scanned documents, the ICDAR 2007 [GAS07] and the UMD [LZDJ08] datasets of handwritten documents, the DFKI-I [SB07] dataset of camera-captured documents, Arabic/Urdu script documents dataset, and German calligraphic (Fraktur) script historical documents dataset. The generic text line extraction method achieves 86% (n = 23,763 text lines in 650 documents) text line detection accuracy which is better than the aggregate accuracy of 73% of the best performing domain-specific state-of-the-art methods. To the best of the author's knowledge, it is the first general-purpose text line extraction method that can be equally used for a diverse collection of documents.
This thesis also presents an active contour (snake) based curled text line extraction method for warped, camera-captured document images. The presented approach is applied to DFKI-I [SB07] dataset of camera-captured, Latin script document images for curled text line extraction. It achieves above 95% (n = 3,091 text lines in 102 documents) text line detection accuracy, which is significantly better than the competing state-of-the-art curled text line extraction methods. The presented text line extraction method can also be applied to document images containing different scripts like Chinese, Devanagari, and Arabic after small modifications.
The second part of this thesis presents an improved version of the state-of-the-art multiresolution morphology (Leptonica) based text and non-text segmentation method [Blo91], which is a domain-independent page segmentation approach and can be equally applied to a diverse collection of binarized document images. It is demonstrated that the presented improvements result in an increase in segmentation accuracy from 93% to 99% (n = 113 documents).
This thesis also introduces a discriminative learning based approach for page segmentation, where a self-tunable multi-layer perceptron (MLP) classifier [BS10] is trained for distinguishing between text and non-text connected components. Unlike other classification based page segmentation approaches in the literature, the connected components based discriminative learning based approach is faster than pixel based classification methods and does not require a block segmentation method beforehand. A segmentation accuracy of $96\%$ ($n = 113$ documents) is achieved in comparison to the state-of-the-art multiresolution morphology (Leptonica) based page segmentation method [Blo91] that achieves a segmentation accuracy of 93%. In addition to text and non-text segmentation of Latin script documents, the presented approach can also be adapted for document images containing other scripts as well as for other specialized layout analysis tasks such as digit and non-digit segmentation [HBSB12], orientation detection [RBSB09], and body-text and side-note segmentation [BAESB12].
Finally, this thesis presents important applications of the two generic layout analysis techniques, ridge-based text line extraction method and the multi-resolution morphology based text and non-text segmentation method, discussed above. First, a complete preprocessing pipeline is described for removing different types of degradations from grayscale warped, camera-captured document images that includes removal of grayscale degradations such as non-uniform shadows and blurring through binarization, noise cleanup applying page frame detection, and document rectification using monocular dewarping. Each of these preprocessing steps shows significant improvement in comparison to the analyzed state-of-the-art methods in the literature. Second, a high performance layout analysis method is described for complex Arabic script document images written in different languages such as Arabic, Urdu, and Persian and different styles for example Naskh and Nastaliq. The presented layout analysis system is robust against different types of document image degradations and shows better performance for text and non-text segmentation, text line extraction, and reading order determination on a variety of Arabic and Urdu document images as compared to the state-of-the-art methods. It can be used for large scale Arabic and Urdu documents' digitization processes. These applications demonstrate that the layout analysis methods, ridge-based text line extraction and the multi-resolution morphology based text and non-text segmentation, are generic and can be applied easily to a large collection of diverse document images.
The safety of embedded systems is becoming more and more important nowadays. Fault Tree Analysis (FTA) is a widely used technique for analyzing the safety of embedded systems. A standardized tree-like structure called a Fault Tree (FT) models the failures of the systems. The Component Fault Tree (CFT) provides an advanced modeling concept for adapting the traditional FTs to the hierarchical architecture model in system design. Minimal Cut Set (MCS) analysis is a method that works for qualitative analysis based on the FTs. Each MCS represents a minimal combination of component failures of a system called basic events, which may together cause the top-level system failure. The ordinary representations of MCSs consist of plain text and data tables with little additional supporting visual and interactive information. Importance analysis based on FTs or CFTs estimates the contribution of each potential basic event to a top-level system failure. The resulting importance values of basic events are typically represented in summary views, e.g., data tables and histograms. There is little visual integration between these forms and the FT (or CFT) structure. The safety of a system can be improved using an iterative process, called the safety improvement process, based on FTs taking relevant constraints into account, e.g., cost. Typically, relevant data regarding the safety improvement process are presented across multiple views with few interactive associations. In short, the ordinary representation concepts cannot effectively facilitate these analyses.
We propose a set of visualization approaches for addressing the issues above mentioned in order to facilitate those analyses in terms of the representations.
1. To support the MCS analysis, we propose a matrix-based visualization that allows detailed data of the MCSs of interest to be viewed while maintaining a satisfactory overview of a large number of MCSs for effective navigation and pattern analysis. Engineers can also intuitively analyze the influence of MCSs of a CFT.
2. To facilitate the importance analysis based on the CFT, we propose a hybrid visualization approach that combines the icicle-layout-style architectural views with the CFT structure. This approach facilitates to identify the vulnerable components taking the hierarchies of system architecture into account and investigate the logical failure propagation of the important basic events.
3. We propose a visual safety improvement process that integrates an enhanced decision tree with a scatter plot. This approach allows one to visually investigate the detailed data related to individual steps of the process while maintaining the overview of the process. The approach facilitates to construct and analyze improvement solutions of the safety of a system.
Using our visualization approaches, the MCS analysis, the importance analysis, and the safety improvement process based on the CFT can be facilitated.
Predicting secondary structures of RNA molecules is one of the fundamental problems of and thus a challenging task in computational structural biology. Existing prediction methods basically use the dynamic programming principle and are either based on a general thermodynamic model or on a specific probabilistic model, traditionally realized by a stochastic context-free grammar. To date, the applied grammars were rather simple and small and despite the fact that statistical approaches have become increasingly appreciated over the past years, a corresponding sampling algorithm based on a stochastic RNA structure model has not yet been devised. In addition, basically all popular state-of-the-art tools for computational structure prediction have the same worst-case time and space requirements of O(n^3) and O(n^2) for sequence length n, limiting their applicability for practical purposes due to the often quite large sizes of native RNA molecules. Accordingly, the prime demand imposed by biologists on computational prediction procedures is to reach a reduced waiting time for results that are not significantly less accurate.
We here deal with all of these issues, by describing algorithms and performing comprehensive studies that are based on sophisticated stochastic context-free grammars of similar complexity as those underlying thermodynamic prediction approaches, where all of our methods indeed make use of the concept of sampling. We also employ the approximation technique known from theoretical computer science in order to reach a heuristic worst-case speedup for RNA folding.
Particularly, we start by describing a way for deriving a sequence-independent random sampler for an arbitrary class of RNAs by means of (weighted) unranking. The resulting algorithm may generate any secondary structure of a given fixed size n in only O(n·log(n)) time, where the results are observed to be accurate, validating its practical applicability.
With respect to RNA folding, we present a novel probabilistic sampling algorithm that generates statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method actually samples the possible foldings from a distribution implied by a suitable (traditional or length-dependent) grammar. Notably, we also propose several (new) ways for obtaining predictions from generated samples. Both variants have the same worst-case time and space complexities of O(n^3) and O(n^2) for sequence length n. Nevertheless, evaluations of our sampling methods show that they are actually capable of producing accurate (prediction) results.
In an attempt to resolve the long-standing problem of reducing the time complexity of RNA folding algorithms without sacrificing much of the accuracy of the results, we invented an innovative heuristic statistical sampling method that can be implemented to require only O(n^2) time for generating a fixed-size sample of candidate structures for a given sequence of length n. Since a reasonable prediction can still efficiently be obtained from the generated sample set, this approach finally reduces the worst-case time complexity by a liner factor compared to all existing precise methods. Notably, we also propose a novel (heuristic) sampling strategy as opposed to the common one typically applied for statistical sampling, which may produce more accurate results for particular settings. A validation of our heuristic sampling approach by comparison to several leading RNA secondary structure prediction tools indicates that it is capable of producing competitive predictions, but may require the consideration of large sample sizes.
Dealing with information in modern times involves users to cope with hundreds of thousands of documents, such as articles, emails, Web pages, or News feeds.
Above all information sources, the World Wide Web presents information seekers with great challenges.
It offers more text in natural language than one is capable to read.
The key idea for this research intends to provide users with adaptable filtering techniques, supporting them in filtering out the specific information items they need.
Its realization focuses on developing an Information Extraction system,
which adapts to a domain of concern, by interpreting the contained formalized knowledge.
Utilizing the Resource Description Framework (RDF), which is the Semantic Web's formal language for exchanging information,
allows extending information extractors to incorporate the given domain knowledge.
Because of this, formal information items from the RDF source can be recognized in the text.
The application of RDF allows a further investigation of operations on recognized information items, such as disambiguating and rating the relevance of these.
Switching between different RDF sources allows changing the application scope of the Information Extraction system from one domain of concern to another.
An RDF-based Information Extraction system can be triggered to extract specific kinds of information entities by providing it with formal RDF queries in terms of the SPARQL query language.
Representing extracted information in RDF extends the coverage of the Semantic Web's information degree and provides a formal view on a text from the perspective of the RDF source.
In detail, this work presents the extension of existing Information Extraction approaches by incorporating the graph-based nature of RDF.
Hereby, the pre-processing of RDF sources allows extracting statistical information models dedicated to support specific information extractors.
These information extractors refine standard extraction tasks, such as the Named Entity Recognition, by using the information provided by the pre-processed models.
The post-processing of extracted information items enables representing these results in RDF format or lists, which can now be ranked or filtered by relevance.
Post-processing also comprises the enrichment of originating natural language text sources with extracted information items by using annotations in RDFa format.
The results of this research extend the state-of-the-art of the Semantic Web.
This work contributes approaches for computing customizable and adaptable RDF views on the natural language content of Web pages.
Finally, due to the formal nature of RDF, machines can interpret these views allowing developers to process the contained information in a variety of applications.
Im Bereich der Automobilelektronik ist eine Zunahme an Fahrerassistenzsystemen zu bemerken, die den Fahrer neben einer warnenden Funktion durch autonomes aktives Eingreifen in seiner Fahraufgabe unterstützen. Dadurch entsteht eine hohe Anforderung an die funktionale Sicherheit dieser Systeme, um ein einwandfreies Verhalten in allen Fahrsituationen zu garantieren und sicherheitskritische Situationen zu vermeiden oder zu entschärfen. Die funktionale Sicherheit derartiger Fahrerassistenzsysteme muss u. a. durch adäquate Testmethoden und einen effizienten Umgang damit innerhalb der etablierten industriellen Entwicklungsprozesse erhöht und sichergestellt werden.
Diese Arbeit bietet einen Überblick über existierende wissenschaftliche wie industrielle Ansätze zum Testen von Automobilelektronik sowie über aktive Fahrerassistenzsysteme. Der Schwerpunkt wird dabei auf diejenigen Systeme gelegt, die Informationen über ihre Umgebung aus Kamerasensoren gewinnen. Aus der Herausforderung, die funktionale Absicherung derart sicherheitskritischer Systeme zu gewährleisten, werden spezifische Anforderungen abgeleitet. Aus dem „Delta“ zwischen Anforderungen und Stand der Technik ergibt sich ein Handlungsbedarf, um neue Methoden und für deren Anwendung nötige Vorgehensweisen und Werkzeuge zu erforschen bzw. bestehende zu erweitern.
Die Methode des „Visual Loop Tests“ wird dafür vorgestellt. Sie kann durch die Anwendung sog. Grafik-Engines als neuer Bestandteil der Test-Technologien zur Absicherung eingesetzt werden. Dabei werden fotorealistische Grafiken zur Stimulation der Assistenzsysteme erzeugt. Die für die effiziente Anwendung dieser Technologien benötigten neuen Vorgehensweisen zur Beschreibung und Erzeugung von Testfällen in einem visuell repräsentierbaren Format werden erarbeitet.
Dadurch können moderne Assistenzfunktionen gleichzeitig effizienter, zuverlässiger, sicherer und kostengünstiger entwickelt werden und die Sicherheit auf den Straßen wird erhöht. Die erste empirische Bewertung im Rahmen der prototypischen Umsetzung bestärkt diese Einschätzung.
In urban planning, both measuring and communicating sustainability are among the most recent concerns. Therefore, the primary emphasis of this thesis concerns establishing metrics and visualization techniques in order to deal with indicators of sustainability.
First, this thesis provides a novel approach for measuring and monitoring two indicators of sustainability - urban sprawl and carbon footprints – at the urban neighborhood scale. By designating different sectors of relevant carbon emissions as well as different household categories, this thesis provides detailed information about carbon emissions in order to estimate impacts of daily consumption decisions and travel behavior by household type. Regarding urban sprawl, a novel gridcell-based indicator model is established, based on different dimensions of urban sprawl.
Second, this thesis presents a three-step-based visualization method, addressing predefined requirements for geovisualizations and visualizing those indicator results, introduced above. This surface-visualization combines advantages from both common GIS representation and three-dimensional representation techniques within the field of urban planning, and is assisted by a web-based graphical user interface which allows for accessing the results by the public.
In addition, by focusing on local neighborhoods, this thesis provides an alternative approach in measuring and visualizing both indicators by utilizing a Neighborhood Relation Diagram (NRD), based on weighted Voronoi diagrams. Thus, the user is able to a) utilize original census data, b) compare direct impacts of indicator results on the neighboring cells, and c) compare both indicators of sustainability visually.
Today, polygonal models occur everywhere in graphical applications, since they are easy
to render and to compute and a very huge set of tools are existing for generation and
manipulation of polygonal data. But modern scanning devices that allow a high quality
and large scale acquisition of complex real world models often deliver a large set of
points as resulting data structure of the scanned surface. A direct triangulation of those
point clouds does not always result in good models. They often contain problems like
holes, self-intersections and non manifold structures. Also one often looses important
surface structures like sharp corners and edges during a usual surface reconstruction.
So it is suitable to stay a little longer in the point based world to analyze the point cloud
data with respect to such features and apply a surface reconstruction method afterwards
that is known to construct continuous and smooth surfaces and extend it to reconstruct
The recognition of patterns and structures has gained importance for dealing with the growing amount of data being generated by sensors and simulations. Most existing methods for pattern recognition are tailored for scalar data and non-correlated data of higher dimensions. The recognition of general patterns in flow structures is possible, but not yet practically usable, due to the high computation effort. The main goal of this work is to present methods for comparative visualization of flow data, amongst others, based on a new method for efficient pattern recognition on flow data. This work is structured in three parts: At first, a known feature-based approach for pattern recognition on flow data, the Clifford convolution, has been applied to color edge detection, and been extended to non-uniform grids. However, this method is still computationally expensive for a general pattern recognition, since the recognition algorithm has to be applied for numerous different scales and orientations of the query pattern. A more efficient and accurate method for pattern recognition on flow data is presented in the second part. It is based upon a novel mathematical formulation of moment invariants for flow data. The common moment invariants for pattern recognition are not applicable on flow data, since they are only invariant on non-correlated data. Because of the spatial correlation of flow data, the moment invariants had to be redefined with different basis functions to satisfy the demands for an invariant mapping of flow data. The computation of the moment invariants is done by a multi-scale convolution of the complete flow field with the basis functions. This pre-processing computation time almost equals the time for the pattern recognition of one single general pattern with the former algorithms. However, after having computed the moments once, they can be indexed and used as a look-up-table to recognize any desired pattern quickly and interactively. This results in a flexible and easy-to-use tool for the analysis of patterns in 2d flow data. For an improved rendering of the recognized features, an importance driven streamline algorithm has been developed. The density of the streamlines can be adjusted by using importance maps. The result of a pattern recognition can be used as such a map, for example. Finally, new comparative flow visualization approaches utilizing the streamline approach, the flow pattern matching, and the moment invariants are presented.
Beim funktionsorientierten Testen von Steuergeräten im automobilen Bereich ist das Expertenwissen aufgrund der hohen Komplexität der Testfälle unersetzlich. Bei Basistesttechniken wie der Grenzwertanalyse ist die Absicht eines Testfalls implizit durch die Technik gegeben. Beim Expertenwissen wird jedoch zur Zeit zu jedem erstellten Testfall zusätzlich ein Prosatext verfasst um die Testabsicht anzugeben. Diese Prosabeschreibung ist anfällig für Mehrdeutigkeiten, fällt bei jedem Testentwickler unterschiedlich aus und der inhaltliche Bezug zum Testfall ist lose. Ziel der Arbeit ist eine Spezifikationssprache für die Testfallbeschreibung zu entwerfen um die Nachteile der natürlichen Sprache zu minimieren und testablaufspezifische Sprachelemente zu definieren, so dass sie als ein Grundgerüst für einen Testfall verwendet werden kann. Dazu wird aus der Einsatzumgebung (Systemspezifikation, Testimplementierung und Testprozessthemen) Sprachelemente für die Beschreibung abgeleitet und Ansätze für die Überführung der Beschreibung in die Testimplementierung betrachtet. Das Ergebnis ist eine Testfall-Spezifikationssprache, die auf formaler Grundlage basiert und u.a. in eine graphische Sicht überführt werden kann. Ähnlich der UML wird der Mehrwert erst durch eine werkzeugunterstützte Eingabe deutlich: So sind die Testentwickler in der Lage, einheitliche, formale, wieder verwendbare, verständliche Testfälle zu definieren.