Predicting secondary structures of RNA molecules is one of the fundamental problems of and thus a challenging task in computational structural biology. Existing prediction methods basically use the dynamic programming principle and are either based on a general thermodynamic model or on a specific probabilistic model, traditionally realized by a stochastic context-free grammar. To date, the applied grammars were rather simple and small and despite the fact that statistical approaches have become increasingly appreciated over the past years, a corresponding sampling algorithm based on a stochastic RNA structure model has not yet been devised. In addition, basically all popular state-of-the-art tools for computational structure prediction have the same worst-case time and space requirements of O(n^3) and O(n^2) for sequence length n, limiting their applicability for practical purposes due to the often quite large sizes of native RNA molecules. Accordingly, the prime demand imposed by biologists on computational prediction procedures is to reach a reduced waiting time for results that are not significantly less accurate.
We here deal with all of these issues, by describing algorithms and performing comprehensive studies that are based on sophisticated stochastic context-free grammars of similar complexity as those underlying thermodynamic prediction approaches, where all of our methods indeed make use of the concept of sampling. We also employ the approximation technique known from theoretical computer science in order to reach a heuristic worst-case speedup for RNA folding.
Particularly, we start by describing a way for deriving a sequence-independent random sampler for an arbitrary class of RNAs by means of (weighted) unranking. The resulting algorithm may generate any secondary structure of a given fixed size n in only O(n·log(n)) time, where the results are observed to be accurate, validating its practical applicability.
With respect to RNA folding, we present a novel probabilistic sampling algorithm that generates statistically representative and reproducible samples of the entire ensemble of feasible structures for a particular input sequence. This method actually samples the possible foldings from a distribution implied by a suitable (traditional or length-dependent) grammar. Notably, we also propose several (new) ways for obtaining predictions from generated samples. Both variants have the same worst-case time and space complexities of O(n^3) and O(n^2) for sequence length n. Nevertheless, evaluations of our sampling methods show that they are actually capable of producing accurate (prediction) results.
In an attempt to resolve the long-standing problem of reducing the time complexity of RNA folding algorithms without sacrificing much of the accuracy of the results, we invented an innovative heuristic statistical sampling method that can be implemented to require only O(n^2) time for generating a fixed-size sample of candidate structures for a given sequence of length n. Since a reasonable prediction can still efficiently be obtained from the generated sample set, this approach finally reduces the worst-case time complexity by a liner factor compared to all existing precise methods. Notably, we also propose a novel (heuristic) sampling strategy as opposed to the common one typically applied for statistical sampling, which may produce more accurate results for particular settings. A validation of our heuristic sampling approach by comparison to several leading RNA secondary structure prediction tools indicates that it is capable of producing competitive predictions, but may require the consideration of large sample sizes.
Dealing with information in modern times involves users to cope with hundreds of thousands of documents, such as articles, emails, Web pages, or News feeds.
Above all information sources, the World Wide Web presents information seekers with great challenges.
It offers more text in natural language than one is capable to read.
The key idea for this research intends to provide users with adaptable filtering techniques, supporting them in filtering out the specific information items they need.
Its realization focuses on developing an Information Extraction system,
which adapts to a domain of concern, by interpreting the contained formalized knowledge.
Utilizing the Resource Description Framework (RDF), which is the Semantic Web's formal language for exchanging information,
allows extending information extractors to incorporate the given domain knowledge.
Because of this, formal information items from the RDF source can be recognized in the text.
The application of RDF allows a further investigation of operations on recognized information items, such as disambiguating and rating the relevance of these.
Switching between different RDF sources allows changing the application scope of the Information Extraction system from one domain of concern to another.
An RDF-based Information Extraction system can be triggered to extract specific kinds of information entities by providing it with formal RDF queries in terms of the SPARQL query language.
Representing extracted information in RDF extends the coverage of the Semantic Web's information degree and provides a formal view on a text from the perspective of the RDF source.
In detail, this work presents the extension of existing Information Extraction approaches by incorporating the graph-based nature of RDF.
Hereby, the pre-processing of RDF sources allows extracting statistical information models dedicated to support specific information extractors.
These information extractors refine standard extraction tasks, such as the Named Entity Recognition, by using the information provided by the pre-processed models.
The post-processing of extracted information items enables representing these results in RDF format or lists, which can now be ranked or filtered by relevance.
Post-processing also comprises the enrichment of originating natural language text sources with extracted information items by using annotations in RDFa format.
The results of this research extend the state-of-the-art of the Semantic Web.
This work contributes approaches for computing customizable and adaptable RDF views on the natural language content of Web pages.
Finally, due to the formal nature of RDF, machines can interpret these views allowing developers to process the contained information in a variety of applications.
Im Bereich der Automobilelektronik ist eine Zunahme an Fahrerassistenzsystemen zu bemerken, die den Fahrer neben einer warnenden Funktion durch autonomes aktives Eingreifen in seiner Fahraufgabe unterstützen. Dadurch entsteht eine hohe Anforderung an die funktionale Sicherheit dieser Systeme, um ein einwandfreies Verhalten in allen Fahrsituationen zu garantieren und sicherheitskritische Situationen zu vermeiden oder zu entschärfen. Die funktionale Sicherheit derartiger Fahrerassistenzsysteme muss u. a. durch adäquate Testmethoden und einen effizienten Umgang damit innerhalb der etablierten industriellen Entwicklungsprozesse erhöht und sichergestellt werden.
Diese Arbeit bietet einen Überblick über existierende wissenschaftliche wie industrielle Ansätze zum Testen von Automobilelektronik sowie über aktive Fahrerassistenzsysteme. Der Schwerpunkt wird dabei auf diejenigen Systeme gelegt, die Informationen über ihre Umgebung aus Kamerasensoren gewinnen. Aus der Herausforderung, die funktionale Absicherung derart sicherheitskritischer Systeme zu gewährleisten, werden spezifische Anforderungen abgeleitet. Aus dem „Delta“ zwischen Anforderungen und Stand der Technik ergibt sich ein Handlungsbedarf, um neue Methoden und für deren Anwendung nötige Vorgehensweisen und Werkzeuge zu erforschen bzw. bestehende zu erweitern.
Die Methode des „Visual Loop Tests“ wird dafür vorgestellt. Sie kann durch die Anwendung sog. Grafik-Engines als neuer Bestandteil der Test-Technologien zur Absicherung eingesetzt werden. Dabei werden fotorealistische Grafiken zur Stimulation der Assistenzsysteme erzeugt. Die für die effiziente Anwendung dieser Technologien benötigten neuen Vorgehensweisen zur Beschreibung und Erzeugung von Testfällen in einem visuell repräsentierbaren Format werden erarbeitet.
Dadurch können moderne Assistenzfunktionen gleichzeitig effizienter, zuverlässiger, sicherer und kostengünstiger entwickelt werden und die Sicherheit auf den Straßen wird erhöht. Die erste empirische Bewertung im Rahmen der prototypischen Umsetzung bestärkt diese Einschätzung.
In urban planning, both measuring and communicating sustainability are among the most recent concerns. Therefore, the primary emphasis of this thesis concerns establishing metrics and visualization techniques in order to deal with indicators of sustainability.
First, this thesis provides a novel approach for measuring and monitoring two indicators of sustainability - urban sprawl and carbon footprints – at the urban neighborhood scale. By designating different sectors of relevant carbon emissions as well as different household categories, this thesis provides detailed information about carbon emissions in order to estimate impacts of daily consumption decisions and travel behavior by household type. Regarding urban sprawl, a novel gridcell-based indicator model is established, based on different dimensions of urban sprawl.
Second, this thesis presents a three-step-based visualization method, addressing predefined requirements for geovisualizations and visualizing those indicator results, introduced above. This surface-visualization combines advantages from both common GIS representation and three-dimensional representation techniques within the field of urban planning, and is assisted by a web-based graphical user interface which allows for accessing the results by the public.
In addition, by focusing on local neighborhoods, this thesis provides an alternative approach in measuring and visualizing both indicators by utilizing a Neighborhood Relation Diagram (NRD), based on weighted Voronoi diagrams. Thus, the user is able to a) utilize original census data, b) compare direct impacts of indicator results on the neighboring cells, and c) compare both indicators of sustainability visually.
Today, polygonal models occur everywhere in graphical applications, since they are easy
to render and to compute and a very huge set of tools are existing for generation and
manipulation of polygonal data. But modern scanning devices that allow a high quality
and large scale acquisition of complex real world models often deliver a large set of
points as resulting data structure of the scanned surface. A direct triangulation of those
point clouds does not always result in good models. They often contain problems like
holes, self-intersections and non manifold structures. Also one often looses important
surface structures like sharp corners and edges during a usual surface reconstruction.
So it is suitable to stay a little longer in the point based world to analyze the point cloud
data with respect to such features and apply a surface reconstruction method afterwards
that is known to construct continuous and smooth surfaces and extend it to reconstruct
The recognition of patterns and structures has gained importance for dealing with the growing amount of data being generated by sensors and simulations. Most existing methods for pattern recognition are tailored for scalar data and non-correlated data of higher dimensions. The recognition of general patterns in flow structures is possible, but not yet practically usable, due to the high computation effort. The main goal of this work is to present methods for comparative visualization of flow data, amongst others, based on a new method for efficient pattern recognition on flow data. This work is structured in three parts: At first, a known feature-based approach for pattern recognition on flow data, the Clifford convolution, has been applied to color edge detection, and been extended to non-uniform grids. However, this method is still computationally expensive for a general pattern recognition, since the recognition algorithm has to be applied for numerous different scales and orientations of the query pattern. A more efficient and accurate method for pattern recognition on flow data is presented in the second part. It is based upon a novel mathematical formulation of moment invariants for flow data. The common moment invariants for pattern recognition are not applicable on flow data, since they are only invariant on non-correlated data. Because of the spatial correlation of flow data, the moment invariants had to be redefined with different basis functions to satisfy the demands for an invariant mapping of flow data. The computation of the moment invariants is done by a multi-scale convolution of the complete flow field with the basis functions. This pre-processing computation time almost equals the time for the pattern recognition of one single general pattern with the former algorithms. However, after having computed the moments once, they can be indexed and used as a look-up-table to recognize any desired pattern quickly and interactively. This results in a flexible and easy-to-use tool for the analysis of patterns in 2d flow data. For an improved rendering of the recognized features, an importance driven streamline algorithm has been developed. The density of the streamlines can be adjusted by using importance maps. The result of a pattern recognition can be used as such a map, for example. Finally, new comparative flow visualization approaches utilizing the streamline approach, the flow pattern matching, and the moment invariants are presented.
Beim funktionsorientierten Testen von Steuergeräten im automobilen Bereich ist das Expertenwissen aufgrund der hohen Komplexität der Testfälle unersetzlich. Bei Basistesttechniken wie der Grenzwertanalyse ist die Absicht eines Testfalls implizit durch die Technik gegeben. Beim Expertenwissen wird jedoch zur Zeit zu jedem erstellten Testfall zusätzlich ein Prosatext verfasst um die Testabsicht anzugeben. Diese Prosabeschreibung ist anfällig für Mehrdeutigkeiten, fällt bei jedem Testentwickler unterschiedlich aus und der inhaltliche Bezug zum Testfall ist lose. Ziel der Arbeit ist eine Spezifikationssprache für die Testfallbeschreibung zu entwerfen um die Nachteile der natürlichen Sprache zu minimieren und testablaufspezifische Sprachelemente zu definieren, so dass sie als ein Grundgerüst für einen Testfall verwendet werden kann. Dazu wird aus der Einsatzumgebung (Systemspezifikation, Testimplementierung und Testprozessthemen) Sprachelemente für die Beschreibung abgeleitet und Ansätze für die Überführung der Beschreibung in die Testimplementierung betrachtet. Das Ergebnis ist eine Testfall-Spezifikationssprache, die auf formaler Grundlage basiert und u.a. in eine graphische Sicht überführt werden kann. Ähnlich der UML wird der Mehrwert erst durch eine werkzeugunterstützte Eingabe deutlich: So sind die Testentwickler in der Lage, einheitliche, formale, wieder verwendbare, verständliche Testfälle zu definieren.
In robotics, information is often regarded as a means to an end. The question of how to structure information and how to bridge the semantic gap between different levels of abstraction in a uniform way is still widely regarded as a technical issue. Ignoring these challenges appears to lead robotics into a similar stasis as experienced in the software industry of the late 1960s. From the beginning of the software crisis until today, numerous methods, techniques, and tools for managing the increasing complexity of software systems have evolved. The attempt to transfer several of these ideas towards applications in robotics yielded various control architectures, frameworks, and process models. These attempts mainly provide modularisation schemata which suggest how to decompose a complex system into less complex subsystems. The schematisation of representation and information ﬂow however is mostly ignored. In this work, a set of design schemata is proposed which is embedded into an action/perception-oriented design methodology to promote thorough abstractions between distinct levels of control. Action-oriented design decomposes control systems top-down and sensor data is extracted from the environment as required. This comes with the problem that information is often condensed in a premature fashion. That way, sensor processing is dependent on the control system design resulting in a monolithical system structure with limited options for reusability. In contrast, perception-oriented design constructs control systems bottom-up starting with the extraction of environment information from sensor data. The extracted entities are placed into structures which evolve with the development of the sensor processing algorithms. In consequence, the control system is strictly dependent on the sensor processing algorithms which again results in a monolithic system. In their particular domain, both design approaches have great advantages but fail to create inherently modular systems. The design approach proposed in this work combines the strengths of action orientation and perception orientation into one coherent methodology without inheriting their weaknesses. More precisely, design schemata for representation, translation, and fusion of environmental information are developed which establish thorough abstraction mechanisms between components. The explicit introduction of abstractions particularly supports extensibility and scalability of robot control systems by design.
Modern science utilizes advanced measurement and simulation techniques to analyze phenomena from fields such as medicine, physics, or mechanics. The data produced by application of these techniques takes the form of multi-dimensional functions or fields, which have to be processed in order to provide meaningful parts of the data to domain experts. Definition and implementation of such processing techniques with the goal to produce visual representations of portions of the data are topic of research in scientific visualization or multi-field visualization in the case of multiple fields. In this thesis, we contribute novel feature extraction and visualization techniques that are able to convey data from multiple fields created by scientific simulations or measurements. Furthermore, our scalar-, vector-, and tensor field processing techniques contribute to scattered field processing in general and introduce novel ways of analyzing and processing tensorial quantities such as strain and displacement in flow fields, providing insights into field topology. We introduce novel mesh-free extraction techniques for visualization of complex-valued scalar fields in acoustics that aid in understanding wave topology in low frequency sound simulations. The resulting structures represent regions with locally minimal sound amplitude and convey wave node evolution and sound cancellation in time-varying sound pressure fields, which is considered an important feature in acoustics design. Furthermore, methods for flow field feature extraction are presented that facilitate analysis of velocity and strain field properties by visualizing deformation of infinitesimal Lagrangian particles and macroscopic deformation of surfaces and volumes in flow. The resulting adaptive manifolds are used to perform flow field segmentation which supports multi-field visualization by selective visualization of scalar flow quantities. The effects of continuum displacement in scattered moment tensor fields can be studied by a novel method for multi-field visualization presented in this thesis. The visualization method demonstrates the benefit of clustering and separate views for the visualization of multiple fields.
Due to remarkable technological advances in the last three decades the capacity of computer systems has improved tremendously. Considering Moore's law, the number of transistors on integrated circuits has doubled approximately every two years and the trend is continuing. Likewise, developments in storage density, network bandwidth, and compute capacity show similar patterns. As a consequence, the amount of data that can be processed by today's systems has increased by orders of magnitude. At the same time, however, the resolution of screens has hardly increased by a factor of ten. Thus, there is a gap between the amount of data that can be processed and the amount of data that can be visualized. Large high-resolution displays offer a way to deal with this gap and provide a significantly increased screen area by combining the images of multiple smaller display devices. The main objective of this dissertation is the development of new visualization and interaction techniques for large high-resolution displays.