## Fachbereich Informatik

### Refine

#### Year of publication

#### Document Type

- Doctoral Thesis (145) (remove)

#### Keywords

- Visualisierung (17)
- Computergraphik (5)
- Visualization (5)
- Robotik (4)
- Bildverarbeitung (3)
- Evaluation (3)
- Geoinformationssystem (3)
- Mustererkennung (3)
- Semantic Web (3)
- computer graphics (3)
- document analysis (3)
- optical character recognition (3)
- verification (3)
- visualization (3)
- AG-RESY (2)
- Algorithmus (2)
- Computational Fluid Dynamics (2)
- Datenanalyse (2)
- Datenbank (2)
- Dreidimensionale Bildverarbeitung (2)
- Effizienter Algorithmus (2)
- Endlicher Automat (2)
- Implementierung (2)
- Layout (2)
- Mensch-Maschine-Kommunikation (2)
- Merkmalsextraktion (2)
- Model Checking (2)
- Modellierung (2)
- Optische Zeichenerkennung (2)
- Raumakustik (2)
- Ray casting (2)
- Recommender Systems (2)
- Room acoustics (2)
- SDL (2)
- Scattered-Data-Interpolation (2)
- Scientific Visualization (2)
- Simulation (2)
- Softwareentwicklung (2)
- UML (2)
- Uncertainty Visualization (2)
- Voronoi-Diagramm (2)
- image processing (2)
- layout analysis (2)
- machine learning (2)
- virtual acoustics (2)
- 3D Gene Expression (1)
- 3D Point Data (1)
- ASM (1)
- AUTOSAR (1)
- Ableitungsschätzung (1)
- Abrechnungsmanagement (1)
- Abstraction (1)
- Abstraktion (1)
- Accounting Agent (1)
- Activity Recognition, Wearable Computing, Minimal Training, Unobtrusive Instrumentations (1)
- Activity recognition (1)
- Adaptive Data Structure (1)
- Affine Arithmetic (1)
- Akquisition (1)
- Algorithm (1)
- Algorithmic Differentiation (1)
- Artificial Intelligence (1)
- Association (1)
- Ausdrucksfähig (1)
- Ausdrucksfähigkeit (1)
- Automat <Automatentheorie> (1)
- Automation (1)
- Automatische Differentiation (1)
- Automatisches Beweisverfahren (1)
- Autonomer Agent (1)
- Autonomer Roboter (1)
- Bahnplanung (1)
- Befahrbarkeitsanalyse (1)
- Benutzer (1)
- Beschränkte Arithmetik (1)
- Bioinformatik (1)
- Bluetooth (1)
- Boosting (1)
- Caching (1)
- Carbon footprint (1)
- Classification (1)
- Cluster-Analyse (1)
- Clustering (1)
- Code Generation (1)
- Collaboration (1)
- Combinatorial Testing (1)
- Computer Graphic (1)
- Computer Supported Cooperative Work (1)
- Computer graphics (1)
- Computerphysik (1)
- Computersimulation (1)
- Computertomographie (1)
- Computervision (1)
- Containertypen (1)
- Containertypes (1)
- Controller Synthesis (1)
- Cycle Accuracy (1)
- DCE <Programm> (1)
- DFG (1)
- DPN (1)
- Data Modeling (1)
- Dataset (1)
- Datenbanken (1)
- Decision Support Systems (1)
- Delaunay-Triangulierung (1)
- Derivative Estimation (1)
- Dienstgüte (1)
- Dienstschnittstellen (1)
- Direct Numerical Simulation (1)
- Discrete Event Simulation (DES) (1)
- Distributed Rendering (1)
- Distributed system (1)
- Dreidimensionale Rekonstruktion (1)
- Dreidimensionale Strömung (1)
- Duplicate Identification (1)
- Duplikaterkennung (1)
- Dynamischer Test (1)
- ESTELLE (1)
- Effizienz (1)
- Effizienzsteigerung (1)
- Empfehlungssysteme (1)
- Energie (1)
- Ensemble Visualization (1)
- Entwurf (1)
- Evolutionary Algorithm (1)
- Experiment (1)
- Experimentation (1)
- Expressiveness (1)
- Extrapolation (1)
- FERAL (1)
- Fahrtkostenmodelle (1)
- Fast Mode-Signaling (1)
- Fault Tree Analysis (1)
- Feasibility study (1)
- Feature (1)
- Feature Detection (1)
- Feature Extraction (1)
- Fehlerbaumanalyse (1)
- Fließanalyse (1)
- Flow Visualization (1)
- Formale Beschreibungstechnik (1)
- Formale Grammatik (1)
- Formale Sprache (1)
- Formaler Beweis (1)
- Functional Safety (1)
- Funktionale Sicherheit (1)
- Fusion (1)
- Future Internet (1)
- GPU (1)
- Gateway (1)
- Gauß-Filter (1)
- Gefahren- und Risikoanalyse (1)
- Gene expression programming (1)
- Generierung (1)
- Geovisualization (1)
- Google Earth (1)
- Hazard Analysis (1)
- Hitting families (1)
- Human-Robot-Coexistence (1)
- Human-Robot-Cooperation (1)
- Hypergraph (1)
- IEEE 802.15.4 (1)
- IP Address (1)
- IP Traffic Accounting (1)
- IP-XACT (1)
- ISO 26262 (1)
- Image Processing (1)
- Imote2 (1)
- Implementation (1)
- Incremental recomputation (1)
- Induktive logische Programmierung (1)
- Information Extraction (1)
- Information Visualization (1)
- Intensity estimation (1)
- Interaction (1)
- Interaktion (1)
- Intermediate Composition (1)
- Interpolation (1)
- Invariante (1)
- Invariante Momente (1)
- Kellerautomat (1)
- Klassifikation (1)
- Knowledge Management (1)
- Knuth-Bendix completion (1)
- Knuth-Bendix-Vervollständigung (1)
- Kommunikation (1)
- Komponentenmodell (1)
- Kompression (1)
- Kontextbezogenes System (1)
- Korrelationsanalyse (1)
- Kurve (1)
- LIDAR (1)
- LIR-Tree (1)
- LOADBAL (1)
- Large High-Resolution Displays (1)
- Laser Wakefield Particle Accelerator (1)
- Leistungsmessung (1)
- Linked Data (1)
- Linking Data Analysis and Visualization (1)
- Lokalisierung (1)
- Lärmbelastung (1)
- Lärmimmission (1)
- MAC protocols (1)
- MIP-Emissionsspektroskopie (1)
- MIP-Massenspektrometrie (1)
- Machine Learning (1)
- Machine learning (1)
- Magnetfeldbasierter Lokalisierung (1)
- Magnetfelder (1)
- Manufacturing (1)
- MapReduce (1)
- Maschinelle Übersetzung (1)
- Maximum Intensity Projection (1)
- Mehragentensystem (1)
- Memory Architecture (1)
- Memory Consistency (1)
- Mensch-Roboter-Koexistenz (1)
- Mensch-Roboter-Kooperation (1)
- Menschenmenge (1)
- Merkmalsraum (1)
- Mesh-Free (1)
- Meter (1)
- Mikroskopie (1)
- Minimal Cut Set Visualization (1)
- Mobile Computing (1)
- Mobile system (1)
- Mobiler Roboter (1)
- Mode-Based Scheduling with Fast Mode-Signaling (1)
- Modellgetriebene Entwicklung (1)
- Modelling (1)
- Modularisierung (1)
- Modusbasierte Signalisierung (1)
- Molekulare Bioinformatik (1)
- Moment Invariants (1)
- Mood-based Music Recommendations (1)
- Multi-Edge Graph (1)
- Multi-Field (1)
- Multi-Variate Data (1)
- Multifield Data (1)
- Nachhaltigkeit (1)
- Natural Language Processing (1)
- Natural Neighbor (1)
- Natural Neighbor Interpolation (1)
- Natürliche Nachbarn (1)
- Navigation (1)
- Network Architecture (1)
- Network Calculus (1)
- Netzwerk (1)
- NoSQL (1)
- Node-Link Diagram (1)
- Numerische Strömungssimulation (1)
- OCR (1)
- Object-orientation (1)
- Objektorientierung (1)
- Off-road Robotics (1)
- Off-road Robotik (1)
- Online chain partitioning (1)
- Open Estelle (1)
- Optimierender Compiler (1)
- Optimierung (1)
- Paralleler Hybrid (1)
- Pareto Optimality (1)
- Partial Differential Equations (1)
- Partially ordered sets (1)
- Participatory Sensing (1)
- Pattern Recognition (1)
- Performance (1)
- Personalisation (1)
- Pervasive health (1)
- Physical activity monitoring (1)
- Process Data (1)
- Processor Architecture (1)
- Protocol Composition (1)
- Prototyp (1)
- Prototype (1)
- Prozessvisualisierung (1)
- Quality (1)
- Qualität (1)
- Quantitative Bildanalyse (1)
- Quartz (1)
- Quicksort (1)
- Random testing (1)
- Raumordnung (1)
- Ray tracing (1)
- Rechtecksgitter (1)
- Rectilinear Grid (1)
- Redundanzvermeidung (1)
- Regelung (1)
- Rekonstruktion (1)
- Representation (1)
- Risikobewertung (1)
- Risk Assessment (1)
- Roboter (1)
- Räumliche Statistik (1)
- SAHARA (1)
- SDL extensions (1)
- SIMERO (1)
- SPARQL (1)
- SPARQL query learning (1)
- Safety Analysis (1)
- Scalar (1)
- Scheduling (1)
- Schema <Informatik> (1)
- Schematisation (1)
- Schematisierung (1)
- Semantic Wikis (1)
- Semantische Modellierung (1)
- Semantisches Datenmodell (1)
- Service Access Points (1)
- Shared Resource Modeling (1)
- Sicherheitsanalyse (1)
- Similarity Join (1)
- Similarity Joins (1)
- Simulationen (1)
- Skalar (1)
- Smart City (1)
- Smartphone (1)
- Smartwatch (1)
- Socio-Semantic Web (1)
- Software (1)
- Software Comprehension (1)
- Software Dependencies (1)
- Software Engineering (1)
- Software Evolution (1)
- Software Maintenance (1)
- Software Measurement (1)
- Software Testing (1)
- Software Visualization (1)
- Software engineering (1)
- Software-Architektur (1)
- Softwaremetrie (1)
- Softwarespezifikation (1)
- Softwarewartung (1)
- Sprachdefinition (1)
- Sprache (1)
- Sprachprofile (1)
- Stimmungsbasierte Musikempfehlungen (1)
- Stokes Equations (1)
- Strukturiertes Gitter (1)
- Strömung (1)
- Surface Reconstruction (1)
- Symbolic Methods (1)
- SystemC (1)
- Systemarchitektur (1)
- Systemdesign (1)
- Temporal Decoupling (1)
- Temporal Logic (1)
- Temporal data processing (1)
- Tensor (1)
- Tensorfeld (1)
- Tesselation (1)
- Tetraeder (1)
- Tetraedergi (1)
- Tetrahedral Grid (1)
- Tetrahedral Mesh (1)
- Themenbasierte Empfehlungen von Ressourcen (1)
- Time-motion-Ultraschallkardiographie (1)
- Topic-based Resource Recommendations (1)
- Topologie (1)
- Topology (1)
- Topology visualization (1)
- Transaction Level Modeling (TLM) (1)
- Transport Protocol (1)
- Traversability Analysis (1)
- Ubiquitous system (1)
- Ultraschallkardiographie (1)
- Umweltinformatik (1)
- Unorganized Data (1)
- Unstrukturiertes Gitter (1)
- Urban sprawl (1)
- UrbanSim (1)
- User Model (1)
- VIACOBI (1)
- Validierung (1)
- Vector (1)
- Vector Field (1)
- Vektor (1)
- Vektorfelder (1)
- Verifikation (1)
- Verzerrungstensor (1)
- Virtual Prototyping (1)
- Visual Analytics (1)
- Visual Queries (1)
- Volume rendering (1)
- Volumen-Rendering (1)
- Voronoi diagram (1)
- Weak Memory Model (1)
- Wearable computing (1)
- WiFi (1)
- Wide-column stores (1)
- XDBMS (1)
- XEC (1)
- XML (1)
- XML query estimation (1)
- XML summary (1)
- Yaroslavskiy-Bentley-Bloch Quicksort (1)
- acoustic modeling (1)
- affective user interface (1)
- affine arithmetic (1)
- analysis of algorithms (1)
- artificial intelligence (1)
- artificial neural network (1)
- associations (1)
- automated theorem proving (1)
- average-case analysis (1)
- behaviour-based system (1)
- binary countdown protocol (1)
- black bursts (1)
- classification (1)
- collaborative information visualization (1)
- collaborative mobile sensing (1)
- collective intelligence (1)
- computational biology (1)
- computational fluid dynamics (1)
- computer-supported cooperative work (1)
- concurrent (1)
- content-and-structure summary (1)
- continuous master theorem (1)
- crowd condition estimation (1)
- crowd density estimation (1)
- crowd scanning (1)
- crowd sensing (1)
- crowdsourcing (1)
- curves and surfaces (1)
- data sets (1)
- data-flow (1)
- dataset (1)
- decidability (1)
- design (1)
- deterministic arbitration (1)
- directed graphs (1)
- distributed (1)
- distributed real-time systems (1)
- distributed tasks (1)
- embedded (1)
- embedding (1)
- emotion visualization (1)
- end-to-end learning (1)
- environmental noise (1)
- evolutionary algorithm (1)
- firewall (1)
- flow visualization (1)
- foundational translation validation (1)
- gaussian filter (1)
- geographic information systems (1)
- geology (1)
- graph drawing algorithm (1)
- graph embedding (1)
- graph layout (1)
- historical documents (1)
- hypergraph (1)
- iB2C (1)
- image analysis (1)
- implementation (1)
- information systems (1)
- interference (1)
- interval arithmetic (1)
- invariant (1)
- inverses Pendel (1)
- language definition (1)
- language modeling (1)
- language profiles (1)
- linked data (1)
- long short-term memory (1)
- long tail (1)
- machine-checkable proof (1)
- magnetic field based localization (1)
- matrix visualization (1)
- message-passing (1)
- metadata (1)
- mixed-signal (1)
- modularisation (1)
- moment (1)
- multicore (1)
- multidimensional datasets (1)
- multinomial regression (1)
- multithreading (1)
- multitype code coupling (1)
- multiway partitioning (1)
- nestable tangibles (1)
- optimization correctness (1)
- oscillating magnetic fields (1)
- out-of-order (1)
- parallel (1)
- participatory sensing (1)
- path cost models (1)
- pattern recognition (1)
- pivot sampling (1)
- point cloud (1)
- proof generating optimizer (1)
- ray casting (1)
- ray tracing (1)
- real-time tasks (1)
- relaxed memory models (1)
- robustness (1)
- secondary structure prediction (1)
- semantic web (1)
- small-multiples node-link visualization (1)
- social media (1)
- software comprehension (1)
- software engineering task (1)
- spatial statistics (1)
- static software structure (1)
- stationary sensing (1)
- statistics (1)
- structural summary (1)
- subjective evaluation (1)
- symbolic simulation (1)
- synchronous (1)
- system architecture (1)
- tabletop (1)
- task sequence (1)
- tensor (1)
- tensorfield (1)
- terrain rendering (1)
- time-varying flow fields (1)
- touch surfaces (1)
- translation contract (1)
- urban planning (1)
- user-centered design (1)
- vector field visualization (1)
- vectorfield (1)
- virtual reality (1)
- visual structure (1)
- weighted finite-state transducers (1)
- wireless signal (1)
- zeitabhängige Strömungen (1)
- Ähnlichkeit (1)
- Übersetzung (1)

Most modern multiprocessors offer weak memory behavior to improve their performance in terms of throughput. They allow the order of memory operations to be observed differently by each processor. This is opposite to the concept of sequential consistency (SC) which enforces a unique sequential view on all operations for all processors. Because most software has been and still is developed with SC in mind, we face a gap between the expected behavior and the actual behavior on modern architectures. The issues described only affect multithreaded software and therefore most programmers might never face them. However, multi-threaded bare metal software like operating systems, embedded software, and real-time software have to consider memory consistency and ensure that the order of memory operations does not yield unexpected results. This software is more critical as general consumer software in terms of consequences, and therefore new methods are needed to ensure their correct behavior.
In general, a memory system is considered weak if it allows behavior that is not possible in a sequential system. For example, in the SPARC processor with total store ordering (TSO) consistency, all writes might be delayed by store buffers before they eventually are processed by the main memory. This allows the issuing process to work with its own written values before other processes observed them (i.e., reading its own value before it leaves the store buffer). Because this behavior is not possible with sequential consistency, TSO is considered to be weaker than SC. Programming in the context of weak memory architectures requires a proper comprehension of how the model deviates from expected sequential behavior. For verification of these programs formal representations are required that cover the weak behavior in order to utilize formal verification tools.
This thesis explores different verification approaches and respectively fitting representations of a multitude of memory models. In a joint effort, we started with the concept of testing memory operation traces in regard of their consistency with different memory consistency models. A memory operation trace is directly derived from a program trace and consists of a sequence of read and write operations for each process. Analyzing the testing problem, we are able to prove that the problem is NP-complete for most memory models. In that process, a satisfiability (SAT) encoding for given problem instances was developed, that can be used in reachability and robustness analysis.
In order to cover all program executions instead of just a single program trace, additional representations are introduced and explored throughout this thesis. One of the representations introduced is a novel approach to specify a weak memory system using temporal logics. A set of linear temporal logic (LTL) formulas is developed that describes all properties required to restrict possible traces to those consistent to the given memory model. The resulting LTL specifications can directly be used in model checking, e.g., to check safety conditions. Unfortunately, the derived LTL specifications suffer from the state explosion problem: Even small examples, like the Peterson mutual exclusion algorithm, tend to generate huge formulas and require vast amounts of memory for verification. For this reason, it is concluded that using the proposed verification approach these specifications are not well suited for verification of real world software. Nonetheless, they provide comprehensive and formally correct descriptions that might be used elsewhere, e.g., programming or teaching.
Another approach to represent these models are operational semantics. In this thesis, operational semantics of weak memory models are provided in the form of reference machines that are both correct and complete regarding the memory model specification. Operational semantics allow to simulate systems with weak memory models step by step. This provides an elegant way to study the effects that lead to weak consistent behavior, while still providing a basis for formal verification. The operational models are then incorporated in verification tools for multithreaded software. These state space exploration tools proved suitable for verification of multithreaded software in a weak consistent memory environment. However, because not only the memory system but also the processor are expressed as operational semantics, some verification approach will not be feasible due to the large size of the state space.
Finally, to tackle the beforementioned issue, a state transition system for parallel programs is proposed. The transition system is defined by a set of structural operational semantics (SOS) rules and a suitable memory structure that can cover multiple memory models. This allows to influence the state space by use of smart representations and approximation approaches in future work.

Large-scale distributed systems consist of a number of components, take a number of parameter values as input, and behave differently based on a number of non-deterministic events. All these features—components, parameter values, and events—interact in complicated ways, and unanticipated interactions may lead to bugs. Empirically, many bugs in these systems are caused by interactions of only a small number of features. In certain cases, it may be possible to test all interactions of \(k\) features for a small constant \(k\) by executing a family of tests that is exponentially or even doubly-exponentially smaller than the family of all tests. Thus, in such cases we can effectively uncover all bugs that require up to \(k\)-wise interactions of features.
In this thesis we study two occurrences of this phenomenon. First, many bugs in distributed systems are caused by network partition faults. In most cases these bugs occur due to two or three key nodes, such as leaders or replicas, not being able to communicate, or because the leading node finds itself in a block of the partition without quorum. Second, bugs may occur due to unexpected schedules (interleavings) of concurrent events—concurrent exchange of messages and concurrent access to shared resources. Again, many bugs depend only on the relative ordering of a small number of events. We call the smallest number of events whose ordering causes a bug the depth of the bug. We show that in both testing scenarios we can effectively uncover bugs involving small number of nodes or bugs of small depth by executing small families of tests.
We phrase both testing scenarios in terms of an abstract framework of tests, testing goals, and goal coverage. Sets of tests that cover all testing goals are called covering families. We give a general construction that shows that whenever a random test covers a fixed goal with sufficiently high probability, a small randomly chosen set of tests is a covering family with high probability. We then introduce concrete coverage notions relating to network partition faults and bugs of small depth. In case of network partition faults, we show that for the introduced coverage notions we can find a lower bound on the probability that a random test covers a given goal. Our general construction then yields a randomized testing procedure that achieves full coverage—and hence, find bugs—quickly.
In case of coverage notions related to bugs of small depth, if the events in the program form a non-trivial partial order, our general construction may give a suboptimal bound. Thus, we study other ways of constructing covering families. We show that if the events in a concurrent program are partially ordered as a tree, we can explicitly construct a covering family of small size: for balanced trees, our construction is polylogarithmic in the number of events. For the case when the partial order of events does not have a "nice" structure, and the events and their relation to previous events are revealed while the program is running, we give an online construction of covering families. Based on the construction, we develop a randomized scheduler called PCTCP that uniformly samples schedules from a covering family and has a rigorous guarantee of finding bugs of small depth. We experiment with an implementation of PCTCP on two real-world distributed systems—Zookeeper and Cassandra—and show that it can effectively find bugs.

Ranking lists are an essential methodology to succinctly summarize outstanding items, computed over database tables or crowdsourced in dedicated websites. In this thesis, we propose the usage of automatically generated, entity-centric rankings to discover insights in data. We present PALEO, a framework for data exploration through reverse engineering top-k database queries, that is, given a database and a sample top-k input list, our approach, aims at determining an SQL query that returns results similar to the provided input when executed over the database. The core problem consist of finding selection predicates that return the given items, determining the correct ranking criteria, and evaluating the most promising candidate queries first. PALEO operates on subset of the base data, uses data samples, histograms, descriptive statistics, and further proposes models that assess the suitability of candidate queries which facilitate limitation of false positives. Furthermore, this thesis presents COMPETE, a novel approach that models and computes dominance over user-provided input entities, given a database of top-k rankings. The resulting entities are found superior or inferior with tunable degree of dominance over the input set---a very intuitive, yet insightful way to explore pros and cons of entities of interest. Several notions of dominance are defined which differ in computational complexity and strictness of the dominance concept---yet, interdependent through containment relations. COMPETE is able to pick the most promising approach to satisfy a user request at minimal runtime latency, using a probabilistic model that is estimating the result sizes. The individual flavors of dominance are cast into a stack of algorithms over inverted indices and auxiliary structures, enabling pruning techniques to avoid significant data access over large datasets of rankings.

Wearable activity recognition aims to identify and assess human activities with the help
of computer systems by evaluating signals of sensors which can be attached to the human
body. This provides us with valuable information in several areas: in health care, e.g. fluid
and food intake monitoring; in sports, e.g. training support and monitoring; in entertainment,
e.g. human-computer interface using body movements; in industrial scenarios, e.g.
computer support for detected work tasks. Several challenges exist for wearable activity
recognition: a large number of nonrelevant activities (null class), the evaluation of large
numbers of sensor signals (curse of dimensionality), ambiguity of sensor signals compared
to the activities and finally the high variability of human activity in general.
This thesis develops a new activity recognition strategy, called invariants classification,
which addresses these challenges, especially the variability in human activities. The
core idea is that often even highly variable actions include short, more or less invariant
sub-actions which are due to hard physical constraints. If someone opens a door, the
movement of the hand to the door handle is not fixed. However the door handle has to
be pushed to open the door. The invariants classification algorithm is structured in four
phases: segmentation, invariant identification, classification, and spotting. The segmentation
divides the continuous sensor data stream into meaningful parts, which are related
to sub-activities. Our segmentation strategy uses the zero crossings of the central difference
quotient of the sensor signals, as segment borders. The invariant identification finds
the invariant sub-activities by means of clustering and a selection strategy dependent on
certain features. The classification identifies the segments of a specific activity class, using
models generated from the invariant sub-activities. The models include the invariant
sub-activity signal and features calculated on sensor signals related to the sub-activity. In
the spotting, the classified segments are used to find the entire activity class instances in
the continuous sensor data stream. For this purpose, we use the position of the invariant
sub-activity in the related activity class instance for the estimation of the borders of the
activity instances.
In this thesis, we show that our new activity recognition strategy, built on invariant
sub-activities, is beneficial. We tested it on three human activity datasets with wearable
inertial measurement units (IMU). Compared to previous publications on the same
datasets we got improvement in the activity recognition in several classes, some with a
large margin. Our segmentation achieves a sensible method to separate the sensor data in
relation to the underlying activities. Relying on sub-activities makes us independent from
imprecise labels on the training data. After the identification of invariant sub-activities,
we calculate a value called cluster precision for each sensor signal and each class activity.
This tells us which classes can be easily classified and which sensor channels support
the classification best. Finally, in the training for each activity class, our algorithm selects
suitable signal channels with invariant sub-activities on different points in time and
with different length. This makes our strategy a multi-dimensional asynchronous motif
detection with variable motif length.

Graphs and flow networks are important mathematical concepts that enable the modeling and analysis of a large variety of real world problems in different domains such as engineering, medicine or computer science. The number, sizes and complexities of those problems permanently increased during the last decades. This led to an increased demand of techniques that help domain experts in understanding their data and its underlying structure to enable an efficient analysis and decision making process.
To tackle this challenge, this work presents several new techniques that utilize concepts of visual analysis to provide domain scientists with new visualization methodologies and tools. Therefore, this work provides novel concepts and approaches for diverse aspects of the visual analysis such as data transformation, visual mapping, parameter refinement and analysis, model building and visualization as well as user interaction.
The presented techniques form a framework that enriches domain scientists with new visual analysis tools and help them analyze their data and gain insight from the underlying structures. To show the applicability and effectiveness of the presented approaches, this work tackles different applications such as networking, product flow management and vascular systems, while preserving the generality to be applicable to further domains.

The simulation of physical phenomena involving the dynamic behavior of fluids and gases
has numerous applications in various fields of science and engineering. Of particular interest
is the material transport behavior, the tendency of a flow field to displace parts of the
medium. Therefore, many visualization techniques rely on particle trajectories.
Lagrangian Flow Field Representation. In typical Eulerian settings, trajectories are
computed from the simulation output using numerical integration schemes. Accuracy concerns
arise because, due to limitations of storage space and bandwidth, often only a fraction
of the computed simulation time steps are available. Prior work has shown empirically that
a Lagrangian, trajectory-based representation can improve accuracy [Agr+14]. Determining
the parameters of such a representation in advance is difficult; a relationship between the
temporal and spatial resolution and the accuracy of resulting trajectories needs to be established.
We provide an error measure for upper bounds of the error of individual trajectories.
We show how areas at risk for high errors can be identified, thereby making it possible to
prioritize areas in time and space to allocate scarce storage resources.
Comparative Visual Analysis of Flow Field Ensembles. Independent of the representation,
errors of the simulation itself are often caused by inaccurate initial conditions,
limitations of the chosen simulation model, and numerical errors. To gain a better understanding
of the possible outcomes, multiple simulation runs can be calculated, resulting in
sets of simulation output referred to as ensembles. Of particular interest when studying the
material transport behavior of ensembles is the identification of areas where the simulation
runs agree or disagree. We introduce and evaluate an interactive method that enables application
scientists to reliably identify and examine regions of agreement and disagreement,
while taking into account the local transport behavior within individual simulation runs.
Particle-Based Representation and Visualization of Uncertain Flow Data Sets. Unlike
simulation ensembles, where uncertainty of the solution appears in the form of different
simulation runs, moment-based Eulerian multi-phase fluid simulations are probabilistic in
nature. These simulations, used in process engineering to simulate the behavior of bubbles in
liquid media, are aimed toward reducing the need for real-world experiments. The locations
of individual bubbles are not modeled explicitly, but stochastically through the properties of
locally defined bubble populations. Comparisons between simulation results and physical
experiments are difficult. We describe and analyze an approach that generates representative
sets of bubbles for moment-based simulation data. Using our approach, application scientists
can directly, visually compare simulation results and physical experiments.

The focus of this work is to provide and evaluate a novel method for multifield topology-based analysis and visualization. Through this concept, called Pareto sets, one is capable to identify critical regions in a multifield with arbitrary many individual fields. It uses ideas found in graph optimization to find common behavior and areas of divergence between multiple optimization objectives. The connections between the latter areas can be reduced into a graph structure allowing for an abstract visualization of the multifield to support data exploration and understanding.
The research question that is answered in this dissertation is about the general capability and expandability of the Pareto set concept in context of visualization and application. Furthermore, the study of its relations, drawbacks and advantages towards other topological-based approaches. This questions is answered in several steps, including consideration and comparison with related work, a thorough introduction of the Pareto set itself as well as a framework for efficient implementation and an attached discussion regarding limitations of the concept and their implications for run time, suitable data, and possible improvements.
Furthermore, this work considers possible simplification approaches like integrated single-field simplification methods but also using common structures identified through the Pareto set concept to smooth all individual fields at once. These considerations are especially important for real-world scenarios to visualize highly complex data by removing small local structures without destroying information about larger, global trends.
To further emphasize possible improvements and expandability of the Pareto set concept, the thesis studies a variety of different real world applications. For each scenario, this work shows how the definition and visualization of the Pareto set is used and improved for data exploration and analysis based on the scenarios.
In summary, this dissertation provides a complete and sound summary of the Pareto set concept as ground work for future application of multifield data analysis. The possible scenarios include those presented in the application section, but are found in a wide range of research and industrial areas relying on uncertainty analysis, time-varying data, and ensembles of data sets in general.

Novel image processing techniques have been in development for decades, but most
of these techniques are barely used in real world applications. This results in a gap
between image processing research and real-world applications; this thesis aims to
close this gap. In an initial study, the quantification, propagation, and communication
of uncertainty were determined to be key features in gaining acceptance for
new image processing techniques in applications.
This thesis presents a holistic approach based on a novel image processing pipeline,
capable of quantifying, propagating, and communicating image uncertainty. This
work provides an improved image data transformation paradigm, extending image
data using a flexible, high-dimensional uncertainty model. Based on this, a completely
redesigned image processing pipeline is presented. In this pipeline, each
step respects and preserves the underlying image uncertainty, allowing image uncertainty
quantification, image pre-processing, image segmentation, and geometry
extraction. This is communicated by utilizing meaningful visualization methodologies
throughout each computational step.
The presented methods are examined qualitatively by comparing to the Stateof-
the-Art, in addition to user evaluation in different domains. To show the applicability
of the presented approach to real world scenarios, this thesis demonstrates
domain-specific problems and the successful implementation of the presented techniques
in these domains.

The Symbol Grounding Problem (SGP) is one of the first attempts to proposed a hypothesis about mapping abstract concepts and the real world. For example, the concept "ball" can be represented by an object with a round shape (visual modality) and phonemes /b/ /a/ /l/ (audio modality).
This thesis is inspired by the association learning presented in infant development.
Newborns can associate visual and audio modalities of the same concept that are presented at the same time for vocabulary acquisition task.
The goal of this thesis is to develop a novel framework that combines the constraints of the Symbol Grounding Problem and Neural Networks in a simplified scenario of association learning in infants. The first motivation is that the network output can be considered as numerical symbolic features because the attributes of input samples are already embedded. The second motivation is the association between two samples is predefined before training via the same vectorial representation. This thesis proposes to associate two samples and the vectorial representation during training. Two scenarios are considered: sample pair association and sequence pair association.
Three main contributions are presented in this work.
The first contribution is a novel Symbolic Association Model based on two parallel MLPs.
The association task is defined by learning that two instances that represent one concept.
Moreover, a novel training algorithm is defined by matching the output vectors of the MLPs with a statistical distribution for obtaining the relationship between concepts and vectorial representations.
The second contribution is a novel Symbolic Association Model based on two parallel LSTM networks that are trained on weakly labeled sequences.
The definition of association task is extended to learn that two sequences represent the same series of concepts.
This model uses a training algorithm that is similar to MLP-based approach.
The last contribution is a Classless Association.
The association task is defined by learning based on the relationship of two samples that represents the same unknown concept.
In summary, the contributions of this thesis are to extend Artificial Intelligence and Cognitive Computation research with a new constraint that is cognitive motivated. Moreover, two training algorithms with a new constraint are proposed for two cases: single and sequence associations. Besides, a new training rule with no-labels with promising results is proposed.

In recent years, enormous progress has been made in the field of Artificial Intelligence (AI). Especially the introduction of Deep Learning and end-to-end learning, the availability of large datasets and the necessary computational power in form of specialised hardware allowed researchers to build systems with previously unseen performance in areas such as computer vision, machine translation and machine gaming. In parallel, the Semantic Web and its Linked Data movement have published many interlinked RDF datasets, forming the world’s largest, decentralised and publicly available knowledge base.
Despite these scientific successes, all current systems are still narrow AI systems. Each of them is specialised to a specific task and cannot easily be adapted to all other human intelligence tasks, as would be necessary for Artificial General Intelligence (AGI). Furthermore, most of the currently developed systems are not able to learn by making use of freely available knowledge such as provided by the Semantic Web. Autonomous incorporation of new knowledge is however one of the pre-conditions for human-like problem solving.
This work provides a small step towards teaching machines such human-like reasoning on freely available knowledge from the Semantic Web. We investigate how human associations, one of the building blocks of our thinking, can be simulated with Linked Data. The two main results of these investigations are a ground truth dataset of semantic associations and a machine learning algorithm that is able to identify patterns for them in huge knowledge bases.
The ground truth dataset of semantic associations consists of DBpedia entities that are known to be strongly associated by humans. The dataset is published as RDF and can be used for future research.
The developed machine learning algorithm is an evolutionary algorithm that can learn SPARQL queries from a given SPARQL endpoint based on a given list of exemplary source-target entity pairs. The algorithm operates in an end-to-end learning fashion, extracting features in form of graph patterns without the need for human intervention. The learned patterns form a feature space adapted to the given list of examples and can be used to predict target candidates from the SPARQL endpoint for new source nodes. On our semantic association ground truth dataset, our evolutionary graph pattern learner reaches a Recall@10 of > 63 % and an MRR (& MAP) > 43 %, outperforming all baselines. With an achieved Recall@1 of > 34% it even reaches average human top response prediction performance. We also demonstrate how the graph pattern learner can be applied to other interesting areas without modification.