## Fachbereich Informatik

### Refine

#### Year of publication

- 1999 (228)
- 1994 (33)
- 1996 (32)
- 1998 (32)
- 1995 (31)
- 1997 (26)
- 2016 (23)
- 2015 (22)
- 2019 (22)
- 2001 (20)
- 2007 (18)
- 2018 (18)
- 2013 (17)
- 2014 (15)
- 1993 (14)
- 2009 (11)
- 2012 (11)
- 2000 (10)
- 2002 (10)
- 2003 (7)
- 2004 (7)
- 2006 (7)
- 2008 (7)
- 2017 (7)
- 2010 (6)
- 2005 (5)
- 1991 (4)
- 1992 (4)
- 2011 (4)
- 1979 (2)
- 1980 (1)
- 1990 (1)
- 2020 (1)

#### Document Type

- Preprint (290)
- Doctoral Thesis (141)
- Report (111)
- Article (90)
- Conference Proceeding (8)
- Master's Thesis (7)
- Study Thesis (5)
- Bachelor Thesis (2)
- Habilitation (2)

#### Language

- English (656) (remove)

#### Keywords

- AG-RESY (47)
- PARO (25)
- SKALP (15)
- Visualisierung (13)
- Case-Based Reasoning (11)
- RODEO (11)
- HANDFLEX (9)
- Abstraction (6)
- Case Based Reasoning (6)
- Robotics (6)

Activity recognition has continued to be a large field in computer science over the last two decades. Research questions from 15 years ago have led to solutions that today support our daily lives. Specifically, the success of smartphones or more recent developments of other smart devices (e.g., smart-watches) is rooted in applications that leverage on activity analysis and location tracking (fitness applications and maps). Today we can track our physical health and fitness and support our physical needs by merely owning (and using) a smart-phone. Still, the quality of our lives does not solely rely on fitness and physical health but also more increasingly on our mental well-being. Since we have learned how practical and easy it is to have a lot of functions, including health support on just one device, it would be specifically helpful if we could also use the smart-phone to support our mental and cognitive health if need be.
The ultimate goal of this work is to use sensor-assisted location and motion analysis to support various aspects of medically valid cognitive assessments.
In this regard, this thesis builds on Hypothesis 3: Sensors in our ubiquitous environment can collect information about our cognitive state, and it is possible to extract that information. In addition, these data can be used to derive complex cognitive states and to predict possible pathological changes in humans. After all, not only is it possible to determine the cognitive state through sensors but also to assist people in difficult situations through these sensors.
Thus, in the first part, this thesis focuses on the detection of mental state and state changes.
The primary purpose is to evaluate possible starting points for sensor systems in order to enable a clinically accurate assessment of mental states. These assessments must work on the condition that a developed system must be able to function within the given limits of a real clinical environment.
Despite the limitations and challenges of real-life deployments, it was possible to develop methods for determining the cognitive state and well-being of the residents. The analysis of the location data provides a correct classification of cognitive state with an average accuracy of 70% to 90%.
Methods to determine the state of bipolar patients provide an accuracy of 70-80\% for the detection of different cognitive states (total seven classes) using single sensors and 76% for merging data from different sensors. Methods for detecting the occurrence of state changes, a highlight of this work, even achieved a precision and recall of 95%.
The comparison of these results with currently used standard methods in psychiatric care even shows a clear advantage of the sensor-based method. The accuracy of the sensor-based analysis is 60% higher than the accuracy of the currently used methods.
The second part of this thesis introduces methods to support people’s actions in stressful situations on the one hand and analyzes the interaction between people during high-pressure activities on the other.
A simple, acceleration based, smartwatch instant feedback application was used to help laypeople to learn to perform CPR (cardiopulmonary resuscitation) in an emergency on the fly.
The evaluation of this application in a study with 43 laypersons showed an instant improvement in the CPR performance of 50%. An investigation of whether training with such an instant feedback device can support improved learning and lead to more permanent effects for gaining skills was able to confirm this theory.
Last but not least, with the main interest shifting from the individual to a group of people at the end of this work, the question: how can we determine the interaction between individuals within a group of people? was answered by developing a methodology to detect un-voiced collaboration in random ad-hoc groups. An evaluation with data retrieved from video footage provides an accuracy of up to more than 95%, and even with artificially introduced errors rates of 20%, still an accuracy of 70% precision, and 90% recall can be achieved.
All scenarios in this thesis address different practical issues of today’s health care. The methods developed are based on real-life datasets and real-world studies.

Private data analytics systems preferably provide required analytic accuracy to analysts and specified privacy to individuals whose data is analyzed. Devising a general system that works for a broad range of datasets and analytic scenarios has proven to be difficult.
Despite the advent of differentially private systems with proven formal privacy guarantees, industry still uses inferior ad-hoc mechanisms that provide better analytic accuracy. Differentially private mechanisms often need to add large amounts of noise to statistical results, which impairs their usability.
In my thesis I follow two approaches to improve the usability of private data analytics systems in general and differentially private systems in particular. First, I revisit ad-hoc mechanisms and explore the possibilities of systems that do not provide Differential Privacy or only a weak version thereof. Based on an attack analysis I devise a set of new protection mechanisms including Query Based Bookkeeping (QBB). In contrast to previous systems QBB only requires the history of analysts’ queries in order to provide privacy protection. In particular, QBB does not require knowledge about the protected individuals’ data.
In my second approach I use the insights gained with QBB to propose UniTraX, the first differentially private analytics system that allows to analyze part of a protected dataset without affecting the other parts and without giving up on accuracy. I show UniTraX’s usability by way of multiple case studies on real-world datasets across different domains. UniTraX allows more queries than previous differentially private data analytics systems at moderate runtime overheads.

Patients after total hip arthroplasty (THA) suffer from lingering musculoskeletal restrictions. Three-dimensional (3D) gait analysis in combination with machine-learning approaches is used to detect these impairments. In this work, features from the 3D gait kinematics, spatio temporal parameters (Set 1) and joint angles (Set 2), of an inertial sensor (IMU) system are proposed as an input for a support vector machine (SVM) model, to differentiate impaired and non-impaired gait. The features were divided into two subsets. The IMU-based features were validated against an optical motion capture (OMC) system by means of 20 patients after THA and a healthy control group of 24 subjects. Then the SVM model was trained on both subsets. The validation of the IMU system-based kinematic features revealed root mean squared errors in the joint kinematics from 0.24° to 1.25°. The validity of the spatio-temporal gait parameters (STP) revealed a similarly high accuracy. The SVM models based on IMU data showed an accuracy of 87.2% (Set 1) and 97.0% (Set 2). The current work presents valid IMU-based features, employed in an SVM model for the classification of the gait of patients after THA and a healthy control. The study reveals that the features of Set 2 are more significant concerning the classification problem. The present IMU system proves its potential to provide accurate features for the incorporation in a mobile gait-feedback system for patients after THA.

IoT systems consist of Hardware/Software systems (e.g., sensors) that are embedded in a physical world, networked and that interact with complex software platforms. The validation of such systems is a challenge and currently mostly done by prototypes. This paper presents the virtual environment for simulation, emulation and validation of an IoT platform and its semantic model in real life scenarios. It is based on a decentralized, bottom up approach that offers interoperability of IoT devices and the value-added services they want to use across different domains. The framework is demonstrated by a comprehensive case study. The example consists of the complete IoT “Smart Energy” use case with focus on data privacy by homomorphic encryption. The performance of the network is compared while using partially homomorphic encryption, fully homomorphic encryption and no encryption at all.As a major result, we found that our framework is capable of simulating big IoT networks and the overhead introduced by homomorphic encryption is feasible for VICINITY.

3D joint kinematics can provide important information about the quality of movements. Optical motion capture systems (OMC) are considered the gold standard in motion analysis. However, in recent years, inertial measurement units (IMU) have become a promising alternative. The aim of this study was to validate IMU-based 3D joint kinematics of the lower extremities during different movements. Twenty-eight healthy subjects participated in this study. They performed bilateral squats (SQ), single-leg squats (SLS) and countermovement jumps (CMJ). The IMU kinematics was calculated using a recently-described sensor-fusion algorithm. A marker based OMC system served as a reference. Only the technical error based on algorithm performance was considered, incorporating OMC data for the calibration, initialization, and a biomechanical model. To evaluate the validity of IMU-based 3D joint kinematics, root mean squared error (RMSE), range of motion error (ROME), Bland-Altman (BA) analysis as well as the coefficient of multiple correlation (CMC) were calculated. The evaluation was twofold. First, the IMU data was compared to OMC data based on marker clusters; and, second based on skin markers attached to anatomical landmarks. The first evaluation revealed means for RMSE and ROME for all joints and tasks below 3°. The more dynamic task, CMJ, revealed error measures approximately 1° higher than the remaining tasks. Mean CMC values ranged from 0.77 to 1 over all joint angles and all tasks. The second evaluation showed an increase in the RMSE of 2.28°– 2.58° on average for all joints and tasks. Hip flexion revealed the highest average RMSE in all tasks (4.87°– 8.27°). The present study revealed a valid IMU-based approach for the measurement of 3D joint kinematics in functional movements of varying demands. The high validity of the results encourages further development and the extension of the present approach into clinical settings.

Planar force or pressure is a fundamental physical aspect during any people-vs-people and people-vs-environment activities and interactions. It is as significant as the more established linear and angular acceleration (usually acquired by inertial measurement units). There have been several studies involving planar pressure in the discipline of activity recognition, as reviewed in the first chapter. These studies have shown that planar pressure is a promising sensing modality for activity recognition. However, they still take a niche part in the entire discipline, using ad hoc systems and data analysis methods. Mostly these studies were not followed by further elaborative works. The situation calls for a general framework that can help push planar pressure sensing into the mainstream.
This dissertation systematically investigates using planar pressure distribution sensing technology for ubiquitous and wearable activity recognition purposes. We propose a generic Textile Pressure Mapping (TPM) Framework, which encapsulates (1) design knowledge and guidelines, (2) a multi-layered tool including hardware, software and algorithms, and (3) an ensemble of empirical study examples. Through validation with various empirical studies, the unified TPM framework covers the full scope of application recognition, including the ambient, object, and wearable subspaces.
The hardware part constructs a general architecture and implementations in the large-scale and mobile directions separately. The software toolkit consists of four heterogeneous tiers: driver, data processing, machine learning, visualization/feedback. The algorithm chapter describes generic data processing techniques and a unified TPM feature set. The TPM framework offers a universal solution for other researchers and developers to evaluate TPM sensing modality in their application scenarios.
The significant findings from the empirical studies have shown that TPM is a versatile sensing modality. Specifically, in the ambient subspace, a sports mat or carpet with TPM sensors embedded underneath can distinguish different sports activities or different people's gait based on the dynamic change of body-print; a pressure sensitive tablecloth can detect various dining actions by the force propagated from the cutlery through the plates to the tabletop. In the object subspace, swirl office chairs with TPM sensors under the cover can be used to detect the seater's real-time posture; TPM can be used to detect emotion-related touch interactions for smart objects, toys or robots. In the wearable subspace, TPM sensors can be used to perform pressure-based mechanomyography to detect muscle and body movement; it can also be tailored to cover the surface of a soccer shoe to distinguish different kicking angles and intensities.
All the empirical evaluations have resulted in accuracies well-above the chance level of the corresponding number of classes, e.g., the `swirl chair' study has classification accuracy of 79.5% out of 10 posture classes and in the `soccer shoe' study the accuracy is 98.8% among 17 combinations of angle and intensity.

Under the notion of Cyber-Physical Systems an increasingly important research area has
evolved with the aim of improving the connectivity and interoperability of previously
separate system functions. Today, the advanced networking and processing capabilities
of embedded systems make it possible to establish strongly distributed, heterogeneous
systems of systems. In such configurations, the system boundary does not necessarily
end with the hardware, but can also take into account the wider context such as people
and environmental factors. In addition to being open and adaptive to other networked
systems at integration time, such systems need to be able to adapt themselves in accordance
with dynamic changes in their application environments. Considering that many
of the potential application domains are inherently safety-critical, it has to be ensured
that the necessary modifications in the individual system behavior are safe. However,
currently available state-of-the-practice and state-of-the-art approaches for safety assurance
and certification are not applicable to this context.
To provide a feasible solution approach, this thesis introduces a framework that allows
“just-in-time” safety certification for the dynamic adaptation behavior of networked
systems. Dynamic safety contracts (DSCs) are presented as the core solution concept
for monitoring and synthesis of decentralized safety knowledge. Ultimately, this opens
up a path towards standardized service provision concepts as a set of safety-related runtime
evidences. DSCs enable the modular specification of relevant safety features in
networked applications as a series of formalized demand-guarantee dependencies. The
specified safety features can be hierarchically integrated and linked to an interpretation
level for accessing the scope of possible safe behavioral adaptations. In this way, the networked
adaptation behavior can be conditionally certified with respect to the fulfilled
DSC safety features during operation. As long as the continuous evaluation process
provides safe adaptation behavior for a networked application context, safety can be
guaranteed for a networked system mode at runtime. Significant safety-related changes
in the application context, however, can lead to situations in which no safe adaptation
behavior is available for the current system state. In such cases, the remaining DSC
guarantees can be utilized to determine optimal degradation concepts for the dynamic
applications.
For the operationalization of the DSCs approach, suitable specification elements and
mechanisms have been defined. Based on a dedicated GUI-engineering framework it is
shown how DSCs can be systematically developed and transformed into appropriate runtime
representations. Furthermore, a safety engineering backbone is outlined to support
the DSC modeling process in concrete application scenarios. The conducted validation
activities show the feasibility and adequacy of the proposed DSCs approach. In parallel,
limitations and areas of future improvement are pointed out.

In computer graphics, realistic rendering of virtual scenes is a computationally complex problem. State-of-the-art rendering technology must become more scalable to
meet the performance requirements for demanding real-time applications.
This dissertation is concerned with core algorithms for rendering, focusing on the
ray tracing method in particular, to support and saturate recent massively parallel computer systems, i.e., to distribute the complex computations very efficiently
among a large number of processing elements. More specifically, the three targeted
main contributions are:
1. Collaboration framework for large-scale distributed memory computers
The purpose of the collaboration framework is to enable scalable rendering
in real-time on a distributed memory computer. As an infrastructure layer it
manages the explicit communication within a network of distributed memory
nodes transparently for the rendering application. The research is focused on
designing a communication protocol resilient against delays and negligible in
overhead, relying exclusively on one-sided and asynchronous data transfers.
The hypothesis is that a loosely coupled system like this is able to scale linearly
with the number of nodes, which is tested by directly measuring all possible
communication-induced delays as well as the overall rendering throughput.
2. Ray tracing algorithms designed for vector processing
Vector processors are to be efficiently utilized for improved ray tracing performance. This requires the basic, scalar traversal algorithm to be reformulated
in order to expose a high degree of fine-grained data parallelism. Two approaches are investigated: traversing multiple rays simultaneously, and performing
multiple traversal steps at once. Efficiently establishing coherence in a group
of rays as well as avoiding sorting of the nodes in a multi-traversal step are the
defining research goals.
3. Multi-threaded schedule and memory management for the ray tracing acceleration structure
Construction times of high-quality acceleration structures are to be reduced by
improvements to multi-threaded scalability and utilization of vector processors. Research is directed at eliminating the following scalability bottlenecks:
dynamic memory growth caused by the primitive splits required for high-
quality structures, and top-level hierarchy construction where simple task par-
allelism is not readily available. Additional research addresses how to expose
scatter/gather-free data-parallelism for efficient vector processing.
Together, these contributions form a scalable, high-performance basis for real-time,
ray tracing-based rendering, and a prototype path tracing application implemented
on top of this basis serves as a demonstration.
The key insight driving this dissertation is that the computational power necessary
for realistic light transport for real-time rendering applications demands massively
parallel computers, which in turn require highly scalable algorithms. Therefore this
dissertation provides important research along the path towards virtual reality.

While the design step should be free from computational related constraints and operations due to its artistic aspect, the modeling phase has to prepare the model for the later stages of the pipeline.
This dissertation is concerned with the design and implementation of a framework for local remeshing and optimization. Based on the experience gathered, a full study about mesh quality criteria is also part of this work.
The contributions can be highlighted as: (1) a local meshing technique based on a completely novel approach constrained to the preservation of the mesh of non interesting areas. With this concept, designers can work on the design details of specific regions of the model without introducing more polygons elsewhere; (2) a tool capable of recovering the shape of a refined area to its decimated version, enabling details on optimized meshes of detailed models; (3) the integration of novel techniques into a single framework for meshing and smoothing which is constrained to surface structure; (4) the development of a mesh quality criteria priority structure, being able to classify and prioritize according to the application of the mesh.
Although efficient meshing techniques have been proposed along the years, most of them lack the possibility to mesh smaller regions of the base mesh, preserving the mesh quality and density of outer areas.
Considering this limitation, this dissertation seeks answers to the following research questions:
1. Given that mesh quality is relative to the application it is intended for, is it possible to design a general mesh evaluation plan?
2. How to prioritize specific mesh criteria over others?
3. Given an optimized mesh and its original design, how to improve the representation of single regions of the first, without degrading the mesh quality elsewhere?
Four main achievements came from the respective answers:
1. The Application Driven Mesh Quality Criteria Structure: Due to high variation in mesh standards because of various computer aided operations performed for different applications, e.g. animation or stress simulation, a structure for better visualization of mesh quality criteria is proposed. The criteria can be used to guide the mesh optimization, making the task consistent and reliable. This dissertation also proposes a methodology to optimize the criteria values, which is adaptable to the needs of a specific application.
2. Curvature Driven Meshing Algorithm: A novel approach, a local meshing technique, which works on a desired area of the mesh while preserving its boundaries as well as the rest of the topology. It causes a slow growth in the overall amount of polygons by making only small regions denser. The method can also be used to recover the details of a reference mesh to its decimated version while refining it. Moreover, it employs a geometric fast and easy to implement approach representing surface features as simple circles, being used to guide the meshing. It also generates quad-dominant meshes, with triangle count directly dependent on the size of the boundary.
3. Curvature-based Method for Anisotropic Mesh Smoothing: A geometric-based method is extended to 3D space to be able to produce anisotropic elements where needed. It is made possible by mapping the original space to another which embeds the surface curvature. This methodology is used to enhance the smoothing algorithm by making the nearly regularized elements follow the surface features, preserving the original design. The mesh optimization method also preserves mesh topology, while resizing elements according to the local mesh resolution, effectively enhancing the design aspects intended.
4. Framework for Local Restructure of Meshed Surfaces: The combination of both methods creates a complete tool for recovering surface details through mesh refinement and curvature aware mesh smoothing.

In this thesis, we consider the problem of processing similarity queries over a dataset of top-k rankings and class constrained objects. Top-k rankings are the most natural and widely used technique to compress a large amount of information into a concise form. Spearman’s Footrule distance is used to compute the similarity between rankings, considering how well rankings agree on the positions (ranks) of ranked items. This setup allows the application of metric distance-based pruning strategies, and, alternatively, enables the use of traditional inverted indices for retrieving rankings that overlap in items. Although both techniques can be individually applied, we hypothesize that blending these two would lead to better performance. First, we formulate theoretical bounds over the rankings, based on Spearman's Footrule distance, which are essential for adapting existing, inverted index based techniques to the setting of top-k rankings. Further, we propose a hybrid indexing strategy, designed for efficiently processing similarity range queries, which incorporates inverted indices and metric space indices, such as M- or BK-trees, resulting in a structure that resembles both indexing methods with tunable emphasis on one or the other. Moreover, optimizations to the inverted index component are presented, for early termination and minimizing bookkeeping. As vast amounts of data are being generated on a daily bases, we further present a distributed, highly tunable, approach, implemented in Apache Spark, for efficiently processing similarity join queries over top-k rankings. To combine distance-based filtering with inverted indices, the algorithm works in several phases. The partial results are joined for the computation of the final result set. As the last contribution of the thesis, we consider processing k-nearest-neighbor (k-NN) queries over class-constrained objects, with the additional requirement that the result objects are of a specific type. We introduce the MISP index, which first indexes the objects by their (combination of) class belonging, followed by a similarity search sub index for each subset of objects. The number of such subsets can combinatorially explode, thus, we provide a cost model that analyzes the performance of the MISP index structure under different configurations, with the aim of finding the most efficient one for the dataset being searched.

Visualization is vital to the scientific discovery process.
An interactive high-fidelity rendering provides accelerated insight into complex structures, models and relationships.
However, the efficient mapping of visualization tasks to high performance architectures is often difficult, being subject to a challenging mixture of hardware and software architectural complexities in combination with domain-specific hurdles.
These difficulties are often exacerbated on heterogeneous architectures.
In this thesis, a variety of ray casting-based techniques are developed and investigated with respect to a more efficient usage of heterogeneous HPC systems for distributed visualization, addressing challenges in mesh-free rendering, in-situ compression, task-based workload formulation, and remote visualization at large scale.
A novel direct raytracing scheme for on-the-fly free surface reconstruction of particle-based simulations using an extended anisoptropic kernel model is investigated on different state-of-the-art cluster setups.
The versatile system renders up to 170 million particles on 32 distributed compute nodes at close to interactive frame rates at 4K resolution with ambient occlusion.
To address the widening gap between high computational throughput and prohibitively slow I/O subsystems, in situ topological contour tree analysis is combined with a compact image-based data representation to provide an effective and easy-to-control trade-off between storage overhead and visualization fidelity.
Experiments show significant reductions in storage requirements, while preserving flexibility for exploration and analysis.
Driven by an increasingly heterogeneous system landscape, a flexible distributed direct volume rendering and hybrid compositing framework is presented.
Based on a task-based dynamic runtime environment, it enables adaptable performance-oriented deployment on various platform configurations.
Comprehensive benchmarks with respect to task granularity and scaling are conducted to verify the characteristics and potential of the novel task-based system design.
A core challenge of HPC visualization is the physical separation of visualization resources and end-users.
Using more tiles than previously thought reasonable, a distributed, low-latency multi-tile streaming system is demonstrated, being able to sustain a stable 80 Hz when streaming up to 256 synchronized 3840x2160 tiles and achieve 365 Hz at 3840x2160 for sort-first compositing over the internet, thereby enabling lightweight visualization clients and leaving all the heavy lifting to the remote supercomputer.

Topology-Based Characterization and Visual Analysis of Feature Evolution in Large-Scale Simulations
(2019)

This manuscript presents a topology-based analysis and visualization framework that enables the effective exploration of feature evolution in large-scale simulations. Such simulations pose additional challenges to the already complex task of feature tracking and visualization, since the vast number of features and the size of the simulation data make it infeasible to naively identify, track, analyze, render, store, and interact with data. The presented methodology addresses these issues via three core contributions. First, the manuscript defines a novel topological abstraction, called the Nested Tracking Graph (NTG), that records the temporal evolution of features that exhibit a nesting hierarchy, such as superlevel set components for multiple levels, or filtered features across multiple thresholds. In contrast to common tracking graphs that are only capable of describing feature evolution at one hierarchy level, NTGs effectively summarize their evolution across all hierarchy levels in one compact visualization. The second core contribution is a view-approximation oriented image database generation approach (VOIDGA) that stores, at simulation runtime, a reduced set of feature images. Instead of storing the features themselves---which is often infeasable due to bandwidth constraints---the images of these databases can be used to approximate the depicted features from any view angle within an acceptable visual error, which requires far less disk space and only introduces a neglectable overhead. The final core contribution combines these approaches into a methodology that stores in situ the least amount of information necessary to support flexible post hoc analysis utilizing NTGs and view approximation techniques.

Shared memory concurrency is the pervasive programming model for multicore architectures
such as x86, Power, and ARM. Depending on the memory organization, each architecture follows
a somewhat different shared memory model. All these models, however, have one common
feature: they allow certain outcomes for concurrent programs that cannot be explained
by interleaving execution. In addition to the complexity due to architectures, compilers like
GCC and LLVM perform various program transformations, which also affect the outcomes of
concurrent programs.
To be able to program these systems correctly and effectively, it is important to define a
formal language-level concurrency model. For efficiency, it is important that the model is
weak enough to allow various compiler optimizations on shared memory accesses as well
as efficient mappings to the architectures. For programmability, the model should be strong
enough to disallow bogus “out-of-thin-air” executions and provide strong guarantees for well-synchronized
programs. Because of these conflicting requirements, defining such a formal
model is very difficult. This is why, despite years of research, major programming languages
such as C/C++ and Java do not yet have completely adequate formal models defining their
concurrency semantics.
In this thesis, we address this challenge and develop a formal concurrency model that is very
good both in terms of compilation efficiency and of programmability. Unlike most previous
approaches, which were defined either operationally or axiomatically on single executions,
our formal model is based on event structures, which represents multiple program executions,
and thus gives us more structure to define the semantics of concurrency.
In more detail, our formalization has two variants: the weaker version, WEAKEST, and the
stronger version, WEAKESTMO. The WEAKEST model simulates the promising semantics proposed
by Kang et al., while WEAKESTMO is incomparable to the promising semantics. Moreover,
WEAKESTMO discards certain questionable behaviors allowed by the promising semantics.
We show that the proposed WEAKESTMO model resolve out-of-thin-air problem, provide
standard data-race-freedom (DRF) guarantees, allow the desirable optimizations, and can be
mapped to the architectures like x86, PowerPC, and ARMv7. Additionally, our models are
flexible enough to leverage existing results from the literature to establish data-race-freedom
(DRF) guarantees and correctness of compilation.
In addition, in order to ensure the correctness of compilation by a major compiler, we developed
a translation validator targeting LLVM’s “opt” transformations of concurrent C/C++
programs. Using the validator, we identified a few subtle compilation bugs, which were reported
and were fixed. Additionally, we observe that LLVM concurrency semantics differs
from that of C11; there are transformations which are justified in C11 but not in LLVM and
vice versa. Considering the subtle aspects of LLVM concurrency, we formalized a fragment
of LLVM’s concurrency semantics and integrated it into our WEAKESTMO model.

The systems in industrial automation management (IAM) are information systems. The management parts of such systems are software components that support the manufacturing processes. The operational parts control highly plug-compatible devices, such as controllers, sensors and motors. Process variability and topology variability are the two main characteristics of software families in this domain. Furthermore, three roles of stakeholders -- requirement engineers, hardware-oriented engineers, and software developers -- participate in different derivation stages and have different variability concerns. In current practice, the development and reuse of such systems is costly and time-consuming, due to the complexity of topology and process variability. To overcome these challenges, the goal of this thesis is to develop an approach to improve the software product derivation process for systems in industrial automation management, where different variability types are concerned in different derivation stages. Current state-of-the-art approaches commonly use general-purpose variability modeling languages to represent variability, which is not sufficient for IAM systems. The process and topology variability requires more user-centered modeling and representation. The insufficiency of variability modeling leads to low efficiency during the staged derivation process involving different stakeholders. Up to now, product line approaches for systematic variability modeling and realization have not been well established for such complex domains. The model-based derivation approach presented in this thesis integrates feature modeling with domain-specific models for expressing processes and topology. The multi-variability modeling framework includes the meta-models of the three variability types and their associations. The realization and implementation of the multi-variability involves the mapping and the tracing of variants to their corresponding software product line assets. Based on the foundation of multi-variability modeling and realization, a derivation infrastructure is developed, which enables a semi-automated software derivation approach. It supports the configuration of different variability types to be integrated into the staged derivation process of the involved stakeholders. The derivation approach is evaluated in an industry-grade case study of a complex software system. The feasibility is demonstrated by applying the approach in the case study. By using the approach, both the size of the reusable core assets and the automation level of derivation are significantly improved. Furthermore, semi-structured interviews with engineers in practice have evaluated the usefulness and ease-of-use of the proposed approach. The results show a positive attitude towards applying the approach in practice, and high potential to generalize it to other related domains.

Postmortem Analysis of Decayed Online Social Communities: Cascade Pattern Analysis and Prediction
(2018)

Recently, many online social networks, such as MySpace, Orkut, and Friendster, have faced inactivity decay of their members, which contributed to the collapse of these networks. The reasons, mechanics, and prevention mechanisms of such inactivity decay are not fully understood. In this work, we analyze decayed and alive subwebsites from the Stack Exchange platform. The analysis mainly focuses on the inactivity cascades that occur among the members of these communities. We provide measures to understand the decay process and statistical analysis to extract the patterns that accompany the inactivity decay. Additionally, we predict cascade size and cascade virality using machine learning. The results of this work include a statistically significant difference of the decay patterns between the decayed and the alive subwebsites. These patterns are mainly cascade size, cascade virality, cascade duration, and cascade similarity. Additionally, the contributed prediction framework showed satisfactorily prediction results compared to a baseline predictor. Supported by empirical evidence, the main findings of this work are (1) there are significantly different decay patterns in the alive and the decayed subwebsites of the Stack Exchange; (2) the cascade’s node degrees contribute more to the decay process than the cascade’s virality, which indicates that the expert members of the Stack Exchange subwebsites were mainly responsible for the activity or inactivity of the Stack Exchange subwebsites; (3) the Statistics subwebsite is going through decay dynamics that may lead to it becoming fully-decayed; (4) the decay process is not governed by only one network measure, it is better described using multiple measures; (5) decayed subwebsites were originally less resilient to inactivity decay, unlike the alive subwebsites; and (6) network’s structure in the early stages of its evolution dictates the activity/inactivity characteristics of the network.

The usage of sensors in modern technical systems and consumer products is in a rapid increase. This advancement can be characterized by two major factors, namely, the mass introduction of consumer oriented sensing devices to the market and the sheer amount of sensor data being generated. These characteristics raise subsequent challenges regarding both the consumer sensing devices' reliability and the management and utilization of the generated sensor data. This thesis addresses these challenges through two main contributions. It presents a novel framework that leverages sentiment analysis techniques in order to assess the quality of consumer sensing devices. It also couples semantic technologies with big data technologies to present a new optimized approach for realization and management of semantic sensor data, hence providing a robust means of integration, analysis, and reuse of the generated data. The thesis also presents several applications that show the potential of the contributions in real-life scenarios.
Due to the broad range, growing feature set and fast release pace of new sensor-based products, evaluating these products is very challenging as standard product testing is not practical. As an alternative, an end-to-end aspect-based sentiment summarizer pipeline for evaluation of consumer sensing devices is presented. The pipeline uses product reviews to extract the sentiment at the aspect level and includes several components namely, product name extractor, aspects extractor and a lexicon-based sentiment extractor which handles multiple sentiment analysis challenges such as sentiment shifters, negations, and comparative sentences among others. The proposed summarizer's components generally outperform the state-of-the-art approaches. As a use case, features of the market leading fitness trackers are evaluated and a dynamic visual summarizer is presented to display the evaluation results and to provide personalized product recommendations for potential customers.
The increased usage of sensing devices in the consumer market is accompanied with increased deployment of sensors in various other fields such as industry, agriculture, and energy production systems. This necessitates using efficient and scalable methods for storing and processing of sensor data. Coupling big data technologies with semantic techniques not only helps to achieve the desired storage and processing goals, but also facilitates data integration, data analysis, and the utilization of data in unforeseen future applications through preserving the data generation context. This thesis proposes an efficient and scalable solution for semantification, storage and processing of raw sensor data through ontological modelling of sensor data and a novel encoding scheme that harnesses the split between the statements of the conceptual model of an ontology (TBox) and the individual facts (ABox) along with in-memory processing capabilities of modern big data systems. A sample use case is further introduced where a smartphone is deployed in a transportation bus to collect various sensor data which is then utilized in detecting street anomalies.
In addition to the aforementioned contributions, and to highlight the potential use cases of sensor data publicly available, a recommender system is developed using running route data, used for proximity-based retrieval, to provide personalized suggestions for new routes considering the runner's performance, visual and nature of route preferences.
This thesis aims at enhancing the integration of sensing devices in daily life applications through facilitating the public acquisition of consumer sensing devices. It also aims at achieving better integration and processing of sensor data in order to enable new potential usage scenarios of the raw generated data.

Most modern multiprocessors offer weak memory behavior to improve their performance in terms of throughput. They allow the order of memory operations to be observed differently by each processor. This is opposite to the concept of sequential consistency (SC) which enforces a unique sequential view on all operations for all processors. Because most software has been and still is developed with SC in mind, we face a gap between the expected behavior and the actual behavior on modern architectures. The issues described only affect multithreaded software and therefore most programmers might never face them. However, multi-threaded bare metal software like operating systems, embedded software, and real-time software have to consider memory consistency and ensure that the order of memory operations does not yield unexpected results. This software is more critical as general consumer software in terms of consequences, and therefore new methods are needed to ensure their correct behavior.
In general, a memory system is considered weak if it allows behavior that is not possible in a sequential system. For example, in the SPARC processor with total store ordering (TSO) consistency, all writes might be delayed by store buffers before they eventually are processed by the main memory. This allows the issuing process to work with its own written values before other processes observed them (i.e., reading its own value before it leaves the store buffer). Because this behavior is not possible with sequential consistency, TSO is considered to be weaker than SC. Programming in the context of weak memory architectures requires a proper comprehension of how the model deviates from expected sequential behavior. For verification of these programs formal representations are required that cover the weak behavior in order to utilize formal verification tools.
This thesis explores different verification approaches and respectively fitting representations of a multitude of memory models. In a joint effort, we started with the concept of testing memory operation traces in regard of their consistency with different memory consistency models. A memory operation trace is directly derived from a program trace and consists of a sequence of read and write operations for each process. Analyzing the testing problem, we are able to prove that the problem is NP-complete for most memory models. In that process, a satisfiability (SAT) encoding for given problem instances was developed, that can be used in reachability and robustness analysis.
In order to cover all program executions instead of just a single program trace, additional representations are introduced and explored throughout this thesis. One of the representations introduced is a novel approach to specify a weak memory system using temporal logics. A set of linear temporal logic (LTL) formulas is developed that describes all properties required to restrict possible traces to those consistent to the given memory model. The resulting LTL specifications can directly be used in model checking, e.g., to check safety conditions. Unfortunately, the derived LTL specifications suffer from the state explosion problem: Even small examples, like the Peterson mutual exclusion algorithm, tend to generate huge formulas and require vast amounts of memory for verification. For this reason, it is concluded that using the proposed verification approach these specifications are not well suited for verification of real world software. Nonetheless, they provide comprehensive and formally correct descriptions that might be used elsewhere, e.g., programming or teaching.
Another approach to represent these models are operational semantics. In this thesis, operational semantics of weak memory models are provided in the form of reference machines that are both correct and complete regarding the memory model specification. Operational semantics allow to simulate systems with weak memory models step by step. This provides an elegant way to study the effects that lead to weak consistent behavior, while still providing a basis for formal verification. The operational models are then incorporated in verification tools for multithreaded software. These state space exploration tools proved suitable for verification of multithreaded software in a weak consistent memory environment. However, because not only the memory system but also the processor are expressed as operational semantics, some verification approach will not be feasible due to the large size of the state space.
Finally, to tackle the beforementioned issue, a state transition system for parallel programs is proposed. The transition system is defined by a set of structural operational semantics (SOS) rules and a suitable memory structure that can cover multiple memory models. This allows to influence the state space by use of smart representations and approximation approaches in future work.

Large-scale distributed systems consist of a number of components, take a number of parameter values as input, and behave differently based on a number of non-deterministic events. All these features—components, parameter values, and events—interact in complicated ways, and unanticipated interactions may lead to bugs. Empirically, many bugs in these systems are caused by interactions of only a small number of features. In certain cases, it may be possible to test all interactions of \(k\) features for a small constant \(k\) by executing a family of tests that is exponentially or even doubly-exponentially smaller than the family of all tests. Thus, in such cases we can effectively uncover all bugs that require up to \(k\)-wise interactions of features.
In this thesis we study two occurrences of this phenomenon. First, many bugs in distributed systems are caused by network partition faults. In most cases these bugs occur due to two or three key nodes, such as leaders or replicas, not being able to communicate, or because the leading node finds itself in a block of the partition without quorum. Second, bugs may occur due to unexpected schedules (interleavings) of concurrent events—concurrent exchange of messages and concurrent access to shared resources. Again, many bugs depend only on the relative ordering of a small number of events. We call the smallest number of events whose ordering causes a bug the depth of the bug. We show that in both testing scenarios we can effectively uncover bugs involving small number of nodes or bugs of small depth by executing small families of tests.
We phrase both testing scenarios in terms of an abstract framework of tests, testing goals, and goal coverage. Sets of tests that cover all testing goals are called covering families. We give a general construction that shows that whenever a random test covers a fixed goal with sufficiently high probability, a small randomly chosen set of tests is a covering family with high probability. We then introduce concrete coverage notions relating to network partition faults and bugs of small depth. In case of network partition faults, we show that for the introduced coverage notions we can find a lower bound on the probability that a random test covers a given goal. Our general construction then yields a randomized testing procedure that achieves full coverage—and hence, find bugs—quickly.
In case of coverage notions related to bugs of small depth, if the events in the program form a non-trivial partial order, our general construction may give a suboptimal bound. Thus, we study other ways of constructing covering families. We show that if the events in a concurrent program are partially ordered as a tree, we can explicitly construct a covering family of small size: for balanced trees, our construction is polylogarithmic in the number of events. For the case when the partial order of events does not have a "nice" structure, and the events and their relation to previous events are revealed while the program is running, we give an online construction of covering families. Based on the construction, we develop a randomized scheduler called PCTCP that uniformly samples schedules from a covering family and has a rigorous guarantee of finding bugs of small depth. We experiment with an implementation of PCTCP on two real-world distributed systems—Zookeeper and Cassandra—and show that it can effectively find bugs.

Ranking lists are an essential methodology to succinctly summarize outstanding items, computed over database tables or crowdsourced in dedicated websites. In this thesis, we propose the usage of automatically generated, entity-centric rankings to discover insights in data. We present PALEO, a framework for data exploration through reverse engineering top-k database queries, that is, given a database and a sample top-k input list, our approach, aims at determining an SQL query that returns results similar to the provided input when executed over the database. The core problem consist of finding selection predicates that return the given items, determining the correct ranking criteria, and evaluating the most promising candidate queries first. PALEO operates on subset of the base data, uses data samples, histograms, descriptive statistics, and further proposes models that assess the suitability of candidate queries which facilitate limitation of false positives. Furthermore, this thesis presents COMPETE, a novel approach that models and computes dominance over user-provided input entities, given a database of top-k rankings. The resulting entities are found superior or inferior with tunable degree of dominance over the input set---a very intuitive, yet insightful way to explore pros and cons of entities of interest. Several notions of dominance are defined which differ in computational complexity and strictness of the dominance concept---yet, interdependent through containment relations. COMPETE is able to pick the most promising approach to satisfy a user request at minimal runtime latency, using a probabilistic model that is estimating the result sizes. The individual flavors of dominance are cast into a stack of algorithms over inverted indices and auxiliary structures, enabling pruning techniques to avoid significant data access over large datasets of rankings.

Wearable activity recognition aims to identify and assess human activities with the help
of computer systems by evaluating signals of sensors which can be attached to the human
body. This provides us with valuable information in several areas: in health care, e.g. fluid
and food intake monitoring; in sports, e.g. training support and monitoring; in entertainment,
e.g. human-computer interface using body movements; in industrial scenarios, e.g.
computer support for detected work tasks. Several challenges exist for wearable activity
recognition: a large number of nonrelevant activities (null class), the evaluation of large
numbers of sensor signals (curse of dimensionality), ambiguity of sensor signals compared
to the activities and finally the high variability of human activity in general.
This thesis develops a new activity recognition strategy, called invariants classification,
which addresses these challenges, especially the variability in human activities. The
core idea is that often even highly variable actions include short, more or less invariant
sub-actions which are due to hard physical constraints. If someone opens a door, the
movement of the hand to the door handle is not fixed. However the door handle has to
be pushed to open the door. The invariants classification algorithm is structured in four
phases: segmentation, invariant identification, classification, and spotting. The segmentation
divides the continuous sensor data stream into meaningful parts, which are related
to sub-activities. Our segmentation strategy uses the zero crossings of the central difference
quotient of the sensor signals, as segment borders. The invariant identification finds
the invariant sub-activities by means of clustering and a selection strategy dependent on
certain features. The classification identifies the segments of a specific activity class, using
models generated from the invariant sub-activities. The models include the invariant
sub-activity signal and features calculated on sensor signals related to the sub-activity. In
the spotting, the classified segments are used to find the entire activity class instances in
the continuous sensor data stream. For this purpose, we use the position of the invariant
sub-activity in the related activity class instance for the estimation of the borders of the
activity instances.
In this thesis, we show that our new activity recognition strategy, built on invariant
sub-activities, is beneficial. We tested it on three human activity datasets with wearable
inertial measurement units (IMU). Compared to previous publications on the same
datasets we got improvement in the activity recognition in several classes, some with a
large margin. Our segmentation achieves a sensible method to separate the sensor data in
relation to the underlying activities. Relying on sub-activities makes us independent from
imprecise labels on the training data. After the identification of invariant sub-activities,
we calculate a value called cluster precision for each sensor signal and each class activity.
This tells us which classes can be easily classified and which sensor channels support
the classification best. Finally, in the training for each activity class, our algorithm selects
suitable signal channels with invariant sub-activities on different points in time and
with different length. This makes our strategy a multi-dimensional asynchronous motif
detection with variable motif length.

Graphs and flow networks are important mathematical concepts that enable the modeling and analysis of a large variety of real world problems in different domains such as engineering, medicine or computer science. The number, sizes and complexities of those problems permanently increased during the last decades. This led to an increased demand of techniques that help domain experts in understanding their data and its underlying structure to enable an efficient analysis and decision making process.
To tackle this challenge, this work presents several new techniques that utilize concepts of visual analysis to provide domain scientists with new visualization methodologies and tools. Therefore, this work provides novel concepts and approaches for diverse aspects of the visual analysis such as data transformation, visual mapping, parameter refinement and analysis, model building and visualization as well as user interaction.
The presented techniques form a framework that enriches domain scientists with new visual analysis tools and help them analyze their data and gain insight from the underlying structures. To show the applicability and effectiveness of the presented approaches, this work tackles different applications such as networking, product flow management and vascular systems, while preserving the generality to be applicable to further domains.

The simulation of physical phenomena involving the dynamic behavior of fluids and gases
has numerous applications in various fields of science and engineering. Of particular interest
is the material transport behavior, the tendency of a flow field to displace parts of the
medium. Therefore, many visualization techniques rely on particle trajectories.
Lagrangian Flow Field Representation. In typical Eulerian settings, trajectories are
computed from the simulation output using numerical integration schemes. Accuracy concerns
arise because, due to limitations of storage space and bandwidth, often only a fraction
of the computed simulation time steps are available. Prior work has shown empirically that
a Lagrangian, trajectory-based representation can improve accuracy [Agr+14]. Determining
the parameters of such a representation in advance is difficult; a relationship between the
temporal and spatial resolution and the accuracy of resulting trajectories needs to be established.
We provide an error measure for upper bounds of the error of individual trajectories.
We show how areas at risk for high errors can be identified, thereby making it possible to
prioritize areas in time and space to allocate scarce storage resources.
Comparative Visual Analysis of Flow Field Ensembles. Independent of the representation,
errors of the simulation itself are often caused by inaccurate initial conditions,
limitations of the chosen simulation model, and numerical errors. To gain a better understanding
of the possible outcomes, multiple simulation runs can be calculated, resulting in
sets of simulation output referred to as ensembles. Of particular interest when studying the
material transport behavior of ensembles is the identification of areas where the simulation
runs agree or disagree. We introduce and evaluate an interactive method that enables application
scientists to reliably identify and examine regions of agreement and disagreement,
while taking into account the local transport behavior within individual simulation runs.
Particle-Based Representation and Visualization of Uncertain Flow Data Sets. Unlike
simulation ensembles, where uncertainty of the solution appears in the form of different
simulation runs, moment-based Eulerian multi-phase fluid simulations are probabilistic in
nature. These simulations, used in process engineering to simulate the behavior of bubbles in
liquid media, are aimed toward reducing the need for real-world experiments. The locations
of individual bubbles are not modeled explicitly, but stochastically through the properties of
locally defined bubble populations. Comparisons between simulation results and physical
experiments are difficult. We describe and analyze an approach that generates representative
sets of bubbles for moment-based simulation data. Using our approach, application scientists
can directly, visually compare simulation results and physical experiments.

Education is the Achilles heel of successful resuscitation in cardiac arrest. Therefore, we aim to contribute to the educational efficiency by providing a novel augmented-reality (AR) guided interactive cardiopulmonary resuscitation (CPR) "trainer". For this trainer, a mixed reality smart glass, Microsoft HoloLens, and a CPR manikin covered with pressure sensors were used. To introduce the CPR procedure to a learner, an application with an intractable virtual teacher model was designed. The teaching scenario consists of the two main parts, theory and practice. In the theoretical part, the virtual teacher provides all information about the CPR procedure. Afterward, the user will be asked to perform the CPR cycles in three different stages. In the first two stages, it is aimed to gain the muscle memory with audio and optical feedback system. In the end, the performance of the participant is evaluated by the virtual teacher.

We present a study comparing the effect of real-time wearable feedback with traditional training methods for cardiopulmonary resuscitation (CPR). The aim is to ensure that the students can deliver CPR with the right compression speed and depth. On the wearable side, we test two systems: one based on a combination of visual feedback and tactile information on a smart-watch and one based on visual feedback and audio information on a Google Glass. In a trial with 50 subjects (23 trainee nurses and 27 novices,) we compare those modalities to standard human teaching that is used in nurse training. While a single traditional teaching session tends to improve only the percentage of correct depth, it has less effect on the percentage of effective CPR (depth and speed correct at the same time). By contrast, in a training session with the wearable feedback device, the average percentage of time when CPR is effective improves by up to almost 25%.

The focus of this work is to provide and evaluate a novel method for multifield topology-based analysis and visualization. Through this concept, called Pareto sets, one is capable to identify critical regions in a multifield with arbitrary many individual fields. It uses ideas found in graph optimization to find common behavior and areas of divergence between multiple optimization objectives. The connections between the latter areas can be reduced into a graph structure allowing for an abstract visualization of the multifield to support data exploration and understanding.
The research question that is answered in this dissertation is about the general capability and expandability of the Pareto set concept in context of visualization and application. Furthermore, the study of its relations, drawbacks and advantages towards other topological-based approaches. This questions is answered in several steps, including consideration and comparison with related work, a thorough introduction of the Pareto set itself as well as a framework for efficient implementation and an attached discussion regarding limitations of the concept and their implications for run time, suitable data, and possible improvements.
Furthermore, this work considers possible simplification approaches like integrated single-field simplification methods but also using common structures identified through the Pareto set concept to smooth all individual fields at once. These considerations are especially important for real-world scenarios to visualize highly complex data by removing small local structures without destroying information about larger, global trends.
To further emphasize possible improvements and expandability of the Pareto set concept, the thesis studies a variety of different real world applications. For each scenario, this work shows how the definition and visualization of the Pareto set is used and improved for data exploration and analysis based on the scenarios.
In summary, this dissertation provides a complete and sound summary of the Pareto set concept as ground work for future application of multifield data analysis. The possible scenarios include those presented in the application section, but are found in a wide range of research and industrial areas relying on uncertainty analysis, time-varying data, and ensembles of data sets in general.

Novel image processing techniques have been in development for decades, but most
of these techniques are barely used in real world applications. This results in a gap
between image processing research and real-world applications; this thesis aims to
close this gap. In an initial study, the quantification, propagation, and communication
of uncertainty were determined to be key features in gaining acceptance for
new image processing techniques in applications.
This thesis presents a holistic approach based on a novel image processing pipeline,
capable of quantifying, propagating, and communicating image uncertainty. This
work provides an improved image data transformation paradigm, extending image
data using a flexible, high-dimensional uncertainty model. Based on this, a completely
redesigned image processing pipeline is presented. In this pipeline, each
step respects and preserves the underlying image uncertainty, allowing image uncertainty
quantification, image pre-processing, image segmentation, and geometry
extraction. This is communicated by utilizing meaningful visualization methodologies
throughout each computational step.
The presented methods are examined qualitatively by comparing to the Stateof-
the-Art, in addition to user evaluation in different domains. To show the applicability
of the presented approach to real world scenarios, this thesis demonstrates
domain-specific problems and the successful implementation of the presented techniques
in these domains.

The Symbol Grounding Problem (SGP) is one of the first attempts to proposed a hypothesis about mapping abstract concepts and the real world. For example, the concept "ball" can be represented by an object with a round shape (visual modality) and phonemes /b/ /a/ /l/ (audio modality).
This thesis is inspired by the association learning presented in infant development.
Newborns can associate visual and audio modalities of the same concept that are presented at the same time for vocabulary acquisition task.
The goal of this thesis is to develop a novel framework that combines the constraints of the Symbol Grounding Problem and Neural Networks in a simplified scenario of association learning in infants. The first motivation is that the network output can be considered as numerical symbolic features because the attributes of input samples are already embedded. The second motivation is the association between two samples is predefined before training via the same vectorial representation. This thesis proposes to associate two samples and the vectorial representation during training. Two scenarios are considered: sample pair association and sequence pair association.
Three main contributions are presented in this work.
The first contribution is a novel Symbolic Association Model based on two parallel MLPs.
The association task is defined by learning that two instances that represent one concept.
Moreover, a novel training algorithm is defined by matching the output vectors of the MLPs with a statistical distribution for obtaining the relationship between concepts and vectorial representations.
The second contribution is a novel Symbolic Association Model based on two parallel LSTM networks that are trained on weakly labeled sequences.
The definition of association task is extended to learn that two sequences represent the same series of concepts.
This model uses a training algorithm that is similar to MLP-based approach.
The last contribution is a Classless Association.
The association task is defined by learning based on the relationship of two samples that represents the same unknown concept.
In summary, the contributions of this thesis are to extend Artificial Intelligence and Cognitive Computation research with a new constraint that is cognitive motivated. Moreover, two training algorithms with a new constraint are proposed for two cases: single and sequence associations. Besides, a new training rule with no-labels with promising results is proposed.

In recent years, enormous progress has been made in the field of Artificial Intelligence (AI). Especially the introduction of Deep Learning and end-to-end learning, the availability of large datasets and the necessary computational power in form of specialised hardware allowed researchers to build systems with previously unseen performance in areas such as computer vision, machine translation and machine gaming. In parallel, the Semantic Web and its Linked Data movement have published many interlinked RDF datasets, forming the world’s largest, decentralised and publicly available knowledge base.
Despite these scientific successes, all current systems are still narrow AI systems. Each of them is specialised to a specific task and cannot easily be adapted to all other human intelligence tasks, as would be necessary for Artificial General Intelligence (AGI). Furthermore, most of the currently developed systems are not able to learn by making use of freely available knowledge such as provided by the Semantic Web. Autonomous incorporation of new knowledge is however one of the pre-conditions for human-like problem solving.
This work provides a small step towards teaching machines such human-like reasoning on freely available knowledge from the Semantic Web. We investigate how human associations, one of the building blocks of our thinking, can be simulated with Linked Data. The two main results of these investigations are a ground truth dataset of semantic associations and a machine learning algorithm that is able to identify patterns for them in huge knowledge bases.
The ground truth dataset of semantic associations consists of DBpedia entities that are known to be strongly associated by humans. The dataset is published as RDF and can be used for future research.
The developed machine learning algorithm is an evolutionary algorithm that can learn SPARQL queries from a given SPARQL endpoint based on a given list of exemplary source-target entity pairs. The algorithm operates in an end-to-end learning fashion, extracting features in form of graph patterns without the need for human intervention. The learned patterns form a feature space adapted to the given list of examples and can be used to predict target candidates from the SPARQL endpoint for new source nodes. On our semantic association ground truth dataset, our evolutionary graph pattern learner reaches a Recall@10 of > 63 % and an MRR (& MAP) > 43 %, outperforming all baselines. With an achieved Recall@1 of > 34% it even reaches average human top response prediction performance. We also demonstrate how the graph pattern learner can be applied to other interesting areas without modification.

Tables or ranked lists summarize facts about a group of entities in a concise and structured fashion. They are found in all kind of domains and easily comprehensible by humans. Some globally prominent examples of such rankings are the tallest buildings in the World, the richest people in Germany, or most powerful cars. The availability of vast amounts of tables or rankings from open domain allows different ways to explore data. Computing similarity between ranked lists, in order to find those lists where entities are presented in a similar order, carries important analytical insights. This thesis presents a novel query-driven Locality Sensitive Hashing (LSH) method, in order to efficiently find similar top-k rankings for a given input ranking. Experiments show that the proposed method provides a far better performance than inverted-index--based approaches, in particular, it is able to outperform the popular prefix-filtering method. Additionally, an LSH-based probabilistic pruning approach is proposed that optimizes the space utilization of inverted indices, while still maintaining a user-provided recall requirement for the results of the similarity search. Further, this thesis addresses the problem of automatically identifying interesting categorical attributes, in order to explore the entity-centric data by organizing them into meaningful categories. Our approach proposes novel statistical measures, beyond known concepts, like information entropy, in order to capture the distribution of data to train a classifier that can predict which categorical attribute will be perceived suitable by humans for data categorization. We further discuss how the information of useful categories can be applied in PANTHEON and PALEO, two data exploration frameworks developed in our group.

Computational problems that involve dynamic data, such as physics simulations and program development environments, have been an important
subject of study in programming languages. Recent advances in self-adjusting
computation made progress towards achieving efficient incremental computation by providing algorithmic language abstractions to express computations that respond automatically to dynamic changes in their inputs. Selfadjusting programs have been shown to be efficient for a broad range of problems via an explicit programming style, where the programmer uses specific
primitives to identify, create and operate on data that can change over time.
This dissertation presents implicit self-adjusting computation, a type directed technique for translating purely functional programs into self-adjusting
programs. In this implicit approach, the programmer annotates the (toplevel) input types of the programs to be translated. Type inference finds
all other types, and a type-directed translation rewrites the source program
into an explicitly self-adjusting target program. The type system is related to
information-flow type systems and enjoys decidable type inference via constraint solving. We prove that the translation outputs well-typed self-adjusting
programs and preserves the source program’s input-output behavior, guaranteeing that translated programs respond correctly to all changes to their
data. Using a cost semantics, we also prove that the translation preserves the
asymptotic complexity of the source program.
As a second contribution, we present two techniques to facilitate the processing of large and dynamic data in self-adjusting computation. First, we
present a type system for precise dependency tracking that minimizes the
time and space for storing dependency metadata. The type system improves
the scalability of self-adjusting computation by eliminating an important assumption of prior work that can lead to recording spurious dependencies.
We present a type-directed translation algorithm that generates correct selfadjusting programs without relying on this assumption. Second, we show a
probabilistic-chunking technique to further decrease space usage by controlling the fundamental space-time tradeoff in self-adjusting computation.
We implement implicit self-adjusting computation as an extension to Standard ML with compiler and runtime support. Using the compiler, we are able
to incrementalize an interesting set of applications, including standard list
and matrix benchmarks, ray tracer, PageRank, sparse graph connectivity, and
social circle counts. Our experiments show that our compiler incrementalizes existing code with only trivial amounts of annotation, and the resulting
programs bring asymptotic improvements to large datasets from real-world
applications, leading to orders of magnitude speedups in practice.

Mobility has become an integral feature of many wireless networks. Along with this mobility comes the need for location awareness. A prime example for this development are today’s and future transportation systems. They increasingly rely on wireless communications to exchange location and velocity information for a multitude of functions and applications. At the same time, the technological progress facilitates the widespread availability of sophisticated radio technology such as software-defined radios. The result is a variety of new attack vectors threatening the integrity of location information in mobile networks.
Although such attacks can have severe consequences in safety-critical environments such as transportation, the combination of mobility and integrity of spatial information has not received much attention in security research in the past. In this thesis we aim to fill this gap by providing adequate methods to protect the integrity of location and velocity information in the presence of mobility. Based on physical effects of mobility on wireless communications, we develop new methods to securely verify locations, sequences of locations, and velocity information provided by untrusted nodes. The results of our analyses show that mobility can in fact be exploited to provide robust security at low cost.
To further investigate the applicability of our schemes to real-world transportation systems, we have built the OpenSky Network, a sensor network which collects air traffic control communication data for scientific applications. The network uses crowdsourcing and has already achieved coverage in most parts of the world with more than 1000 sensors.
Based on the data provided by the network and measurements with commercial off-the-shelf hardware, we demonstrate the technical feasibility and security of our schemes in the air traffic scenario. Moreover, the experience and data provided by the OpenSky Network allows us to investigate the challenges for our schemes in the real-world air traffic communication environment. We show that our verification methods match all
requirements to help secure the next generation air traffic system.

If gradient based derivative algorithms are used to improve industrial products by reducing their target functions, the derivatives need to be exact.
The last percent of possible improvement, like the efficiency of a turbine, can only be gained if the derivatives are consistent with the solution process that is used in the simulation software.
It is problematic that the development of the simulation software is an ongoing process which leads to the use of approximated derivatives.
If a derivative computation is implemented manually, it will be inconsistent after some time if it is not updated.
This thesis presents a generalized approach which differentiates the whole simulation software with Algorithmic Differentiation (AD), and guarantees a correct and consistent derivative computation after each change to the software.
For this purpose, the variable tagging technique is developed.
The technique checks at run-time if all dependencies, which are used by the derivative algorithms, are correct.
Since it is also necessary to check the correctness of the implementation, a theorem is developed which describes how AD derivatives can be compared.
This theorem is used to develop further methods that can detect and correct errors.
All methods are designed such that they can be applied in real world applications and are used within industrial configurations.
The process described above yields consistent and correct derivatives but the efficiency can still be improved.
This is done by deriving new derivative algorithms.
A fixed-point iterator approach, with a consistent derivation, yields all state of the art algorithms and produces two new algorithms.
These two new algorithms include all implementation details and therefore they produce consistent derivative results.
For detecting hot spots in the application, the state of the art techniques are presented and extended.
The data management is changed such that the performance of the software is affected only marginally when quantities, like the number of input and output variables or the memory consumption, are computed for the detection.
The hot spots can be treated with techniques like checkpointing or preaccumulation.
How these techniques change the time and memory consumption is analyzed and it is shown how they need to be used in selected AD tools.
As a last step, the used AD tools are analyzed in more detail.
The major implementation strategies for operator overloading AD tools are presented and implementation improvements for existing AD tools are discussed.\
The discussion focuses on a minimal memory consumption and makes it possible to compare AD tools on a theoretical level.
The new AD tool CoDiPack is based on these findings and its design and concepts are presented.
The improvements and findings in this thesis make it possible, that an automatic, consistent and correct derivative is generated in an efficient way for industrial applications.

Fast Internet content delivery relies on two layers of caches on the request path. Firstly, content delivery networks (CDNs) seek to answer user requests before they traverse slow Internet paths. Secondly, aggregation caches in data centers seek to answer user requests before they traverse slow backend systems. The key challenge in managing these caches is the high variability of object sizes, request patterns, and retrieval latencies. Unfortunately, most existing literature focuses on caching with low (or no) variability in object sizes and ignores the intricacies of data center subsystems.
This thesis seeks to fill this gap with three contributions. First, we design a new caching system, called AdaptSize, that is robust under high object size variability. Second, we derive a method (called Flow-Offline Optimum or FOO) to predict the optimal cache hit ratio under variable object sizes. Third, we design a new caching system, called RobinHood, that exploits variances in retrieval latencies to deliver faster responses to user requests in data centers.
The techniques proposed in this thesis significantly improve the performance of CDN and data center caches. On two production traces from one of the world's largest CDN AdaptSize achieves 30-91% higher hit ratios than widely-used production systems, and 33-46% higher hit ratios than state-of-the-art research systems. Further, AdaptSize reduces the latency by more than 30% at the median, 90-percentile and 99-percentile.
We evaluate the accuracy of our FOO analysis technique on eight different production traces spanning four major Internet companies.
We find that FOO's error is at most 0.3%. Further, FOO reveals that the gap between online policies and OPT is much larger than previously thought: 27% on average, and up to 43% on web application traces.
We evaluate RobinHood with production traces from a major Internet company on a 50-server cluster. We find that RobinHood improves the 99-percentile latency by more than 50% over existing caching systems.
As load imbalances grow, RobinHood's latency improvement can be more than 2x. Further, we show that RobinHood is robust against server failures and adapts to automatic scaling of backend systems.
The results of this thesis demonstrate the power of guiding the design of practical caching policies using mathematical performance models and analysis. These models are general enough to find application in other areas of caching design and future challenges in Internet content delivery.

Analyzing Centrality Indices in Complex Networks: an Approach Using Fuzzy Aggregation Operators
(2018)

The identification of entities that play an important role in a system is one of the fundamental analyses being performed in network studies. This topic is mainly related to centrality indices, which quantify node centrality with respect to several properties in the represented network. The nodes identified in such an analysis are called central nodes. Although centrality indices are very useful for these analyses, there exist several challenges regarding which one fits best
for a network. In addition, if the usage of only one index for determining central
nodes leads to under- or overestimation of the importance of nodes and is
insufficient for finding important nodes, then the question is how multiple indices
can be used in conjunction in such an evaluation. Thus, in this thesis an approach is proposed that includes multiple indices of nodes, each indicating
an aspect of importance, in the respective evaluation and where all the aspects of a node’s centrality are analyzed in an explorative manner. To achieve this
aim, the proposed idea uses fuzzy operators, including a parameter for generating different types of aggregations over multiple indices. In addition, several preprocessing methods for normalization of those values are proposed and discussed. We investigate whether the choice of different decisions regarding the
aggregation of the values changes the ranking of the nodes or not. It is revealed that (1) there are nodes that remain stable among the top-ranking nodes, which
makes them the most central nodes, and there are nodes that remain stable
among the bottom-ranking nodes, which makes them the least central nodes; and (2) there are nodes that show high sensitivity to the choice of normalization
methods and/or aggregations. We explain both cases and the reasons why the nodes’ rankings are stable or sensitive to the corresponding choices in various networks, such as social networks, communication networks, and air transportation networks.

Asynchronous concurrency is a wide-spread way of writing programs that
deal with many short tasks. It is the programming model behind
event-driven concurrency, as exemplified by GUI applications, where the
tasks correspond to event handlers, web applications based around
JavaScript, the implementation of web browsers, but also of server-side
software or operating systems.
This model is widely used because it provides the performance benefits of
concurrency together with easier programming than multi-threading. While
there is ample work on how to implement asynchronous programs, and
significant work on testing and model checking, little research has been
done on handling asynchronous programs that involve heap manipulation, nor
on how to automatically optimize code for asynchronous concurrency.
This thesis addresses the question of how we can reason about asynchronous
programs while considering the heap, and how to use this this to optimize
programs. The work is organized along the main questions: (i) How can we
reason about asynchronous programs, without ignoring the heap? (ii) How
can we use such reasoning techniques to optimize programs involving
asynchronous behavior? (iii) How can we transfer these reasoning and
optimization techniques to other settings?
The unifying idea behind all the results in the thesis is the use of an
appropriate model encompassing global state and a promise-based model of
asynchronous concurrency. For the first question, We start from refinement
type systems for sequential programs and extend them to perform precise
resource-based reasoning in terms of heap contents, known outstanding
tasks and promises. This extended type system is known as Asynchronous
Liquid Separation Types, or ALST for short. We implement ALST in for OCaml
programs using the Lwt library.
For the second question, we consider a family of possible program
optimizations, described by a set of rewriting rules, the DWFM rules. The
rewriting rules are type-driven: We only guarantee soundness for programs
that are well-typed under ALST. We give a soundness proof based on a
semantic interpretation of ALST that allows us to show behavior inclusion
of pairs of programs.
For the third question, we address an optimization problem from industrial
practice: Normally, JavaScript files that are referenced in an HTML file
are be loaded synchronously, i.e., when a script tag is encountered, the
browser must suspend parsing, then load and execute the script, and only
after will it continue parsing HTML. But in practice, there are numerous
JavaScript files for which asynchronous loading would be perfectly sound.
First, we sketch a hypothetical optimization using the DWFM rules and a
static analysis.
To actually implement the analysis, we modify the approach to use a
dynamic analysis. This analysis, known as JSDefer, enables us to analyze
real-world web pages, and provide experimental evidence for the efficiency
of this transformation.

Optical Character Recognition (OCR) system plays an important role in digitization of data acquired as images from a variety of sources. Although the area is very well explored for Latin languages, some of the languages based on Arabic cursive script are not yet explored. It is due to many factors: Most importantly are the unavailability of proper data sets and complexities posed by cursive scripts. The Pashto language is one of such languages which needs considerable exploration towards OCR. In order to develop such an OCR system, this thesis provides a pioneering study that explores deep learning for the Pashto language in the field of OCR.
The Pashto language is spoken by more than $50$ million people across the world, and it is an active medium both for oral as well as written communication. It is associated with rich literary heritage and contains huge written collection. These written materials present contents of simple to complex nature, and layouts from hand-scribed to printed text. The Pashto language presents mainly two types of complexities (i) generic w.r.t. cursive script, (ii) specific w.r.t. Pashto language. Generic complexities are cursiveness, context dependency, and breaker character anomalies, as well as space anomalies. Pashto specific complexities are variations in shape for a single character and shape similarity for some of the additional Pashto characters. Existing research in the area of Arabic OCR did not lead to an end-to-end solution for the mentioned complexities and therefore could not be generalized to build a sophisticated OCR system for Pashto.
The contribution of this thesis spans in three levels, conceptual level, data level, and practical level. In the conceptual level, we have deeply explored the Pashto language and identified those characters, which are responsible for the challenges mentioned above. In the data level, a comprehensive dataset is introduced containing real images of hand-scribed contents. The dataset is manually transcribed and has the most frequent layout patterns associated with the Pashto language. The practical level contribution provides a bridge, in the form of a complete Pashto OCR system, and connects the outcomes of the conceptual and data levels contributions. The practical contribution comprises of skew detection, text-line segmentation, feature extraction, classification, and post-processing. The OCR module is more strengthened by using deep learning paradigm to recognize Pashto cursive script by the framework of Recursive Neural Networks (RNN). Proposed Pashto text recognition is based on Long Short-Term Memory Network (LSTM) and realizes a character recognition rate of $90.78\%$ on Pashto real hand-scribed images. All these contributions are integrated into an application to provide a flexible and generic End-to-End Pashto OCR system.
The impact of this thesis is not only specific to the Pashto language, but it is also beneficial to other cursive languages like Arabic, Urdu, and Persian e.t.c. The main reason is the Pashto character set, which is a superset of Arabic, Persian, and Urdu languages. Therefore, the conceptual contribution of this thesis provides insight and proposes solutions to almost all generic complexities associated with Arabic, Persian, and Urdu languages. For example, an anomaly caused by breaker characters is deeply analyzed, which is shared among 70 languages, mainly use Arabic script. This thesis presents a solution to this issue and is equally beneficial to almost all Arabic like languages.
The scope of this thesis has two important aspects. First, a social impact, i.e., how a society may benefit from it. The main advantages are to bring the historical and almost vanished document to life and to ensure the opportunities to explore, analyze, translate, share, and understand the contents of Pashto language globally. Second, the advancement and exploration of the technical aspects. Because, this thesis empirically explores the recognition and challenges which are solely related to the Pashto language, both regarding character-set and the materials which present such complexities. Furthermore, the conceptual and practical background of this thesis regarding complexities of Pashto language is very beneficial regarding OCR for other cursive languages.

Nowadays, the increasing demand for ever more customizable products has emphasized the need for more flexible and fast-changing manufacturing systems. In this environment, simulation has become a strategic tool for the design, development, and implementation of such systems. Simulation represents a relatively low-cost and risk-free alternative for testing the impact and effectiveness of changes in different aspects of manufacturing systems.
Systems that deal with this kind of data for its use in decision making processes are known as Simulation-Based Decision Support Systems (SB-DSS). Although most SB-DSS provide a powerful variety of tools for the automatic and semi-automatic analysis of simulations, visual and interactive alternatives for the manual exploration of the results are still open to further development.
The work in this dissertation is focused on enhancing decision makers’ analysis capabilities by making simulation data more accessible through the incorporation of visualization and analysis techniques. To demonstrate how this goal can be achieved, two systems were developed. The first system, viPhos – standing for visualization of Phos: Greek for light –, is a system that supports lighting design in factory layout planning. viPhos combines simulation, analysis, and visualization tools and techniques to facilitate the global and local (overall factory or single workstations, respectively) interactive exploration and comparison of lighting design alternatives.
The second system, STRAD - standing for Spatio-Temporal Radar -, is a web-based systems that considers the spatio/attribute-temporal analysis of event data. Since decision making processes in manufacturing also involve the monitoring of the systems over time, STRAD enables the multilevel exploration of event data (e.g., simulated or historical registers of the status of machines or results of quality control processes).
A set of four case studies and one proof of concept prepared for both systems demonstrate the suitability of the visualization and analysis strategies adopted for supporting decision making processes in diverse application domains. The results of these case studies indicate that both, the systems as well as the techniques included in the systems can be generalized and extended to support the analysis of different tasks and scenarios.

Collaboration aims to increase the efficiency of problem solving and decision making by bringing diverse areas of expertise together, i.e., teams of experts from various disciplines, all necessary to come up with acceptable concepts. This dissertation is concerned with the design of highly efficient computer-supported collaborative work involving active participation of user groups with diverse expertise. Three main contributions can be highlighted: (1) the definition and design of a framework facilitating collaborative decision making; (2) the deployment and evaluation of more natural and intuitive interaction and visualization techniques in order to support multiple decision makers in virtual reality environments; and (3) the integration of novel techniques into a single proof-of-concept system.
Decision making processes are time-consuming, typically involving several iterations of different options before a generally acceptable solution is obtained. Although, collaboration is an often-applied method, the execution of collaborative sessions is often inefficient, does not involve all participants, and decisions are often finalized with- out the agreement of all participants. An increasing number of computer-supported cooperative work systems (CSCW) facilitate collaborative work by providing shared viewpoints and tools to solve joint tasks. However, most of these software systems are designed from a feature-oriented perspective, rather than a human-centered perspective and without the consideration of user groups with diverse experience and joint goals instead of joint tasks. The aim of this dissertation is to bring insights to the following research question: How can computer-supported cooperative work be designed to be more efficient? This question opens up more specific questions like: How can collaborative work be designed to be more efficient? How can all participants be involved in the collaboration process? And how can interaction interfaces that support collaborative work be designed to be more efficient? As such, this dissertation makes contributions in:
1. Definition and design of a framework facilitating decision making and collaborative work. Based on examinations of collaborative work and decision making processes requirements of a collaboration framework are assorted and formulated. Following, an approach to define and rate software/frameworks is introduced. This approach is used to translate the assorted requirements into a software’s architecture design. Next, an approach to evaluate alternatives based on Multi Criteria Decision Making (MCDM) and Multi Attribute Utility Theory (MAUT) is presented. Two case studies demonstrate the usability of this approach for (1) benchmarking between systems and evaluates the value of the desired collaboration framework, and (2) ranking a set of alternatives resulting from a decision-making process incorporating the points of view of multiple stake- holders.
2. Deployment and evaluation of natural and intuitive interaction and visualization techniques in order to support multiple diverse decision makers. A user taxonomy of industrial corporations serves to create a petri network of users in order to identify dependencies and information flows between each other. An explicit characterization and design of task models was developed to define interfaces and further components of the collaboration framework. In order to involve and support user groups with diverse experiences, smart de- vices and virtual reality are used within the presented collaboration framework. Natural and intuitive interaction techniques as well as advanced visualizations of user centered views of the collaboratively processed data are developed in order to support and increase the efficiency of decision making processes. The smartwatch as one of the latest technologies of smart devices, offers new possibilities of interaction techniques. A multi-modal interaction interface is provided, realized with smartwatch and smartphone in full immersive environments, including touch-input, in-air gestures, and speech.
3. Integration of novel techniques into a single proof-of-concept system. Finally, all findings and designed components are combined into the new collaboration framework called IN2CO, for distributed or co-located participants to efficiently collaborate using diverse mobile devices. In a prototypical implementation, all described components are integrated and evaluated. Examples where next-generation network-enabled collaborative environments, connected by visual and mobile interaction devices, can have significant impact are: design and simulation of automobiles and aircrafts; urban planning and simulation of urban infrastructure; or the design of complex and large buildings, including efficiency- and cost-optimized manufacturing buildings as task in factory planning. To demonstrate the functionality and usability of the framework, case studies referring to factory planning are demonstrated. Considering that factory planning is a process that involves the interaction of multiple aspects as well as the participation of experts from different domains (i.e., mechanical engineering, electrical engineering, computer engineering, ergonomics, material science, and even more), this application is suitable to demonstrate the utilization and usability of the collaboration framework. The various software modules and the integrated system resulting from the research will all be subjected to evaluations. Thus, collaborative decision making for co-located and distributed participants is enhanced by the use of natural and intuitive multi-modal interaction interfaces and techniques.

Due to the steadily growing flood of data, the appropriate use of visualizations for efficient data analysis is as important today as it has never been before. In many application domains, the data flood is based on processes that can be represented by node-link diagrams. Within such a diagram, nodes may represent intermediate results (or products), system states (or snapshots), milestones or real (and possibly georeferenced) objects, while links (edges) can embody transition conditions, transformation processes or real physical connections. Inspired by the engineering sciences application domain and the research project “SinOptiKom: Cross-sectoral optimization of transformation processes in municipal infrastructures in rural areas”, a platform for the analysis of transformation processes has been researched and developed based on a geographic information system (GIS). Caused by the increased amount of available and interesting data, a particular challenge is the simultaneous visualization of several visible attributes within one single diagram instead of using multiple ones. Therefore, two approaches have been developed, which utilize the available space between nodes in a diagram to display additional information.
Motivated by the necessity of appropriate result communication with various stakeholders, a concept for a universal, dashboard-based analysis platform has been developed. This web-based approach is conceptually capable of displaying data from various data sources and has been supplemented by collaboration possibilities such as sharing, annotating and presenting features.
In order to demonstrate the applicability and usability of newly developed applications, visualizations or user interfaces, extensive evaluations with human users are often inevitable. To reduce the complexity and the effort for conducting an evaluation, the browser-based evaluation framework (BREF) has been designed and implemented. Through its universal and flexible character, virtually any visualization or interaction running in the browser can be evaluated with BREF without any additional application (except for a modern web browser) on the target device. BREF has already proved itself in a wide range of application areas during the development and has since grown into a comprehensive evaluation tool.

Crowd condition monitoring concerns the crowd safety and concerns business performance metrics. The research problem to be solved is a crowd condition estimation approach to enable and support the supervision of mass events by first-responders and marketing experts, but is also targeted towards supporting social scientists, journalists, historians, public relations experts, community leaders, and political researchers. Real-time insights of the crowd condition is desired for quick reactions and historic crowd conditions measurements are desired for profound post-event crowd condition analysis.
This thesis aims to provide a systematic understanding of different approaches for crowd condition estimation by relying on 2.4 GHz signals and its variation in crowds of people, proposes and categorizes possible sensing approaches, applies supervised machine learning algorithms, and demonstrates experimental evaluation results. I categorize four sensing approaches. Firstly, stationary sensors which are sensing crowd centric signals sources. Secondly, stationary sensors which are sensing other stationary signals sources (either opportunistic or special purpose signal sources). Thirdly, a few volunteers within the crowd equipped with sensors which are sensing other surrounding crowd centric device signals (either individually, in a single group or collaboratively) within a small region. Fourthly, a small subset of participants within the crowd equipped with sensors and roaming throughout a whole city to sense wireless crowd centric signals.
I present and evaluate an approach with meshed stationary sensors which were sensing crowd centric devices. This was demonstrated and empirically evaluated within an industrial project during three of the world-wide largest automotive exhibitions. With over 30 meshed stationary sensors in an optimized setup across 6400m2 I achieved a mean absolute error of the crowd density of just 0.0115
people per square meter which equals to an average of below 6% mean relative error from the ground truth. I validate the contextual crowd condition anomaly detection method during the visit of chancellor Mrs. Merkel and during a large press conference during the exhibition. I present the approach of opportunistically sensing stationary based wireless signal variations and validate this during the Hannover CeBIT exhibition with 80 opportunistic sources with a crowd condition estimation relative error of below 12% relying only on surrounding signals in influenced by humans. Pursuing this approach I present an approach with dedicated signal sources and sensors to estimate the condition of shared office environments. I demonstrate methods being viable to even detect low density static crowds, such as people sitting at their desks, and evaluate this on an eight person office scenario. I present the approach of mobile crowd density estimation by a group of sensors detecting other crowd centric devices in the proximity with a classification accuracy of the crowd density of 66 % (improvement of over 22% over a individual sensor) during the crowded Oktoberfest event. I propose a collaborative mobile sensing approach which makes the system more robust against variations that may result from the background of the people rather than the crowd condition with differential features taking information about the link structure between actively scanning devices, the ratio between values observed by different devices, ratio of discovered crowd devices over time, team-wise diversity of discovered devices, number of semi- continuous device visibility periods, and device visibility durations into account. I validate the approach on multiple experiments including the Kaiserslautern European soccer championship public viewing event and evaluated the collaborative mobile sensing approach with a crowd condition estimation accuracy of 77 % while outperforming previous methods by 21%. I present the feasibility of deploying the wireless crowd condition sensing approach to a citywide scale during an event in Zurich with 971 actively sensing participants and outperformed the reference method by 24% in average.

Embedded reactive systems underpin various safety-critical applications wherein they interact with other systems and the environment with limited or even no human supervision. Therefore, design errors that violate essential system specifications can lead to severe unacceptable damages. For this reason, formal verification of such systems in their physical environment is of high interest. Synchronous programs are typically used to represent embedded reactive systems while hybrid systems serve to model discrete reactive system in a continuous environment. As such, both synchronous programs and hybrid systems play important roles in the model-based design of embedded reactive systems. This thesis develops induction-based techniques for safety property verification of synchronous and hybrid programs. The imperative synchronous language Quartz and its hybrid systems’ extensions are used to sustain the findings.
Deductive techniques for software verification typically use Hoare calculus. In this context, Verification Condition Generation (VCG) is used to apply Hoare calculus rules to a program whose statements are annotated with pre- and postconditions so that the validity of an obtained Verification Condition (VC) implies correctness of a given proof goal. Due to the abstraction of macro steps, Hoare calculus cannot directly generate VCs of synchronous programs unless it handles additional label variables or goto statements. As a first contribution, Floyd’s induction-based approach is employed to generate VCs for synchronous and hybrid programs. Five VCG methods are introduced that use inductive assertions to decompose the overall proof goal. Given the right assertions, the procedure can automatically generate a set of VCs that can then be checked by SMT solvers or automated theorem provers. The methods are proved sound and relatively complete, provided that the underlying assertion language is expressive enough. They can be applied to any program with a state-based semantics.
Property Directed Reachability (PDR) is an efficient method for synchronous hardware circuit verification based on induction rather than fixpoint computation. Crucial steps of the PDR method consist of deciding about the reachability of Counterexamples to Induction (CTIs) and generalizing them to clauses that cover as many unreachable states as possible. The thesis demonstrates that PDR becomes more efficient for imperative synchronous programs when using the distinction between the control- and dataflow. Before calling the PDR method, it is possible to derive additional program control-flow information that can be added to the transition relation such that less CTIs will be generated. Two methods to compute additional control-flow information are presented that differ in how precisely they approximate the reachable control-flow states and, consequently, in their required runtime. After calling the PDR method, the CTI identification work is reduced to its control-flow part and to checking whether the obtained control-flow states are unreachable in the corresponding extended finite state machine of the program. If so, all states of the transition system that refer to the same program locations can be excluded, which significantly increases the performance of PDR.

Computational simulations run on large supercomputers balance their outputs with the need of the scientist and the capability of the machine. Persistent storage is typically expensive and slow, its peformance grows at a slower rate than the processing power of the machine. This forces scientists to be practical about the size and frequency of the simulation outputs that can be later analyzed to understand the simulation states. Flexibility in the trade-offs of flexibilty and accessibility of the outputs of the simulations are critical the success of scientists using the supercomputers to understand their science. In situ transformations of the simulation state to be persistently stored is the focus of this dissertation.
The extreme size and parallelism of simulations can cause challenges for visualization and data analysis. This is coupled with the need to accept pre partitioned data into the analysis algorithms, which is not always well oriented toward existing software infrastructures. The work in this dissertation is focused on improving current work flows and software to accept data as it is, and efficiently produce smaller, more information rich data, for persistent storage that is easily consumed by end-user scientists. I attack this problem from both a theoretical and practical basis, by managing completely raw data to quantities of information dense visualizations and study methods for managing both the creation and persistence of data products from large scale simulations.

The proliferation of sensors in everyday devices – especially in smartphones – has led to crowd sensing becoming an important technique in many urban applications ranging from noise pollution mapping or road condition monitoring to tracking the spreading of diseases. However, in order to establish integrated crowd sensing environments on a large scale, some open issues need to be tackled first. On a high level, this thesis concentrates on dealing with two of those key issues: (1) efficiently collecting and processing large amounts of sensor data from smartphones in a scalable manner and (2) extracting abstract data models from those collected data sets thereby enabling the development of complex smart city services based on the extracted knowledge.
Going more into detail, the first main contribution of this thesis is the development of methods and architectures to facilitate simple and efficient deployments, scalability and adaptability of crowd sensing applications in a broad range of scenarios while at the same time enabling the integration of incentivation mechanisms for the participating general public. During an evaluation within a complex, large-scale environment it is shown that real-world deployments of the proposed data recording architecture are in fact feasible. The second major contribution of this thesis is the development of a novel methodology for using the recorded data to extract abstract data models which are representing the inherent core characteristics of the source data correctly. Finally – and in order to bring together the results of the thesis – it is demonstrated how the proposed architecture and the modeling method can be used to implement a complex smart city service by employing a data driven development approach.

We study high dimensional integration in the quantum model of computation. We develop quantum algorithms for integration of functions from Sobolev classes \(W^r_p [0,1]^d\) and analyze their convergence rates. We also prove lower bounds which show that the proposed algorithms are, in many cases, optimal within the setting of quantum computing. This extends recent results of Novak on integration of functions from Hölder classes.

In this paper, the complexity of full solution of Fredholm integral equations of the second kind with data from the Sobolev class \(W^r_2\) is studied. The exact order of information complexity is derived. The lower bound is proved using a Gelfand number technique. The upper bound is shown by providing a concrete algorithm of optimal order, based on a specific hyperbolic cross approximation of the kernel function. Numerical experiments are included, comparing the optimal algorithm with the standard Galerkin method.

We survey old and new results about optimal algorithms for summation of finite sequences and for integration of functions from Hölder or Sobolev spaces. First we discuss optimal deterministic and randornized algorithms. Then we add a new aspect, which has not been covered before on conferences
about (quasi-) Monte Carlo methods: quantum computation. We give a short introduction into this setting and present recent results of the authors on optimal quantum algorithms for summation and integration. We discuss comparisons between the three settings. The most interesting case for Monte
Carlo and quantum integration is that of moderate smoothness \(k\) and large dimension \(d\) which, in fact, occurs in a number of important applied problems. In that case the deterministic exponent is negligible, so the \(n^{-1/2}\) Monte Carlo and the \(n^{-1}\) quantum speedup essentially constitute the entire convergence rate.

Free Form Volumes
(1994)

Software development organizations measure their real-world processes, products, and resources to achieve the goal of improving their practices. Accurate and useful measurement relies on explicit models of the real-world processes, products, and resources. These explicit models assist with planning measurement, interpreting data, and assisting developers with their work. However, little work has been done on the joint use of measurem(int and process technologies. We hypothesize that it is possible to integrate measurement and process technologies in a way that supports automation of measurement-based feedback. Automated support for measurementbased feedback means that software developers and maintainers are provided with on-line, detailed information about their work. This type of automated support is expected to help software professionals gain intellectual control over their software projects. The dissertation offers three major contributions. First, an integrated measurement and
process modeling framework was constructed. This framework establishes the necessary foundation for integrating measurement and process technologies in a way that will permit automation. Second, a process-centered software engineering environment was developed to support measurement-based feedback. This system provides personnel with information about the tasks expected of them based on an integrated set of measurement and process views. Third, a set of assumptions and requirements about that system were examined in a controlled experiment. The experiment compared the use of different levels of automation to evaluate the acceptance and effectiveness of measurement-based feedback.

Wireless LANs operating within unlicensed frequency bands require random access schemes such as CSMA/ CA, so that wireless networks from different administrative domains (for example wireless community networks) may co-exist without central coordination, even when they happen to operate on the same radio channel. Yet, it is evident that this Jack of coordination leads to an inevitable loss in efficiency due to contention on the MAC layer. The interesting question is, which efficiency may be gained by adding coordination to existing, unrelated wireless networks, for example by self-organization. In this paper, we present a methodology based on a mathematical programming formulation to determine the
parameters (assignment of stations to access points, signal strengths and channel assignment of both access points and stations) for a scenario of co-existing CSMA/ CA-based wireless networks, such that the contention between these networks is minimized. We demonstrate how it is possible to solve this discrete, non-linear optimization problem exactly for small
problems. For larger scenarios, we present a genetic algorithm specifically tuned for finding near-optimal solutions, and compare its results to theoretical lower bounds. Overall, we provide a benchmark on the minimum contention problem for coordination mechanisms in CSMA/CA-based wireless networks.

This report presents a generalization of tensor-product B-spline surfaces. The new scheme permits knots whose endpoints lie in the interior of the domain rectangle of a surface. This allows local refinement of the knot structure for approximation purposes as well as modeling surfaces with local tangent or curvature discontinuities. The surfaces are represented in terms of B-spline basis functions, ensuring affine invariance, local control, the convex hull property, and evaluation by de Boor's algorithm. A dimension formula for a class of generalized tensor-product spline spaces is developed.

We present a methodology to augment system safety step-by-step and illustrate the approach by the definition of reusable solutions for the detection of fail-silent nodes - a watchdog and a heartbeat. These solutions can be added to real-time system designs, to protect against certain types of system failures. We use SDL as a system design language for the development of distributed systems, including real-time systems.

Interactive graphics has been limited to simple direct illumination that commonly results in an artificial appearance. A more realistic appearance by simulating global illumination effects has been too costly to compute at interactive rates. In this paper we describe a new Monte Carlo-based global illumination algorithm. It achieves performance of up to 10 frames per second while arbitrary changes to the scene may be applied interactively. The performance is obtained through the effective use of a fast, distributed ray-tracing engine as well as a new interleaved sampling technique for parallel Monte Carlo simulation. A new filtering step in combination with correlated sampling avoids the disturbing noise artifacts common to Monte Carlo methods.

Estelle is an internationally standardized formal description technique (FDT) designed for the specification of distributed systems, in particular communication protocols. An Estelle specification describes a system of communicating components (module instances). The specified system is closed in a topological sense, i.e. it has no ability to interact with some environment. Because of this restriction, open systems can only be specified together with and incorporated with an environment. To overcome this restriction, we introduce a compatible extension of Estelle, called "Open Estelle". It allows the specification of (topologically) open systems, i.e. systems that have the ability to communicate with any environment through a well-defined external interface. We define aformal syntax and a formal semantics for Open Estelle, both based on and extending the syntax and semantics of Estelle. The extension is compatible syntactically and semantically, i.e. Estelle is a subset of Open Estelle. In particular, the formal semantics of Open Estelle reduces to the Estelle semantics in the special case of a closed system. Furthermore, we present a tool for the textual integration of open systems into environments specified in Open Estelle, and a compiler for the automatic generation of implementations directly from Open Estelle specifications.

This paper describes some new algorithms for the accurate calculation of surface properties. In the first part an arithmetic on Bézier surfaces is introduced. Formulas are given, which determine the Bézier points and weights of the resulting surface from the points and weights of the operand surfaces. An application of the arithmetic operations to the surface interrogation methods are described in the second part. It turns out, that the quality analysis can be reduced to a few numerical stable operations. Finally the advantages and disadvantages of this method are discussed.

Partitioned chain grammars
(1979)

This paper introduces a new class of grammars, the partitioned chain grammars, for which efficient parsers can be automatically generated. Besides being efficiently parsable these grammars possess a number of other properties, which make them very attractive for the use in parser-generators. They for instance form a large grammarclass and describe all deterministic context-free languages. Main advantage of the partitioned chain grammars however is, that given a language it is usually easier to describe it by a partitioned chain grammar than to construct a grammar of some other type commonly used in parser-generators for it.

The intuitionistic calculus mj for sequents, in which no other logical symbols than those for implication and universal quantification occur, is introduced and analysed. It allows a simple backward application, called mj-reduction here, for searching for derivation trees. Terms needed in mj-reduction can be found with the unification algorithm. mj-Reduction with unification can be seen as a natural extension of SLD-resolution. mj-Derivability of the sequents considered here coincides with derivability in Johansson's minimal intuitionistic calculus LHM in [6]. Intuitionistic derivability of formulae with negation and classical derivability of formulae with all usual logical symbols can be expressed with mj-derivability and hence be verified by mj-reduction. mj-Derivations can be easily translated into LJ-derivations without
"Schnitt", or into NJ-derivations in a slightly sharpened form of Prawitz' normal form. In the first three sections, the systematic use of mj-reduction for proving in predicate logic is emphasized. Although the fourth section, the last and largest, is exclusively devoted to the mathematical analysis of the calculus mj, the first three sections may be of interest to a wider readership, including readers looking for applications of symbolic logic. Unfortunately, the mathematical analysis of the calculus mj, as the study of Gentzen's calculi, demands a large amount of technical work that obscures the natural unfolding of the argumentation. To alleviate this, definitions and theorems are completely embedded in the text to provide a fluent and balanced mathematical discourse: new concepts are indicated with bold-face, proofs of assertions are outlined, or omitted when it is assumed that the reader can provide them.

A natural extension of SLD-resolution is introduced as a goal directed proof procedure
for the full first order implicational fragment of intuitionistic logic. Its intuitionistic semantic fits a procedural interpretation of logic programming. By allowing arbitrary nested implications it can be used for implementing modularity in logic programs. With adequate negation axioms it gives an alternative to negation as failure and leads to a proof procedure for full first order predicate logic.

The use of non-volatile semiconductor memory within an extended storage hierarchy promises significant performance improvements for transaction processing. Although page-addressable semiconductor memories like extended memory, solid-state disks and disk caches are commercially available since several years, no detailed investigation of their use for transaction processing has been performed so far. We present a comprehensive simulation study that compares the performance of these storage types and of different usage forms. The following usage forms are considered: allocation of entire log and database files in non-volatile semiconductor memory, using a so-called write buffer to perform disk writes asynchronously, and caching of database pages at intermediate storage levels (in addition to main memory caching). Our simulations are conducted with both synthetically generated workloads and traces from real-life database applications. In particular, simulation results will be presented for the debit-credit workload frequently used in transaction processing benchmarks. As expected, the greatest performance improvements (but at the highest cost) can be achieved by storing log and database files completely in non-volatile semiconductor memory. For update-intensive
workloads, a limited amount of non-volatile memory used as a write buffer also proved to be very effective. To reduce the number of disk reads; caching of database pages in addition to main memory is best supported by an extended memory buffer. In this respect, disk caches are found to be less effective as they are designed for one-level caching. Different storage costs suggest that it may be cost-effective to use two or even three of the intermediate storage types together. The performance improvements obtainable by the use of non-volatile semiconductor memory is also found to reduce the need for sophisticated DBMS buffer management in order to achieve high transaction processing performance.

The rapid development of any field of knowledge brings with it unavoidable fragmentation and proliferation of new disciplines. The development of computer science is no exception. Software engineering (SE) and human-computer interaction (HCI) are both relatively new disciplines of computer science. Furthermore, as both names suggest, they each have strong connections with other subjects. SE is concerned with methods and tools for general software development based on engineering principles. This discipline has its roots not only in computer science but also in a number of traditional engineering disciplines. HCI is concerned with methods and tools for the development of human-computer interfaces, assessing the usability of computer systems and with broader issues about how people interact with computers. It is based on theories about how humans process information and interact with computers, other objects and other people in the organizational and social contexts in
which computers are used. HCI draws on knowledge and skills from psychology, anthropology and sociology in addition to computer science. Both disciplines need ways of measuring how well their products and development processes fulfil their intended requirements. Traditionally SE has been concerned with 'how software is constructed' and HCI with 'how people use software'. Given the
different histories of the disciplines and their different objectives, it is not surprising that they take different approaches to measurement. Thus, each has its own distinct 'measurement culture.' In this paper we analyse the differences and the commonalties of the two cultures by examining the measurement approaches used by each. We then argue the need for a common measurement taxonomy and framework, which is derived from our analyses of the two disciplines. Next we demonstrate the usefulness of the taxonomy and framework via specific example studies drawn from our own work and that of others and show that, in fact, the two disciplines have many important similarities as well as differences and that there is some evidence to suggest that they are growing closer. Finally, we discuss the role of the taxonomy as a framework to support: reuse, planning future studies, guiding practice and facilitating communication between the two disciplines.

Optimization of Projection Methods for Solving ill-posed Problems. In this paper we propose a modification of the projection scheme for solving ill-posed problems. We show that this modification allows to obtain the best possible order of accuracy of Tikhonov Regularization using an amount of information which is far less than for the standard projection technique.

In this paper we show how Metropolis Light Transport can be extended both in the underlying theoretical framework and the algorithmic implementation to incorporate volumetric scattering.
We present a generalization of the path integral formulation thathandles anisotropic scattering in non-homogeneous media. Based on this framework we introduce a new mutation strategy that is
specifically designed for participating media. It exploits the locality of light propagation by perturbing certain interaction points within the medium. To efficiently sample inhomogeneous media a new ray marching method has been developed that avoids aliasing artefacts and is significantly faster than stratified sampling. The resulting global illumination algorithm provides a physically correct simulation of light transport in the presence of participating media that includes effects such as volume caustics and multiple volume scattering. It is not restricted to certain classes of geometry and scattering models and has minimal memory requirements. Furthermore, it is unbiased and robust, in the sense that it produces satisfactory results for a wide range of input scenes and lighting situations within acceptable time bounds. In particular, we found that it is weil suited for complex scenes with many light sources.

For most applications the used transport service providers are predetermined during the development of the application. This makes it difficult to consider the application communication requirements and to exploit specific features of the network technology. Specialized protocols that are more efficient and offer a qualitative improved service are typically not supported by most applications because they are not commonly available. In this paper we propose a concept for the realization of protocol independent transport services. Only a transport service is predetermined during the development of the application and an appropriate transport service provider is dynamically selected at run time. This enables to exploit specialized protocols if possible, but standard protocols could still be used if necessary. The main focus of this paper is how a transport service could provide a new transport service provider transparently to existing applications. A prototype is presented that maps TCP/IP based applications to an ATM specific transport service provider which offers a reliable and unreliable transport service like TCP/IP.

The Analytic Blossom
(2001)

Blossoming is a powerful tool for studying and computing with Bézier and B-spline curves and surfaces - that is, for the investigation and analysis of polynomials and piecewise polynomials in geometric modeling. In this paper, we define a notion of the blossom for Poisson curves. Poisson curves are to analytic functions what Bézier curves are to polynomials - a representation adapted to geometric design. As in the polynomial setting, the blossom provides a simple, powerful, elegant and computationally meaningful way to analyze Poisson curves. Here, we
define the analytic blossom and interpret all the known algorithms for Poisson curves - subdivision, trimming, evaluation of the function and its derivatives, and conversion between the Taylor and the Poisson basis - in terms of this analytic blossom.

In this work we propose a set of term-rewriting techniques for modelling object-oriented computation. Based on symbolic variants of explicit substitutions calculi, we show how to deal with imperative statements like assignment and sequence in specifications in a pure declarative style. Under our model, computation with classes and objects becomes simply normal form calculation, exactly as it is the case in term-rewriting based languages (for instance the functional languages). We believe this kind of unification between functions and
objects is important because it provides plausible alternatives for using the term-rewriting theory as an engine for supporting the formal and mechanical reasoning about object-oriented specifications.

Visualization of large data sets, especially on small machines, requires advanced techniques in image processing and image generation. Our hybrid raytracer is capable of rendering volumetric and geometric data simultaneously, without loss of accuracy due to data conversion. Compound data sets, consisting of several types of data, are called "hybrid data sets". There is only one rendering pipeline to obtain loss-less and efficient visualization of hybrid data. Algorithms apply to both types of data. Optical material properties are stored in the same data base for both volumetric and geometric objects, and anti-aliasing methods appeal to both data types. Stereoscopic display routines have been added to obtain true three-dimensional visualization on various media, and animation features allow generation of recordable 3-D sequences.

Trimming of surfaces and volumes, curve and surface modeling via Bézier's idea of destortion, segmentation, reparametrization, geometric continuity are examples of applications of functional composition. This paper shows how to
compose polynomial and rational tensor product Bézier representations. The problem of composing Bezier splines and B-spline representations will also be addressed in this paper.

The composition of Bézier curves and tensor product Bézier surfaces, polynomial as well as rational, is applied to exactly and explicitely represent trim curves of tensor product Bézier surfaces. Trimming curves are assumed to be defined as Bézier curves in surface parameter domain. A Bézier spline approximation of lower polynomial degree is built up as weil which is based on the exact trim curve representation in coordinate space.

We propose a framework for the synthesis of temporal logic programs which are formulated in a simple temporal logic programming language from both positive and negative examples. First we will prove that results from the theory of first order inductive logic programming carry over to the domain of temporal logic. After this we will show how programs formulated in the presented language can be generalized or specialized in order to satisfy the specification induced by the sets of examples.

We propose several algorithms for efficient Testing of logical Implication in the case of ground objects. Because the problem of Testing a set of propositional formulas for (un)satisfiability is \(NP\)-complete there's strong evidence that there exist examples for which every algorithm which solves the problem of testing for (un)satisfiability has a runtime that is exponential in the length of the input. So will have our algorithms. We will therefore point out classes of logic programs for which our algorithms have a lower runtime. At the end of this paper we will give an outline of an algorithm for theory refinement which is based on the algorithms described above.

Approximating illumination by point light sources, as done in many professional applications, suffers from the problem of the weak singularity: Numerical exceptions caused by the division by the squared distance between the point light source and the point to be illuminated must be avoided. Multiple importance sampling overcomes these problems by combining multiple sampling techniques by weights. Such a set of weights is called a heuristic. So far the estimators resulting from a heuristic only have been analyzed for variance. Since the cost of sampling is not at all constant for different sampling techniques, it is possible to find more efficient heuristics, even though they may hove higher variance. Based on our new stratification heuristic, we present a robust and unbiased global illumination algorithm. By numerical examples, we show that it is more efficient than previous heuristics. The algorithm is as simple as a path tracer, but elegantly avoids the problem of the weak singularity.

We present an algorithm for determining quadrature rules for computing the direct illumination of predominantly diffuse objects by high dynamic range images. The new method precisely reproduces fine shadow detail, is much more efficient as compared to Monte Carlo integration, and does not require any manual intervention.

As opposed to Monte Carlo integration the quasi-Monte Carlo method does not allow for an (consistent) error estimate from the samples used for the integral approximation. In addition the deterministic error bound of quasi-Monte Carlo integration is not accessible in the setting of computer graphics, since usually the integrands are of unbounded variation. The structure of the high dimensional functionals to be computed for photorealistic image synthesis implies the application of the randomized quasi-Monte Carlo method. Thus we can exploit low discrepancy sampling and at the same time we can estimate the variance. The resulting technique is much more efficient than previous bidirectional path tracing algorithms.

Temporal Data Management and Incremental Data Recomputation with Wide-column Stores and MapReduce
(2017)

In recent years, ”Big Data” has become an important topic in academia
and industry. To handle the challenges and problems caused by Big Data,
new types of data storage systems called ”NoSQL stores” (means ”Not-only-
SQL”) have emerged.
”Wide-column stores” are one kind of NoSQL stores. Compared to relational database systems, wide-column stores introduce a new data model,
new IRUD (Insert, Retrieve, Update and Delete) semantics with support for
schema-flexibility, single-row transactions and data expiration constraints.
Moreover, each column stores multiple data versions with associated time-
stamps. Well-known examples are Google’s ”Big-table” and its open sourced
counterpart ”HBase”. Recently, such systems are increasingly used in business intelligence and data warehouse environments to provide decision support, controlling and revision capabilities.
Besides managing the current values, data warehouses also require management and processing of historical, time-related data. Data warehouses
frequently employ techniques for processing changes in various data sources
and incrementally applying such changes to the warehouse to keep it up-to-
date. Although both incremental data warehousing maintenance and temporal data management have been the subject of intensive research in the
relational database and finally commercial database products have picked up
the ability for temporal data processing and management, such capabilities
have not been explored systematically for today’s wide-column stores.
This thesis helps to address the shortcomings mentioned above. It care-
fully analyzes the properties of wide-column stores and the applicability
of mechanisms for temporal data management and incremental data ware-
house maintenance known from relational databases, extends well-known approaches and develops new capabilities for providing equivalent support in
wide-column stores.

Virtual Reality (VR) is to be seen as the superset of simulation and animation. Visualization is done by rendering. The fundamental model of VR accounts for all phenomenons to be modelled with help of a computer. Examples range from simple dragging actions with a mouse device to the complex simulation of physically based animation.

The calculation of form factors is an important problem in computing the global illumination in the radiosity setting. Closed form solutions often are only available for objects without obstruction and are very hard to calculate. Using Monte Carlo integration and ray tracing provides a fast and elegant tool for the estimation of the form factors. In this paper we show, that using deterministic low discrepancy sample points is superior to random sampling, resulting in an acceleration of more than half an order of magnitude.

Many rendering problems can only be solved using Monte Carlo integration. The noise and variance inherent with the statistical method efficiently can be reduced by stratification. So far only uncorrelated stratification methods were used that in addition depend on the dimension of the integration domain. Based on rank-1-lattices we present a new stratification technique that removes this dependency on dimension, is much more efficient by correlation, is trivial to implement, and robust to use. The superiority of the new scheme is demonstrated for standard rendering algorithms.

The simulation of random fields has many applications in computer graphics such as e.g. ocean wave or turbulent wind field modeling. We present a new and strikingly simple synthesis algorithm for random fields on rank-1 lattices that requires only one Fourier transform independent of the dimension of the support of the random field. The underlying mathematical principle of discrete Fourier transforms on rank-1 lattices breaks the curse of dimension of the standard tensor product Fourier transform, i.e. the number of function values does not exponentially depend on the dimension, but can be chosen linearly.

Quasi-Monte Carlo Radiosity
(1996)

The problem of global illumination in computer graphics is described by a second kind Fredholm integral equation. Due to the complexity of this equation, Monte Carlo methods provide an interesting tool for approximating
solutions to this transport equation. For the case of the radiosity equation, we present the deterministic method of quasi-rondom walks. This method very efficiently uses low discrepancy sequences for integrating the Neumann series and consistently outperforms stochastic techniques. The method of quasi-random walks also is applicable to transport problems in settings other
than computer graphics.

Monte Carlo & Beyond
(2002)

Interleaved Sampling
(2001)

The sampling of functions is one of the most fundamental tasks in computer graphics, and occurs in a variety of different forms. The known sampling methods can roughly be grouped in two categories. Sampling on regular grids is simple and efficient, and the algorithms are often easy to built into graphics hardware. On the down side, regular sampling is prone to aliasing artifacts that are expensive to overcome. Monte Carlo methods, on the other hand,
mask the aliasing artifacts by noise. However due to the lack of coherence, these methods are more expensive and not weil suited for hardware implementations. In this paper, we introduce a novel sampling scheme where samples from several regular grids are a combined into a single irregular sampling pattern. The relative positions of the regular grids are themselves determined by Monte Carlo methods. This generalization obtained by interleaving yields,significantly improved quality compared to traditional approaches while at the same time preserving much of the advantageous coherency of regular sampling. We demonstrate the quality of the new sampling scheme with a number of applications ranging from supersampling over motion blur simulation to volume rendering. Due to the coherence in the interleaved samples, the method is optimally suited for implementations in graphics hardware.

Instant Radiosity
(1997)

We present a fundamental procedure for instant rendering from the radiance equation. Operating directly on the textured scene description, the very efficient and simple algorithm produces photorealistic images without any kernel or solution discretization of the underlying integral equation. Rendering rates of a few seconds are obtained by exploiting graphics hardware, the deterministic
technique of the quasi-random walk for the solution of the global illumination problem, and the new method of jittered low discrepancy sampling.

A fundamental variance reduction technique for Monte Carlo integration in the framework of integro-approximation problems is
presented. Using the method of dependent tests a successive hierarchical function approximation algorithm is developed, which
captures discontinuities and exploits smoothness in the target function. The general mathematical scheme and its highly efficient
implementation are illustrated for image generation by ray tracing,
yielding new and much faster image synthesis algorithms.

The photon map provides a powerful tool for approximating the irradiance in global illumination computations independent from geometry. By presenting new importance sampling techniques, we dramatically improve the memory footprint of the photon map, simplify the caustic generation, and allow for a much faster sampling of direct illumination in complicated models as they arise in a production environment.

The main problem in computer graphics is to solve the global illumination problem,
which is given by a Fredholm integral equation of the second kind, called the radiance equation (REQ). In order to achieve realistic images, a very complex kernel
of the integral equation, modelling all physical effects of light, must be considered. Due to this complexity Monte Carlo methods seem to be an appropriate approach to solve the REQ approximately. We show that replacing Monte Carlo by quasi-Monte Carlo in some steps of the algorithm results in a faster convergence.

Shadow-Mapping
(1993)

Most radiosity techniques store radiosities in certain sample points, typically the vertices of polyhedral scenes. As diffuse radiosities are view independent they can be used for an interactive 'walk-through'. This paper presents an algorithm for storing radiosities independent of the representation of the object. A distributed rendering system, which uses this shadow-mapping technique is described. The basic thermophysical definitions, needed to derive a sum formula for a form factor calculation of polygons, are explained.

In this paper an analytic hidden surface removal algorithm is presented which uses a combination
of 2D and 3D BSP trees without involving point sampling or scan conversion. Errors like aliasing
which result from sampling do not occur while using this technique. An application of this
algorithm is outlined which computes the energy locally reflected from a surface having an
arbitrary BRDF. A simplification for diffuse reflectors is described, which has been implemented
to compute analytic form factors from diffuse light sources to differential receivers as they are needed for shading and radiosity algorithms.

The problem of constructing a geometric model of an existing object from a set of boundary points arises in many areas of industry. In this paper we present a new solution to this problem which is an extension of Boissonnat's method [2]. Our approach uses the well known Delaunay triangulation of the data points as an intermediate step. Starting with this structure, we eliminate tetrahedra until we get an appropriate approximation of the desired shape. The method proposed in this paper is capable of reconstructing objects with arbitrary genus and can cope with different point densities in different regions of the object. The
problems which arise during the elimination process, i.e. which tetrahedra can be eliminated, which order has to be used to control the process and finally, how to stop the elimination procedure at the right time, are discussed in detail. Several examples are given to show the validity of the method.

We study the global solution of Fredholm integral equations of the second kind by the help of Monte Carlo methods. Global solution means that we seek to approximate the full solution function. This is opposed to the usual applications of Monte Carlo, were one only wants to approximate a functional of the solution. In recent years several researchers developed Monte Carlo methods also for the global problem. In this paper we present a new Monte Carlo algorithm for the global solution of integral equations. We use multiwavelet expansions to approximate the solution. We study the behaviour of variance on increasing levels, and based on this, develop a new variance reduction technique. For classes of smooth kernels and right hand sides we determine the convergence rate of this algorithm and show that it is higher
than those of previously developed algorithms for the global problem. Moreover, an information-based complexity analysis shows that our algorithm is optimal among all stochastic algorithms of the same computational
cost and that no deterministic algorithm of the same cost can reach its convergence rate.

A new variance reduction technique for the Monte Carlo solution of integral
equations is introduced. It is based on separation of the main part. A neighboring equation with exactly known solution is constructed by the help of a deterministic Galerkin scheme. The variance of the method is analyzed, and an application to the radiosity equation of computer graphics, together with numerical test results is given.

Approximation properties of the underlying estimator are used to improve the efficiency of the method of dependent tests. A multilevel approximation procedure is developed such that in each level the number of samples is balanced with the level-dependent variance, resulting in a considerable reduction of the overall computational cost. The new technique is applied to the Monte Carlo estimation of integrals depending on a parameter.

The radiance equation, which describes the global illumination problem in computer graphics, is a high dimensional integral equation. Estimates of the solution are usually computed on the basis of Monte Carlo methods. In this paper we propose and investigate quasi-Monte Carlo methods, which means that we replace (pseudo-) random samples by low discrepancy sequences, yielding deterministic algorithms. We carry out a comparative numerical study between Monte Carlo and quasi-Monte Carlo methods. Our results show that quasi-Monte Carlo converges considerably faster.

Monte Carlo integration is often used for antialiasing in rendering processes.
Due to low sampling rates only expected error estimates can be stated, and the variance can be high. In this article quasi-Monte Carlo methods are presented, achieving a guaranteed upper error bound and a convergence rate essentially as fast as usual Monte Carlo.