Kaiserslautern - Fachbereich Informatik
Refine
Year of publication
- 1999 (267)
- 1996 (50)
- 1994 (49)
- 1995 (48)
- 1998 (38)
- 1997 (35)
- 2021 (27)
- 2016 (25)
- 2019 (24)
- 2022 (24)
- 1993 (22)
- 2015 (22)
- 2023 (22)
- 2001 (21)
- 2020 (21)
- 2007 (19)
- 2013 (18)
- 2018 (18)
- 2002 (17)
- 2003 (16)
- 2014 (15)
- 2012 (14)
- 1992 (13)
- 2000 (13)
- 2004 (12)
- 2006 (11)
- 2009 (11)
- 2008 (9)
- 2005 (8)
- 2017 (8)
- 1991 (7)
- 2010 (7)
- 2024 (7)
- 2011 (5)
- 1979 (2)
- 1980 (1)
- 1983 (1)
- 1990 (1)
Document Type
- Preprint (346)
- Doctoral Thesis (235)
- Report (139)
- Article (134)
- Master's Thesis (45)
- Study Thesis (13)
- Conference Proceeding (8)
- Bachelor Thesis (4)
- Part of a Book (2)
- Habilitation (2)
Has Fulltext
- yes (928)
Keywords
- AG-RESY (64)
- PARO (31)
- Case-Based Reasoning (20)
- Visualisierung (20)
- SKALP (16)
- CoMo-Kit (15)
- Fallbasiertes Schliessen (12)
- RODEO (12)
- Robotik (12)
- HANDFLEX (11)
Faculty / Organisational entity
Distributed message-passing systems have become ubiquitous and essential for our daily lives. Hence, designing and implementing them correctly is of utmost importance. This is, however, very challenging at the same time. In fact, it is well-known that verifying such systems is algorithmically undecidable in general due to the interplay of asynchronous communication (messages are buffered) and concurrency. When designing communication in a system, it is natural to start with a global protocol specification of the desired communication behaviour. In such a top-down approach, the implementability problem asks, given such a global protocol, if the specified behaviour can be implemented in a distributed setting without additional synchronisation. This problem has been studied from two perspectives in the literature. On the one hand, there are Multiparty Session Types (MSTs) from process algebra, with global types to specify protocols. Key to the MST approach is a so-called projection operator, which takes a global type and tries to project it onto every participant: if successful, the local specifications are safe to use. This approach is efficient but brittle. On the other hand, High-level Message Sequence Charts (HMSCs) study the implementability problem from an automata-theoretic perspective. They employ very few restrictions on protocol specifications, making the implementability problem for HMSCs undecidable in general. The work in this thesis is the first to formally build a bridge between the world of MSTs and HMSCs. To start, we present a generalised projection operator for sender-driven choice. This allows a sender to send to different receivers when branching, which is crucial to handle common communication patterns from distributed computing. Despite this first step, we also show that the classical MST projection approach is inherently incomplete. We present the first formal encoding from global types to HMSCs. With this, we prove decidability of the implementability problem for global types with sender-driven choice. Furthermore, we develop the first direct and complete projection operator for global types with sender-driven choice, using automata-theoretic techniques, and show its effectiveness with a prototype implementation. We are the first to provide an upper bound for the implementability problem for global types with sender-driven (or directed) choice and show it to be in PSPACE. We also provide a session type system that uses the results from our projection operator. Last, we introduce protocol state machines (PSMs) – an automata-based protocol specification formalism – that subsume both global types from MSTs and HMSCs with regard to expressivity. We use transformations on PSMs to show that many of the syntactic restrictions of global types are not restrictive in terms of protocol expressivity. We prove that the implementability problem for PSMs with mixed choice, which requires no dedicated sender for a branch but solely all labels to be distinct, is undecidable in general. With our results on expressivity, this answers an open question: the implementability problem for mixed-choice global types is undecidable in general.
Machine learning algorithms are widely applied to create powerful prediction models. With increasingly complex models, humans' ability to understand the decision function (that maps from a high-dimensional input space) is quickly exceeded. To explain a model's decisions, black-box methods have been proposed that provide either non-linear maps of the global topology of the decision boundary, or samples that allow approximating it locally. The former loses information about distances in input space, while the latter only provides statements about given samples, but lacks a focus on the underlying model for precise ‘What-If'-reasoning. In this paper, we integrate both approaches and propose an interactive exploration method using local linear maps of the decision space. We create the maps on high-dimensional hyperplanes—2D-slices of the high-dimensional parameter space—based on statistical and personal feature mutability and guided by feature importance. We complement the proposed workflow with established model inspection techniques to provide orientation and guidance. We demonstrate our approach on real-world datasets and illustrate that it allows identification of instance-based decision boundary structures and can answer multi-dimensional ‘What-If'-questions, thereby identifying counterfactual scenarios visually.
Edit distances between merge trees of scalar fields have many applications in scientific visualization, such as ensemble analysis, feature tracking or symmetry detection. In this paper, we propose branch mappings, a novel approach to the construction of edit mappings for merge trees. Classic edit mappings match nodes or edges of two trees onto each other, and therefore have to either rely on branch decompositions of both trees or have to use auxiliary node properties to determine a matching. In contrast, branch mappings employ branch properties instead of node similarity information, and are independent of predetermined branch decompositions. Especially for topological features, which are typically based on branch properties, this allows a more intuitive distance measure which is also less susceptible to instabilities from small-scale perturbations. For trees with 𝒪(n) nodes, we describe an 𝒪(n4) algorithm for computing optimal branch mappings, which is faster than the only other branch decomposition-independent method in the literature by more than a linear factor. Furthermore, we compare the results of our method on synthetic and real-world examples to demonstrate its practicality and utility.
The development of algorithmic differentiation (AD) tools focuses mostly on handling floating point types in the target language. Taping optimizations in these tools mostly focus on specific operations like matrix vector products. Aggregated types like std::complex are usually handled by specifying the AD type as a template argument. This approach provides exact results, but prevents the use of expression templates. If AD tools are extended and specialized such that aggregated types can be added to the expression framework, then this will result in reduced memory utilization and improve the timing for applications where aggregated types such as complex number or matrix vector operations are used. Such an integration requires a reformulation of the stored data per expression and a rework of the tape evaluation process. We will demonstrate the overheads on a synthetic benchmark and show the improvement when aggregated types are handled properly by the expression framework of the AD tool.
To increase situational awareness of the crane operator, the aim of this thesis is to develop a vision-based deep learning object detection from crane load-view using an adaptive perception in the construction area. Conventional worker detection methods are based on simple shape or color features from the worker's appearances. Nonetheless, these methods can fail to recognize the workers who do not wear the protective gears. To find out an image representation of the object from the top view manually or handcrafted feature is crucial. We, therefore, employed deep learning methods to automatically learn those features.
To yield optimal results, deep learning methods require mass amount of data.
Due to the data deficit especially in the construction domain, we developed the photorealistic world to create the data in addition to our samples collected from the real construction area. The simulated platform does not benefit only from diverse data types, but also concurrent research development which speeds up the pipeline at a low cost.
Our research findings indicate that the combination of synthetic and real training samples improved the state-of-the-art detector. In line with previous studies to bridge the gap between synthetic and real data, the results of preprocessed synthetic images are substantially better than using the raw data by approximately 10%.
Finding the right deep learning model for load-view detection is challenging.
By investigating our training data, it becomes evident that the majority of bounding box sizes are very small with a complex background.
In addition, we gave the priority to speed over accuracy based on the construction safety criteria. Finally, RetinaNet is chosen out of the three primary object detection models.
Nevertheless, the data-driven detection algorithm can fail to handle scale invariance, especially for detectors whose input size is changed in an extremely wide range.
The adaptive zoom feature can enhance the quality of the worker detection.
To avoid further data gathering and extensive retraining, the proposed automatic zoom method of the load-view crane camera supports the deep learning algorithm, specifically in the high scale variant problem. The finite state machine is employed for control strategies to adapt the zoom level to cope not only with inconsistent detection but also abrupt camera movement during lifting operation. Consequently, the detector is able to detect a small size object by smooth continuous zoom control without additional training.
The adaptive zoom control not only enhances the performance of the top-view object detection but also reduces the interaction of the crane operator with camera system, reducing the risk of fatality during load lifting operation.
Turbulence models, which are a means to fix the closure problem arising from Reynolds averaging of Navier-Stokes equations, are economical stop-gaps but suffer from accuracy issues. Modifying turbulence models by incorporating corrections in their functional form is one approach to improve their accuracy. We estimate correction functionals for the Spalart - Allmaras turbulence model, based on an inverse problem with PDE constraints emphasizing the issue of regularization.
Algorithmic decision-making (ADM) systems have come to support, pre-empt or substitute for human decisions in manifold areas, with potentially significant impacts on individuals' lives. Achieving transparency and accountability has been formulated as a general goal regarding the use of these systems. However, concrete applications differ widely in the degree of risk and the accountability problems they entail for data subjects. The present paper addresses this variation and presents a framework that differentiates regulatory requirements for a range of ADM system uses. It draws on agency theory to conceptualize accountability challenges from the point of view of data subjects with the purpose to systematize instruments for safeguarding algorithmic accountability. The paper furthermore shows how such instruments can be matched to applications of ADM based on a risk matrix. The resulting comprehensive framework can guide the evaluation of ADM systems and the choice of suitable regulatory provisions.
We describe a novel technique for the simultaneous visualization of multiple scalar fields, e.g. representing the members of an ensemble, based on their contour trees. Using tree alignments, a graph-theoretic concept similar to edit distance mappings, we identify commonalities across multiple contour trees and leverage these to obtain a layout that can represent all trees simultaneously in an easy-to-interpret, minimally-cluttered manner. We describe a heuristic algorithm to compute tree alignments for a given similarity metric, and give an algorithm to compute a joint layout of the resulting aligned contour trees. We apply our approach to the visualization of scalar field ensembles, discuss basic visualization and interaction possibilities, and demonstrate results on several analytic and real-world examples.
Editorial
(2021)
In recent years, there has been a growing need for accurate 3D scene reconstruction. Recent developments in the automotive industry have led to the increased use of ADAS where 3D reconstruction techniques are used, for example, as part of a collision detection system. For such applications, scene geometry reconstruction is usually performed in the form of depth estimation, where distances to scene objects are obtained.
In general, depth estimation systems can be divided into active and passive. Both systems have their advantages and disadvantages, but passive systems are usually cheaper to produce and easier to assemble and integrate than active systems. Passive systems can be stereo- or multiple-view based. Up to a certain limit, increasing the number of views in multi-view systems usually results in improved depth estimation accuracy.
One potential problem for ensuring the reliability of multi-view systems is the need to accurately estimate the orientation of their optical sensors. One way to ensure sensor placement for multi-view systems is to rigidly fix the sensors at the manufacturing stage. Unlike arbitrary sensor placement, using of a simplified and known sensor placement geometry further simplifies the depth estimation.
We meet with the concept of light field, which parameterizes all visible light passing through all viewpoints by their intersection with angular and spatial planes. When applied to computer vision, this gives us a 2D set of 2D images, where the physical distances between each image are fixed and proportional to each other.
Existing light field depth estimation methods provide good accuracy, which is suitable for industrial applications. However, the main problems of these methods are related to their running time and resource requirements. Most of the algorithms presented in the literature are typically sharpened for accuracy, can only be run on high-performance machines and often require a significant amount of time to process and obtain results.
Real-world applications often have running time requirements. Also, often there is a power-consumption limitation. In this dissertation, we investigate the problem of building a depth estimation system with an light field camera that satisfies the operating time and power consumption constraints without significant loss of estimation accuracy.
First, an algorithm for calibrating light field cameras is proposed, together with an algorithm for automatic calibration refinement, that works on arbitrary captured scenes. An algorithm for classical geometric depth estimation using light field cameras is proposed. Ways to optimize the algorithm for real-time use without significant loss of accuracy are presented. Finally, the ways how the presented depth estimation methods can be extended using modern deep learning paradigms under the two previously mentioned constraints are shown.
In order to discuss the kinds of reasoning a visualization supports and the conclusions that can be drawn within the analysiscontext, a theoretical framework is needed that enables a formal treatment of the reasoning process. Such a model needs toencompass three stages of the visualization pipeline: encoding, decoding and interpretation. The encoding details how dataare transformed into a visualization and what can be seen in the visualization. The decoding explains how humans constructgraphical contexts inside the depicted visualization and how they interpret them assigning meaning to displayed structuresaccording to a formal reasoning strategy. In the presented model, we adapt and combine theories for the different steps intoa unified formal framework such that the analysis process is modelled as an assignment of meaning to displayed structuresaccording to a formal reasoning strategy. Additionally, we propose the ConceptGraph, a combined graph-based representationof the finite-state transducers resulting from the three stages, that can be used to formalize and understand the reasoning process.We apply the new model to several visualization types and investigate reasoning strategies for various tasks.
Knowledge workers face an ever increasing flood of information in their daily work. They live in a “multi-tasking craziness”, involving activities like creating, finding, processing, assessing or organizing information while constantly switching from one context to another, each being associated with different tasks, documents, mails, etc. Hence, their personal information sphere consisting of file, mail and bookmark folders as well as their content, calendar entries, etc. is cluttered with information that has become irrelevant. Finding important information thus gets harder and much of previously gained knowledge is practically lost.
This thesis explores new ways of solving this problem by investigating the potential of self-(re)organizing and especially forgetting-enabled personal knowledge assistants in the given scenario. It utilizes so-called Managed Forgetting, which is an escalating set of measures to overcome the binary keep-or-delete paradigm, ranging from temporal hiding, to condensation, to adaptive reorganization, synchronization, archiving and deletion. Managed Forgetting is combined with two other major ideas: First, it uses the Semantic Desktop as an ecosystem, which brings Semantic Web and thus knowledge graph technologies to a user’s desktop, making it possible to capture and represent major parts of a user’s personal mental model in a machine-understandable way and exploit it in many different applications. Second, the system uses explicated context information – so-called Context Spaces: context is seen as an explicit interaction element users can work with (i.e. a “tangible” object similar to a folder) and in (immersion). The thesis is structured according to the basic interaction cycle with such a system, ranging from evidence collection to information extraction and context elicitation, followed by information value assessment and the actual support measures consisting of self-(re)organization decisions (back-end) and user interface updates (front-end). The system’s data foundation are personal or group knowledge graphs as well as native data. This work makes contributions to all of these aspects, whereas several of them have been investigated and developed in interdisciplinary research with cognitive scientists. On a more general level, searching and trust in such highly autonomous assistants have also been investigated.
In summary, a self-(re)organizing and especially forgetting-enabled support system for information management and knowledge work has been realized. Its different features vary in maturity: the most mature ones are already in practical use (also in industry), while the latest are just well elaborated (position papers) or rough ideas. Different evaluation strategies have been applied ranging from mere data-driven experiments to various user studies. Some of them were rather short-term with controlled laboratory conditions, others less controlled but spanning several months. Different benefits of working with such a system could be quantified, e.g. cognitive offloading effects and reduced task switching/resumption time. Other benefits were gathered qualitatively, e.g. tidiness of the information sphere and its better alignment with the user’s mental model. The presented approach has been shown to hold a lot of potential. In some aspects, however, only first steps have been taken towards tapping it, e.g. several support measures can be further refined and automation further increased.
Editorial
(2020)
Editorial
(2020)
Several governmental organizations all over the world aim for algorithmic accountability of artificial intelligence systems. However, there are few specific proposals on how exactly to achieve it. This article provides an extensive overview of possible transparency and inspectability mechanisms that contribute to accountability for the technical components of an algorithmic decision-making system. Following the different phases of a generic software development process, we identify and discuss several such mechanisms. For each of them, we give an estimate of the cost with respect to time and money that might be associated with that measure.
Editorial
(2020)
This thesis focuses on the operation of reliability-constrained routes in wireless ad-hoc networks. A complete communication protocol that is capable of guaranteeing a statistical minimum reliability level would have to support several functionalities: first, routes that are capable of supporting the specified Quality of Service requirement have to be discovered. During operation of discovered routes, the current Quality of Service level has to be monitored continuously. Whenever significant deviations are detected and the required level of Quality of Service is endangered, route maintenance has to ensure continuous operation. All four functionalities, route discovery, route operation, route maintenance and collection and distribution of network status information, will be addressed in this thesis.
In the first part of the thesis, we propose a new approach for Quality-of- Service routing in wireless ad-hoc networks called rmin-routing, with the provision of statistical minimum route reliability as main route selection criterion. To achieve specified minimum route reliabilities, we improve the reliability of individual links by well-directed retransmissions, to be applied during the operation of routes. To select among a set of candidate routes, we define and apply route quality criteria concerning network load.
High-quality information about the network status is essential for the discovery and operation of routes and clusters in wireless ad-hoc networks. This requires permanent observation and assessment of nodes, links, and link metrics, and the exchange of gathered status data. In the second part of the thesis, we present cTEx, a configurable topology explorer for wireless ad-hoc networks that efficiently detects and exchanges high-quality network status information during operation.
In the third part, we propose a decentralized algorithm for the discovery and operation of reliability-constrained routes in wireless ad-hoc networks called dRmin-routing. The algorithm uses locally available network status information about network topology and link properties that is collected proactively in order to discover a preliminary route candidate. This is followed by a distributed, reactive search along this preselected route to remove imprecisions of the locally recorded network status before making a final route selection. During route operation, dRmin-routing monitors routes and performs different kinds of route repair actions to maintain route reliability in order to overcome varying link reliabilities.
Modeling and Simulation of Internet of Things Infrastructures for Cyber-Physical Energy Systems
(2024)
This dissertation presents a novel approach to the model-based development and simulation-based validation of Internet of Things (IoT) infrastructures within the context of Cyber-Physical Energy Systems (CPES). CPES represents an evolution in energy management, seamlessly blending physical and cyber components for efficient, secure, and dependable energy distribution. However, the intricate interplay of these components demands innovative modeling and simulation strategies.
The work begins by establishing a robust foundation, exploring essential background elements such as requirements engineering, model-based systems engineering, digitalization approaches, and the intricacies of IoT platforms. It introduces the novel concept of homomorphic encryption, a critical enabler for securing IoT data within CPES.
In the exploration of the state of the art, the dissertation delves into the multifaceted landscape of IoT simulation, emphasizing the significance of versatility, community support, scalability, and synchronization.
The core contribution emerges in the chapter on simulating IoT networks. It introduces a sophisticated framework that encompasses hardware-in-the-loop, software-in-the-loop, and human-in-the-loop simulation. This innovative framework extends the boundaries of conventional simulation, enabling holistic evaluations of IoT systems.
A practical case study on smart energy usage showcases the application of the framework. Detailed SysML models, including requirements, package diagrams, block definition diagrams, internal block diagrams, state machine diagrams, and activity diagrams, are meticulously examined. The performance evaluation encompasses diverse aspects, from hardware and software validation to human interaction.
In conclusion, this dissertation represents a significant leap forward in the integration of IoT infrastructures within CPES. Its contributions extend from a comprehensive understanding of foundational elements to the practical implementation of a holistic simulation framework. This work not only addresses the current challenges but also outlines a path for future research, shaping the landscape of IoT integration within the dynamic realm of CPES. It offers invaluable insights for researchers, engineers, and stakeholders working towards resilient, secure, and energy-efficient infrastructures.
In many applications, visual analytics (VA) has developed into a standard tool to ease data access and knowledge generation. VA describes a holistic cycle transforming data into hypothesis and visualization to generate insights that enhance the data. Unfortunately, many data sources used in the VA process are affected by uncertainty. In addition, the VA cycle itself can introduce uncertainty to the knowledge generation process but does not provide a mechanism to handle these sources of uncertainty. In this manuscript, we aim to provide an extended VA cycle that is capable of handling uncertainty by quantification, propagation, and visualization, defined as uncertainty-aware visual analytics (UAVA). Here, a recap of uncertainty definition and description is used as a starting point to insert novel components in the visual analytics cycle. These components assist in capturing uncertainty throughout the VA cycle. Further, different data types, hypothesis generation approaches, and uncertainty-aware visualization approaches are discussed that fit in the defined UAVA cycle. In addition, application scenarios that can be handled by such a cycle, examples, and a list of open challenges in the area of UAVA are provided.
Dataflow process networks (DPNs) are intrinsically data-driven, i.e., node actions are not synchronized among each other and may fire whenever sufficient input operands arrived at a node. While the general model of computation (MoC) of DPNs does not impose further restrictions, many different subclasses of DPNs representing different dataflow MoCs have been considered over time. These classes mainly differ in the kinds of behaviors of the processes. A DPN may be heterogeneous in that different processes in the network belong to different classes of DPNs. A heterogeneous DPN can therefore be effectively used to model and to implement different components of a system with different kinds of processes and, therefore, different dataflow MoCs. This paper presents a model-based design based on different dataflow MoCs including their heterogeneous combinations. In particular, it covers the automatic software synthesis of systems from DPN models. The main objective is to validate, evaluate and compare the artifacts exhibited by different dataflow MoCs at the implementation level of systems under the supervision of a common design tool. Moreover, this work also offers an efficient synthesis method that targets and exploits heterogeneity in DPNs by generating implementations based on the kinds of behaviors of the processes. The proposed synthesis method provides a tool chain including different specialized code generators for specific dataflow MoCs, and a runtime system that finally maps models using a combination of different dataflow MoCs on cross-vendor target hardware.
This paper presents an iterative finite element (FE)–based method to calculate the gravity-free shape of nonrigid parts from
an optical measurement performed on a non-over-constrained fixture. Measuring these kinds of parts in a stress-free state
is almost impossible because deflections caused by their weight occur. To solve this problem, a simulation model of the
measurement is created using available methods of reverse engineering. Then, an iterative algorithm calculates the gravityfree
shape. The approach does not require a CAD model of the measured part, implying the whole part can be fully scanned.
The application of this method mainly addresses thin, unstable sheet metal parts, like those commonly used in the automotive
or aerospace industry. To show the performance of the proposed method, validations with simulation and experimental
data are presented. The shown results meet the predefined quality goal to predict shapes within a tolerance of ±0.05 mm
measured in surface normal direction.
We propose a universal method for the evaluation of generalized standard materials that greatly simplifies the material law implementation process. By means of automatic differentiation and a numerical integration scheme, AutoMat reduces the implementation effort to two potential functions. By moving AutoMat to the GPU, we close the performance gap to conventional evaluation routines and demonstrate in detail that the expression level reverse mode of automatic differentiation as well as its extension to second order derivatives can be applied inside CUDA kernels. We underline the effectiveness and the applicability of AutoMat by integrating it into the FFT-based homogenization scheme of Moulinec and Suquet and discuss the benefits of using AutoMat with respect to runtime and solution accuracy for an elasto-viscoplastic example.
When considering complex systems, identifying the most important actors is often of relevance. When the system is modeled
as a network, centrality measures are used which assign each node a value due to its position in the network. It is often
disregarded that they implicitly assume a network process flowing through a network, and also make assumptions of how
the network process flows through the network. A node is then central with respect to this network process (Borgatti in Soc
Netw 27(1):55–71, 2005, https ://doi.org/10.1016/j.socne t.2004.11.008). It has been shown that real-world processes often
do not fulfill these assumptions (Bockholt and Zweig, in Complex networks and their applications VIII, Springer, Cham,
2019, https ://doi.org/10.1007/978-3-030-36683 -4_7). In this work, we systematically investigate the impact of the measures’
assumptions by using four datasets of real-world processes. In order to do so, we introduce several variants of the betweenness
and closeness centrality which, for each assumption, use either the assumed process model or the behavior of the real-world
process. The results are twofold: on the one hand, for all measure variants and almost all datasets, we find that, in general,
the standard centrality measures are quite robust against deviations in their process model. On the other hand, we observe a
large variation of ranking positions of single nodes, even among the nodes ranked high by the standard measures. This has
implications for the interpretability of results of those centrality measures. Since a mismatch of the behaviour of the real
network process and the assumed process model does even affect the highly-ranked nodes, resulting rankings need to be
interpreted with care.
Since the h-index has been invented, it is the most frequently discussed bibliometric value and one of the most commonly used metrics to quantify a researcher’s scientific output. The more it is increasingly gaining popularity to use the metric as an indication of the quality of a job applicant or an employee the more important it is to assure its correctitude. Many platforms offer the h-index of a scientist as a service, sometimes without the explicit knowledge of the respective person. In this article we show that looking up the h-index for a researcher on the five most commonly used platforms, namely AMiner, Google Scholar, ResearchGate, Scopus and Web of Science, results in a variance that is in many cases as large as the average value. This is due to the varying definitions of what a scientific article is, the underlying data basis, and different qualities of the entity recognition problem. To perform our study, we crawled the h-index of the worlds top researchers according to two different rankings, all the Nobel Prize laureates except Literature and Peace, and the teaching staff of the computer science department of the TU Kaiserslautern Germany with whom we additionally computed their h-index manually. Thus we showed that the individual h-indices differ to an alarming extent between the platforms. We observed that researchers with an extraordinary high h-index and researchers with an index appropriate to the scientific career path and the respective scientific field are affected alike by these problems.
Weak memory consistency models capture the outcomes of concurrent
programs that appear in practice and yet cannot be explained by thread
interleavings. Such outcomes pose two major challenges to formal
methods. First, establishing that a memory model satisfies its
intended properties (e.g., supports a certain compilation scheme) is
extremely error-prone: most proposed language models were initially
broken and required multiple iterations to achieve soundness. Second,
weak memory models make verification of concurrent programs much
harder, as a result of which there are no scalable verification
techniques beyond a few that target very simple models.
This thesis presents solutions to both of these problems.
First, it shows that the relevant metatheory of weak memory
models can be effectively decided (sparing years of manual proof
efforts), and presents Kater, a tool that can answer metatheoretic
queries in a matter of seconds. Second, it presents GenMC, the first
(and only) scalable stateless model checker that is parametric in the
choice of the memory model, often improving the prior state of the art
by orders of magnitude.
In one-dimensional (1-D) Ultrasound (US) measurements, signals are
acquired that form the basis of more sophisticated two-dimensional (2-D) or
three-dimensional (3-D) US imaging. These 1-D signals contain a lot of raw
information about the US wave propagation and interaction with the
medium that is only processed in parts during image generation. While
image representations are easy to interpret for humans, the analysis of US
wave signals is hard to perform without applying algorithms to extract
desired features.
This work investigates reliable and fast 1-D US signal classifications to
distinguish between different stages or states in biomedical US scenarios and
shows how the new field of Machine Learning (ML) on raw US wave data
provides advantages and different applications. To achieve good results, the
input signals are treated as time series, which requires the deployment of
comparatively complex Time Series Classification (TSC) algorithms.
The literature shows that a lot of research efforts have previously only
tackled the classification and segmentation of US Brightness mode (B-Mode)
images, while neglecting approaches to classify 1-D signals to a large extent.
This research contributes by developing, deploying and evaluating
classification approaches for three distinct biomedical US classification tasks
and finds that respective signal classifications for different scenarios are
possible with varying degrees of accuracies. It entails the comparison of
several combinations of data types (e.g. temporal, spectral and statistical
features or raw signals), ML models and pre-processing steps to provide a
strong foundation for robust, binary classifications of 1-D US signals for
scenarios based on low-cost wearable, mobile and stationary devices. This
research addresses scientific questions not answered before by informing on
detailed descriptions of beneficial domain specific knowledge (domain specific
knowledge (DSK)), achieved accuracies and times needed for training and
evaluation of the examined ML models.
The resulting ML pipelines includes solutions based on data acquired from
custom experimental setups or clinical trials. Possible real-world applications
might include muscle contraction trackers, muscle fatigue detectors,
epiphyseal radius bone closure detectors or devices providing information
about advanced liver disease stages.
Automated machine-assisted
classifications requiring as little DSK as possible from the end user enable
application scenarios ranging from fitness or rehabilitation trackers as
consumer devices to solutions providing diagnostic support without requiring
extensive knowledge from professional medical practitioners. For example,
decision support systems for bone age assessments in clinical use or liver
health assessment systems for gastroenterologists.
This work shows that reliable, robust and fast classifications based on 1-D
US signals are possible with high degrees of accuracies depending on the
examined scenario with achieved F 1 -scores ranging from ≈ 70% to ≈ 87%.
These results prove that real-life applications for recreational purposes are
already possible and that critical applications for clinical use are highly likely
to be achieved once the presented approaches are further optimized in the future.
The field of 3D reconstruction is one of the most important areas in computer
vision. It is not only of theoretical importance, but it is also increasingly
used in practical applications, be it in reverse engineering, quality control or
robotics. In practical applications, where high precision reconstructions are
required for a large variety of different objects, structured light reconstruction
is often the method of choice. It allows to achieve accurate and dense
point correspondences over the entire scene, regardless of object texture or
features. Techniques that project phase-shifted sinusoidals are widely used
because, based on the harmonic addition theorem, they theoretically allow
surface encoding in full camera resolution invariant to the object’s coloring.
In this thesis, a fully-automatic reconstruction pipeline based on the sinusoidal
structured light technique is presented. From the projection of the
fringe patterns for encoding the object’s surface, the robust matching of the
point correspondences in sub-pixel accuracy, the auto-calibration of the setup
including the active device, up to the fully-automatic alignment of the partial
reconstructions, all steps will be described and examined in detail. During
that, improvements will be achieved in the area of matching, obtaining highly
accurate and topologically consistent correspondences in sub-pixel precision
between all the devices used. Furthermore, the auto-calibration from point
correspondences, based on the epipolar geometry of the structured light system
is improved. Weaknesses of previous methods in the extraction of focal
lengths from the fundamental matrices are discovered and addressed. The partial
point clouds, reconstructed from the auto-calibrated devices, are finally
pre-aligned using a neural network approach, based on light-resistant optical
flow estimation and subsequently refined using a global approach.
The weaknesses of the structured light method itself will also be addressed
and partially fixed during the course of this work. Since it is an active reconstruction method, certain surface properties can affect the quality of the
reconstruction. It will be shown how these problems can be eliminated or at
least be reduced using an iterative approach that combines fringe patterns with
an inverse texture. Another weakness of the method is its time-consuming acquisition procedure. Typically, a large number of horizontal and vertical fringe
patterns are projected onto the scene to achieve high-precision encoding despite
the limited dynamic range and resolution of the projector. Therefore, a
method will be presented which allows to combine the horizontal and vertical
patterns for a simultaneous two dimensional surface encoding.
During our daily lives, we are confronted with vast amounts of data, the processing of which can dramatically influence our lives, both positively and negatively. The enormous amount of data (images, texts, tables, and time series), its variety and possible applications are not always obvious. Due to advancements in the internet of things (IoT), there exist billions of sensors that produce time series which can be found everywhere, whether in medicine, the financial sector or the agricultural economy. This incredible amount of time series data has many hidden features which are useful for industry as well as for daily use, e.g. improving the cancer prediction can save real human lives. Recently, several deep learning methods have been proposed for analyzing this time series data. However, due to their black box nature, their applicability is limited in critical sectors like medicine, finance, and communication. In addition, it is now a compulsion as per artificial intelligence (AI) Act and per General Data Protection Regulation (GDPR) to protect any sensitive data and provide explanations in safety-critical domains. To enable use of DNNs in a broader domain scope, this thesis presents a framework for privacy-preserved and interpretable time series analysis. TimeFrame consists of four main components, namely, post-hoc interpretability, intrinsic interpretability, direct privacy, and indirect privacy. Interpretability is indispensable to avoid damaging people or the infrastructure. In the past years, the development mostly focused on image data, which prevented the full potential of DNNs in time series processing from being exploited. To overcome this limitation, TimeFrame introduces five (Time to Focus, TSViz, TimeREISE, TSInsight, Data Lens) novel post-hoc and two (PatchX, P2ExNet) novel intrinsic interpretability components. TimeFrame addresses multiple perspectives such as attribution, compression, visualization, influence, prototyping, and hierarchical splitting. Compared to existing methods, the components show better explanations, robustness, and scalability. Another crucial factor is the privacy when dealing with sensitive data and deep learning. In this context, TimeFrame introduces two (PPML, PPML x XAI) components for direct and one (From Private to Public) component for indirect privacy. These components benchmark privacy approaches, their effect on interpretability, and the synthetic generation of data to overcome privacy concerns. TimeFrame offers a large set of interpretability and privacy components that can be combined and consider numerous different aspects. Furthermore, the novel approaches have shown to consistently outperform twenty existing state-of-the-art methods across up to 20 different datasets. To guarantee the fairness, various metrics were used including performance change, Sensitivity, Infidelity, Continuity, runtime, model dependency, compression rate, and others. This broad set of metrics makes it possible to provide guidelines for a more appropriate use of existing state-of-the-art approaches as well as the novel components included in TimeFrame.
Highly Automated Driving (HAD) vehicles represent complex and safety critical systems. They are deployed in an open context i.e., an intricate environment which undergoes continual changes. The complexity of these systems and insufficiencies in sensing and understanding the open context may result in unsafe and uncertain behaviour. The safety critical nature of the HAD vehicles requires modelling of root causes for unsafe behaviour and their mitigation to argue sufficient reduction of residual risk.
Standardization activities such as ISO 21448 provide guidelines on the Safety Of The Intended Functionality (SOTIF) and focus on the analysis of performance limitations under the influence of triggering conditions that can lead to hazardous behaviour. SOTIF references traditional safety analyses methods e.g., Failure Mode and Effect Analysis (FMEA) and Fault Tree Analysis (FTA) to perform safety analysis. These analyses methods are based on certain assumptions e.g., single point failure in FMEA and independence of basic events in FTA. Moreover, these analyses are generally based on expert knowledge i.e., data-based models or hybrid approaches (expert and data) are seldom practised. The resulting safety model is fixed i.e., it is generally seen as a one-time artefact. Open context environment may contain triggering conditions which may not be evident to the expert. Open context also evolves over time and new phenomena may emerge.
This thesis explores the applicability of the traditional safety analyses techniques to provide safety models for HAD vehicles operating in the open context, under the light of modelling assumptions taken by traditional safety analyses techniques. Moreover, incorporating uncertainties into safety analyses models is also explored. An explicit distinction between the inherent uncertainty of a probabilistic event (aleatory) and uncertainty due to lack of knowledge (epistemic) is made to formalize models to perform SOTIF analysis. A further distinction is made for conditions of complete ignorance and termed as ontological uncertainty. The distinction is important as for HAD vehicles operating in open context the ontological uncertainty can never be completely disregarded.
This thesis proposes a novel framework of SOTIF to model, estimate and dis cover triggering conditions relevant to performance limitations. The framework provides the ability to model uncertainties while also providing a hybrid approach i.e., supporting inclusion of expert knowledge as well as data driven engineering processes. Two representative algorithms are provided to support the framework. Bayesian Network (BN) and p-value hypothesis testing are utilised in this regard. The framework is implemented on a real-world case study in which LIDARs based perception systems are used as vehicle detection system.
This doctoral dissertation is comprised of nine published articles covering different
methods for ‘Fast, Robust Rigid and Non-Rigid Registration for Globally Consistent
3D Scene and Shape Reconstruction’. Overall the contributing articles are separated
and discussed in three stages – The first part of the thesis i.e., chapter 2 explains
three novel method classes of rigid point set registration namely Gravitational Approach (GA), Fast Gravitational Approach (FGA), and RPSRNet. GA was introduced as the first physics-based rigid point set registration. It includes elegant modeling of rigid by dynamics using Newtonian mechanics. The method proposed many new avenues for other types of pattern matching tasks thank point set registration. Next, FGA method, published 4 years after GA presented as an extension that breaks the algorithmic complexity of GA from O(M N ) to O(M log N ) using Barnes-Hut tree representation of point cloud. It also eliminates the requirement of heuristic optimization parameter settings by GA, and achieve state-of-the-art alignment accuracy on LiDAR odometry. Finally, RPSRNet presents deep learning version of FGA, with custom convolution layers for hierarchical point feature embedding. RPSRNet is robust and the fastest among SoA methods for LiDAR data registration. The second part, i.e., chapter 3, of the thesis introduces NRGA as the fist physics-based non-rigid point set
registration method which is computationally slow but robust against noisy and partial inputs. NRGA preserves structural consistency as it coherently regularize motion of deformable vertices. For articulated hand shape reconstruction, a tailored version of NRGA -- Articulated-NRGA -- is effective to refine final hand shape. Collision and penetration avoidance between source and target surfaces are tackled by constrained optimization in NRGA. This setting has improved hand and object interaction reconstruction. Next contribution FoldMatch method remodels the shape deformation by introducing wrinkle vector field (WVF) for capturing complex clothing and garment details while fitting body models onto 3D Scans. Quantitative evaluation of FoldMatch and NRGA shows their effectiveness in geometrically consistent surface modeling and reconstruction tasks. Finally, the third part of the thesis explains globally consistent outdoor scene reconstruciton, odometry estimation, and uncertainty guided pose-graph optimization in a novel LiDAR-based localization and map building method, called Deep Evidential LiDAR Odometry (DELO). This is the first Odometry method to use predictive uncertainty modeling for sensor pose prediction network.
From industrial fault detection to medical image analysis or financial fraud prevention: Anomaly detection—the task of identifying data points that show significant deviations from the majority of data—is critical in industrial and technological applications. For efficient and effective anomaly detection, a rich set of semantic features are required to be automatically extracted from the complex data. For example, many recent advances in image anomaly detection are based on self-supervised learning, which learns rich features from a large amount of unlabeled complex image data by exploiting data augmentations. For image data, predefined transformations such as rotations are used to generate varying views of the data. Unfortunately, for data other than images, such as time series, tabular data, graphs, or text, it is unclear what are suitable transformations. This becomes an obstacle to successful self-supervised anomaly detection on other data types.
This thesis proposes Neural Transformation Learning, a self-supervised anomaly detection method that is applicable to general data types. In contrast to previous methods relying on hand-crafted transformations, neural transformation learning learns the transformations from data and uses them for detection. The key ingredient is a novel objective that encourages learning diverse transformations while preserving the relevant semantic content of the data. We prove theoretically and empirically that it is more suited than existing objectives for transformation learning.
We also introduce the extensions of neural transformation learning for anomaly detection within time series and graph-level anomaly detection. The extensions combine transformation learning and other learning paradigms to incorporate vital prior knowledge about time series and graph data. Moreover, we propose a general training strategy for deep anomaly detection with contaminated data. The idea is to infer the unlabeled anomalies and utilize them for updating parameters alternatively. In setups where expert feedback is available, we present a diverse querying strategy based on the seeding algorithm of K-means++ for active anomaly detection.
Our extensive experiments and analysis demonstrate that neural transformation learning achieves remarkable and robust anomaly detection performance on various data types. Finally, we outline specific paths for future research.
Semi-structured data is a common data format in many domains.
It is characterized by a hierarchical structure and a schema that is not fixed.
Efficient and scalable processing of this data is therefore challenging, as many existing indexing and processing techniques are not well-suited for this data format.
This dissertation presents a novel approach to processing large JSON datasets.
We describe a new data processor, JODA, that is designed to process semi-structured data by using all available computing resources and state-of-the-art techniques.
Using a custom query language and a vertically-scaling pipeline query execution engine, JODA can process large datasets with high throughput.
We optimize JODA by using a novel optimization for iterative query workloads called delta trees, which succinctly represent the changes between two documents.
This allows us to process iterative and exploratory queries efficiently.
We improve the filtering performance of JODA by implementing a holistic adaptive indexing approach that creates and improves structural and content indices on the fly, depending on the query load.
No prior knowledge about the data is required, and the indices are automatically improved over time.
JODA is also modularized and can be extended with new user-defined predicates, functions, indices, import, and export functionalities.
These modules can be written in an external programming language and integrated into the query execution pipeline at runtime.
To evaluate this system against competitors, we introduce a benchmark generator, coined BETZE, which aims to simulate data scientists exploring unknown JSON datasets.
The generator can be tweaked to generate query workload with different characteristics, or predefined presets can be used to quickly generate a benchmark.
We see that JODA outperforms competitors in most tasks over a wide range of datasets and use-cases.
3D joint angles based human pose is needed for applications like activity recognition, musculoskeletal health, sports biomechanics and ergonomics. The microelectromechanical systems (MEMS) based magnetic-inertial measurement units (MIMUs) can estimate 3D orientation. Due to small size, MIMUs can be attached to the body as wearable sensors for obtaining full 3D human pose and this system is termed as inertial motion capture (i-Mocap). But the MIMUs suffer from sensor errors and disturbances, due to which orientation estimated from individual MIMUs can be erroneous. Accurate sensor calibration is essential and subsequently alignment of these sensors to body segments must also be precisely known, which is called sensor-to-segment calibration. Sensor fusion is employed to address the disturbances and noise in MIMUs. Many state-of-art inertial motion capture approaches ignore the magnetometer and only use IMUs to reduce the error arising from inhomogeneous magnetic field. These algorithms rely on kinematic constraints and assumptions regarding joints and are based on IMUs located on the adjacent body segments. The full body coverage requires 13-17 such units and can be quite obtrusive. The setting up and calibration of so many wearable sensors also take time.
This thesis focuses on 3D human pose estimation from a reduced number of MIMUs and deals with this problem systematically. First we propose an accurate simultaneous calibration of multiple MIMUs, which also learns the uncertainty of individual sensors. We then describe a novel sensor fusion algorithm for robust orientation estimation from an MIMU and for updating sensors calibration online. The residual errors in both sensor calibration and fusion can result in drift error in the joint angles. Therefore, we present anatomical (sensor-to-segment) calibration in which an orientation offset correction term is updated and used for online correction of residual drift in individual joint angles. Subsequently we demonstrate that 3D human joint angle constraints can be learned using a data-driven approach in a high dimensional latent space. Owing to temporal and joint angle constraints, it is possible to use only a reduced set of sensors (as opposed to one sensor per segment) and still obtain 3D human pose. But the spatial and temporal prior learning from data is often limited due to finite set of movement patterns in most datasets. This introduces uncertainty while estimating 3D human pose from sparse MIMU sensors. We propose a magnetometer robust orientation parameterization and a data-driven deep learning framework to predict 3D human pose with associated uncertainty from sparse MIMUs. The model is evaluated on real MIMU data and we show that the uncertainty predicted by the trained model is well-correlated with actual error and ambiguity.
Though Computer Aided Design (CAD) and Simulation software are mature, well established, and in wide professional use, modern design and prototyping pipelines are challenging the limits of these tools. Advances in 3D printing have brought manufacturing capability to the general public. Moreover, advancements in Machine Learning and sensor technology are enabling enthusiasts and small companies to develop their own autonomous vehicles and machines. This means that many more users are designing (or customizing) 3D objects in CAD, and many are testing machine autonomy in Simulation. Though Graphical User Interfaces (GUIs) are the de-facto standard for these tools, we find that these interfaces are not robust and flexible. For example, designs made using GUI often break when customized, and setting up large simulations can be quite tedious in GUI. Though programmatic interfaces do not suffer from these limitations, they are generally quite difficult to use, and often do not provide appropriate abstractions and language constructs.
In this Thesis, we present our work on bridging the ease of use of GUI with the robustness and flexibility of programming. For CAD, we propose an interactive framework that automatically synthesizes robust programs from GUI-based design operations. Additionally, we apply program analysis to ensure customizations do not lead to invalid objects. Finally, for simulation, we propose a novel programmatic framework that simplifies building of complex test environments, and a test generation mechanism that guarantees good coverage over test parameters. Our contributions help bring some of the advantages of programming to traditionally GUI-dominant workflows. Through novel programmatic interfaces, and without sacrificing ease of use, we show that the design and customization of 3D objects can be made more robust, and that the creation of parameterized simulations can be simplified.
Faces deliver invaluable information about people. Machine-based perception can be of a great benefit in extracting that underlying information in face images if the problem is properly modeled. Classical image processing algorithms may fail to handle the diverse data available today due to several challenges related to varying capturing locations, and conditions. Advanced machine learning methods and algorithms are now highly beneficial due to the rapid development of powerful hardware, enabling feasible advanced solutions based on data learning and summarization into powerful models. In this thesis, novel solutions are provided to the problems of head orientation estimation and gender prediction. Initially, classical machine learning algorithms were used to address head orientation estimation but were limited by their inability to handle large datasets and poor generalization. To overcome these challenges, a new highly accurate head pose dataset was acquired to tackle the identified problems. Novel trained deep neural networks have been exploited, that use the acquired data and provide novel architectures. The information about head pose is then represented in the network weights, thus, allowing predicting the head orientation angles given a new unseen face. The acquired dataset, named AutoPOSE opens the door for further studies in the field of computer vision and especially, face analysis. The problem of gender prediction has also been explored, but unlike humans who can easily identify gender from a face, computers face difficulties due to facial similarities. Therefore, hand-crafted features are not effective for generalization. To address this, a new deep learning method was developed and evaluated on multiple public datasets, with identified challenges in both still images and videos addressed. Finally, the effect of facial appearance changes due to head orientation variation has been investigated on gender prediction accuracy. A novel orientation-guided feature maps recalibration method is presented, that significantly increased the accuracy of gender prediction.
In conclusion, two problems have been addressed in this thesis, independently and joined together. Existing methods have been enhanced with intelligent pre-processing methods and new approaches have been introduced to tackle existing challenges, that arise from pose, illumination, and occlusion variations. The proposed methods have been extensively evaluated, showing that head orientation and gender prediction can be estimated with high accuracy using machine learning-based methods. Also, the evaluations showed that the use of head orientation information consistently improved the gender prediction accuracy. Scientific contributions have been presented, and the new acquired highly accurate dataset motivates the research community to push the state-of-the-art forward.
Undocumented enterprise data can easily pile up in companies in form of datasets and personal information. In absence of a data management strategy, such data becomes rather messy and may not fit for its intended use. Since there is often no documentation available, only a limited number of domain experts are aware of its contents. Therefore, for companies it becomes increasingly difficult to use such data to its full potential. To provide a solution, this PhD thesis investigates the construction of enterprise and personal knowledge graphs by semantically enriching messy data with meaning using semantic technologies. Since real world entities and their interrelations are organized in a graph, knowledge graphs serve as a semantic bridge between domain conceptualization and raw data. Spreadsheets are a prominent example of such enterprise data, since they are widely used by knowledge workers in the industrial sector. Two distinct approaches are investigated to construct knowledge graphs from them: a global extraction & annotation method and a local mapping technique. The latter is further complemented with a predictor of mapping rules on messy data. Different human-in-the-loop strategies are considered to include experts depending on their user group. Since non-technical users usually lack understanding of semantic technologies, they need appropriate tools to be able to give feedback. In case of developers, approaches are proposed to close the technology gap between industry and Semantic Web related concepts. Semantic Web practitioners participate with ontology modeling and linked data applications. Enterprise and personal data is typically confidential which is why it cannot be shared with a research community to discuss its challenges. However, for evaluation and reproducibility reasons publicly available datasets are mandatory. The thesis proposes ways to generate synthetic datasets with the goal to be as authentic as possible. Besides that, for internal evaluations a crawler of personal data on desktops is implemented. There are further contributions related to this thesis in diverse domains. One is about the motivation to support users in their daily work using personal knowledge assistants. Others are the agricultural field and the data science domain which also benefit from knowledge graph approaches. In conclusion, this PhD thesis contributes to the construction of knowledge graphs from especially messy enterprise data, while users from different groups take part in this process in various ways.
This thesis focuses on novel methods to establish the utility of wearable devices along with machine learning and pattern recognition methods for formal education and address the open research questions posed by existing methods. Firstly, state-of-the-art methods are proposed to analyse the cognitive activities in the learning process, i.e., reading, writing, and their correlation. Furthermore, this thesis presents real-time applications in wearable space as an experimental tool in Physics education, and an air-writing system.
There are two critical components in analysing the reading behaviour, i.e., WHERE a person looks at (gaze analysis) and WHAT a person looks at (content analysis). This thesis proposes novel methods to classify the reading content to address the WHAT AT component. The proposed methods are based on a hybrid approach, which fuses the traditional computer vision methods with deep neural networks. These methods, when evaluated on publicly available datasets, yield state-of-the-art results to define the structure of the document images. Moreover, extensive efforts were made to refine and correct ICDAR2017-POD dataset along with a completely new FFD dataset.
Traditionally, handwriting research focuses on character and number recognition without looking into the type of writing, i.e. text, math, and drawing. This thesis reports multiple contributions for on-line handwriting classification. First, it presents a public dataset for on-line handwriting classification OnTabWriter, collected using iPen and an iPad. In addition, a new feature set is introduced for on-line handwriting classification to establish the benchmark on the proposed dataset to classify handwriting as plain text, mathematical expression, and plot/graph. An ablation study is made to evaluate the performance of the proposed feature set in comparison to existing feature sets. Lastly, this thesis evaluates the importance of context for on-line handwriting classification.
Analysing reading and writing activities individually is not enough to provide insights to identify the student's expertise unless their correlations are analysed. This thesis presents a study where reading data from wearable eye-trackers and writing data from sensor pen are analysed together in correlation to correlate the expertise of the users in Physics education with their actual knowledge. Initial results show a strong correlation between individual's expertise and understanding of the subject.
Augmented reality & virtual applications can play a vital role in making classroom environments more interactive and engaging both for teachers and learners. To validate the hypothesis, different applications are developed and evaluated. First, smart glasses are used as an experimental tool in Physics education to help the learners perform experiments by providing assistance and feedback on head mounted display in understanding acoustics concepts. Second, a real-time application of air-writing with the finger on an imaginary canvas using a single IMU as the FAirWrite system is also presented. FAirWrite system is further equipped with DL methods to classify the air-written characters.
Due to its performance, the field of deep learning has gained a lot of attention, with neural networks succeeding in areas like \( \textit{Computer Vision} \) (CV), \( \textit{Neural Language Processing} \) (NLP), and \( \textit{Reinforcement Learning} \) (RL). However, high accuracy comes at a computational cost as larger networks require longer training time and no longer fit onto a single GPU. To reduce training costs, researchers are looking into the dynamics of different optimizers, in order to find ways to make training more efficient. Resource requirements can be limited by reducing model size during training or designing more efficient models that improve accuracy without increasing network size.
This thesis combines eigenvalue computation and high-dimensional loss surface visualization to study different optimizers and deep neural network models. Eigenvectors of different eigenvalues are computed, and the loss landscape and optimizer trajectory are projected onto the plane spanned by those eigenvectors. A new parallelization method for the stochastic Lanczos method is introduced, resulting in faster computation and thus enabling high-resolution videos of the trajectory and second-order information during neural network training. Additionally, the thesis presents the loss landscape between two minima along with the eigenvalue density spectrum at intermediate points for the first time.
Secondly, this thesis presents a regularization method for \( \textit{Generative Adversarial Networks} \) (GANs) that uses second-order information. The gradient during training is modified by subtracting the eigenvector direction of the biggest eigenvalue, preventing the network from falling into the steepest minima and avoiding mode collapse. The thesis also shows the full eigenvalue density spectra of GANs during training.
Thirdly, this thesis introduces ProxSGD, a proximal algorithm for neural network training that guarantees convergence to a stationary point and unifies multiple popular optimizers. Proximal gradients are used to find a closed-form solution to the problem of training neural networks with smooth and non-smooth regularizations, resulting in better sparsity and more efficient optimization. Experiments show that ProxSGD can find sparser networks while reaching the same accuracy as popular optimizers.
Lastly, this thesis unifies sparsity and \( \textit{neural architecture search} \) (NAS) through the framework of group sparsity. Group sparsity is achieved through \( \ell_{2,1} \)-regularization during training, allowing for filter and operation pruning to reduce model size with minimal sacrifice in accuracy. By grouping multiple operations together, group sparsity can be used for NAS as well. This approach is shown to be more robust while still achieving competitive accuracies compared to state-of-the-art methods.
In recent years, the formal methods community has made significant progress towards the development of industrial-strength static analysis tools that can check properties of real-world production code. Such tools can help developers detect potential bugs and security vulnerabilities in critical software before deployment. While the potential benefits of static analysis tools are clear, their usability and effectiveness in mainstream software development workflows often comes into question and can prevent software developers from using these tools to their full potential. In this dissertation, we focus on two major challenges that can limit their ability to be incorporated into software development workflows.
The first challenge is unintentional unsoundness. Static program analyzers are complicated tools, implementing sophisticated algorithms and performance heuristics. This makes them highly susceptible to undetected unintentional soundness issues. These issues in program analyzers can cause false negatives and have disastrous consequences e.g., when analyzing safety critical software. In this dissertation, we present novel techniques to detect unintentional unsoundness bugs in two foundational program analysis tools namely SMT solvers and Datalog engines. These tools are used extensively by the formal methods community, for instance, in software verification, systematic testing, and program synthesis. We implemented these techniques as easy-to-use open source tools that are publicly available on Github. With the proposed techniques, we were able to detect more than 55 unique and confirmed critical soundness bugs in popular and widely used SMT solvers and Datalog engines in only a few months of testing.
The second challenge is finding the right balance between soundness, precision, and perfor- mance. In an ideal world, a static analyzer should be as precise as possible while maintaining soundness and being sufficiently fast. However, to overcome undecidability issues, these tools have to employ a variety of techniques to be practical for example, compromising on the sound- ness of the analysis or approximating code behavior. Static analyzers therefore are not trivial to integrate into any usage scenario with different program sizes, resource constraints and SLAs. Most of the times, these tools also don’t scale to large industrial code bases containing millions of lines of code. This makes it extremely challenging to get the most out of these analyzers and integrate them into everyday development activities, especially for average software develop- ment teams with little to no knowledge or understanding of advanced static analysis techniques. In this dissertation we present an approach to automatically tailor an abstract interpreter to the code under analysis and any given resource constraints. We implemented our technique as an open source framework, which is publicly available on Github. The second contribution of this dissertation in this challenge area is a technique to horizontally scale analysis tools in cloud-based static analysis platforms by splitting the input to the analyzer into partitions and analyzing the partitions independently. The technique was developed in collaboration with Amazon Web Services and is now being used in production in their CodeGuru service.
The rising demand for machine learning (ML) models has become a growing concern for stakeholders who depend on automatic decisions. In today's world, black-box solutions (in particular deep neural networks) are being continuously implemented for more and more high-stake scenarios like medical diagnosis or autonomous vehicles. Unfortunately, when these opaque models make predictions that do not align with our expectations, finding a valid justification is simply not possible.
Explainable Artificial Intelligence (XAI) has emerged in response to our need for finding reasons that justify what a machine sees, but we don't. However, contributions in this field are mostly centered around local structures such as individual neurons or single input samples. Global characteristics that govern the behavior of a model are still poorly understood or have not been explored yet. An aggravating factor is the lack of a standard terminology to contextualize and compare contributions in this field. Such lack of consensus is depriving the ML community from ultimately moving away from black-boxes, and start creating systematic methods to design models that are interpretable by design.
So, what are the global patterns that govern the behavior of modern neural networks, and what can we do to make these models more interpretable from the start?
This thesis delves into both issues, unveiling patterns about existing models, and establishing strategies that lead to more interpretable architectures. These include biases coming from imbalanced datasets, quantification of model capacity, and robustness against adversarial attacks. When looking for new models that are interpretable by design, this work proposes a strategy to add more structure to neural networks, based on auxiliary tasks that are semantically related to the main objective. This strategy is the result of applying a novel theoretical framework proposed as part of this work. The XAI framework is meant to contextualize and compare contributions in XAI by providing actionable definitions for terms like "explanation" and "interpretation."
Altogether, these contributions address dire demands for understanding more about the global behavior of modern deep neural networks. More importantly, they can be used as a blueprint for designing novel, and more interpretable architectures. By tackling issues from the present and the future of XAI, results from this work are a firm step towards more interpretable models for computer vision.
With the ever-increasing amount of satellite-backed communication, constellations covering the entire world, and the rise of Software Defined Radios (SDRs), satellite signals have already become prime targets for scientific research all over the globe. However, due to logistical challenges like capture time/location and peripheral/system management for the sensors and the wide variety of protocols/encoding schemes used, no one-fits-all sniffing solution exists for capturing their wide variety of signals. Therefore, this thesis aims to analyze, design, and implement a system that makes it possible to study LEO (Low Earth Orbit) L-Band satellite signals with readily available Single Board Computers (SBCs) in a widely distributed, location, and time-aware way. The key design factors were useability, maintainability, adaptability, and security in a centrally managed client-server architecture. The research presented yielded a Satellite probe Operating System called SATOS, which aims to implement on-sensor data decoding driven by GNU Radio and secure Over The Air (OTA) updates inside the Buildroot build environment. Its intended use case is the future deployment of DISCOSAT on a university working group scale.
Processing data streams is a classical and ubiquitous problem.
A query is registered against a potentially endless data stream and continuously delivers results as tuples stream in.
Modern stream processing systems allow users to express queries in different ways.
However, when a query involves joins between multiple input streams, the order of these joins is not transparently optimized.
In this thesis, we explore ways to optimize multi-way theta joins, where the join predicates are not limited to equality and multiple inputs are referenced.
We put forward a novel operator, MultiStream, which joins multiple input streams using iterative probing and bringing minimal materialization effort in.
The order in which tuples are sent inside a MultiStream operator is optimized using a cost-based model.
Further, a query can be answered using an multi-way tree comprising multiple MultiStream operators where each inner operator represents a materialized intermediate result.
We integrate equi-joins in MultiStream to reduce communication, such that mixed queries of theta and equality predicates are supported.
Streaming queries are long-standing and thus multiple queries might be registered at the system at the same time.
Hence, we research joint answering of multiple multi-way join queries and optimize the global ordering using integer linear programming.
All these approaches are implemented in CLASH, a system for generating Apache Storm topologies including runtime components that enables users to pose queries in a declarative way and let the system craft the suitable topology.
An Efficient Automated Machine Learning Framework for Genomics and Proteomics Sequence Analysis
(2023)
Genomics and Proteomics sequence analyses are the scientific studies of understanding the language of Deoxyribonucleic Acid (DNA), Ribonucleic Acid (RNA) and protein biomolecules with an objective of controlling the production of proteins and understanding their core functionalities. It helps to detect chronic diseases in early stages, root causes of clinical changes, key genetic targets for pharmaceutical development and optimization of therapeutics for various age groups. Most Genomics and Proteomics sequence analysis work is performed using typical wet lab experimental approaches that make use of different genetic diagnostic technologies. However, these approaches are costly, time consuming, skill and labor intensive. Hence, these approaches slow down the process of developing an efficient and economical sequence analysis landscape essential to demystify a variety of cellular processes and functioning of biomolecules in living organisms. To empower manual wet lab experiment driven research, many machine learning based approaches have been developed in recent years. However, these approaches cannot be used in practical environment due to their limited performance. Considering the sensitive and inherently demanding nature of Genomics and Proteomics sequence
analysis which can have very far-reaching as well as serious repercussions on account of misdiagnosis, the main
objective of this research is to develop an efficient automated computational framework for Genomics and Proteomics sequence analysis using the predictive and prescriptive analytical powers of Artificial Intelligence (AI) to significantly improve healthcare operations.
The proposed framework is comprised of 3 main components namely sequence encoding, feature engineering and
discrete or continuous value predictor. The sequence encoding module is equipped with a variety of existing and newly developed sequence encoding algorithms that are capable of generating a rich statistical representation of DNA, RNA and protein raw sequences. The feature engineering module has diverse types of feature selection and dimensionality reduction approaches which can be used to generate the most effective feature space. Furthermore, the discrete and/or continuous value predictor module of the proposed framework contains a wide range of existing machine learning and newly developed deep learning regressors and classifiers. To evaluate the integrity and generalizability of the proposed framework, we have performed a large-scale experimentation over diverse types of Genomics and Proteomics sequence analysis tasks (i.e., DNA, RNA and proteins).
In Genomics analysis, Epigenetic modification detection is one of the key component. It helps clinical researchers and practitioners to distinguish normal cellular activities from malfunctioned ones, which can lead to diverse genetic disorders such as metabolic disorders, cancers, etc. To support this analysis, the proposed framework is used to solve the problem of DNA and Histone modification prediction where it has achieved state-of-the-art performance on 27 publicly available benchmark datasets of 17 different species with best accuracy of 97%. RNA sequence analysis is another vital component of Genomics sequence analysis where the identification of different coding and non-coding RNAs as well as their subcellular localization patterns help to demystify the functions of diverse RNAs, root causes of clinical changes, develop precision medicine and optimize therapeutics. To support this analysis, the proposed framework is utilized for non-coding RNA classification and multi-compartment RNA subcellular localization prediction. Where it achieved state-of-the-art performance on 10 publicly available benchmark datasets of Homo sapiens and Mus Musculus species with best accuracy of 98%.
Proteomics sequence analysis is essential to demystify the virus pathogenesis, host immunity responses, the way
proteins affect or are affected by the cell processes, their structure and core functionalities. To support this analysis, the proposed framework is used for host protein-protein and virus-host protein-protein interaction prediction. It has achieved state-of-the-art performance on 2 publicly available protein protein interaction datasets of Homo Sapiens and Mus Musculus species with best accuracy of 96% and 7 viral host protein protein interaction datasets of multiple hosts and viruses with best accuracy of 94%. Considering the performance and practical significance of proposed framework, we believe proposed framework will help researchers in developing cutting-edge practical applications for diverse Genomic and Proteomic sequence analyses tasks (i.e., DNA, RNA and proteins).
Scientific research plays a crucial role in the development of a society. With ever-increasing volumes of scientific publications are now making it extremely challenging to analyze and maintain insights into the scientific communities like collaboration or citation trends and evolution of interests etc. This thesis is an effort towards using scientific publications to provide detailed insights into a scientific community from a range of aspects. The contribution of this thesis is five-fold.
Firstly, this thesis proposes approaches for automatic information extraction from scientific publications. The proposed layout-based approach for this purpose is inspired by how human beings perceive individual references relying only on visual queues. The proposed approach significantly outperforms the existing text-based techniques and is independent of any domain or language.
Secondly, this thesis tackles the problem of identifying meaningful topics from a given publication as the keywords provided in the publication are not always accurate representatives of the publication topic. To rectify this problem, this thesis proposes a state-of-the-art keywords extraction approach that employs a domain ontology along with the detected keywords to perform topic modeling for a given set of publications.
Thirdly, this thesis analyses the disposition of each citation to understand its true essence. For this purpose, we proposes a transformer-based approach for analyzing the impact of each citation appearing in a scientific publication. The impact of a citation can be determined by the inherent sentiment and intent of a citation, which refers to the assessment and motive of an author towards citing a scientific publication.
Furthermore, this thesis quantifies the influence of a research contributor in a scientific community by introducing a new semantic index for researchers that takes both quantitative and qualitative aspects of a citation into account to better represent the prestige of a researcher in a scientific community. Semantic Index is also evaluated for conformity to the guidelines and recommendations of various research funding organizations to assess the impact of a researcher.
In this thesis, all of the aforementioned aspects are packaged together in a single framework called Academic Community Explorer (ACE) 2.0, which automatically extracts and analyzes information from scientific publications and visualizes the insights using several interactive visualizations. These visualizations provide an instant glimpse into the scientific communities from a wide range of aspects with different granularity levels.
The generally unsupervised nature of autoencoder models implies that the main training metric is formulated as the error between input images and their corresponding reconstructions. Different reconstruction loss variations and latent space regularization have been shown to improve model performances depending on the tasks to solve and to induce new desirable properties like disentanglement. Nevertheless, measuring the success in, or enforcing properties by, the input pixel space is a challenging endeavor. In this work, we want to make more efficient use of the available data and provide design choices to be considered in the recording or generation of future datasets to implicitly induce desirable properties during training. To this end, we propose a new sampling technique which matches semantically important parts of the image while randomizing the other parts, leading to salient feature extraction and a neglection of unimportant details. Further, we propose to recursively apply a previously trained autoencoder model, which can then be interpreted as a dynamical system with desirable properties for generalization and uncertainty estimation.
The proposed methods can be combined with any existing reconstruction loss. We give a detailed analysis of the resulting properties on various datasets and show improvements on several computer vision tasks: image and illumination normalization, invariances, synthetic to real generalization, uncertainty estimation and improved classification accuracy by means of simple classifiers in the latent space.
These investigations are adopted in the automotive application of vehicle interior rear seat occupant classification. For the latter, we release a synthetic dataset with several fine-grained extensions such that all the aforementioned topics can be investigated in isolation, or together, in a single application environment. We provide quantitative evidence that machine learning, and in particular deep learning methods cannot readily be used in industrial applications when only a limited amount of variation is available for training. The latter can, however, often be the case because of constraints enforced by the application to be considered and financial limitations.
In recent years, deep learning has made substantial improvements in various fields like image understanding, Natural Language Processing (NLP), etc. These huge advancements have led to the release of many commercial applications which aim to help users carry out their daily tasks. Personal digital assistants are one such successful application of NLP, having a diverse userbase from all age groups. NLP tasks like Natural Language Understanding (NLU) and Natural Language Generation (NLG) are core components for building these assistants. However, like any other deep learning model, the growth of NLU & NLG models is directly coupled with tremendous amounts of training examples, which are expensive to collect due to annotator costs. Therefore, this work investigates the methodologies to build NLU and NLG systems in a data-constrained setting.
We evaluate the problem of limited training data in multiple scenarios like limited or no data available when building a new system, availability of a few labeled examples when adding a new feature to an existing system, and changes in the distribution of test data during the lifetime of a deployed system.
Motivated by the standard methods to handle data-constrained settings, we propose novel approaches to generate data and exploit latent representations to overcome performance drops emerging from limited training data.We propose a framework to generate high-quality synthetic data when few training examples are available for a newly added feature for dialogue agents. Our interpretation-to-text model uses existing training data for bootstrapping new features and improves the accuracy of downstream tasks of intent classification and slot labeling. Following, we study a few-shot setting and observe that generation systems face a low semantic coverage problem. Hence, we present an unsupervised NLG algorithm that ensures that all relevant semantic information is present in the generated text.
We also study to see if we really need all training examples for learning a generalized model. We propose a data selection method that selects the most informative training examples to train Visual Question Answering (VQA) models without erosion of accuracy. We leverage the already available inter-annotator agreement and design a diagnostic tool, called (EaSe), that leverages the entropy and semantic similarity of answer patterns.
Finally, we discuss two empirical studies to understand the feature space of VQA models and show how language model pre-training and exploiting multimodal embedding space allows for building data constrained models ensuring minimal or no accuracy losses.
In recent years, Augmented Reality has made its way into everyday devices. Most smartphones are AR-enabled, providing applications like pedestrian navigation, Point of Interest highlighting, gaming, and retail. The high-tech industry has been focused on developing smartglasses to present virtual elements directly in front of the viewers’ eyes, allowing more immersive AR experiences. Smartglasses can also be deployed while driving for an enhanced and more safe experience. A 3D registered augmentation of the real world with navigation arrows, lane highlighting, or warnings can decrease the duration of inattentiveness regarding driving due to glancing at other screens. Enabling HMDs’ usage inside cars requires knowing its exact position and orientation (6-DoF pose) in the car. This necessitates sensors either built inside the AR glasses or the car. In a car, the latter option called outside-in tracking is more attractive due to two reasons. First, AR glasses containing different sensor sets exist, hampering finding one single solution for different HMDs. Second, the view from the driver’s perspective combines static interior and dynamic exterior features, complicating finding a reliable set of features. Nowadays, tracking methods utilize Deep Learning for a more generalizable and accurate derivation of the 6-DoF pose. They achieve outstanding results for head and object pose estimation. In this thesis, we present Deep Learning-based in-car 6-DoF AR glasses pose estimation approaches. The goal of the work is an exploration of accurate HMD pose estimation with the help of neural networks. The thesis achieves this by investigating numerous pose estimation techniques. Evaluations on the recorded HMDPose dataset constitute the foundation for this, consisting of infrared images of drivers wearing different HMD models. First, algorithms based on images are derived and evaluated on the dataset. For comparison, we carried out an evaluation on image-based methods considering time information. Further, pose estimation based on point clouds, generated out of infrared images, are analyzed. An investigation of various head pose estimation methods to derive its potential use are conducted. In conclusion, we introduce several highly accurate AR glasses pose estimators. The HMD pose alone achieves better results than the head pose and the combination of the head and HMD. Especially our image-based methods with optional usage of time information can efficiently and accurately regress the AR glasses pose. Our algorithms show excellent estimation results on live data when deployed inside a car, making seamless in-car HMD usage possible in the future.
The wireless spectrum is already a scarce good, shared by multiple competing technologies such as Bluetooth, ZigBee and Wi-Fi, and the hunger for traffic is only increasing. Due to the heterogeneity of the existing wireless technologies and the real threat that interference poses to network performance, sophisticated techniques must be developed to ensure acceptable levels of quality of service.
In this thesis, we present a passive channel sensing scheme based on both energy and signal detection, that primarily considers the spectrum occupation of foreign traffic while allowing for additional complementary information such as the signal-to-noise ratio. The resulting channel quality metric is first corrected for the spectrum occupation of internal transmissions and later aggregated with help of a moving average followed by an exponential weighted moving average. This aggregation keeps the metric both sufficiently stable and adaptive to significant changes in channel usage. Moreover, the channel quality metric is made volatility-aware by penalizing qualities proportionally to their downward volatility. This yields a conservative metric and allows to differentiate channels with similar aggregated qualities but different volatility behavior.
Our second main contribution is in the form of a schedule-based channel sensing protocol, in which nodes possess two network interfaces, one for communication and one for channel sensing. Channel sensing schedules are derived from communication schedules, i.e. channel hopping sequences used for communication, with help of a stochastic local search-based heuristic that attempts to minimize channel sensing bias, the channel overlap between both schedules and to maximize overlap fairness. This minimizes the effect of internal transmissions in the resulting channel quality metric, allowing nodes to derive channel quality primarily based on foreign traffic in an unbiased manner.
Finally, we propose and implement a stabilization protocol for keeping nodes in an ad-hoc network tick-synchronized and schedule-consistent w.r.t. a communication schedule. This stabilization protocol makes use of special messages, namely tick frames for synchronization, channel quality reports for sharing local views of channel conditions and schedule reports for disseminating the global communication hopping sequence. The communication schedules are computed by a master node based on an aggregation of local channel quality views and the re-computation of these schedules is triggered by significant changes in channel conditions. The resulting protocol is robust against changes in topology and channel conditions.
Several applications have emerged and benefited from the recent advancements in wireless communication technologies. In the case of industrial automation, the wireless networks substituted wired networks to control and monitor the production systems and the factory environment. In such use cases, a common requirement is communication reliability. Technologies based on IEEE 802.15.4, such as WirelessHart and ZigBee developed for industrial applications, offer deterministic guarantees using reservation-based medium access. However, it is becoming more challenging for these technologies to guarantee their sufficiently predictable behavior, as the number of consumer electronics equipped with wireless communication technologies operating in the 2.4 GHz ISM band shared by IEEE 802.15.4 is increasing day by day.
Meanwhile, developments in WiFi technology opened the opportunity to use WiFi for industrial applications. Compared to the technologies based on IEEE 802.15.4, WiFi offers significantly higher transmission rates, and the off-the-shelf commodity WiFi hardwares are available at a low cost. However, when using a contention-based technology such as WiFi for industrial applications, additional measures are required to guarantee the specified statistical reliability.
This thesis lays the foundations for developing a multi-hop wireless control network using off-the-shelf IEEE 802.11 (WiFi) hardware operating in contention mode that can satisfy the specified reliability requirements of the applications. In a multi-hop wireless network, the communication reliability between the nodes depends on the routes determined by the routing protocol and managing these routes. We introduce a novel Quality-of-Service (QoS) routing protocol for contention-based wireless technologies such as WiFi that prioritizes reliability as the QoS requirement for route selection. The proposed routing protocol relies on different aspects of the network to determine and manage the routes. For instance, it requires algorithms and protocols to monitor and measure link quality, available bandwidth, or medium overload. Further, the determined routes require certain statistical link properties for the successful operation of the routes. We develop and evaluate different protocols, algorithms, and metrics to monitor and measure different aspects of the network in this thesis.
This dissertation describes the implementation, validation, and troubleshooting of ``Digital Twins'' in assembly processes of thin structures like parts from the automotive and aerospace industry. As requirements in terms of cost, weight, and human (pedestrian) safety are increasing for modern vehicles, thinner materials are used for exterior components. By that, components become softer but less stable which is challenging for the assembly processes and impacts the resulting quality. The most critical quality measures are gap and flushness as these are affecting aesthetics, wind noise, and fuel consumption of the final vehicle. To compensate for geometrical deviations, parts have adjustable mechanical interfaces which are used to tune in gaps and flushness for each individual assembly. For the components being assembled, individual process parameters depending on the geometry of the actual physical part must be defined. This is a challenging task that cannot be solved in a straightforward manner. However, assembly quality can be predicted by setting up individual Finite Element Method (FEM) simulation models for each part being assembled. These simulation models are called Digital Twin (DTs) as they are enriched with measured properties from the actual physical part. By that, precise predictions can be made and optimal assembly parameters for automated processes are derived. The demonstration use case in this dissertation is the assembly process of exterior car components made from sheet metals. For this kind of process, the geometrical deviations of individual components are crucial and have to be considered by the DT. To capture geometrical deviations, 3D-scanning is employed which provides a high-resolution point cloud representation of the actual physical part. This point cloud is processed further to obtain the DT that preserves the measured geometry. This dissertation tackles the following challenges: (a) setting up DTs on different level of details, (b) correctly post-processing 3D-scanned data to remove systematical measurement errors, (c) automatically morphing meshes to derive simulation models from measured point clouds, and (d) troubleshooting DTs with human-in-the-loop approaches. For all approaches, validations are provided that underline applicability and benefits. All methods and results are discussed on a high-level perspective and connections as well as the interplay between methods are elaborated. Each method either improves or extends existing approaches or provides benefits, i.e. higher precision, compared to existing solutions.