Kaiserslautern - Fachbereich Informatik
Refine
Year of publication
Document Type
- Doctoral Thesis (235) (remove)
Has Fulltext
- yes (235)
Keywords
- Visualisierung (20)
- Visualization (8)
- Deep Learning (7)
- Computergraphik (5)
- Evaluation (4)
- Robotik (4)
- Artificial Intelligence (3)
- Bildverarbeitung (3)
- Geoinformationssystem (3)
- Machine Learning (3)
Faculty / Organisational entity
Highly Automated Driving (HAD) vehicles represent complex and safety critical systems. They are deployed in an open context i.e., an intricate environment which undergoes continual changes. The complexity of these systems and insufficiencies in sensing and understanding the open context may result in unsafe and uncertain behaviour. The safety critical nature of the HAD vehicles requires modelling of root causes for unsafe behaviour and their mitigation to argue sufficient reduction of residual risk.
Standardization activities such as ISO 21448 provide guidelines on the Safety Of The Intended Functionality (SOTIF) and focus on the analysis of performance limitations under the influence of triggering conditions that can lead to hazardous behaviour. SOTIF references traditional safety analyses methods e.g., Failure Mode and Effect Analysis (FMEA) and Fault Tree Analysis (FTA) to perform safety analysis. These analyses methods are based on certain assumptions e.g., single point failure in FMEA and independence of basic events in FTA. Moreover, these analyses are generally based on expert knowledge i.e., data-based models or hybrid approaches (expert and data) are seldom practised. The resulting safety model is fixed i.e., it is generally seen as a one-time artefact. Open context environment may contain triggering conditions which may not be evident to the expert. Open context also evolves over time and new phenomena may emerge.
This thesis explores the applicability of the traditional safety analyses techniques to provide safety models for HAD vehicles operating in the open context, under the light of modelling assumptions taken by traditional safety analyses techniques. Moreover, incorporating uncertainties into safety analyses models is also explored. An explicit distinction between the inherent uncertainty of a probabilistic event (aleatory) and uncertainty due to lack of knowledge (epistemic) is made to formalize models to perform SOTIF analysis. A further distinction is made for conditions of complete ignorance and termed as ontological uncertainty. The distinction is important as for HAD vehicles operating in open context the ontological uncertainty can never be completely disregarded.
This thesis proposes a novel framework of SOTIF to model, estimate and dis cover triggering conditions relevant to performance limitations. The framework provides the ability to model uncertainties while also providing a hybrid approach i.e., supporting inclusion of expert knowledge as well as data driven engineering processes. Two representative algorithms are provided to support the framework. Bayesian Network (BN) and p-value hypothesis testing are utilised in this regard. The framework is implemented on a real-world case study in which LIDARs based perception systems are used as vehicle detection system.
This doctoral dissertation is comprised of nine published articles covering different
methods for ‘Fast, Robust Rigid and Non-Rigid Registration for Globally Consistent
3D Scene and Shape Reconstruction’. Overall the contributing articles are separated
and discussed in three stages – The first part of the thesis i.e., chapter 2 explains
three novel method classes of rigid point set registration namely Gravitational Approach (GA), Fast Gravitational Approach (FGA), and RPSRNet. GA was introduced as the first physics-based rigid point set registration. It includes elegant modeling of rigid by dynamics using Newtonian mechanics. The method proposed many new avenues for other types of pattern matching tasks thank point set registration. Next, FGA method, published 4 years after GA presented as an extension that breaks the algorithmic complexity of GA from O(M N ) to O(M log N ) using Barnes-Hut tree representation of point cloud. It also eliminates the requirement of heuristic optimization parameter settings by GA, and achieve state-of-the-art alignment accuracy on LiDAR odometry. Finally, RPSRNet presents deep learning version of FGA, with custom convolution layers for hierarchical point feature embedding. RPSRNet is robust and the fastest among SoA methods for LiDAR data registration. The second part, i.e., chapter 3, of the thesis introduces NRGA as the fist physics-based non-rigid point set
registration method which is computationally slow but robust against noisy and partial inputs. NRGA preserves structural consistency as it coherently regularize motion of deformable vertices. For articulated hand shape reconstruction, a tailored version of NRGA -- Articulated-NRGA -- is effective to refine final hand shape. Collision and penetration avoidance between source and target surfaces are tackled by constrained optimization in NRGA. This setting has improved hand and object interaction reconstruction. Next contribution FoldMatch method remodels the shape deformation by introducing wrinkle vector field (WVF) for capturing complex clothing and garment details while fitting body models onto 3D Scans. Quantitative evaluation of FoldMatch and NRGA shows their effectiveness in geometrically consistent surface modeling and reconstruction tasks. Finally, the third part of the thesis explains globally consistent outdoor scene reconstruciton, odometry estimation, and uncertainty guided pose-graph optimization in a novel LiDAR-based localization and map building method, called Deep Evidential LiDAR Odometry (DELO). This is the first Odometry method to use predictive uncertainty modeling for sensor pose prediction network.
From industrial fault detection to medical image analysis or financial fraud prevention: Anomaly detection—the task of identifying data points that show significant deviations from the majority of data—is critical in industrial and technological applications. For efficient and effective anomaly detection, a rich set of semantic features are required to be automatically extracted from the complex data. For example, many recent advances in image anomaly detection are based on self-supervised learning, which learns rich features from a large amount of unlabeled complex image data by exploiting data augmentations. For image data, predefined transformations such as rotations are used to generate varying views of the data. Unfortunately, for data other than images, such as time series, tabular data, graphs, or text, it is unclear what are suitable transformations. This becomes an obstacle to successful self-supervised anomaly detection on other data types.
This thesis proposes Neural Transformation Learning, a self-supervised anomaly detection method that is applicable to general data types. In contrast to previous methods relying on hand-crafted transformations, neural transformation learning learns the transformations from data and uses them for detection. The key ingredient is a novel objective that encourages learning diverse transformations while preserving the relevant semantic content of the data. We prove theoretically and empirically that it is more suited than existing objectives for transformation learning.
We also introduce the extensions of neural transformation learning for anomaly detection within time series and graph-level anomaly detection. The extensions combine transformation learning and other learning paradigms to incorporate vital prior knowledge about time series and graph data. Moreover, we propose a general training strategy for deep anomaly detection with contaminated data. The idea is to infer the unlabeled anomalies and utilize them for updating parameters alternatively. In setups where expert feedback is available, we present a diverse querying strategy based on the seeding algorithm of K-means++ for active anomaly detection.
Our extensive experiments and analysis demonstrate that neural transformation learning achieves remarkable and robust anomaly detection performance on various data types. Finally, we outline specific paths for future research.
Semi-structured data is a common data format in many domains.
It is characterized by a hierarchical structure and a schema that is not fixed.
Efficient and scalable processing of this data is therefore challenging, as many existing indexing and processing techniques are not well-suited for this data format.
This dissertation presents a novel approach to processing large JSON datasets.
We describe a new data processor, JODA, that is designed to process semi-structured data by using all available computing resources and state-of-the-art techniques.
Using a custom query language and a vertically-scaling pipeline query execution engine, JODA can process large datasets with high throughput.
We optimize JODA by using a novel optimization for iterative query workloads called delta trees, which succinctly represent the changes between two documents.
This allows us to process iterative and exploratory queries efficiently.
We improve the filtering performance of JODA by implementing a holistic adaptive indexing approach that creates and improves structural and content indices on the fly, depending on the query load.
No prior knowledge about the data is required, and the indices are automatically improved over time.
JODA is also modularized and can be extended with new user-defined predicates, functions, indices, import, and export functionalities.
These modules can be written in an external programming language and integrated into the query execution pipeline at runtime.
To evaluate this system against competitors, we introduce a benchmark generator, coined BETZE, which aims to simulate data scientists exploring unknown JSON datasets.
The generator can be tweaked to generate query workload with different characteristics, or predefined presets can be used to quickly generate a benchmark.
We see that JODA outperforms competitors in most tasks over a wide range of datasets and use-cases.
3D joint angles based human pose is needed for applications like activity recognition, musculoskeletal health, sports biomechanics and ergonomics. The microelectromechanical systems (MEMS) based magnetic-inertial measurement units (MIMUs) can estimate 3D orientation. Due to small size, MIMUs can be attached to the body as wearable sensors for obtaining full 3D human pose and this system is termed as inertial motion capture (i-Mocap). But the MIMUs suffer from sensor errors and disturbances, due to which orientation estimated from individual MIMUs can be erroneous. Accurate sensor calibration is essential and subsequently alignment of these sensors to body segments must also be precisely known, which is called sensor-to-segment calibration. Sensor fusion is employed to address the disturbances and noise in MIMUs. Many state-of-art inertial motion capture approaches ignore the magnetometer and only use IMUs to reduce the error arising from inhomogeneous magnetic field. These algorithms rely on kinematic constraints and assumptions regarding joints and are based on IMUs located on the adjacent body segments. The full body coverage requires 13-17 such units and can be quite obtrusive. The setting up and calibration of so many wearable sensors also take time.
This thesis focuses on 3D human pose estimation from a reduced number of MIMUs and deals with this problem systematically. First we propose an accurate simultaneous calibration of multiple MIMUs, which also learns the uncertainty of individual sensors. We then describe a novel sensor fusion algorithm for robust orientation estimation from an MIMU and for updating sensors calibration online. The residual errors in both sensor calibration and fusion can result in drift error in the joint angles. Therefore, we present anatomical (sensor-to-segment) calibration in which an orientation offset correction term is updated and used for online correction of residual drift in individual joint angles. Subsequently we demonstrate that 3D human joint angle constraints can be learned using a data-driven approach in a high dimensional latent space. Owing to temporal and joint angle constraints, it is possible to use only a reduced set of sensors (as opposed to one sensor per segment) and still obtain 3D human pose. But the spatial and temporal prior learning from data is often limited due to finite set of movement patterns in most datasets. This introduces uncertainty while estimating 3D human pose from sparse MIMU sensors. We propose a magnetometer robust orientation parameterization and a data-driven deep learning framework to predict 3D human pose with associated uncertainty from sparse MIMUs. The model is evaluated on real MIMU data and we show that the uncertainty predicted by the trained model is well-correlated with actual error and ambiguity.
Though Computer Aided Design (CAD) and Simulation software are mature, well established, and in wide professional use, modern design and prototyping pipelines are challenging the limits of these tools. Advances in 3D printing have brought manufacturing capability to the general public. Moreover, advancements in Machine Learning and sensor technology are enabling enthusiasts and small companies to develop their own autonomous vehicles and machines. This means that many more users are designing (or customizing) 3D objects in CAD, and many are testing machine autonomy in Simulation. Though Graphical User Interfaces (GUIs) are the de-facto standard for these tools, we find that these interfaces are not robust and flexible. For example, designs made using GUI often break when customized, and setting up large simulations can be quite tedious in GUI. Though programmatic interfaces do not suffer from these limitations, they are generally quite difficult to use, and often do not provide appropriate abstractions and language constructs.
In this Thesis, we present our work on bridging the ease of use of GUI with the robustness and flexibility of programming. For CAD, we propose an interactive framework that automatically synthesizes robust programs from GUI-based design operations. Additionally, we apply program analysis to ensure customizations do not lead to invalid objects. Finally, for simulation, we propose a novel programmatic framework that simplifies building of complex test environments, and a test generation mechanism that guarantees good coverage over test parameters. Our contributions help bring some of the advantages of programming to traditionally GUI-dominant workflows. Through novel programmatic interfaces, and without sacrificing ease of use, we show that the design and customization of 3D objects can be made more robust, and that the creation of parameterized simulations can be simplified.
Faces deliver invaluable information about people. Machine-based perception can be of a great benefit in extracting that underlying information in face images if the problem is properly modeled. Classical image processing algorithms may fail to handle the diverse data available today due to several challenges related to varying capturing locations, and conditions. Advanced machine learning methods and algorithms are now highly beneficial due to the rapid development of powerful hardware, enabling feasible advanced solutions based on data learning and summarization into powerful models. In this thesis, novel solutions are provided to the problems of head orientation estimation and gender prediction. Initially, classical machine learning algorithms were used to address head orientation estimation but were limited by their inability to handle large datasets and poor generalization. To overcome these challenges, a new highly accurate head pose dataset was acquired to tackle the identified problems. Novel trained deep neural networks have been exploited, that use the acquired data and provide novel architectures. The information about head pose is then represented in the network weights, thus, allowing predicting the head orientation angles given a new unseen face. The acquired dataset, named AutoPOSE opens the door for further studies in the field of computer vision and especially, face analysis. The problem of gender prediction has also been explored, but unlike humans who can easily identify gender from a face, computers face difficulties due to facial similarities. Therefore, hand-crafted features are not effective for generalization. To address this, a new deep learning method was developed and evaluated on multiple public datasets, with identified challenges in both still images and videos addressed. Finally, the effect of facial appearance changes due to head orientation variation has been investigated on gender prediction accuracy. A novel orientation-guided feature maps recalibration method is presented, that significantly increased the accuracy of gender prediction.
In conclusion, two problems have been addressed in this thesis, independently and joined together. Existing methods have been enhanced with intelligent pre-processing methods and new approaches have been introduced to tackle existing challenges, that arise from pose, illumination, and occlusion variations. The proposed methods have been extensively evaluated, showing that head orientation and gender prediction can be estimated with high accuracy using machine learning-based methods. Also, the evaluations showed that the use of head orientation information consistently improved the gender prediction accuracy. Scientific contributions have been presented, and the new acquired highly accurate dataset motivates the research community to push the state-of-the-art forward.
Undocumented enterprise data can easily pile up in companies in form of datasets and personal information. In absence of a data management strategy, such data becomes rather messy and may not fit for its intended use. Since there is often no documentation available, only a limited number of domain experts are aware of its contents. Therefore, for companies it becomes increasingly difficult to use such data to its full potential. To provide a solution, this PhD thesis investigates the construction of enterprise and personal knowledge graphs by semantically enriching messy data with meaning using semantic technologies. Since real world entities and their interrelations are organized in a graph, knowledge graphs serve as a semantic bridge between domain conceptualization and raw data. Spreadsheets are a prominent example of such enterprise data, since they are widely used by knowledge workers in the industrial sector. Two distinct approaches are investigated to construct knowledge graphs from them: a global extraction & annotation method and a local mapping technique. The latter is further complemented with a predictor of mapping rules on messy data. Different human-in-the-loop strategies are considered to include experts depending on their user group. Since non-technical users usually lack understanding of semantic technologies, they need appropriate tools to be able to give feedback. In case of developers, approaches are proposed to close the technology gap between industry and Semantic Web related concepts. Semantic Web practitioners participate with ontology modeling and linked data applications. Enterprise and personal data is typically confidential which is why it cannot be shared with a research community to discuss its challenges. However, for evaluation and reproducibility reasons publicly available datasets are mandatory. The thesis proposes ways to generate synthetic datasets with the goal to be as authentic as possible. Besides that, for internal evaluations a crawler of personal data on desktops is implemented. There are further contributions related to this thesis in diverse domains. One is about the motivation to support users in their daily work using personal knowledge assistants. Others are the agricultural field and the data science domain which also benefit from knowledge graph approaches. In conclusion, this PhD thesis contributes to the construction of knowledge graphs from especially messy enterprise data, while users from different groups take part in this process in various ways.
This thesis focuses on novel methods to establish the utility of wearable devices along with machine learning and pattern recognition methods for formal education and address the open research questions posed by existing methods. Firstly, state-of-the-art methods are proposed to analyse the cognitive activities in the learning process, i.e., reading, writing, and their correlation. Furthermore, this thesis presents real-time applications in wearable space as an experimental tool in Physics education, and an air-writing system.
There are two critical components in analysing the reading behaviour, i.e., WHERE a person looks at (gaze analysis) and WHAT a person looks at (content analysis). This thesis proposes novel methods to classify the reading content to address the WHAT AT component. The proposed methods are based on a hybrid approach, which fuses the traditional computer vision methods with deep neural networks. These methods, when evaluated on publicly available datasets, yield state-of-the-art results to define the structure of the document images. Moreover, extensive efforts were made to refine and correct ICDAR2017-POD dataset along with a completely new FFD dataset.
Traditionally, handwriting research focuses on character and number recognition without looking into the type of writing, i.e. text, math, and drawing. This thesis reports multiple contributions for on-line handwriting classification. First, it presents a public dataset for on-line handwriting classification OnTabWriter, collected using iPen and an iPad. In addition, a new feature set is introduced for on-line handwriting classification to establish the benchmark on the proposed dataset to classify handwriting as plain text, mathematical expression, and plot/graph. An ablation study is made to evaluate the performance of the proposed feature set in comparison to existing feature sets. Lastly, this thesis evaluates the importance of context for on-line handwriting classification.
Analysing reading and writing activities individually is not enough to provide insights to identify the student's expertise unless their correlations are analysed. This thesis presents a study where reading data from wearable eye-trackers and writing data from sensor pen are analysed together in correlation to correlate the expertise of the users in Physics education with their actual knowledge. Initial results show a strong correlation between individual's expertise and understanding of the subject.
Augmented reality & virtual applications can play a vital role in making classroom environments more interactive and engaging both for teachers and learners. To validate the hypothesis, different applications are developed and evaluated. First, smart glasses are used as an experimental tool in Physics education to help the learners perform experiments by providing assistance and feedback on head mounted display in understanding acoustics concepts. Second, a real-time application of air-writing with the finger on an imaginary canvas using a single IMU as the FAirWrite system is also presented. FAirWrite system is further equipped with DL methods to classify the air-written characters.
Due to its performance, the field of deep learning has gained a lot of attention, with neural networks succeeding in areas like \( \textit{Computer Vision} \) (CV), \( \textit{Neural Language Processing} \) (NLP), and \( \textit{Reinforcement Learning} \) (RL). However, high accuracy comes at a computational cost as larger networks require longer training time and no longer fit onto a single GPU. To reduce training costs, researchers are looking into the dynamics of different optimizers, in order to find ways to make training more efficient. Resource requirements can be limited by reducing model size during training or designing more efficient models that improve accuracy without increasing network size.
This thesis combines eigenvalue computation and high-dimensional loss surface visualization to study different optimizers and deep neural network models. Eigenvectors of different eigenvalues are computed, and the loss landscape and optimizer trajectory are projected onto the plane spanned by those eigenvectors. A new parallelization method for the stochastic Lanczos method is introduced, resulting in faster computation and thus enabling high-resolution videos of the trajectory and second-order information during neural network training. Additionally, the thesis presents the loss landscape between two minima along with the eigenvalue density spectrum at intermediate points for the first time.
Secondly, this thesis presents a regularization method for \( \textit{Generative Adversarial Networks} \) (GANs) that uses second-order information. The gradient during training is modified by subtracting the eigenvector direction of the biggest eigenvalue, preventing the network from falling into the steepest minima and avoiding mode collapse. The thesis also shows the full eigenvalue density spectra of GANs during training.
Thirdly, this thesis introduces ProxSGD, a proximal algorithm for neural network training that guarantees convergence to a stationary point and unifies multiple popular optimizers. Proximal gradients are used to find a closed-form solution to the problem of training neural networks with smooth and non-smooth regularizations, resulting in better sparsity and more efficient optimization. Experiments show that ProxSGD can find sparser networks while reaching the same accuracy as popular optimizers.
Lastly, this thesis unifies sparsity and \( \textit{neural architecture search} \) (NAS) through the framework of group sparsity. Group sparsity is achieved through \( \ell_{2,1} \)-regularization during training, allowing for filter and operation pruning to reduce model size with minimal sacrifice in accuracy. By grouping multiple operations together, group sparsity can be used for NAS as well. This approach is shown to be more robust while still achieving competitive accuracies compared to state-of-the-art methods.