Kaiserslautern - Fachbereich Informatik
Refine
Document Type
- Doctoral Thesis (3)
Language
- English (3)
Has Fulltext
- yes (3)
Keywords
- Machine Learning (3) (remove)
Faculty / Organisational entity
From industrial fault detection to medical image analysis or financial fraud prevention: Anomaly detection—the task of identifying data points that show significant deviations from the majority of data—is critical in industrial and technological applications. For efficient and effective anomaly detection, a rich set of semantic features are required to be automatically extracted from the complex data. For example, many recent advances in image anomaly detection are based on self-supervised learning, which learns rich features from a large amount of unlabeled complex image data by exploiting data augmentations. For image data, predefined transformations such as rotations are used to generate varying views of the data. Unfortunately, for data other than images, such as time series, tabular data, graphs, or text, it is unclear what are suitable transformations. This becomes an obstacle to successful self-supervised anomaly detection on other data types.
This thesis proposes Neural Transformation Learning, a self-supervised anomaly detection method that is applicable to general data types. In contrast to previous methods relying on hand-crafted transformations, neural transformation learning learns the transformations from data and uses them for detection. The key ingredient is a novel objective that encourages learning diverse transformations while preserving the relevant semantic content of the data. We prove theoretically and empirically that it is more suited than existing objectives for transformation learning.
We also introduce the extensions of neural transformation learning for anomaly detection within time series and graph-level anomaly detection. The extensions combine transformation learning and other learning paradigms to incorporate vital prior knowledge about time series and graph data. Moreover, we propose a general training strategy for deep anomaly detection with contaminated data. The idea is to infer the unlabeled anomalies and utilize them for updating parameters alternatively. In setups where expert feedback is available, we present a diverse querying strategy based on the seeding algorithm of K-means++ for active anomaly detection.
Our extensive experiments and analysis demonstrate that neural transformation learning achieves remarkable and robust anomaly detection performance on various data types. Finally, we outline specific paths for future research.
In recent years, enormous progress has been made in the field of Artificial Intelligence (AI). Especially the introduction of Deep Learning and end-to-end learning, the availability of large datasets and the necessary computational power in form of specialised hardware allowed researchers to build systems with previously unseen performance in areas such as computer vision, machine translation and machine gaming. In parallel, the Semantic Web and its Linked Data movement have published many interlinked RDF datasets, forming the world’s largest, decentralised and publicly available knowledge base.
Despite these scientific successes, all current systems are still narrow AI systems. Each of them is specialised to a specific task and cannot easily be adapted to all other human intelligence tasks, as would be necessary for Artificial General Intelligence (AGI). Furthermore, most of the currently developed systems are not able to learn by making use of freely available knowledge such as provided by the Semantic Web. Autonomous incorporation of new knowledge is however one of the pre-conditions for human-like problem solving.
This work provides a small step towards teaching machines such human-like reasoning on freely available knowledge from the Semantic Web. We investigate how human associations, one of the building blocks of our thinking, can be simulated with Linked Data. The two main results of these investigations are a ground truth dataset of semantic associations and a machine learning algorithm that is able to identify patterns for them in huge knowledge bases.
The ground truth dataset of semantic associations consists of DBpedia entities that are known to be strongly associated by humans. The dataset is published as RDF and can be used for future research.
The developed machine learning algorithm is an evolutionary algorithm that can learn SPARQL queries from a given SPARQL endpoint based on a given list of exemplary source-target entity pairs. The algorithm operates in an end-to-end learning fashion, extracting features in form of graph patterns without the need for human intervention. The learned patterns form a feature space adapted to the given list of examples and can be used to predict target candidates from the SPARQL endpoint for new source nodes. On our semantic association ground truth dataset, our evolutionary graph pattern learner reaches a Recall@10 of > 63 % and an MRR (& MAP) > 43 %, outperforming all baselines. With an achieved Recall@1 of > 34% it even reaches average human top response prediction performance. We also demonstrate how the graph pattern learner can be applied to other interesting areas without modification.
Recommender systems recommend items (e.g., movies, products, books) to users. In this thesis, we proposed two comprehensive and cluster-induced recommendation-based methods: Orthogonal Inductive Matrix Completion (OMIC) and Burst-induced Multi-armed Bandit (BMAB). Given the presence of side information, the first method is categorized as context-aware. OMIC is the first matrix completion method to approach the problem of incorporating biases, side information terms and a pure low-rank term into a single flexible framework with a well-principled optimization procedure. The second method, BMAB, is context-free. That is, it does not require any side data about users or items. Unlike previous context-free multi-armed bandit approaches, our method considers the temporal dynamics of human communication on the web and treats the problem in a continuous time setting. We built our models' assumptions under solid theoretical foundations. For OMIC, we provided theoretical guarantees in the form of generalization bounds by considering the distribution-free case: no assumptions about the sampling distribution are made. Additionally, we conducted a theoretical analysis of community side information when the sampling distribution is known and an adjusted nuclear norm regularization is applied. We showed that our method requires just a few entries to accurately recover the ratings matrix if the structure of the ground truth closely matches the cluster side information. For BMAB, we provided regret guarantees under mild conditions that demonstrate how the system's stability affects the expected reward. Furthermore, we conducted extensive experiments to validate our proposed methodologies. In a controlled environment, we implemented synthetic data generation techniques capable of replicating the domains for which OMIC and BMAB were designed. As a result, we were able to analyze our algorithms' performance across a broad spectrum of ground truth regimes. Finally, we replicated a real-world scenario by utilizing well-established recommender datasets. After comparing our approaches to several baselines, we observe that they achieved state-of-the-art results in terms of accuracy. Apart from being highly accurate, these methods improve interpretability by describing and quantifying features of the datasets they characterize.