Refine
Document Type
- Doctoral Thesis (6)
- Article (1)
Has Fulltext
- yes (7)
Keywords
- Machine Learning (7) (remove)
Recommender systems recommend items (e.g., movies, products, books) to users. In this thesis, we proposed two comprehensive and cluster-induced recommendation-based methods: Orthogonal Inductive Matrix Completion (OMIC) and Burst-induced Multi-armed Bandit (BMAB). Given the presence of side information, the first method is categorized as context-aware. OMIC is the first matrix completion method to approach the problem of incorporating biases, side information terms and a pure low-rank term into a single flexible framework with a well-principled optimization procedure. The second method, BMAB, is context-free. That is, it does not require any side data about users or items. Unlike previous context-free multi-armed bandit approaches, our method considers the temporal dynamics of human communication on the web and treats the problem in a continuous time setting. We built our models' assumptions under solid theoretical foundations. For OMIC, we provided theoretical guarantees in the form of generalization bounds by considering the distribution-free case: no assumptions about the sampling distribution are made. Additionally, we conducted a theoretical analysis of community side information when the sampling distribution is known and an adjusted nuclear norm regularization is applied. We showed that our method requires just a few entries to accurately recover the ratings matrix if the structure of the ground truth closely matches the cluster side information. For BMAB, we provided regret guarantees under mild conditions that demonstrate how the system's stability affects the expected reward. Furthermore, we conducted extensive experiments to validate our proposed methodologies. In a controlled environment, we implemented synthetic data generation techniques capable of replicating the domains for which OMIC and BMAB were designed. As a result, we were able to analyze our algorithms' performance across a broad spectrum of ground truth regimes. Finally, we replicated a real-world scenario by utilizing well-established recommender datasets. After comparing our approaches to several baselines, we observe that they achieved state-of-the-art results in terms of accuracy. Apart from being highly accurate, these methods improve interpretability by describing and quantifying features of the datasets they characterize.
Thermodynamic Modeling of Poorly Specified Mixtures using NMR Fingerprinting and Machine Learning
(2023)
Poorly specified mixtures, i.e., mixtures of unknown or incompletely known composition,
are common in many fields of process engineering. Dealing with such mixtures in process
design is challenging as their properties cannot be described with classical thermodynamic
models, which require a full specification. As a workaround, pseudo-components
can be introduced, which are generally defined using ad-hoc assumptions. In the present
thesis, a new framework is developed for the thermodynamic modeling of such mixtures
using nuclear magnetic resonance (NMR) experiments in combination with machine-learning
(ML) methods. In the framework, a characterization of a mixture in terms of
structural groups (“NMR fingerprint”) is obtained by using the ML concept of support
vector classification. Based on the group-specific fingerprint, quantum-chemical descriptors
of the unknown part of the mixture as well as activity coefficients can already be
predicted. Furthermore, a meaningful definition of pseudo-components is achieved by
clustering the structural groups into pseudo-components with the K-medians algorithm
based on their self-diffusion coefficients measured by pulsed-field gradient (PFG) NMR.
It is demonstrated that the characterization of poorly specified mixtures in terms of
pseudo-components can be combined with several thermodynamic group-contribution
methods. The resulting thermodynamic models were applied to various poorly specified
mixtures and used for solving two typical tasks from conceptual fluid separation process
design: the solvent screening for liquid-liquid extraction processes and the simulation
of open evaporation processes. The predictions with the methods developed here show
very good agreement with the results obtained for the fully specified mixtures.
In recent years, enormous progress has been made in the field of Artificial Intelligence (AI). Especially the introduction of Deep Learning and end-to-end learning, the availability of large datasets and the necessary computational power in form of specialised hardware allowed researchers to build systems with previously unseen performance in areas such as computer vision, machine translation and machine gaming. In parallel, the Semantic Web and its Linked Data movement have published many interlinked RDF datasets, forming the world’s largest, decentralised and publicly available knowledge base.
Despite these scientific successes, all current systems are still narrow AI systems. Each of them is specialised to a specific task and cannot easily be adapted to all other human intelligence tasks, as would be necessary for Artificial General Intelligence (AGI). Furthermore, most of the currently developed systems are not able to learn by making use of freely available knowledge such as provided by the Semantic Web. Autonomous incorporation of new knowledge is however one of the pre-conditions for human-like problem solving.
This work provides a small step towards teaching machines such human-like reasoning on freely available knowledge from the Semantic Web. We investigate how human associations, one of the building blocks of our thinking, can be simulated with Linked Data. The two main results of these investigations are a ground truth dataset of semantic associations and a machine learning algorithm that is able to identify patterns for them in huge knowledge bases.
The ground truth dataset of semantic associations consists of DBpedia entities that are known to be strongly associated by humans. The dataset is published as RDF and can be used for future research.
The developed machine learning algorithm is an evolutionary algorithm that can learn SPARQL queries from a given SPARQL endpoint based on a given list of exemplary source-target entity pairs. The algorithm operates in an end-to-end learning fashion, extracting features in form of graph patterns without the need for human intervention. The learned patterns form a feature space adapted to the given list of examples and can be used to predict target candidates from the SPARQL endpoint for new source nodes. On our semantic association ground truth dataset, our evolutionary graph pattern learner reaches a Recall@10 of > 63 % and an MRR (& MAP) > 43 %, outperforming all baselines. With an achieved Recall@1 of > 34% it even reaches average human top response prediction performance. We also demonstrate how the graph pattern learner can be applied to other interesting areas without modification.
From industrial fault detection to medical image analysis or financial fraud prevention: Anomaly detection—the task of identifying data points that show significant deviations from the majority of data—is critical in industrial and technological applications. For efficient and effective anomaly detection, a rich set of semantic features are required to be automatically extracted from the complex data. For example, many recent advances in image anomaly detection are based on self-supervised learning, which learns rich features from a large amount of unlabeled complex image data by exploiting data augmentations. For image data, predefined transformations such as rotations are used to generate varying views of the data. Unfortunately, for data other than images, such as time series, tabular data, graphs, or text, it is unclear what are suitable transformations. This becomes an obstacle to successful self-supervised anomaly detection on other data types.
This thesis proposes Neural Transformation Learning, a self-supervised anomaly detection method that is applicable to general data types. In contrast to previous methods relying on hand-crafted transformations, neural transformation learning learns the transformations from data and uses them for detection. The key ingredient is a novel objective that encourages learning diverse transformations while preserving the relevant semantic content of the data. We prove theoretically and empirically that it is more suited than existing objectives for transformation learning.
We also introduce the extensions of neural transformation learning for anomaly detection within time series and graph-level anomaly detection. The extensions combine transformation learning and other learning paradigms to incorporate vital prior knowledge about time series and graph data. Moreover, we propose a general training strategy for deep anomaly detection with contaminated data. The idea is to infer the unlabeled anomalies and utilize them for updating parameters alternatively. In setups where expert feedback is available, we present a diverse querying strategy based on the seeding algorithm of K-means++ for active anomaly detection.
Our extensive experiments and analysis demonstrate that neural transformation learning achieves remarkable and robust anomaly detection performance on various data types. Finally, we outline specific paths for future research.
Machine Learning (ML) is expected to become an integrated part of future mobile networks due to its capacity for solving complex problems. During inference, ML algorithms extract the hidden knowledge of their input data which is delivered to them through wireless links in many scenarios. Transmission of a massive amount of such input data can impose a huge burden on the mobile network. On the other hand, it is known that ML algorithms can tolerate different levels of distortion on their input components, while the quality of their predictions remains unaffected. Therefore, utilization of the conventional approaches
implies a waste of radio resources, since they target an exact reconstruction of transmitted data, i.e., the input of ML algorithms. In this thesis, we propose a novel relevance based framework that focuses on the quality of final ML outputs instead of such syntax based reconstruction of transmitted inputs. To this end, we quantify the semantics or relevancy of input components in terms of the bit allocation aspect of data compression, where a higher tolerance for distortion implies less relevancy. A lower relevance level is translated into the allocation of less radio resources, e.g., bandwidth. The introduced formulation provides the foundations for the efficient support of ML models with their required data in the inference phase, while wireless resources are employed efficiently.
In this dissertation, a generic relevance based framework utilizing the Kullback-Leibler Divergence (KLD) is developed that is applicable to many realistic scenarios. The system model under study contains multiple sources transmitting correlated multivariate input components of a ML algorithm. The ML model is seen as a black box, which is trained and has fixed parameters while operating in the inference phase. Our proposed bit allocation accounts for the rate-distortion tradeoff. Hence, it is simply adjustable for application to
other problems. Here, an extended version of the proposed bit allocation strategy is introduced for signaling overhead reduction, in which the relevancy level of each input attribute changes instantaneously. In another expansion, to take the effect of dynamic channel states into account, a resource allocation approach for ML based centralized control systems is proposed. The novel quality of service metric takes outputs of ML algorithms into consideration,
and in combination with the designed greedy algorithm, provides significantly
improved end-to-end performance for a network of cart inverted pendulums.
The introduced relevance based framework is comprehensively investigated by considering various case studies, real and synthetic data, regression and classification, different estimators for the KLD, various ML models and codebook designs. Furthermore, the reliability of this proposed solution is explored in presence of packet drops, indicating robustness of the relevance based compression. In all of the simulations, the relevance based solutions deliver the best outcome in terms of the carefully chosen key performance indicators. In most of them, significantly high gains are also achieved compared to the conventional techniques, motivating further research on the subject.
In recent years, the automotive industry has shifted from purely combustion engine-driven vehicles towards hybridization due to the introduction of CO2 emission legislation. Hybrid powertrains also represent an important pillar and starting point in the journey towards zero-emission and full electrification. Fulfilling the most recent emission standards requires efficient control strategies for the engine, capable of real-time operation. Model accuracy is one of the main parameters which directly influence the performance of such control strategies. Specific methodologies developed in the past, such as physically- or phenomenologically-based approaches, have already facilitated the modeling of the combustion engine. Even though these models can accurately predict emissions in steady state conditions, their performance during transient engine operation is time-consuming and still not sufficiently reliable. The major contribution of the current work is to clarify and apply the recent advancements in data-driven modeling techniques, especially in time series forecasting with feedforward neural networks (FFNNs) and long short-term memory networks (LSTMs), to address the limitations mentioned above and to compare the different approaches. The quantity and quality of data are significant challenges for data-driven modeling. This paper studies the modeling of gasoline engine emissions using FFNNs and LSTMs. The data quantity and quality requirements are studied based on a portable emission measurement system (PEMS), measuring at 1 Hz, and additional analyses on an engine test bench with a HiL setup, providing the possibility of increasing the measurement frequency with more sophisticated devices by a factor of five. Subsequently, the training and validation of the FFNNs and LSTMs are outlined, and finally, the model accuracy is discussed.
Plattformarbeit gewinnt als neue Arbeitsform zunehmend an Bedeutung und bietet Vorteile bei der Vereinbarkeit von Erwerbs- und Privatleben. Allerdings können Steuerungselemente wie Algorithmen und Bewertungssysteme auch Risiken bergen. Aktuelle Forschung zur Diskriminierung von Frauen auf Online-Arbeitsmärkten gibt Hinweise auf eine mögliche Ungleichbehandlung. Bekannte Muster des traditionellen Arbeitsmarktes bei der Beauftragung und Preissetzung zeigen sich auch auf den Plattformen. Dies legt nahe, dass sich Geschlechterstereotype auf die Plattformökonomie übertragen. Welche Bedeutung dabei die plattformspezifischen Steuerungselemente haben stand bei bisherigen Untersuchungen nur selten im Fokus.
Diese Dissertation untersucht die Rolle von Geschlechterstereotypen und Algorithmen bei Beauftragung und Preissetzung auf einer der weltweit größten Freelancing-Plattformen, freelancer.com. Durch Web-Scraping wird ein einzigartiger Datensatz erstellt und mithilfe von Methoden des maschinellen Lernens aufbereitet. Mittels ökonometrischer Modelle wird die Fragestellung unter Berücksichtigung auftragsspezifischer Effekte untersucht.
Die Ergebnisse deuten darauf hin, dass Geschlechterstereotype bei der Beauftragungsentscheidung auf der Plattform keine Rolle spielen. Allerdings kommt dem Rankingalgorithmus der Plattform eine hohe Bedeutung zu. Ferner kann festgestellt werden, dass das Ranking der Freelancer:innen in Abhängigkeit vom Geschlecht unterschiedlichen Einfluss auf die Beauftragungswahrscheinlichkeit nimmt: Für Frauen ist der Rang in einem weiblich geprägten Tätigkeitsfeld weniger relevant als für Männer.
Geschlechterstereotype scheinen demnach auf der Freelancing-Plattform keine Relevanz zu haben. Frauen wird somit eine gendergerechtere Erwerbstätigkeit geboten. Jedoch bergen plattformspezifische Steuerungselemente wie der Rankingalgorithmus neue Potenziale zur Geschlechterdiskriminierung. Die Erkenntnisse tragen dazu bei, ein besseres Verständnis der Herausforderungen und Chancen der Plattformarbeit im Kontext der Geschlechtergleichstellung zu gewinnen.