Refine
Document Type
- Doctoral Thesis (3) (remove)
Has Fulltext
- yes (3)
Keywords
- Algorithmus (3) (remove)
Faculty / Organisational entity
A prime motivation for using XML to directly represent pieces of information is the ability of supporting ad-hoc or 'schema-later' settings. In such scenarios, modeling data under loose data constraints is essential. Of course, the flexibility of XML comes at a price: the absence of a rigid, regular, and homogeneous structure makes many aspects of data management more challenging. Such malleable data formats can also lead to severe information quality problems, because the risk of storing inconsistent and incorrect data is greatly increased. A prominent example of such problems is the appearance of the so-called fuzzy duplicates, i.e., multiple and non-identical representations of a real-world entity. Similarity joins correlating XML document fragments that are similar can be used as core operators to support the identification of fuzzy duplicates. However, similarity assessment is especially difficult on XML datasets because structure, besides textual information, may exhibit variations in document fragments representing the same real-world entity. Moreover, similarity computation is substantially more expensive for tree-structured objects and, thus, is a serious performance concern. This thesis describes the design and implementation of an effective, flexible, and high-performance XML-based similarity join framework. As main contributions, we present novel structure-conscious similarity functions for XML trees - either considering XML structure in isolation or combined with textual information -, mechanisms to support the selection of relevant information from XML trees and organization of this information into a suitable format for similarity calculation, and efficient algorithms for large-scale identification of similar, set-represented objects. Finally, we validate the applicability of our techniques by integrating our framework into a native XML database management system; in this context we address several issues around the integration of similarity operations into traditional database architectures.
Automated theorem proving is a search problem and, by its undecidability, a very difficult one. The challenge in the development of a practically successful prover is the mapping of the extensively developed theory into a program that runs efficiently on a computer. Starting from a level-based system model for automated theorem provers, in this work we present different techniques that are important for the development of powerful equational theorem provers. The contributions can be divided into three areas: Architecture. We present a novel prover architecture that is based on a set-based compression scheme. With moderate additional computational costs we achieve a substantial reduction of the memory requirements. Further wins are architectural clarity, the easy provision of proof objects, and a new way to parallelize a prover which shows respectable speed-ups in practice. The compact representation paves the way to new applications of automated equational provers in the area of verification systems. Algorithms. To improve the speed of a prover we need efficient solutions for the most time-consuming sub-tasks. We demonstrate improvements of several orders of magnitude for two of the most widely used term orderings, LPO and KBO. Other important contributions are a novel generic unsatisfiability test for ordering constraints and, based on that, a sufficient ground reducibility criterion with an excellent cost-benefit ratio. Redundancy avoidance. The notion of redundancy is of central importance to justify simplifying inferences which are used to prune the search space. In our experience with unfailing completion, the usual notion of redundancy is not strong enough. In the presence of associativity and commutativity, the provers often get stuck enumerating equations that are permutations of each other. By extending and refining the proof ordering, many more equations can be shown redundant. Furthermore, our refinement of the unfailing completion approach allows us to use redundant equations for simplification without the need to consider them for generating inferences. We describe the efficient implementation of several redundancy criteria and experimentally investigate their influence on the proof search. The combination of these techniques results in a considerable improvement of the practical performance of a prover, which we demonstrate with extensive experiments for the automated theorem prover Waldmeister. The progress achieved allows the prover to solve problems that were previously out of reach. This considerably enhances the potential of the prover and opens up the way for new applications.
Plattformarbeit gewinnt als neue Arbeitsform zunehmend an Bedeutung und bietet Vorteile bei der Vereinbarkeit von Erwerbs- und Privatleben. Allerdings können Steuerungselemente wie Algorithmen und Bewertungssysteme auch Risiken bergen. Aktuelle Forschung zur Diskriminierung von Frauen auf Online-Arbeitsmärkten gibt Hinweise auf eine mögliche Ungleichbehandlung. Bekannte Muster des traditionellen Arbeitsmarktes bei der Beauftragung und Preissetzung zeigen sich auch auf den Plattformen. Dies legt nahe, dass sich Geschlechterstereotype auf die Plattformökonomie übertragen. Welche Bedeutung dabei die plattformspezifischen Steuerungselemente haben stand bei bisherigen Untersuchungen nur selten im Fokus.
Diese Dissertation untersucht die Rolle von Geschlechterstereotypen und Algorithmen bei Beauftragung und Preissetzung auf einer der weltweit größten Freelancing-Plattformen, freelancer.com. Durch Web-Scraping wird ein einzigartiger Datensatz erstellt und mithilfe von Methoden des maschinellen Lernens aufbereitet. Mittels ökonometrischer Modelle wird die Fragestellung unter Berücksichtigung auftragsspezifischer Effekte untersucht.
Die Ergebnisse deuten darauf hin, dass Geschlechterstereotype bei der Beauftragungsentscheidung auf der Plattform keine Rolle spielen. Allerdings kommt dem Rankingalgorithmus der Plattform eine hohe Bedeutung zu. Ferner kann festgestellt werden, dass das Ranking der Freelancer:innen in Abhängigkeit vom Geschlecht unterschiedlichen Einfluss auf die Beauftragungswahrscheinlichkeit nimmt: Für Frauen ist der Rang in einem weiblich geprägten Tätigkeitsfeld weniger relevant als für Männer.
Geschlechterstereotype scheinen demnach auf der Freelancing-Plattform keine Relevanz zu haben. Frauen wird somit eine gendergerechtere Erwerbstätigkeit geboten. Jedoch bergen plattformspezifische Steuerungselemente wie der Rankingalgorithmus neue Potenziale zur Geschlechterdiskriminierung. Die Erkenntnisse tragen dazu bei, ein besseres Verständnis der Herausforderungen und Chancen der Plattformarbeit im Kontext der Geschlechtergleichstellung zu gewinnen.