DeepKAF: A Knowledge Intensive Framework for Heterogeneous Case-Based Reasoning in Textual Domains

  • Business-relevant domain knowledge can be found in plain text across message exchanges among customer support tickets, employee message exchanges and other business transactions. Decoding text-based domain knowledge can be a very demanding task since traditional methods focus on a comprehensive representation of the business and its relevant paths. Such a process can be highly complex, time-costly and of high maintenance effort, especially in environments that change dynamically. In this thesis, a novel approach is presented for developing hybrid case-based reasoning (CBR) systems that bring together the benefits of deep learning approaches with CBR advantages. Deep Knowledge Acquisition Framework (DeepKAF) is a domain-independent framework that features the usage of deep neural networks and big data technologies to decode the domain knowledge with the minimum involvement from the domain experts. While this thesis is focusing more on the textual data because of the availability of the datasets, the target CBR systems based on DeepKAF are able to deal with heterogeneous data where a case can be represented by different attribute types and automatically extract the necessary domain knowledge while keeping the ability to provide an adequate level of explainability. The main focus within this thesis are automatic knowledge acquisition, building similarity measures and cases retrieval. Throughout the progress of this research, several sets of experiments have been conducted and validated by domain experts. Past textual data produced over around 15 years have been used for the needs of the conducted experiments. The text produced is a mixture between English and German texts that were used to describe specific domain problems with a lot of abbreviations. Based on these, the necessary knowledge repositories were built and used afterwards in order to evaluate the suggested approach towards effective monitoring and diagnosis of business workflows. Another public dataset has been used, the CaseLaw dataset, to validate DeepKAF when dealing with longer text and cases with more attributes. The CaseLaw dataset represents around 22 million cases from different US states. Further work motivated by this thesis could investigate how different deep learning models can be used within the CBR paradigm to solve some of the chronic CBR challenges and be of benefit to large-scale multi-dimensional enterprises.

Volltext Dateien herunterladen

Metadaten exportieren

Metadaten
Verfasser*innenangaben:Kareem Amin
URN:urn:nbn:de:hbz:386-kluedo-65744
DOI:https://doi.org/10.26204/KLUEDO/6574
Betreuer*in:Andreas Dengel
Dokumentart:Dissertation
Sprache der Veröffentlichung:Englisch
Datum der Veröffentlichung (online):16.09.2021
Jahr der Erstveröffentlichung:2021
Veröffentlichende Institution:Technische Universität Kaiserslautern
Titel verleihende Institution:Technische Universität Kaiserslautern
Datum der Annahme der Abschlussarbeit:07.09.2021
Datum der Publikation (Server):20.09.2021
GND-Schlagwort:CBR; Deep Learning; Textual CBR; Hybrid CBR
Seitenzahl:XIV, 187
Fachbereiche / Organisatorische Einheiten:Kaiserslautern - Fachbereich Informatik
DDC-Sachgruppen:0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik
Sammlungen:Open-Access-Publikationsfonds
Lizenz (Deutsch):Creative Commons 4.0 - Namensnennung (CC BY 4.0)