DeepKAF: A Knowledge Intensive Framework for Heterogeneous Case-Based Reasoning in Textual Domains

  • Business-relevant domain knowledge can be found in plain text across message exchanges among customer support tickets, employee message exchanges and other business transactions. Decoding text-based domain knowledge can be a very demanding task since traditional methods focus on a comprehensive representation of the business and its relevant paths. Such a process can be highly complex, time-costly and of high maintenance effort, especially in environments that change dynamically. In this thesis, a novel approach is presented for developing hybrid case-based reasoning (CBR) systems that bring together the benefits of deep learning approaches with CBR advantages. Deep Knowledge Acquisition Framework (DeepKAF) is a domain-independent framework that features the usage of deep neural networks and big data technologies to decode the domain knowledge with the minimum involvement from the domain experts. While this thesis is focusing more on the textual data because of the availability of the datasets, the target CBR systems based on DeepKAF are able to deal with heterogeneous data where a case can be represented by different attribute types and automatically extract the necessary domain knowledge while keeping the ability to provide an adequate level of explainability. The main focus within this thesis are automatic knowledge acquisition, building similarity measures and cases retrieval. Throughout the progress of this research, several sets of experiments have been conducted and validated by domain experts. Past textual data produced over around 15 years have been used for the needs of the conducted experiments. The text produced is a mixture between English and German texts that were used to describe specific domain problems with a lot of abbreviations. Based on these, the necessary knowledge repositories were built and used afterwards in order to evaluate the suggested approach towards effective monitoring and diagnosis of business workflows. Another public dataset has been used, the CaseLaw dataset, to validate DeepKAF when dealing with longer text and cases with more attributes. The CaseLaw dataset represents around 22 million cases from different US states. Further work motivated by this thesis could investigate how different deep learning models can be used within the CBR paradigm to solve some of the chronic CBR challenges and be of benefit to large-scale multi-dimensional enterprises.

Download full text files

Export metadata

Author:Kareem Amin
Advisor:Andreas Dengel
Document Type:Doctoral Thesis
Language of publication:English
Publication Date:2021/09/16
Year of Publication:2021
Publishing Institute:Technische Universität Kaiserslautern
Granting Institute:Technische Universität Kaiserslautern
Acceptance Date of the Thesis:2021/09/07
Date of the Publication (Server):2021/09/20
GND-Keyword:CBR; Deep Learning; Textual CBR; Hybrid CBR
Number of page:XIV, 187
Faculties / Organisational entities:Fachbereich Informatik
DDC-Cassification:0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik
Licence (German):Creative Commons 4.0 - Namensnennung (CC BY 4.0)