Refine
Year of publication
- 2021 (1)
Document Type
- Doctoral Thesis (1)
Language
- English (1)
Has Fulltext
- yes (1)
Keywords
- CBR (1)
- Deep Learning (1)
- Hybrid CBR (1)
- Textual CBR (1)
Faculty / Organisational entity
DeepKAF: A Knowledge Intensive Framework for Heterogeneous Case-Based Reasoning in Textual Domains
(2021)
Business-relevant domain knowledge can be found in plain text across message exchanges
among customer support tickets, employee message exchanges and other business transactions.
Decoding text-based domain knowledge can be a very demanding task since traditional
methods focus on a comprehensive representation of the business and its relevant paths. Such
a process can be highly complex, time-costly and of high maintenance effort, especially in
environments that change dynamically.
In this thesis, a novel approach is presented for developing hybrid case-based reasoning
(CBR) systems that bring together the benefits of deep learning approaches with CBR advantages.
Deep Knowledge Acquisition Framework (DeepKAF) is a domain-independent
framework that features the usage of deep neural networks and big data technologies to decode
the domain knowledge with the minimum involvement from the domain experts. While
this thesis is focusing more on the textual data because of the availability of the datasets, the
target CBR systems based on DeepKAF are able to deal with heterogeneous data where a
case can be represented by different attribute types and automatically extract the necessary
domain knowledge while keeping the ability to provide an adequate level of explainability.
The main focus within this thesis are automatic knowledge acquisition, building similarity
measures and cases retrieval.
Throughout the progress of this research, several sets of experiments have been conducted
and validated by domain experts. Past textual data produced over around 15 years have
been used for the needs of the conducted experiments. The text produced is a mixture
between English and German texts that were used to describe specific domain problems
with a lot of abbreviations. Based on these, the necessary knowledge repositories were built
and used afterwards in order to evaluate the suggested approach towards effective monitoring
and diagnosis of business workflows. Another public dataset has been used, the CaseLaw
dataset, to validate DeepKAF when dealing with longer text and cases with more attributes.
The CaseLaw dataset represents around 22 million cases from different US states.
Further work motivated by this thesis could investigate how different deep learning models
can be used within the CBR paradigm to solve some of the chronic CBR challenges and be
of benefit to large-scale multi-dimensional enterprises.