Refine
Document Type
- Doctoral Thesis (4) (remove)
Language
- English (4) (remove)
Has Fulltext
- yes (4)
Keywords
- Artificial Intelligence (4) (remove)
Faculty / Organisational entity
Towards PACE-CAD Systems
(2022)
Despite phenomenal advancements in the availability of medical image datasets and the development of modern classification algorithms, Computer-Aided Diagnosis (CAD) has had limited practical exposure in the real-world clinical workflow. This is primarily because of the inherently demanding and sensitive nature of medical diagnosis that can have far-reaching and serious repercussions in case of misdiagnosis. In this work, a paradigm called PACE (Pragmatic, Accurate, Confident, & Explainable) is presented as a set of some of must-have features for any CAD. Diagnosis of glaucoma using Retinal Fundus Images (RFIs) is taken as the primary use case for development of various methods that may enrich an ordinary CAD system with PACE. However, depending on specific requirements for different methods, other application areas in ophthalmology and dermatology have also been explored.
Pragmatic CAD systems refer to a solution that can perform reliably in day-to-day clinical setup. In this research two, of possibly many, aspects of a pragmatic CAD are addressed. Firstly, observing that the existing medical image datasets are small and not representative of images taken in the real-world, a large RFI dataset for glaucoma detection is curated and published. Secondly, realising that a salient attribute of a reliable and pragmatic CAD is its ability to perform in a range of clinically relevant scenarios, classification of 622 unique cutaneous diseases in one of the largest publicly available datasets of skin lesions is successfully performed.
Accuracy is one of the most essential metrics of any CAD system's performance. Domain knowledge relevant to three types of diseases, namely glaucoma, Diabetic Retinopathy (DR), and skin lesions, is industriously utilised in an attempt to improve the accuracy. For glaucoma, a two-stage framework for automatic Optic Disc (OD) localisation and glaucoma detection is developed, which marked new state-of-the-art for glaucoma detection and OD localisation. To identify DR, a model is proposed that combines coarse-grained classifiers with fine-grained classifiers and grades the disease in four stages with respect to severity. Lastly, different methods of modelling and incorporating metadata are also examined and their effect on a model's classification performance is studied.
Confidence in diagnosing a disease is equally important as the diagnosis itself. One of the biggest reasons hampering the successful deployment of CAD in the real-world is that medical diagnosis cannot be readily decided based on an algorithm's output. Therefore, a hybrid CNN architecture is proposed with the convolutional feature extractor trained using point estimates and a dense classifier trained using Bayesian estimates. Evaluation on 13 publicly available datasets shows the superiority of this method in terms of classification accuracy and also provides an estimate of uncertainty for every prediction.
Explainability of AI-driven algorithms has become a legal requirement after Europe’s General Data Protection Regulations came into effect. This research presents a framework for easy-to-understand textual explanations of skin lesion diagnosis. The framework is called ExAID (Explainable AI for Dermatology) and relies upon two fundamental modules. The first module uses any deep skin lesion classifier and performs detailed analysis on its latent space to map human-understandable disease-related concepts to the latent representation learnt by the deep model. The second module proposes Concept Localisation Maps, which extend Concept Activation Vectors by locating significant regions corresponding to a learned concept in the latent space of a trained image classifier.
This thesis probes many viable solutions to equip a CAD system with PACE. However, it is noted that some of these methods require specific attributes in datasets and, therefore, not all methods may be applied on a single dataset. Regardless, this work anticipates that consolidating PACE into a CAD system can not only increase the confidence of medical practitioners in such tools but also serve as a stepping stone for the further development of AI-driven technologies in healthcare.
During our daily lives, we are confronted with vast amounts of data, the processing of which can dramatically influence our lives, both positively and negatively. The enormous amount of data (images, texts, tables, and time series), its variety and possible applications are not always obvious. Due to advancements in the internet of things (IoT), there exist billions of sensors that produce time series which can be found everywhere, whether in medicine, the financial sector or the agricultural economy. This incredible amount of time series data has many hidden features which are useful for industry as well as for daily use, e.g. improving the cancer prediction can save real human lives. Recently, several deep learning methods have been proposed for analyzing this time series data. However, due to their black box nature, their applicability is limited in critical sectors like medicine, finance, and communication. In addition, it is now a compulsion as per artificial intelligence (AI) Act and per General Data Protection Regulation (GDPR) to protect any sensitive data and provide explanations in safety-critical domains. To enable use of DNNs in a broader domain scope, this thesis presents a framework for privacy-preserved and interpretable time series analysis. TimeFrame consists of four main components, namely, post-hoc interpretability, intrinsic interpretability, direct privacy, and indirect privacy. Interpretability is indispensable to avoid damaging people or the infrastructure. In the past years, the development mostly focused on image data, which prevented the full potential of DNNs in time series processing from being exploited. To overcome this limitation, TimeFrame introduces five (Time to Focus, TSViz, TimeREISE, TSInsight, Data Lens) novel post-hoc and two (PatchX, P2ExNet) novel intrinsic interpretability components. TimeFrame addresses multiple perspectives such as attribution, compression, visualization, influence, prototyping, and hierarchical splitting. Compared to existing methods, the components show better explanations, robustness, and scalability. Another crucial factor is the privacy when dealing with sensitive data and deep learning. In this context, TimeFrame introduces two (PPML, PPML x XAI) components for direct and one (From Private to Public) component for indirect privacy. These components benchmark privacy approaches, their effect on interpretability, and the synthetic generation of data to overcome privacy concerns. TimeFrame offers a large set of interpretability and privacy components that can be combined and consider numerous different aspects. Furthermore, the novel approaches have shown to consistently outperform twenty existing state-of-the-art methods across up to 20 different datasets. To guarantee the fairness, various metrics were used including performance change, Sensitivity, Infidelity, Continuity, runtime, model dependency, compression rate, and others. This broad set of metrics makes it possible to provide guidelines for a more appropriate use of existing state-of-the-art approaches as well as the novel components included in TimeFrame.
In recent years, enormous progress has been made in the field of Artificial Intelligence (AI). Especially the introduction of Deep Learning and end-to-end learning, the availability of large datasets and the necessary computational power in form of specialised hardware allowed researchers to build systems with previously unseen performance in areas such as computer vision, machine translation and machine gaming. In parallel, the Semantic Web and its Linked Data movement have published many interlinked RDF datasets, forming the world’s largest, decentralised and publicly available knowledge base.
Despite these scientific successes, all current systems are still narrow AI systems. Each of them is specialised to a specific task and cannot easily be adapted to all other human intelligence tasks, as would be necessary for Artificial General Intelligence (AGI). Furthermore, most of the currently developed systems are not able to learn by making use of freely available knowledge such as provided by the Semantic Web. Autonomous incorporation of new knowledge is however one of the pre-conditions for human-like problem solving.
This work provides a small step towards teaching machines such human-like reasoning on freely available knowledge from the Semantic Web. We investigate how human associations, one of the building blocks of our thinking, can be simulated with Linked Data. The two main results of these investigations are a ground truth dataset of semantic associations and a machine learning algorithm that is able to identify patterns for them in huge knowledge bases.
The ground truth dataset of semantic associations consists of DBpedia entities that are known to be strongly associated by humans. The dataset is published as RDF and can be used for future research.
The developed machine learning algorithm is an evolutionary algorithm that can learn SPARQL queries from a given SPARQL endpoint based on a given list of exemplary source-target entity pairs. The algorithm operates in an end-to-end learning fashion, extracting features in form of graph patterns without the need for human intervention. The learned patterns form a feature space adapted to the given list of examples and can be used to predict target candidates from the SPARQL endpoint for new source nodes. On our semantic association ground truth dataset, our evolutionary graph pattern learner reaches a Recall@10 of > 63 % and an MRR (& MAP) > 43 %, outperforming all baselines. With an achieved Recall@1 of > 34% it even reaches average human top response prediction performance. We also demonstrate how the graph pattern learner can be applied to other interesting areas without modification.
Machine Learning (ML) is expected to become an integrated part of future mobile networks due to its capacity for solving complex problems. During inference, ML algorithms extract the hidden knowledge of their input data which is delivered to them through wireless links in many scenarios. Transmission of a massive amount of such input data can impose a huge burden on the mobile network. On the other hand, it is known that ML algorithms can tolerate different levels of distortion on their input components, while the quality of their predictions remains unaffected. Therefore, utilization of the conventional approaches
implies a waste of radio resources, since they target an exact reconstruction of transmitted data, i.e., the input of ML algorithms. In this thesis, we propose a novel relevance based framework that focuses on the quality of final ML outputs instead of such syntax based reconstruction of transmitted inputs. To this end, we quantify the semantics or relevancy of input components in terms of the bit allocation aspect of data compression, where a higher tolerance for distortion implies less relevancy. A lower relevance level is translated into the allocation of less radio resources, e.g., bandwidth. The introduced formulation provides the foundations for the efficient support of ML models with their required data in the inference phase, while wireless resources are employed efficiently.
In this dissertation, a generic relevance based framework utilizing the Kullback-Leibler Divergence (KLD) is developed that is applicable to many realistic scenarios. The system model under study contains multiple sources transmitting correlated multivariate input components of a ML algorithm. The ML model is seen as a black box, which is trained and has fixed parameters while operating in the inference phase. Our proposed bit allocation accounts for the rate-distortion tradeoff. Hence, it is simply adjustable for application to
other problems. Here, an extended version of the proposed bit allocation strategy is introduced for signaling overhead reduction, in which the relevancy level of each input attribute changes instantaneously. In another expansion, to take the effect of dynamic channel states into account, a resource allocation approach for ML based centralized control systems is proposed. The novel quality of service metric takes outputs of ML algorithms into consideration,
and in combination with the designed greedy algorithm, provides significantly
improved end-to-end performance for a network of cart inverted pendulums.
The introduced relevance based framework is comprehensively investigated by considering various case studies, real and synthetic data, regression and classification, different estimators for the KLD, various ML models and codebook designs. Furthermore, the reliability of this proposed solution is explored in presence of packet drops, indicating robustness of the relevance based compression. In all of the simulations, the relevance based solutions deliver the best outcome in terms of the carefully chosen key performance indicators. In most of them, significantly high gains are also achieved compared to the conventional techniques, motivating further research on the subject.