H. Information Systems
Refine
Year of publication
Document Type
- Doctoral Thesis (12)
- Master's Thesis (1)
- Report (1)
Has Fulltext
- yes (14)
Keywords
- Deep Learning (2)
- Accelerometer (1)
- Artificial Intelligence (1)
- Computer-Aided Diagnosis (1)
- Datenfusion (1)
- Explainable Artificial Intelligence (1)
- Extended Mind (1)
- Human Pose (1)
- IMU (1)
- Incremental recomputation (1)
Faculty / Organisational entity
The rapid growth of systems, both in size and complexity, combined with their distributed
nature, is posing challenges for their efficient integration and functioning. Moreover,
in order to achieve sustainability objectives and future goals, systems are increasingly
collaborating with each other, resulting in the emergence of Systems of Systems (SoS)
that are large-scale and independent. In such scenarios, multiple stakeholders and systems
from different disciplines with diverse interests need to interoperate. In various domains,
this trend of growing systems creates a greater need for interfaces that ensure seamless
interoperability in between and within these systems and SoS.
To address these challenges, an effective method for integrating systems and SoS is required.
A key to ease this integration can be the use of interface specifications to describe and
specify interfaces. However, there is currently no comprehensive understanding of how
to write high-quality interface specifications, nor is there a common overview of interface
specification approaches.
This thesis aims to fill these gaps of documented knowledge by reviewing recent developments
and best practices for interface specifications in the context of systems engineering
and SoS engineering. The review was conducted through a literature review focusing on
interface specifications, complemented by an analysis of existing interface specification
approaches and expert interviews. The goal is to provide an overview of current interface
specification characteristics and their common use cases. Based on this analysis, a
usage-driven approach in the form of customised interface specification mappings was
developed, which can assist in identifying an appropriate approach for specifying interfaces.
In light of the increasing connectivity in our lives, the work provides a framework for
better classifying and approaching interface specifications, seeking to move away from
viewing interfaces as neglected elements of systems engineering, towards a more intelligent
and productive classification and approach.
Semi-structured data is a common data format in many domains.
It is characterized by a hierarchical structure and a schema that is not fixed.
Efficient and scalable processing of this data is therefore challenging, as many existing indexing and processing techniques are not well-suited for this data format.
This dissertation presents a novel approach to processing large JSON datasets.
We describe a new data processor, JODA, that is designed to process semi-structured data by using all available computing resources and state-of-the-art techniques.
Using a custom query language and a vertically-scaling pipeline query execution engine, JODA can process large datasets with high throughput.
We optimize JODA by using a novel optimization for iterative query workloads called delta trees, which succinctly represent the changes between two documents.
This allows us to process iterative and exploratory queries efficiently.
We improve the filtering performance of JODA by implementing a holistic adaptive indexing approach that creates and improves structural and content indices on the fly, depending on the query load.
No prior knowledge about the data is required, and the indices are automatically improved over time.
JODA is also modularized and can be extended with new user-defined predicates, functions, indices, import, and export functionalities.
These modules can be written in an external programming language and integrated into the query execution pipeline at runtime.
To evaluate this system against competitors, we introduce a benchmark generator, coined BETZE, which aims to simulate data scientists exploring unknown JSON datasets.
The generator can be tweaked to generate query workload with different characteristics, or predefined presets can be used to quickly generate a benchmark.
We see that JODA outperforms competitors in most tasks over a wide range of datasets and use-cases.
3D joint angles based human pose is needed for applications like activity recognition, musculoskeletal health, sports biomechanics and ergonomics. The microelectromechanical systems (MEMS) based magnetic-inertial measurement units (MIMUs) can estimate 3D orientation. Due to small size, MIMUs can be attached to the body as wearable sensors for obtaining full 3D human pose and this system is termed as inertial motion capture (i-Mocap). But the MIMUs suffer from sensor errors and disturbances, due to which orientation estimated from individual MIMUs can be erroneous. Accurate sensor calibration is essential and subsequently alignment of these sensors to body segments must also be precisely known, which is called sensor-to-segment calibration. Sensor fusion is employed to address the disturbances and noise in MIMUs. Many state-of-art inertial motion capture approaches ignore the magnetometer and only use IMUs to reduce the error arising from inhomogeneous magnetic field. These algorithms rely on kinematic constraints and assumptions regarding joints and are based on IMUs located on the adjacent body segments. The full body coverage requires 13-17 such units and can be quite obtrusive. The setting up and calibration of so many wearable sensors also take time.
This thesis focuses on 3D human pose estimation from a reduced number of MIMUs and deals with this problem systematically. First we propose an accurate simultaneous calibration of multiple MIMUs, which also learns the uncertainty of individual sensors. We then describe a novel sensor fusion algorithm for robust orientation estimation from an MIMU and for updating sensors calibration online. The residual errors in both sensor calibration and fusion can result in drift error in the joint angles. Therefore, we present anatomical (sensor-to-segment) calibration in which an orientation offset correction term is updated and used for online correction of residual drift in individual joint angles. Subsequently we demonstrate that 3D human joint angle constraints can be learned using a data-driven approach in a high dimensional latent space. Owing to temporal and joint angle constraints, it is possible to use only a reduced set of sensors (as opposed to one sensor per segment) and still obtain 3D human pose. But the spatial and temporal prior learning from data is often limited due to finite set of movement patterns in most datasets. This introduces uncertainty while estimating 3D human pose from sparse MIMU sensors. We propose a magnetometer robust orientation parameterization and a data-driven deep learning framework to predict 3D human pose with associated uncertainty from sparse MIMUs. The model is evaluated on real MIMU data and we show that the uncertainty predicted by the trained model is well-correlated with actual error and ambiguity.
The fifth-generation (5G) of wireless networks promises to bring new advances, such as a huge increase in mobile data rates, a plunge in communications latency, and an increase in the quality of experience perceived by users that can cope with the ever-increasing demand in Internet traffic. However, the high cost of capital and operational expenditure (CAPEX/OPEX) of the new 5G network and the lack of a killer application hinder its rapid adoption. In this context, Mobile Network Operators (MNOs) have turned their attention to the following idea: opening up their infrastructure so that vertical businesses can leverage the new 5G network to improve their primary businesses and develop new ones. However, deploying multiple isolated vertical applications on top of the same infrastructure poses unique challenges that must be addressed. In this thesis, we provide critical contributions to developing 5G networks to accommodate different vertical applications in an isolated, flexible, and automated manner. This thesis contributions spawn on three main areas: (i) the development of an integrated fronthaul and backhaul network, (ii) the development of a network slicing overbooking algorithm, and (iii) the development of a method to mitigate the noisy neighbors' problem in a vRAN deployment.
The fifth generation mobile networks (5G) will incorporate novel technologies such as network programmability and virtualization enabled by Software-Defined Networking (SDN) and Network Function Virtualization (NFV) paradigms, which have recently attracted major
interest from both academic and industrial stakeholders.
Building on these concepts, Network Slicing raised as the main driver of a novel business model where mobile operators may open, i.e., “slice”, their infrastructure to new business players and offer independent, isolated and self-contained sets of network functions
and physical/virtual resources tailored to specific services requirements. While Network Slicing has the potential to increase the revenue sources of service providers, it involves a number of technical challenges that must be carefully addressed.
End-to-end (E2E) network slices encompass time and spectrum resources in the radio access network (RAN), transport resources on the fronthauling/backhauling links, and computing and storage resources at core and edge data centers. Additionally, the vertical service requirements’ heterogeneity (e.g., high throughput, low latency, high reliability) exacerbates the need for novel orchestration solutions able to manage end-to-end network slice resources across different domains, while satisfying stringent service level agreements and specific traffic requirements. An end-to-end network slicing orchestration solution shall i) admit network slice requests
such that the overall system revenues are maximized, ii) provide the required resources across different network domains to fulfill the Service Level Agreements (SLAs) iii) dynamically adapt the resource allocation based on the real-time traffic load, endusers’ mobility and instantaneous wireless channel statistics. Certainly, a mobile network represents a fast-changing scenario characterized by complex
spatio-temporal relationship connecting end-users’ traffic demand with social activities and economy. Legacy models that aim at providing dynamic resource allocation based on traditional traffic demand forecasting techniques fail to capture these important aspects.
To close this gap, machine learning-aided solutions are quickly arising as promising technologies to sustain, in a scalable manner, the set of operations required by the network slicing context. How to implement such resource allocation schemes among slices, while
trying to make the most efficient use of the networking resources composing the mobile infrastructure, are key problems underlying the network slicing paradigm, which will be addressed in this thesis.
Towards PACE-CAD Systems
(2022)
Despite phenomenal advancements in the availability of medical image datasets and the development of modern classification algorithms, Computer-Aided Diagnosis (CAD) has had limited practical exposure in the real-world clinical workflow. This is primarily because of the inherently demanding and sensitive nature of medical diagnosis that can have far-reaching and serious repercussions in case of misdiagnosis. In this work, a paradigm called PACE (Pragmatic, Accurate, Confident, & Explainable) is presented as a set of some of must-have features for any CAD. Diagnosis of glaucoma using Retinal Fundus Images (RFIs) is taken as the primary use case for development of various methods that may enrich an ordinary CAD system with PACE. However, depending on specific requirements for different methods, other application areas in ophthalmology and dermatology have also been explored.
Pragmatic CAD systems refer to a solution that can perform reliably in day-to-day clinical setup. In this research two, of possibly many, aspects of a pragmatic CAD are addressed. Firstly, observing that the existing medical image datasets are small and not representative of images taken in the real-world, a large RFI dataset for glaucoma detection is curated and published. Secondly, realising that a salient attribute of a reliable and pragmatic CAD is its ability to perform in a range of clinically relevant scenarios, classification of 622 unique cutaneous diseases in one of the largest publicly available datasets of skin lesions is successfully performed.
Accuracy is one of the most essential metrics of any CAD system's performance. Domain knowledge relevant to three types of diseases, namely glaucoma, Diabetic Retinopathy (DR), and skin lesions, is industriously utilised in an attempt to improve the accuracy. For glaucoma, a two-stage framework for automatic Optic Disc (OD) localisation and glaucoma detection is developed, which marked new state-of-the-art for glaucoma detection and OD localisation. To identify DR, a model is proposed that combines coarse-grained classifiers with fine-grained classifiers and grades the disease in four stages with respect to severity. Lastly, different methods of modelling and incorporating metadata are also examined and their effect on a model's classification performance is studied.
Confidence in diagnosing a disease is equally important as the diagnosis itself. One of the biggest reasons hampering the successful deployment of CAD in the real-world is that medical diagnosis cannot be readily decided based on an algorithm's output. Therefore, a hybrid CNN architecture is proposed with the convolutional feature extractor trained using point estimates and a dense classifier trained using Bayesian estimates. Evaluation on 13 publicly available datasets shows the superiority of this method in terms of classification accuracy and also provides an estimate of uncertainty for every prediction.
Explainability of AI-driven algorithms has become a legal requirement after Europe’s General Data Protection Regulations came into effect. This research presents a framework for easy-to-understand textual explanations of skin lesion diagnosis. The framework is called ExAID (Explainable AI for Dermatology) and relies upon two fundamental modules. The first module uses any deep skin lesion classifier and performs detailed analysis on its latent space to map human-understandable disease-related concepts to the latent representation learnt by the deep model. The second module proposes Concept Localisation Maps, which extend Concept Activation Vectors by locating significant regions corresponding to a learned concept in the latent space of a trained image classifier.
This thesis probes many viable solutions to equip a CAD system with PACE. However, it is noted that some of these methods require specific attributes in datasets and, therefore, not all methods may be applied on a single dataset. Regardless, this work anticipates that consolidating PACE into a CAD system can not only increase the confidence of medical practitioners in such tools but also serve as a stepping stone for the further development of AI-driven technologies in healthcare.
Indoor positioning system (IPS) is becoming more and more popular in recent years in industrial, scientific and medical areas. The rapidly growing demand of accurate position information attracts much attention and effort in developing various kinds of positioning systems that are characterized by parameters like accuracy,robustness,
latency, cost, etc. These systems have been successfully used in many applications such as automation in manufacturing, patient tracking in hospital, action detection for human-machine interacting and so on.
The different performance requirements in various applications lead to existence of greatly diverse technologies, which can be categorized into two groups: inertial positioning(involving momentum sensors embedded on the object device to be located) and external sensing (geometry estimation based on signal measurement). In positioning
systems based on external sensing, the input signal used for locating refers to many sources, such as visual or infrared signal in optical methods, sound or ultra-sound in acoustic methods and radio frequency based methods. This dissertation gives a recapitulative survey of a number of existence popular solutions for indoor positioning systems. Basic principles of individual technologies are demonstrated and discussed. By comparing the performances like accuracy, robustness, cost, etc., a comprehensive review of the properties of each technologies is presented, which concludes a guidance for designing a location sensing systems for indoor applications. This thesis will lately focus on presenting the development of a high precision IPS
prototype system based on RF signal from the concept aspect to the implementation up to evaluation. Developing phases related to this work include positioning scenario, involved technologies, hardware development, algorithms development, firmware generation, prototype evaluation, etc.. The developed prototype is a narrow band RF system, and it is suitable for a flexible frequency selection in UHF (300MHz3GHz) and SHF (3GHz30GHz) bands, enabling this technology to meet broad service preferences. The fundamental of the proposed system classified itself as a hyperbolic position fix system, which estimates a location by solving non-linear equations derived from time difference of arrival (TDoA) measurements. As the positioning accuracy largely depends on the temporal resolution of the signal acquisition, a dedicated RF front-end system is developed to achieve a time resolution in range of multiple picoseconds down to less than 1 pico second. On the algorithms aspect, two processing units: TDoA estimator and the Hyperbolic equations solver construct the digital signal processing system. In order to implement a real-time positioning system, the processing system is implemented on a FPGA platform. Corresponding firmware is generated from the algorithms modeled in MATLAB/Simulink, using the high level synthesis (HLS) tool HDL Coder. The prototype system is evaluated and an accuracy of better than 1 cm is achieved. A better performance is potential feasible by manipulating some of the controlling conditions such as ADC sampling rate, ADC resolution, interpolation process, higher frequency, more stable antenna, etc. Although the proposed system is initially dedicated to indoor applications, it could also be a competitive candidate for an outdoor positioning service.
Private data analytics systems preferably provide required analytic accuracy to analysts and specified privacy to individuals whose data is analyzed. Devising a general system that works for a broad range of datasets and analytic scenarios has proven to be difficult.
Despite the advent of differentially private systems with proven formal privacy guarantees, industry still uses inferior ad-hoc mechanisms that provide better analytic accuracy. Differentially private mechanisms often need to add large amounts of noise to statistical results, which impairs their usability.
In my thesis I follow two approaches to improve the usability of private data analytics systems in general and differentially private systems in particular. First, I revisit ad-hoc mechanisms and explore the possibilities of systems that do not provide Differential Privacy or only a weak version thereof. Based on an attack analysis I devise a set of new protection mechanisms including Query Based Bookkeeping (QBB). In contrast to previous systems QBB only requires the history of analysts’ queries in order to provide privacy protection. In particular, QBB does not require knowledge about the protected individuals’ data.
In my second approach I use the insights gained with QBB to propose UniTraX, the first differentially private analytics system that allows to analyze part of a protected dataset without affecting the other parts and without giving up on accuracy. I show UniTraX’s usability by way of multiple case studies on real-world datasets across different domains. UniTraX allows more queries than previous differentially private data analytics systems at moderate runtime overheads.
Temporal Data Management and Incremental Data Recomputation with Wide-column Stores and MapReduce
(2017)
In recent years, ”Big Data” has become an important topic in academia
and industry. To handle the challenges and problems caused by Big Data,
new types of data storage systems called ”NoSQL stores” (means ”Not-only-
SQL”) have emerged.
”Wide-column stores” are one kind of NoSQL stores. Compared to relational database systems, wide-column stores introduce a new data model,
new IRUD (Insert, Retrieve, Update and Delete) semantics with support for
schema-flexibility, single-row transactions and data expiration constraints.
Moreover, each column stores multiple data versions with associated time-
stamps. Well-known examples are Google’s ”Big-table” and its open sourced
counterpart ”HBase”. Recently, such systems are increasingly used in business intelligence and data warehouse environments to provide decision support, controlling and revision capabilities.
Besides managing the current values, data warehouses also require management and processing of historical, time-related data. Data warehouses
frequently employ techniques for processing changes in various data sources
and incrementally applying such changes to the warehouse to keep it up-to-
date. Although both incremental data warehousing maintenance and temporal data management have been the subject of intensive research in the
relational database and finally commercial database products have picked up
the ability for temporal data processing and management, such capabilities
have not been explored systematically for today’s wide-column stores.
This thesis helps to address the shortcomings mentioned above. It care-
fully analyzes the properties of wide-column stores and the applicability
of mechanisms for temporal data management and incremental data ware-
house maintenance known from relational databases, extends well-known approaches and develops new capabilities for providing equivalent support in
wide-column stores.
Towards A Non-tracking Web
(2016)
Today, many publishers (e.g., websites, mobile application developers) commonly use third-party analytics services and social widgets. Unfortunately, this scheme allows these third parties to track individual users across the web, creating privacy concerns and leading to reactions to prevent tracking via blocking, legislation and standards. While improving user privacy, these efforts do not consider the functionality third-party tracking enables publishers to use: to obtain aggregate statistics about their users and increase their exposure to other users via online social networks. Simply preventing third-party tracking without replacing the functionality it provides cannot be a viable solution; leaving publishers without essential services will hurt the sustainability of the entire ecosystem.
In this thesis, we present alternative approaches to bridge this gap between privacy for users and functionality for publishers and other entities. We first propose a general and interaction-based third-party cookie policy that prevents third-party tracking via cookies, yet enables social networking features for users when wanted, and does not interfere with non-tracking services for analytics and advertisements. We then present a system that enables publishers to obtain rich web analytics information (e.g., user demographics, other sites visited) without tracking the users across the web. While this system requires no new organizational players and is practical to deploy, it necessitates the publishers to pre-define answer values for the queries, which may not be feasible for many analytics scenarios (e.g., search phrases used, free-text photo labels). Our second system complements the first system by enabling publishers to discover previously unknown string values to be used as potential answers in a privacy-preserving fashion and with low computation overhead for clients as well as servers. These systems suggest that it is possible to provide non-tracking services with (at least) the same functionality as today’s tracking services.