Refine
Document Type
- Doctoral Thesis (5) (remove)
Language
- English (5) (remove)
Has Fulltext
- yes (5)
Keywords
- machine learning (5) (remove)
Faculty / Organisational entity
Crowd condition monitoring concerns the crowd safety and concerns business performance metrics. The research problem to be solved is a crowd condition estimation approach to enable and support the supervision of mass events by first-responders and marketing experts, but is also targeted towards supporting social scientists, journalists, historians, public relations experts, community leaders, and political researchers. Real-time insights of the crowd condition is desired for quick reactions and historic crowd conditions measurements are desired for profound post-event crowd condition analysis.
This thesis aims to provide a systematic understanding of different approaches for crowd condition estimation by relying on 2.4 GHz signals and its variation in crowds of people, proposes and categorizes possible sensing approaches, applies supervised machine learning algorithms, and demonstrates experimental evaluation results. I categorize four sensing approaches. Firstly, stationary sensors which are sensing crowd centric signals sources. Secondly, stationary sensors which are sensing other stationary signals sources (either opportunistic or special purpose signal sources). Thirdly, a few volunteers within the crowd equipped with sensors which are sensing other surrounding crowd centric device signals (either individually, in a single group or collaboratively) within a small region. Fourthly, a small subset of participants within the crowd equipped with sensors and roaming throughout a whole city to sense wireless crowd centric signals.
I present and evaluate an approach with meshed stationary sensors which were sensing crowd centric devices. This was demonstrated and empirically evaluated within an industrial project during three of the world-wide largest automotive exhibitions. With over 30 meshed stationary sensors in an optimized setup across 6400m2 I achieved a mean absolute error of the crowd density of just 0.0115
people per square meter which equals to an average of below 6% mean relative error from the ground truth. I validate the contextual crowd condition anomaly detection method during the visit of chancellor Mrs. Merkel and during a large press conference during the exhibition. I present the approach of opportunistically sensing stationary based wireless signal variations and validate this during the Hannover CeBIT exhibition with 80 opportunistic sources with a crowd condition estimation relative error of below 12% relying only on surrounding signals in influenced by humans. Pursuing this approach I present an approach with dedicated signal sources and sensors to estimate the condition of shared office environments. I demonstrate methods being viable to even detect low density static crowds, such as people sitting at their desks, and evaluate this on an eight person office scenario. I present the approach of mobile crowd density estimation by a group of sensors detecting other crowd centric devices in the proximity with a classification accuracy of the crowd density of 66 % (improvement of over 22% over a individual sensor) during the crowded Oktoberfest event. I propose a collaborative mobile sensing approach which makes the system more robust against variations that may result from the background of the people rather than the crowd condition with differential features taking information about the link structure between actively scanning devices, the ratio between values observed by different devices, ratio of discovered crowd devices over time, team-wise diversity of discovered devices, number of semi- continuous device visibility periods, and device visibility durations into account. I validate the approach on multiple experiments including the Kaiserslautern European soccer championship public viewing event and evaluated the collaborative mobile sensing approach with a crowd condition estimation accuracy of 77 % while outperforming previous methods by 21%. I present the feasibility of deploying the wireless crowd condition sensing approach to a citywide scale during an event in Zurich with 971 actively sensing participants and outperformed the reference method by 24% in average.
The rising demand for machine learning (ML) models has become a growing concern for stakeholders who depend on automatic decisions. In today's world, black-box solutions (in particular deep neural networks) are being continuously implemented for more and more high-stake scenarios like medical diagnosis or autonomous vehicles. Unfortunately, when these opaque models make predictions that do not align with our expectations, finding a valid justification is simply not possible.
Explainable Artificial Intelligence (XAI) has emerged in response to our need for finding reasons that justify what a machine sees, but we don't. However, contributions in this field are mostly centered around local structures such as individual neurons or single input samples. Global characteristics that govern the behavior of a model are still poorly understood or have not been explored yet. An aggravating factor is the lack of a standard terminology to contextualize and compare contributions in this field. Such lack of consensus is depriving the ML community from ultimately moving away from black-boxes, and start creating systematic methods to design models that are interpretable by design.
So, what are the global patterns that govern the behavior of modern neural networks, and what can we do to make these models more interpretable from the start?
This thesis delves into both issues, unveiling patterns about existing models, and establishing strategies that lead to more interpretable architectures. These include biases coming from imbalanced datasets, quantification of model capacity, and robustness against adversarial attacks. When looking for new models that are interpretable by design, this work proposes a strategy to add more structure to neural networks, based on auxiliary tasks that are semantically related to the main objective. This strategy is the result of applying a novel theoretical framework proposed as part of this work. The XAI framework is meant to contextualize and compare contributions in XAI by providing actionable definitions for terms like "explanation" and "interpretation."
Altogether, these contributions address dire demands for understanding more about the global behavior of modern deep neural networks. More importantly, they can be used as a blueprint for designing novel, and more interpretable architectures. By tackling issues from the present and the future of XAI, results from this work are a firm step towards more interpretable models for computer vision.
In recent years, enormous progress has been made in the field of Artificial Intelligence (AI). Especially the introduction of Deep Learning and end-to-end learning, the availability of large datasets and the necessary computational power in form of specialised hardware allowed researchers to build systems with previously unseen performance in areas such as computer vision, machine translation and machine gaming. In parallel, the Semantic Web and its Linked Data movement have published many interlinked RDF datasets, forming the world’s largest, decentralised and publicly available knowledge base.
Despite these scientific successes, all current systems are still narrow AI systems. Each of them is specialised to a specific task and cannot easily be adapted to all other human intelligence tasks, as would be necessary for Artificial General Intelligence (AGI). Furthermore, most of the currently developed systems are not able to learn by making use of freely available knowledge such as provided by the Semantic Web. Autonomous incorporation of new knowledge is however one of the pre-conditions for human-like problem solving.
This work provides a small step towards teaching machines such human-like reasoning on freely available knowledge from the Semantic Web. We investigate how human associations, one of the building blocks of our thinking, can be simulated with Linked Data. The two main results of these investigations are a ground truth dataset of semantic associations and a machine learning algorithm that is able to identify patterns for them in huge knowledge bases.
The ground truth dataset of semantic associations consists of DBpedia entities that are known to be strongly associated by humans. The dataset is published as RDF and can be used for future research.
The developed machine learning algorithm is an evolutionary algorithm that can learn SPARQL queries from a given SPARQL endpoint based on a given list of exemplary source-target entity pairs. The algorithm operates in an end-to-end learning fashion, extracting features in form of graph patterns without the need for human intervention. The learned patterns form a feature space adapted to the given list of examples and can be used to predict target candidates from the SPARQL endpoint for new source nodes. On our semantic association ground truth dataset, our evolutionary graph pattern learner reaches a Recall@10 of > 63 % and an MRR (& MAP) > 43 %, outperforming all baselines. With an achieved Recall@1 of > 34% it even reaches average human top response prediction performance. We also demonstrate how the graph pattern learner can be applied to other interesting areas without modification.
Life insurance companies are asked by the Solvency II regime to retain capital requirements against economically adverse developments. This ensures that they are continuously able to meet their payment obligations towards the policyholders. When relying on an internal model approach, an insurer's solvency capital requirement is defined as the 99.5% value-at-risk of its full loss probability distribution over the coming year. In the introductory part of this thesis, we provide the actuarial modeling tools and risk aggregation methods by which the companies can accomplish the derivations of these forecasts. Since the industry still lacks the computational capacities to fully simulate these distributions, the insurers have to refer to suitable approximation techniques such as the least-squares Monte Carlo (LSMC) method. The key idea of LSMC is to run only a few wisely selected simulations and to process their output further to obtain a risk-dependent proxy function of the loss. We dedicate the first part of this thesis to establishing a theoretical framework of the LSMC method. We start with how LSMC for calculating capital requirements is related to its original use in American option pricing. Then we decompose LSMC into four steps. In the first one, the Monte Carlo simulation setting is defined. The second and third steps serve the calibration and validation of the proxy function, and the fourth step yields the loss distribution forecast by evaluating the proxy model. When guiding through the steps, we address practical challenges and propose an adaptive calibration algorithm. We complete with a slightly disguised real-world application. The second part builds upon the first one by taking up the LSMC framework and diving deeper into its calibration step. After a literature review and a basic recapitulation, various adaptive machine learning approaches relying on least-squares regression and model selection criteria are presented as solutions to the proxy modeling task. The studied approaches range from ordinary and generalized least-squares regression variants over GLM and GAM methods to MARS and kernel regression routines. We justify the combinability of the regression ingredients mathematically and compare their approximation quality in slightly altered real-world experiments. Thereby, we perform sensitivity analyses, discuss numerical stability and run comprehensive out-of-sample tests. The scope of the analyzed regression variants extends to other high-dimensional variable selection applications. Life insurance contracts with early exercise features can be priced by LSMC as well due to their analogies to American options. In the third part of this thesis, equity-linked contracts with American-style surrender options and minimum interest rate guarantees payable upon contract termination are valued. We allow randomness and jumps in the movements of the interest rate, stochastic volatility, stock market and mortality. For the simultaneous valuation of numerous insurance contracts, a hybrid probability measure and an additional regression function are introduced. Furthermore, an efficient seed-related simulation procedure accounting for the forward discretization bias and a validation concept are proposed. An extensive numerical example rounds off the last part.
Automation, Industry 4.0 and artificial intelligence are playing an increasingly central role for companies. Artificial intelligence in particular is currently enabling new methods to achieve a higher level of automation. However, machine learning methods are usually particularly lucrative when a lot of data can be easily collected and patterns can be learned with the help of this data. In the field of metrology, this can prove difficult depending on the area of work. Particularly for micrometer-scale measurements, measurement data often involves a lot of time, effort, patience, and money, so measurement data is not readily available. This raises the question of how meaningfully machine learning approaches can be applied to different domains of measurement tasks, especially in comparison to current solution approaches that use model-based methods. This thesis addresses this question by taking a closer look at two research areas in metrology, micro lead determination and reconstruction. Methods for micro lead determination are presented that determine texture and tool axis with high accuracy. The methods are based on signal processing, classical optimization and machine learning. In the second research area, reconstructions for cutting edges are considered in detail. The reconstruction methods here are based on the robust Gaussian filter and deep neural networks, more specifically autoencoders. All results on micro lead and reconstruction are compared and contrasted in this thesis, and the applicability of the different approaches is evaluated.