Kaiserslautern - Fachbereich Informatik
Refine
Year of publication
- 2014 (15) (remove)
Document Type
- Doctoral Thesis (14)
- Preprint (1)
Language
- English (15)
Has Fulltext
- yes (15)
Keywords
- Activity recognition (2)
- Wearable computing (2)
- Adaptive Data Structure (1)
- Algorithm (1)
- Boosting (1)
- Classification (1)
- Code Generation (1)
- Computer graphics (1)
- Cycle Accuracy (1)
- Dataset (1)
Faculty / Organisational entity
Three dimensional (3d) point data is used in industry for measurement and reverse engineering. Precise point data is usually acquired with triangulating laser scanners or high precision structured light scanners. Lower precision point data is acquired by real-time structured light devices or by stereo matching with multiple cameras. The basic principle of all these methods is the so-called triangulation of 3d coordinates from two dimensional (2d) camera images.
This dissertation contributes a method for multi-camera stereo matching that uses a system of four synchronized cameras. A GPU based stereo matching method is presented to achieve a high quality reconstruction at interactive frame rates. Good depth resolution is achieved by allowing large disparities between the images. A multi level approach on the GPU allows a fast processing of these large disparities. In reverse engineering, hand-held laser scanners are used for the scanning of complex shaped objects. The operator of the scanner can scan complex regions slower, multiple times, or from multiple angles to achieve a higher point density. Traditionally, computer aided design (CAD) geometry is reconstructed in a separate step after the scanning. Errors or missing parts in the scan prevent a successful reconstruction. The contribution of this dissertation is an on-line algorithm that allows the reconstruction during the scanning of an object. Scanned points are added to the reconstruction and improve it on-line. The operator can detect the areas in the scan where the reconstruction needs additional data.
First, the point data is thinned out using an octree based data structure. Local normals and principal curvatures are estimated for the reduced set of points. These local geometric values are used for segmentation using a region growing approach. Implicit quadrics are fitted to these segments. The canonical form of the quadrics provides the parameters of basic geometric primitives.
An improved approach uses so called accumulated means of local geometric properties to perform segmentation and primitive reconstruction in a single step. Local geometric values can be added and removed on-line to these means to get a stable estimate over a complete segment. By estimating the shape of the segment it is decided which local areas are added to a segment. An accumulated score estimates the probability for a segment to belong to a certain type of geometric primitive. A boundary around the segment is reconstructed using a growing algorithm that ensures that the boundary is closed and avoids self intersections.
This thesis discusses several applications of computational topology to the visualization
of scalar fields. Scalar field data come from different measurements and simulations. The
intrinsic properties of this kind of data, which make the visualization of it to a complicated
task, are the large size and presence of noise. Computational topology is a powerful tool
for automatic feature extraction, which allows the user to interpret the information contained
in the dataset in a more efficient way. Utilizing it one can make the main purpose of
scientific visualization, namely extracting knowledge from data, a more convenient task.
Volume rendering is a class of methods designed for realistic visual representation of 3D
scalar fields. It is used in a wide range of applications with different data size, noise
rate and requirements on interactivity and flexibility. At the moment there is no known
technique which can meet the needs of every application domain, therefore development
of methods solving specific problems is required. One of such algorithms, designed for
rendering of noisy data with high frequencies is presented in the first part of this thesis.
The method works with multidimensional transfer functions and is especially suited for
functions exhibiting sharp features. Compared with known methods the presented algorithm
achieves better visual quality with a faster performance in presence of mentioned
features. An improvement on the method utilizing a topological theory, Morse theory, and
a topological construct, Morse-Smale complex, is also presented in this part of the thesis.
The improvement allows for performance speedup at a little precomputation and memory
cost.
The usage of topological methods for feature extraction on a real world dataset often
results in a very large feature space which easily leads to information overflow. Topology
simplification is designed to reduce the number of features and allow a domain expert
to concentrate on the most important ones. In the terms of Morse theory features are
represented by critical points. An importance measure which is usually used for removing
critical points is called homological persistence. Critical points are cancelled pairwise
according to their homological persistence value. In the presence of outlier-like noise
homological persistence has a clear drawback: the outliers get a high importance value
assigned and therefore are not being removed. In the second part of this thesis a new
importance measure is presented which is especially suited for data with outliers. This
importance measure is called scale space persistence. The algorithm for the computation
of this measure is based on the scale space theory known from the area of computer
vision. The development of a critical point in scale space gives information about its
spacial extent, therefore outliers can be distinguished from other critical points. The usage
of the presented importance measure is demonstrated on a real world application, crater
identification on a surface of Mars.
The third part of this work presents a system for general interactive topology analysis
and exploration. The development of such a system is motivated by the fact that topological
methods are often considered to be complicated and hard to understand, because
application of topology for visualization requires deep understanding of the mathematical
background behind it. A domain expert exploring the data using topology for feature
extraction needs an intuitive way to manipulate the exploration process. The presented
system is based on an intuitive notion of a scene graph, where the user can choose and
place the component blocks to achieve an individual result. This way the domain expert
can extract more knowledge from given data independent on the application domain. The
tool gives the possibility for calculation and simplification of the underlying topological
structure, Morse-Smale complex, and also the visualization of parts of it. The system also
includes a simple generic query language to acquire different structures of the topological
structure at different levels of hierarchy.
The fourth part of this dissertation is concentrated on an application of computational
geometry for quality assessment of a triangulated surface. Quality assessment of a triangulation
is called surface interrogation and is aimed for revealing intrinsic irregularities
of a surface. Curvature and continuity are the properties required to design a visually
pleasing geometric object. For example, a surface of a manufactured body usually should
be convex without bumps of wiggles. Conventional rendering methods hide the regions
of interest because of smoothing or interpolation. Two new methods which are presented
here: curvature estimation using local fitting with B´ezier patches and computation of reflection
lines for visual representation of continuity, are specially designed for assessment
problems. The examples and comparisons presented in this part of the thesis prove the
benefits of the introduced algorithms. The methods are also well suited for concurrent visualization
of the results from simulation and surface interrogation to reveal the possible
intrinsic relationship between them.
Building interoperation among separately developed software units requires checking their conceptual assumptions and constraints. However, eliciting such assumptions and constraints is time consuming and is a challenging task as it requires analyzing each of the interoperating software units. To address this issue we proposed a new conceptual interoperability analysis approach which aims at decreasing the analysis cost and the conceptual mismatches between the interoperating software units. In this report we present the design of a planned controlled experiment for evaluating the effectiveness, efficiency, and acceptance of our proposed conceptual interoperability analysis approach. The design includes the study objectives, research questions, statistical hypotheses, and experimental design. It also provides the materials that will be used in the execution phase of the planned experiment.
Researchers and analysts in modern industrial and academic environments are faced with a daunting amount of multivariate data. While there has been significant development in the areas of data mining and knowledge
discovery, there is still the need for improved visualizations and generic solutions. The state-of-the-art in visual analytics and exploratory data visualization is to incorporate more profound analysis methods while focusing on improving interactive abilities, in order to support data analysts in gaining new insights through visual exploration and hypothesis building.
In the research field of exploratory data visualization, this thesis contributes new approaches in dimension reduction that tackle a number of shortcomings in state-of-the-art methods, such as interpretability and ambiguity. By combining methods from several disciplines, we describe how ambiguity can be countered effectively by visualizing coordinate values within a lower-dimensional embedding, thereby focusing on the display of the structural composition of high-dimensional data and on an intuitive depiction of inherent global relationships. We also describe how properties and alignment of high-dimensional manifolds can be analyzed in different levels of detail by means of a self-embedding hierarchy of local projections, each using full degree of freedom, while keeping the global context.
To the application field of air quality research, the thesis provides novel means for the research of aerosol source contributions. Triggered by this particularly challenging application problem, we instigate a new research direction in the area of visual analytics by describing a methodology to model-based visual analysis that (i) allows the scientist to be “in the loop” of computations and (ii) enables him to verify and control the analysis process, in order to steer computations towards physical meaning. Careful reflection of our work in this application has led us to derive key design choices that underlie and transcend beyond application-specific solutions. As a result, we describe a general design methodology to computing parameters of a pre-defined analytical model that map to multivariate data. Core applications areas that can benefit from our approach are within engineering disciplines, such as civil, chemical, electrical, and mechanical engineering, as well as in geology, physics, and biology.
If an automated system is tasked to provide services such as search or clustering of information on an information repository, the quality of the output depends a lot on the information that is available to the system in machine-readable form. Simple text, for example, is machine-readable only in a very limited sense. Advanced services typically need to derive other representations of the text (e.g., sets of keywords) as input for their core algorithms. Some services might need information that cannot be derived from the resource in question alone, but is available as separate metadata only, such as usage information. Annotations can be used to carry this information.
This thesis focuses on so-called ontology-based annotations. In contrast to other forms of annotations such as Tags (arbitrary strings that users can assign to resources), ontology-based annotations conform to a predefined data structure and class hierarchy. An advantage of this approach is that rich information can be stored in a well-structured way in the annotations; a drawback is that users need to be familiar with the hierarchy and other design decisions of the underlying ontology used for annotations.
Two scenarios are considered in this thesis:
First, a document-based scenario in which text annotations are used to represent both information about the text content and usage and user context information in a multi-user setting with mostly objective annotation criteria; second, a resource-based scenario whose annotation model focuses on multi-user settings with subjective annotation criteria, using (dis-)similarities in user annotations to derive user similarity metrics, and building personalized views from this information.
Finally, the prototypical systems that have been developed throughout this thesis get evaluated, proving the concepts presented in this thesis.
This PhD-Thesis deals with the calculation and application of a new class of invariants, that can be used to recognize patterns in tensor fields (i.e. scalar fields, vector fields und matrix fields), and by the composition of scalar fields with delta-functions also to point-clouds.
In the first chapter an overview over already existing invariants is given.
In the second chapter the general definition of the new invariants is given:
starting with a tensor field a set of moment tensor is created via folding in tensor-product manner with different orders of the tensor product of the positional vector. From these, rotational invariant values are calculated via contraction of tensor products. An algorithm to get a complete and independent set of invariants from a given moment tensor set is described. Furthermore methods to make these sets of invariants invariant against translation, rotation, scaling, and affine transformation.
In the third chapter, a method to optimize the calculation of these sets of invariants is described: every invariant can be modeled as undirected graph comprising multiple sub-graphs representing partially contracted tensor products of the moment tensors.
The composition of the sets of invariants is optimized by a clever choice of the decomposition into sub-graphs, all paths creating a hyper-graph of sub-graphs where each node describes a composition step. Finally, C++-source-code is created, which optimized using the symmetry of the different tensors and tensor-products, and a comparison of the effort to other calculation methods of invariants is given.
The fourth chapter describes the application of the invariants to object recognition in point-clouds from 3D-scans. To do this, the invariants of sub-sets of point-clouds are stored for every known object. Afterwards, invariants are calculated from an unknown point-cloud and tried to find them in the database to assign it to one of the known objects. Benchmarks using three 3D-object databases are made testing time and recognition rate.
Embedded systems, ranging from very simple systems up to complex controllers, may
nowadays have quite challenging real-time requirements. Many embedded systems are reactive
systems that have to respond to environmental events and have to guarantee certain real-time
constrain. Their execution is usually divided into reaction steps, where in each step, the
system reads inputs from the environment and reacts to these by computing corresponding
outputs.
The synchronous Model of Computation (MoC) has proven to be well-suited for the
development of reactive real-time embedded systems whose paradigm directly reflects the
reactive nature of the systems it describes. Another advantage is the availability of formal
verification by model checking as a result of the deterministic execution based on a formal
semantics. Nevertheless, the increasing complexity of embedded systems requires to compensate
the natural disadvantages of model checking that suffers from the well-known state-space
explosion problem. It is therefore natural to try to integrate other verification methods with
the already established techniques. Hence, improvements to encounter these problems are
required, e.g., appropriate decomposition techniques, which encounter the disadvantages
of the model checking approach naturally. But defining decomposition techniques for synchronous
language is a difficult task, as a result of the inherent parallelism emerging from
the synchronous broadcast communication.
Inspired by the progress in the field of desynchronization of synchronous systems by
representing them in other MoCs, this work will investigate the possibility of adapting and use
methods and tools designed for other MoC for the verification of systems represented in the
synchronous MoC. Therefore, this work introduces the interactive verification of synchronous
systems based on the basic foundation of formal verification for sequential programs – the
Hoare calculus. Due to the different models of computation several problems have to be
solved. In particular due to the large amount of concurrency, several parts of the program
are active at the same point of time. In contrast to sequential programs, a decomposition
in the Hoare-logic style that is in some sense a symbolic execution from one control flow
location to another one requires the consideration of several flows here. Therefore, different
approaches for the interactive verification of synchronous systems are presented.
Additionally, the representation of synchronous systems by other MoCs and the influence
of the representation on the verification task by differently embedding synchronous system
in a single verification tool are elaborated.
The feasibility is shown by integration of the presented approach with the established
model checking methods by implementing the AIFProver on top of the Averest system.
The recognition of day-to-day activities is still a very challenging and important research topic. During recent years, a lot of research has gone into designing and realizing smart environ- ments in different application areas such as health care, maintenance, sports or smart homes. As a result, a large amount of sensor modalities were developed, different types of activity and context recognition services were implemented and the resulting systems were benchmarked using state-of-the-art evaluation techniques. However, so far hardly any of these approaches have found their way into the market and consequently into the homes of real end-users on a large scale. The reason for this is, that almost all systems have one or more of the following characteristics in common: expensive high-end or prototype sensors are used which are not af- fordable or reliable enough for mainstream applications; many systems are deployed in highly instrumented environments or so-called "living labs", which are far from real-life scenarios and are often evaluated only in research labs; almost all systems are based on complex system con- figurations and/or extensive training data sets, which means that a large amount of data must be collected in order to install the system. Furthermore, many systems rely on a user and/or environment dependent training, which makes it even more difficult to install them on a large scale. Besides, a standardized integration procedure for the deployment of services in existing environments and smart homes has still not been defined. As a matter of fact, service providers use their own closed systems, which are not compatible with other systems, services or sensors. It is clear, that these points make it nearly impossible to deploy activity recognition systems in a real daily-life environment, to make them affordable for real users and to deploy them in hundreds or thousands of different homes.
This thesis works towards the solution of the above mentioned problems. Activity and context recognition systems designed for large-scale deployment and real-life scenarios are intro- duced. Systems are based on low-cost, reliable sensors and can be set up, configured and trained with little effort, even by technical laymen. It is because of these characteristics that we call our approach "minimally invasive". As a consequence, large amounts of training data, that are usu- ally required by many state-of-the-art approaches, are not necessary. Furthermore, all systems were integrated unobtrusively in real-world/similar to real-world environments and were evalu- ated under real-life, as well as similar to real-life conditions. The thesis addresses the following topics: First, a sub-room level indoor positioning system is introduced. The system is based on low-cost ceiling cameras and a simple computer vision tracking approach. The problem of user identification is solved by correlating modes of locomotion patterns derived from the trajectory of unidentified objects and on-body motion sensors. Afterwards, the issue of recognizing how and what mainstream household devices have been used for is considered. Based on a low-cost microphone, the water consumption of water-taps can be approximated by analyzing plumbing noise. Besides that, operating modes of mainstream electronic devices were recognized by using rule-based classifiers, electric current features and power measurement sensors. As a next step, the difficulty of spotting subtle, barely distinguishable hand activities and the resulting object interactions, within a data set containing a large amount of background data, is addressed. The problem is solved by introducing an on-body core system which is configured by simple, one-time physical measurements and minimal data collections. The lack of large training sets is compensated by fusing the system with activity and context recognition systems, that are able to reduce the search space observed. Amongst other systems, previously introduced approaches and ideas are revisited in this section. An in-depth evaluation shows the impact of each fusion procedure on the performance and run-time of the system. The approaches introduced are able to provide significantly better results than a state-of-the-art inertial system using large amounts of training data. The idea of using unobtrusive sensors has also been applied to the field of behavior analysis. Integrated smartphone sensors are used to detect behavioral changes of in- dividuals due to medium-term stress periods. Behavioral parameters related to location traces, social interactions and phone usage were analyzed to detect significant behavioral changes of individuals during stressless and stressful time periods. Finally, as a closing part of the the- sis, a standardization approach related to the integration of ambient intelligence systems (as introduced in this thesis) in real-life and large-scale scenarios is shown.
Optical character recognition (OCR) of machine printed text is ubiquitously considered as a solved problem. However, error free OCR of degraded (broken and merged) and noisy text is still challenging for modern OCR systems. OCR of degraded text with high accuracy is very important due to many applications in business, industry and large scale document digitization projects. This thesis presents a new OCR method for degraded
text recognition by introducing a combined ANN/HMM OCR approach. The approach
provides significantly better performance in comparison with state-of-the-art HMM based OCR methods and existing open source OCR systems. In addition, the thesis introduces novel applications of ANNs and HMMs for document image preprocessing and recognition of low resolution text. Furthermore, the thesis provides psychophysical experiments to determine the effect of letter permutation in visual word recognition of Latin and Cursive
script languages.
HMMs and ANNs are widely employed pattern recognition paradigms and have been
used in numerous pattern classification problems. This work presents a simple and novel method for combining the HMMs and ANNs in application to segmentation free OCR of degraded text. HMMs and ANNs are powerful pattern recognition strategies and their combination is interesting to improve current state-of-the-art research in OCR. Mostly, previous attempts in combining the HMMs and ANNs were focused on applying ANNs
as approximation of the probability density function or as a neural vector quantizer for HMMs. These methods either require combined NN/HMM training criteria [ECBG-MZM11] or they use complex neural network architecture like time delay or space displacement neural networks [BLNB95]. However, in this work neural networks are used as discriminative feature extractor, in combination with novel text line scanning mechanism, to extract discriminative features from unsegmented text lines. The features are
processed by HMMs to provide segmentation free text line recognition. The ANN/HMM modules are trained separately on a common dataset by using standard machine learning procedures. The proposed ANN/HMM OCR system also realizes to some extent several cognitive reading based strategies during the OCR. On a dataset of 1,060 degraded text lines extracted from the widely used UNLV-ISRI benchmark database [TNBC99], the presented system achieves a 30% reduction in error rate as compared to Google’s Tesseract OCR system [Smi13] and 43% reduction in error as compared to OCRopus OCR system [Bre08], which are the best open source OCR systems available today.
In addition, this thesis introduces new applications of HMMs and ANNs in OCR and document images preprocessing. First, an HMMs-based segmentation free OCR approach is presented for recognition of low resolution text. OCR of low resolution text is quite important due to presence of low resolution text in screen-shots, web images and video captions. OCR of low resolution text is challenging because of antialiased rendering and use of very small font size. The characters in low resolution text are usually joined to each other and they may appear differently at different locations on computer screen. This
work presents the use of HMMs in optical recognition of low resolution isolated characters and text lines. The evaluation of the proposed method shows that HMMs-based OCR techniques works quite well and reaches the performance of specialized approaches for OCR of low resolution text.
Then, this thesis presents novel applications of ANNs for automatic script recognition and orientation detection. Script recognition determines the written script on the page for the application of an appropriate character recognition algorithm. Orientation detection detects and corrects the deviation of the document’s orientation angle from the horizontal direction. Both, script recognition and orientation detection, are important preprocessing steps in developing robust OCR systems. In this work, instead of extracting handcrafted features, convolutional neural networks are used to extract relevant discriminative features for each classification task. The proposed method resulted in more than 95% script recognition accuracy on various multi-script documents at connected component level
and 100% page orientation detection accuracy for Urdu documents.
Human reading is a nearly analogous cognitive process to OCR that involves decoding of printed symbols into meanings. Studying the cognitive reading behavior may help in building a robust machine reading strategy. This thesis presents a behavioral study that deals on how cognitive system works in visual recognition of words and permuted non-words. The objective of this study is to determine the impact of overall word shape
in visual word recognition process. The permutation is considered as a source of shape degradation and visual appearance of actual words can be distorted by changing the constituent letter positions inside the words. The study proposes a hypothesis that reading of words and permuted non-words are two distinct mental level processes, and people use
different strategies in handling permuted non-words as compared to normal words. The hypothesis is tested by conducting psychophysical experiments in visual recognition of words from orthographically different languages i.e. Urdu, German and English. Experimental data is analyzed using analysis of variance (ANOVA) and distribution free rank tests to determine significance differences in response time latencies for two classes of data. The results support the presented hypothesis and the findings are consistent with
the dual route theories of reading.
Regular physical activity is essential to maintain or even improve an individual’s health. There exist various guidelines on how much individuals should do. Therefore, it is important to monitor performed physical activities during people’s daily routine in order to tell how far they meet professional recommendations. This thesis follows the goal to develop a mobile, personalized physical activity monitoring system applicable for everyday life scenarios. From the mentioned recommendations, this thesis concentrates on monitoring aerobic physical activity. Two main objectives are defined in this context. On the one hand, the goal is to estimate the intensity of performed activities: To distinguish activities of light, moderate or vigorous effort. On the other hand, to give a more detailed description of an individual’s daily routine, the goal is to recognize basic aerobic activities (such as walk, run or cycle) and basic postures (lie, sit and stand).
With recent progress in wearable sensing and computing the technological tools largely exist nowadays to create the envisioned physical activity monitoring system. Therefore, the focus of this thesis is on the development of new approaches for physical activity recognition and intensity estimation, which extend the applicability of such systems. In order to make physical activity monitoring feasible in everyday life scenarios, the thesis deals with questions such as 1) how to handle a wide range of e.g.
everyday, household or sport activities and 2) how to handle various potential users. Moreover, this thesis deals with the realistic scenario where either the currently performed activity or the current user is unknown during the development and training
phase of activity monitoring applications. To answer these questions, this thesis proposes and developes novel algorithms, models and evaluation techniques, and performs thorough experiments to prove their validity.
The contributions of this thesis are both of theoretical and of practical value. Addressing the challenge of creating robust activity monitoring systems for everyday life the concept of other activities is introduced, various models are proposed and validated. Another key challenge is that complex activity recognition tasks exceed the potential of existing classification algorithms. Therefore, this thesis introduces a confidence-based extension of the well known AdaBoost.M1 algorithm, called ConfAdaBoost.M1. Thorough experiments show its significant performance improvement compared to commonly used boosting methods. A further major theoretical contribution is the introduction and validation of a new general concept for the personalization of physical activity recognition applications, and the development of a novel algorithm (called Dependent Experts) based on this concept. A major contribution of practical value is the introduction of a new evaluation technique (called leave-one-activity-out) to simulate when performing previously unknown activities in a physical activity monitoring system. Furthermore, the creation and benchmarking of publicly available physical activity monitoring datasets within this thesis are directly benefiting the research community. Finally, the thesis deals with issues related to the implementation of the proposed methods, in order to realize the envisioned mobile system and integrate it into a full healthcare application for aerobic activity monitoring and support in daily life.