Search

941 search hits

571 to 580

Sort by

Visual Learning of Socio-Video Semantics (2015)

Borth, Damian

Today's ubiquity of visual content as driven by the availability of broadband Internet, low-priced storage, and the omnipresence of camera equipped mobile devices conveys much of our thinking and feeling as individuals and as a society. As a result the growth of video repositories is increasing at enourmous rates with content now being embedded and shared through social media. To make use of this new form of social multimedia, concept detection, the automatic mapping of semantic concepts and video content has to be extended such that concept vocabularies are synchronized with current real-world events, systems can perform scalable concept learning with thousands of concepts, and high-level information such as sentiment can be extracted from visual content. To catch up with these demands the following three contributions are made in this thesis: (i) concept detection is linked to trending topics, (ii) visual learning from web videos is presented including the proper treatment of tags as concept labels, and (iii) the extension of concept detection with adjective noun pairs for sentiment analysis is proposed. In order for concept detection to satisfy users' current information needs, the notion of fixed concept vocabularies has to be reconsidered. This thesis presents a novel concept learning approach built upon dynamic vocabularies, which are automatically augmented with trending topics mined from social media. Once discovered, trending topics are evaluated by forecasting their future progression to predict high impact topics, which are then either mapped to an available static concept vocabulary or trained as individual concept detectors on demand. It is demonstrated in experiments on YouTube video clips that by a visual learning of trending topics, improvements of over 100% in concept detection accuracy can be achieved over static vocabularies (n=78,000). To remove manual efforts related to training data retrieval from YouTube and noise caused by tags being coarse, subjective and context-depedent, this thesis suggests an automatic concept-to-query mapping for the retrieval of relevant training video material, and active relevance filtering to generate reliable annotations from web video tags. Here, the relevance of web tags is modeled as a latent variable, which is combined with an active learning label refinement. In experiments on YouTube, active relevance filtering is found to outperform both automatic filtering and active learning approaches, leading to a reduction of required label inspections by 75% as compared to an expert annotated training dataset (n=100,000). Finally, it is demonstrated, that concept detection can serve as a key component to infer the sentiment reflected in visual content. To extend concept detection for sentiment analysis, adjective noun pairs (ANP) as novel entities for concept learning are proposed in this thesis. First a large-scale visual sentiment ontology consisting of 3,000 ANPs is automatically constructed by mining the web. From this ontology a mid-level representation of visual content – SentiBank – is trained to encode the visual presence of 1,200 ANPs. This novel approach of visual learning is validated in three independent experiments on sentiment prediction (n=2,000), emotion detection (n=807) and pornographic filtering (n=40,000). SentiBank is shown to outperform known low-level feature representations (sentiment prediction, pornography detection) or perform comparable to state-of-the art methods (emotion detection). Altogether, these contributions extend state-of-the-art concept detection approaches such that concept learning can be done autonomously from web videos on a large-scale, and can cope with novel semantic structures such as trending topics or adjective noun pairs, adding a new dimension to the understanding of video content.

Statistical Language Modeling for Historical Documents using Weighted Finite-State Transducers and Long Short-Term Memory (2015)

Al Azawi, Mayce

The goal of this work is to develop statistical natural language models and processing techniques based on Recurrent Neural Networks (RNN), especially the recently introduced Long Short- Term Memory (LSTM). Due to their adapting and predicting abilities, these methods are more robust, and easier to train than traditional methods, i.e., words list and rule-based models. They improve the output of recognition systems and make them more accessible to users for browsing and reading. These techniques are required, especially for historical books which might take years of effort and huge costs to manually transcribe them. The contributions of this thesis are several new methods which have high-performance computing and accuracy. First, an error model for improving recognition results is designed. As a second contribution, a hyphenation model for difficult transcription for alignment purposes is suggested. Third, a dehyphenation model is used to classify the hyphens in noisy transcription. The fourth contribution is using LSTM networks for normalizing historical orthography. A size normalization alignment is implemented to equal the size of strings, before the training phase. Using the LSTM networks as a language model to improve the recognition results is the fifth contribution. Finally, the sixth contribution is a combination of Weighted Finite-State Transducers (WFSTs), and LSTM applied on multiple recognition systems. These contributions will be elaborated in more detail. Context-dependent confusion rules is a new technique to build an error model for Optical Character Recognition (OCR) corrections. The rules are extracted from the OCR confusions which appear in the recognition outputs and are translated into edit operations, e.g., insertions, deletions, and substitutions using the Levenshtein edit distance algorithm. The edit operations are extracted in a form of rules with respect to the context of the incorrect string to build an error model using WFSTs. The context-dependent rules assist the language model to find the best candidate corrections. They avoid the calculations that occur in searching the language model and they also make the language model able to correct incorrect words by using context- dependent confusion rules. The context-dependent error model is applied on the university of Washington (UWIII) dataset and the Nastaleeq script in Urdu dataset. It improves the OCR results from an error rate of 1.14% to an error rate of 0.68%. It performs better than the state-of-the-art single rule-based which returns an error rate of 1.0%. This thesis describes a new, simple, fast, and accurate system for generating correspondences between real scanned historical books and their transcriptions. The alignment has many challenges, first, the transcriptions might have different modifications, and layout variations than the original book. Second, the recognition of the historical books have misrecognition, and segmentation errors, which make the alignment more difficult especially the line breaks, and pages will not have the same correspondences. Adapted WFSTs are designed to represent the transcription. The WFSTs process Fraktur ligatures and adapt the transcription with a hyphenations model that allows the alignment with respect to the varieties of the hyphenated words in the line breaks of the OCR documents. In this work, several approaches are implemented to be used for the alignment such as: text-segments, page-wise, and book-wise approaches. The approaches are evaluated on German calligraphic (Fraktur) script historical documents dataset from “Wan- derungen durch die Mark Brandenburg” volumes (1862-1889). The text-segmentation approach returns an error rate of 2.33% without using a hyphenation model and an error rate of 2.0% using a hyphenation model. Dehyphenation methods are presented to remove the hyphen from the transcription. They provide the transcription in a readable and reflowable format to be used for alignment purposes. We consider the task as classification problem and classify the hyphens from the given patterns as hyphens for line breaks, combined words, or noise. The methods are applied on clean and noisy transcription for different languages. The Decision Trees classifier returns better performance on UWIII dataset and returns an accuracy of 98%. It returns 97% on Fraktur script. A new method for normalizing historical OCRed text using LSTM is implemented for different texts, ranging from Early New High German 14th - 16th centuries to modern forms in New High German applied on the Luther bible. It performed better than the rule-based word-list approaches. It provides a transcription for various purposes such as part-of-speech tagging and n-grams. Also two new techniques are presented for aligning the OCR results and normalize the size by using adding Character-Epsilons or Appending-Epsilons. They allow deletion and insertion in the appropriate position in the string. In normalizing historical wordforms to modern wordforms, the accuracy of LSTM on seen data is around 94%, while the state-of-the-art combined rule-based method returns 93%. On unseen data, LSTM returns 88% and the combined rule-based method returns 76%. In normalizing modern wordforms to historical wordforms, the LSTM delivers the best performance and returns 93.4% on seen data and 89.17% on unknown data. In this thesis, a deep investigation has been done on constructing high-performance language modeling for improving the recognition systems. A new method to construct a language model using LSTM is designed to correct OCR results. The method is applied on UWIII and Urdu script. The LSTM approach outperforms the state-of-the-art, especially for unseen tokens during training. On the UWIII dataset, the LSTM returns reduction in OCR error rates from 1.14% to 0.48%. On the Nastaleeq script in Urdu dataset, the LSTM reduces the error rate from 6.9% to 1.58%. Finally, the integration of multiple recognition outputs can give higher performance than a single recognition system. Therefore, a new method for combining the results of OCR systems is explored using WFSTs and LSTM. It uses multiple OCR outputs and votes for the best output to improve the OCR results. It performs better than the ISRI tool, Pairwise of Multiple Sequence and it helps to improve the OCR results. The purpose is to provide correct transcription so that it can be used for digitizing books, linguistics purposes, N-grams, and part-of-speech tagging. The method consists of two alignment steps. First, two recognition systems are aligned using WFSTs. The transducers are designed to be more flexible and compatible with the different symbols in line and page breaks to avoid the segmentation and misrecognition errors. The LSTM model then is used to vote the best candidate correction of the two systems and improve the incorrect tokens which are produced during the first alignment. The approaches are evaluated on OCRs output from the English UWIII and historical German Fraktur dataset which are obtained from state-of-the-art OCR systems. The Experiments show that the error rate of ISRI-Voting is 1.45%, the error rate of the Pairwise of Multiple Sequence is 1.32%, the error rate of the Line-to-Page alignment is 1.26% and the error rate of the LSTM approach has the best performance with 0.40%. The purpose of this thesis is to contribute methods providing correct transcriptions corresponding to the original book. This is considered to be the first step towards an accurate and more effective use of the documents in digital libraries.

Context Awareness for Enhancing Heterogeneous Access Management and Self-Optimizing Networks (2015)

Klein, Andreas

The heterogeneity of today's access possibilities to wireless networks imposes challenges for efficient mobility support and resource management across different Radio Access Technologies (RATs). The current situation is characterized by the coexistence of various wireless communication systems, such as GSM, HSPA, LTE, WiMAX, and WLAN. These RATs greatly differ with respect to coverage, spectrum, data rates, Quality of Service (QoS), and mobility support. In real systems, mobility-related events, such as Handover (HO) procedures, directly affect resource efficiency and End-To-End (E2E) performance, in particular with respect to signaling efforts and users' QoS. In order to lay a basis for realistic multi-radio network evaluation, a novel evaluation methodology is introduced in this thesis. A central hypothesis of this thesis is that the consideration and exploitation of additional information characterizing user, network, and environment context, is beneficial for enhancing Heterogeneous Access Management (HAM) and Self-Optimizing Networks (SONs). Further, Mobile Network Operator (MNO) revenues are maximized by tightly integrating bandwidth adaptation and admission control mechanisms as well as simultaneously accounting for user profiles and service characteristics. In addition, mobility robustness is optimized by enabling network nodes to tune HO parameters according to locally observed conditions. For establishing all these facets of context awareness, various schemes and algorithms are developed and evaluated in this thesis. System-level simulation results demonstrate the potential of context information exploitation for enhancing resource utilization, mobility support, self-tuning network operations, and users' E2E performance. In essence, the conducted research activities and presented results motivate and substantiate the consideration of context awareness as key enabler for cognitive and autonomous network management. Further, the performed investigations and aspects evaluated in the scope of this thesis are highly relevant for future 5G wireless systems and current discussions in the 5G infrastructure Public Private Partnership (PPP).

Context-Enabled Optimization of Energy-Autarkic Networks for Carrier-Grade Wireless Backhauling (2015)

Mannweiler, Christian

This work establishes the novel category of coordinated Wireless Backhaul Networks (WBNs) for energy-autarkic point-to-point radio backhauling. The networking concept is based on three major building blocks: cost-efficient radio transceiver hardware, a self-organizing network operations framework, and power supply from renewable energy sources. The aim of this novel backhauling approach is to combine carrier-grade network performance with reduced maintenance effort as well as independent and self-sufficient power supply. In order to facilitate the success prospects of this concept, the thesis comprises the following major contributions: Formal, multi-domain system model and evaluation methodology First, adapted from the theory of cyber-physical systems, the author devises a multi-domain evaluation methodology and a system-level simulation framework for energy-autarkic coordinated WBNs, including a novel balanced scorecard concept. Second, the thesis specifically addresses the topic of Topology Control (TC) in point-to-point radio networks and how it can be exploited for network management purposes. Given a set of network nodes equipped with multiple radio transceivers and known locations, TC continuously optimizes the setup and configuration of radio links between network nodes, thus supporting initial network deployment, network operation, as well as topology re-configuration. In particular, the author shows that TC in WBNs belongs to the class of NP-hard quadratic assignment problems and that it has significant impact in operational practice, e.g., on routing efficiency, network redundancy levels, service reliability, and energy consumption. Two novel algorithms focusing on maximizing edge connectivity of network graphs are developed. Finally, this work carries out an analytical benchmarking and a numerical performance analysis of the introduced concepts and algorithms. The author analytically derives minimum performance levels of the the developed TC algorithms. For the analyzed scenarios of remote Alpine communities and rural Tanzania, the evaluation shows that the algorithms improve energy efficiency and more evenly balance energy consumption across backhaul nodes, thus significantly increasing the number of available backhaul nodes compared to state-of-the-art TC algorithms.

Modeling and design optimization of textile-like materials via homogenization and one-dimensional models of elasticity (2015)

Shiryaev, Vladimir

The work consists of two parts. In the first part an optimization problem of structures of linear elastic material with contact modeled by Robin-type boundary conditions is considered. The structures model textile-like materials and possess certain quasiperiodicity properties. The homogenization method is used to represent the structures by homogeneous elastic bodies and is essential for formulations of the effective stress and Poisson's ratio optimization problems. At the micro-level, the classical one-dimensional Euler-Bernoulli beam model extended with jump conditions at contact interfaces is used. The stress optimization problem is of a PDE-constrained optimization type, and the adjoint approach is exploited. Several numerical results are provided. In the second part a non-linear model for simulation of textiles is proposed. The yarns are modeled by hyperelastic law and have no bending stiffness. The friction is modeled by the Capstan equation. The model is formulated as a problem with the rate-independent dissipation, and the basic continuity and convexity properties are investigated. The part ends with numerical experiments and a comparison of the results to a real measurement.

An Automata-Theoretic Approach to Open Actor System Verification (2015)

Kurnia, Ilham W.

Open distributed systems are a class of distributed systems where (i) only partial information about the environment, in which they are running, is present, (ii) new resources may become available at runtime, and (iii) a subsystem may become aware of other subsystems after some interaction. Modeling and implementing such systems correctly is a complex task due to the openness and the dynamicity aspects. One way to ensure that the resulting systems behave correctly is to utilize formal verification. Formal verification requires an adequate semantic model of the implementation, a specification of the desired behavior, and a reasoning technique. The actor model is a semantic model that captures the challenging aspects of open distributed systems by utilizing actors as universal primitives to represent system entities and allowing them to create new actors and to communicate by sending directed messages as reply to received messages. To enable compositional reasoning, where the reasoning task is reduced to independent verification of the system parts, semantic entities at a higher level of abstraction than actors are needed. This thesis proposes an automaton model and combines sound reasoning techniques to compositionally verify implementations of open actor systems. Based on I/O automata, the model allows automata to be created dynamically and captures dynamic changes in communication patterns. Each automaton represents either an actor or a group of actors. The specification of the desired behavior is given constructively as an automaton. As the basis for compositionality, we formalize a component notion based on the static structure of the implementation instead of the dynamic entities (the actors) occurring in the system execution. The reasoning proceeds in two stages. The first stage establishes the connection between the automata representing single actors and their implementation description by means of weakest liberal preconditions. The second stage employs this result as the basis for verifying whether a component specification is satisfied. The verification is done by building a simulation relation from the automaton representing the implementation to the component's automaton. Finally, we validate the compositional verification approach through a number of examples by proving correctness of their actor implementations with respect to system specifications.

Modeling and Simulation of a Moving Rigid Body in a Rarefied Gas (2015)

Shrestha, Samir

We present a numerical scheme to simulate a moving rigid body with arbitrary shape suspended in a rarefied gas micro flows, in view of applications to complex computations of moving structures in micro or vacuum systems. The rarefied gas is simulated by solving the Boltzmann equation using a DSMC particle method. The motion of the rigid body is governed by the Newton-Euler equations, where the force and the torque on the rigid body is computed from the momentum transfer of the gas molecules colliding with the body. The resulting motion of the rigid body affects in turn again the gas flow in the surroundings. This means that a two-way coupling has been modeled. We validate the scheme by performing various numerical experiments in 1-, 2- and 3-dimensional computational domains. We have presented 1-dimensional actuator problem, 2-dimensional cavity driven flow problem, Brownian diffusion of a spherical particle both with translational and rotational motions, and finally thermophoresis on a spherical particles. We compare the numerical results obtained from the numerical simulations with the existing theories in each test examples.

Visual Processing in Reading and Dyslexia (2015)

Khera, Gunjan

The present research combines different paradigm in the area of visual perception of letter and words. These experiments aimed to understand the deficit underlying the problem associated with the faulty visual processing of letters and words. The present work summarizes the findings from two different types of population: (1) Dyslexics (reading-disabled children) and (2) Illiterates (adults who cannot read). In order to compare the results, comparisons were made between literate and illiterate group; dyslexics and control group (normal reading children). Differences for Even related potentials (ERP’s) between dyslexics and control children were made using mental rotation task for letters. According to the ERP’s, the effect of the mental rotation task of letter perception resulted as a delayed positive component and the component becomes less positive when the task becomes more difficult (Rotation related Negativity – RRN). The component was absent for dyslexics and present for controls. Dyslexics also showed some late effects in comparison to control children and this could be interpreted as problems at the decision stage where they are confused as to the letter is normal or mirrored. Dyslexics also have problems in responding to the letters having visual or phonological similarities (e.g. b vs d, p vs q). Visually similar letters were used to compare dyslexics and controls on a symmetry generalization task in two different contrast conditions (low and high). Dyslexics showed a similar pattern of response, and were overall slower in responding to the task compared to controls. The results were interpreted within the framework of the Functional Coordination Deficit (Lachmann, 2002). Dyslexics also showed delayed response in responding to the word recognition task during motion. Using red background decreases the Magnocellular pathway (M-pathway) activity, making more difficult to identify letters and this effect was worse for dyslexics because their M-pathway is weaker. In dyslexics, the M-pathway is worse; using a red background decreases the M activity and increases the difficulty in identifying lexical task in motion. This effect generated worse response to red compared to the green background. The reaction times with red were longer than those with green background. Further, Illiterates showed an analytic approach to responding to letters as well as on shapes. The analytic approach does not result from an individual capability to read, but is a primary base of visual organization or perception.

Adaptive Real-Time Scheduling and Resource Management on Multicore Architectures (2015)

Schorr, Stefan

Real-time systems are systems that have to react correctly to stimuli from the environment within given timing constraints. Today, real-time systems are employed everywhere in industry, not only in safety-critical systems but also in, e.g., communication, entertainment, and multimedia systems. With the advent of multicore platforms, new challenges on the efficient exploitation of real-time systems have arisen: First, there is the need for effective scheduling algorithms that feature low overheads to improve the use of the computational resources of real-time systems. The goal of these algorithms is to ensure timely execution of tasks, i.e., to provide runtime guarantees. Additionally, many systems require their scheduling algorithm to flexibly react to unforeseen events. Second, the inherent parallelism of multicore systems leads to contention for shared hardware resources and complicates system analysis. At any time, multiple applications run with varying resource requirements and compete for the scarce resources of the system. As a result, there is a need for an adaptive resource management. Achieving and implementing an effective and efficient resource management is a challenging task. The main goal of resource management is to guarantee a minimum resource availability to real-time applications. A further goal is to fulfill global optimization objectives, e.g., maximization of the global system performance, or the user perceived quality of service. In this thesis, we derive methods based on the slot shifting algorithm. Slot shifting provides flexible scheduling of time-constrained applications and can react to unforeseen events in time-triggered systems. For this reason, we aim at designing slot shifting based algorithms targeted for multicore systems to tackle the aforementioned challenges. The main contribution of this thesis is to present two global slot shifting algorithms targeted for multicore systems. Additionally, we extend slot shifting algorithms to improve their runtime behavior, or to handle non-preemptive firm aperiodic tasks. In a variety of experiments, the effectiveness and efficiency of the algorithms are evaluated and confirmed. Finally, the thesis presents an implementation of a slot-shifting-based logic into a resource management framework for multicore systems. Thus, the thesis closes the circle and successfully bridges the gap between real-time scheduling theory and real-world implementations. We prove applicability of the slot shifting algorithm to effectively and efficiently perform adaptive resource management on multicore systems.

Structural Decomposition of STGs (2015)

Benyamin Kangsah, Benedictus

Specification of asynchronous circuit behaviour becomes more complex as the complexity of today’s System-On-a-Chip (SOC) design increases. This also causes the Signal Transition Graphs (STGs) – interpreted Petri nets for the specification of asynchronous circuit behaviour – to become bigger and more complex, which makes it more difficult, sometimes even impossible, to synthesize an asynchronous circuit from an STG with a tool like petrify [CKK+96] or CASCADE [BEW00]. It has, therefore, been suggested to decompose the STG as a first step; this leads to a modular implementation [KWVB03] [KVWB05], which can reduce syn- thesis effort by possibly avoiding state explosion or by allowing the use of library elements. A decomposition approach for STGs was presented in [VW02] [KKT93] [Chu87a]. The decomposition algorithm by Vogler and Wollowski [VW02] is based on that of Chu [Chu87a] but is much more generally applicable than the one in [KKT93] [Chu87a], and its correctness has been proved formally in [VW02]. This dissertation begins with Petri net background described in chapter 2. It starts with a class of Petri nets called a place/transition (P/T) nets. Then STGs, the subclass of P/T nets, is viewed. Background in net decomposition is presented in chapter 3. It begins with the structural decomposition of P/T nets for analysis purposes – liveness and boundedness of the net. Then STG decomposition for synthesis from [VW02] is described. The decomposition method from [VW02] still could be improved to deal with STGs from real applications and to give better decomposition results. Some improvements for [VW02] to improve decomposition result and increase algorithm efficiency are discussed in chapter 4. These improvement ideas are suggested in [KVWB04] and some of them are have been proved formally in [VK04]. The decomposition method from [VW02] is based on net reduction to find an output block component. A large amount of work has to be done to reduce an initial specification until the final component is found. This reduction is not always possible, which causes input initially classified as irrelevant to become relevant input for the component. But under certain conditions (e.g. if structural auto-conflicts turn out to be non-dynamic) some of them could be reclassified as irrelevant. If this is not done, the specifications become unnecessarily large, which intern leads to unnecessarily large implemented circuits. Instead of reduction, a new approach, presented in chapter 5, decomposes the original net into structural components first. An initial output block component is found by composing the structural components. Then, a final output block component is obtained by net reduction. As we cope with the structure of a net most of the time, it would be useful to have a structural abstraction of the net. A structural abstraction algorithm [Kan03] is presented in chapter 6. It can improve the performance in finding an output block component in most of the cases [War05] [Taw04]. Also, the structure net is in most cases smaller than the net itself. This increases the efficiency of the decomposition algorithm because it allows the transitions contained in a node of the structure graph to be contracted at the same time if the structure graph is used as internal representation of the net. Chapter 7 discusses the application of STG decomposition in asynchronous circuit design. Application to speed independent circuits is discussed first. Af- ter that 3D circuits synthesized from extended burst mode (XBM) specifications are discussed. An algorithm for translating STG specifications to XBM specifi- cations was first suggested by [BEW99]. This algorithm first derives the state machine from the STG specification, then translates the state machine to XBM specification. An XBM specification, though it is a state machine, allows some concurrency. These concurrencies can be translated directly, without deriving all of the possible states. An algorithm which directly translates STG to XBM specifications, is presented in chapter 7.3.1. Finally DESI, a tool to decompose STGs and its decomposition results are presented.

571 to 580

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Keywords

Faculty / Organisational entity

941 search hits