Nowadays, accounting, charging and billing users' network resource consumption are commonly used for the purpose of facilitating reasonable network usage, controlling congestion, allocating cost, gaining revenue, etc. In traditional IP traffic accounting systems, IP addresses are used to identify the corresponding consumers of the network resources. However, there are some situations in which IP addresses cannot be used to identify users uniquely, for example, in multi-user systems. In these cases, network resource consumption can only be ascribed to the owners of these hosts instead of corresponding real users who have consumed the network resources. Therefore, accurate accountability in these systems is practically impossible. This is a flaw of the traditional IP address based IP traffic accounting technique. This dissertation proposes a user based IP traffic accounting model which can facilitate collecting network resource usage information on the basis of users. With user based IP traffic accounting, IP traffic can be distinguished not only by IP addresses but also by users. In this dissertation, three different schemes, which can achieve the user based IP traffic accounting mechanism, are discussed in detail. The inband scheme utilizes the IP header to convey the user information of the corresponding IP packet. The Accounting Agent residing in the measured host intercepts IP packets passing through it. Then it identifies the users of these IP packets and inserts user information into the IP packets. With this mechanism, a meter located in a key position of the network can intercept the IP packets tagged with user information, extract not only statistic information, but also IP addresses and user information from the IP packets to generate accounting records with user information. The out-of-band scheme is a contrast scheme to the in-band scheme. It also uses an Accounting Agent to intercept IP packets and identify the users of IP traffic. However, the user information is transferred through a separated channel, which is different from the corresponding IP packets' transmission. The Multi-IP scheme provides a different solution for identifying users of IP traffic. It assigns each user in a measured host a unique IP address. Through that, an IP address can be used to identify a user uniquely without ambiguity. This way, traditional IP address based accounting techniques can be applied to achieve the goal of user based IP traffic accounting. In this dissertation, a user based IP traffic accounting prototype system developed according to the out-of-band scheme is also introduced. The application of user based IP traffic accounting model in the distributed computing environment is also discussed.
The safety of embedded systems is becoming more and more important nowadays. Fault Tree Analysis (FTA) is a widely used technique for analyzing the safety of embedded systems. A standardized tree-like structure called a Fault Tree (FT) models the failures of the systems. The Component Fault Tree (CFT) provides an advanced modeling concept for adapting the traditional FTs to the hierarchical architecture model in system design. Minimal Cut Set (MCS) analysis is a method that works for qualitative analysis based on the FTs. Each MCS represents a minimal combination of component failures of a system called basic events, which may together cause the top-level system failure. The ordinary representations of MCSs consist of plain text and data tables with little additional supporting visual and interactive information. Importance analysis based on FTs or CFTs estimates the contribution of each potential basic event to a top-level system failure. The resulting importance values of basic events are typically represented in summary views, e.g., data tables and histograms. There is little visual integration between these forms and the FT (or CFT) structure. The safety of a system can be improved using an iterative process, called the safety improvement process, based on FTs taking relevant constraints into account, e.g., cost. Typically, relevant data regarding the safety improvement process are presented across multiple views with few interactive associations. In short, the ordinary representation concepts cannot effectively facilitate these analyses.
We propose a set of visualization approaches for addressing the issues above mentioned in order to facilitate those analyses in terms of the representations.
1. To support the MCS analysis, we propose a matrix-based visualization that allows detailed data of the MCSs of interest to be viewed while maintaining a satisfactory overview of a large number of MCSs for effective navigation and pattern analysis. Engineers can also intuitively analyze the influence of MCSs of a CFT.
2. To facilitate the importance analysis based on the CFT, we propose a hybrid visualization approach that combines the icicle-layout-style architectural views with the CFT structure. This approach facilitates to identify the vulnerable components taking the hierarchies of system architecture into account and investigate the logical failure propagation of the important basic events.
3. We propose a visual safety improvement process that integrates an enhanced decision tree with a scatter plot. This approach allows one to visually investigate the detailed data related to individual steps of the process while maintaining the overview of the process. The approach facilitates to construct and analyze improvement solutions of the safety of a system.
Using our visualization approaches, the MCS analysis, the importance analysis, and the safety improvement process based on the CFT can be facilitated.
Modern digital imaging technologies, such as digital microscopy or micro-computed tomography, deliver such large amounts of 2D and 3D-image data that manual processing becomes infeasible. This leads to a need for robust, flexible and automatic image analysis tools in areas such as histology or materials science, where microstructures are being investigated (e.g. cells, fiber systems). General-purpose image processing methods can be used to analyze such microstructures. These methods usually rely on segmentation, i.e., a separation of areas of interest in digital images. As image segmentation algorithms rarely adapt well to changes in the imaging system or to different analysis problems, there is a demand for solutions that can easily be modified to analyze different microstructures, and that are more accurate than existing ones. To address these challenges, this thesis contributes a novel statistical model for objects in images and novel algorithms for the image-based analysis of microstructures. The first contribution is a novel statistical model for the locations of objects (e.g. tumor cells) in images. This model is fully trainable and can therefore be easily adapted to many different image analysis tasks, which is demonstrated by examples from histology and materials science. Using algorithms for fitting this statistical model to images results in a method for locating multiple objects in images that is more accurate and more robust to noise and background clutter than standard methods. On simulated data at high noise levels (peak signal-to-noise ratio below 10 dB), this method achieves detection rates up to 10% above those of a watershed-based alternative algorithm. While objects like tumor cells can be described well by their coordinates in the plane, the analysis of fiber systems in composite materials, for instance, requires a fully three dimensional treatment. Therefore, the second contribution of this thesis is a novel algorithm to determine the local fiber orientation in micro-tomographic reconstructions of fiber-reinforced polymers and other fibrous materials. Using simulated data, it will be demonstrated that the local orientations obtained from this novel method are more robust to noise and fiber overlap than those computed using an established alternative gradient-based algorithm, both in 2D and 3D. The property of robustness to noise of the proposed algorithm can be explained by the fact that a low-pass filter is used to detect local orientations. But even in the absence of noise, depending on fiber curvature and density, the average local 3D-orientation estimate can be about 9° more accurate compared to that alternative gradient-based method. Implementations of that novel orientation estimation method require repeated image filtering using anisotropic Gaussian convolution filters. These filter operations, which other authors have used for adaptive image smoothing, are computationally expensive when using standard implementations. Therefore, the third contribution of this thesis is a novel optimal non-orthogonal separation of the anisotropic Gaussian convolution kernel. This result generalizes a previous one reported elsewhere, and allows for efficient implementations of the corresponding convolution operation in any dimension. In 2D and 3D, these implementations achieve an average performance gain by factors of 3.8 and 3.5, respectively, compared to a fast Fourier transform-based implementation. The contributions made by this thesis represent improvements over state-of-the-art methods, especially in the 2D-analysis of cells in histological resections, and in the 2D and 3D-analysis of fibrous materials.
Backward compatibility of class libraries ensures that an old implementation of a library can safely be replaced by a new implementation without breaking existing clients.
Formal reasoning about backward compatibility requires an adequate semantic model to compare the behavior of two library implementations.
In the object-oriented setting with inheritance and callbacks, finding such models is difficult as the interface between library implementations and clients are complex.
Furthermore, handling these models in a way to support practical reasoning requires appropriate verification tools.
This thesis proposes a formal model for library implementations and a reasoning approach for backward compatibility that is implemented using an automatic verifier. The first part of the thesis develops a fully abstract trace-based semantics for class libraries of a core sequential object-oriented language. Traces abstract from the control flow (stack) and data representation (heap) of the library implementations. The construction of a most general context is given that abstracts exactly from all possible clients of the library implementation.
Soundness and completeness of the trace semantics as well as the most general context are proven using specialized simulation relations on the operational semantics. The simulation relations also provide a proof method for reasoning about backward compatibility.
The second part of the thesis presents the implementation of the simulation-based proof method for an automatic verifier to check backward compatibility of class libraries written in Java. The approach works for complex library implementations, with recursion and loops, in the setting of unknown program contexts. The verification process relies on a coupling invariant that describes a relation between programs that use the old library implementation and programs that use the new library implementation. The thesis presents a specification language to formulate such coupling invariants. Finally, an application of the developed theory and tool to typical examples from the literature validates the reasoning and verification approach.
Today, polygonal models occur everywhere in graphical applications, since they are easy
to render and to compute and a very huge set of tools are existing for generation and
manipulation of polygonal data. But modern scanning devices that allow a high quality
and large scale acquisition of complex real world models often deliver a large set of
points as resulting data structure of the scanned surface. A direct triangulation of those
point clouds does not always result in good models. They often contain problems like
holes, self-intersections and non manifold structures. Also one often looses important
surface structures like sharp corners and edges during a usual surface reconstruction.
So it is suitable to stay a little longer in the point based world to analyze the point cloud
data with respect to such features and apply a surface reconstruction method afterwards
that is known to construct continuous and smooth surfaces and extend it to reconstruct
Due to tremendous improvements of high-performance computing resources as well
as numerical advances computational simulations became a common tool for modern
engineers. Nowadays, simulation of complex physics is more and more substituting a
large amount of physical experiments. While the vast compute power of large-scale
high-performance systems enabled for simulating more complex numerical equations,
handling the ever increasing amount of data with spatial and temporal resolution
burdens new challenges to scientists. Huge hardware and energy costs desire for
e� cient utilization of high-performance systems. However, increasing complexity of
simulations raises the risk of failing simulations resulting in a single simulation to be
restarted multiple times. Computational Steering is a promising approach to interact
with running simulations which could prevent simulation crashes. The large amount
of data expands gaps in the amount of data that can be calculated and the amount of
data that can be processed. Extreme-scale simulations produce more data that can
even be stored. In this thesis, I propose several methods that enhance the process
of steering, exploring, visualizing, and analyzing ongoing numerical simulations.
Due to remarkable technological advances in the last three decades the capacity of computer systems has improved tremendously. Considering Moore's law, the number of transistors on integrated circuits has doubled approximately every two years and the trend is continuing. Likewise, developments in storage density, network bandwidth, and compute capacity show similar patterns. As a consequence, the amount of data that can be processed by today's systems has increased by orders of magnitude. At the same time, however, the resolution of screens has hardly increased by a factor of ten. Thus, there is a gap between the amount of data that can be processed and the amount of data that can be visualized. Large high-resolution displays offer a way to deal with this gap and provide a significantly increased screen area by combining the images of multiple smaller display devices. The main objective of this dissertation is the development of new visualization and interaction techniques for large high-resolution displays.
Die formale Spezifikation von Kommunikationssystemen stellt durch die mit ihr verbundene Abstraktion und Präzision eine wichtige Grundlage für die formale Verifikation von Systemeigenschaften dar. Diese Abstraktion begrenzt jedoch auch die Ausdrucksfähigkeit der formalen Beschreibungstechnik und kann somit zu problemunangemessenen Spezifikationen führen. Wir untersuchen anhand der formalen Beschreibungstechnik Estelle zunächst zwei solche Aspekte. Beide führen speziell in Hinsicht auf die Domäne von Estelle, der Spezifikation von Kommunikationsprotokollen, zu schwerwiegenden Beeinträchtigungen der Ausdrucksfähigkeit. Eines dieser Defizite zeigt sich bei dem Versuch, in Estelle ein offenes System wie z. B. eine Protokollmaschine oder einen Kommunikationsdienst zu spezifizieren. Da Estelle-Spezifikationen nur geschlossene Systeme beschreiben können, werden solche Komponenten immer nur als Teil einer fest vorgegebenen Umgebung spezifiziert und besitzen auch nur in dieser eine formale Syntax und Semantik. Als Lösung für dieses Problem führen wir die kompatible syntaktische und semantische Estelle-Erweiterung Open-Estelle ein, die eine formale Spezifikation solcher offener Systeme und ihres Imports in verschiedene Umgebungen ermöglicht. Ein anderes Defizit in der Ausdrucksfähigkeit von Estelle ergibt sich aus der strengen Typprüfung. Wir werden zeigen, dass es in heterogenen, hierarchisch strukturierten Kommunikationssystemen im Zusammenhang mit den dort auftretenden horizontalen und vertikalen Typkompositionen zu einer unangemessenen Modellierung von Nutzdatentypen an den Dienstschnittstellen kommt. Dieses Problem erweist sich beim Versuch einer generischen und nutzdatentypunabhängigen Spezifikation eines offenen Systems (z. B. mit Open-Estelle) sogar als fatal. Deshalb führen wir die kompatible Containertyp-Erweiterung ein, durch die eine formale Spezifikation nutzdatentypunabhängiger und somit generischer Schnittstellen von Diensten und Protokollmaschinen ermöglicht wird. Als Grundlage für unsere Implementierungs- und Optimierungsexperimente führen wir den „eXperimental Estelle Compiler“ (XEC) ein. Er ermöglicht aufgrund seines Implementierungskonzeptes eine sehr flexible Modellierung des Systemmanagements und ist insbesondere für die Realisierung verschiedener Auswahloptimierungen geeignet. XEC ist zudem mit verschiedenen Statistik- und Monitoring-Funktionalitäten ausgestattet, durch die eine effiziente quantitative Analyse der durchgeführten Implementierungsexperimente möglich ist. Neben dem vollständigen Sprachumfang von Estelle unterstützt XEC auch die meisten der hier eingeführten Estelle-Erweiterungen. Neben der Korrektheit ist die Effizienz automatisch generierter Implementierungen eine wichtige Anforderung im praktischen Einsatz. Hier zeigt sich jedoch, dass viele der in formalen Protokollspezifikationen verwendeten Konstrukte nur schwer semantikkonform und zugleich effizient implementiert werden können. Entsprechend untersuchen wir anhand des Kontrollflusses und der Handhabung von Nutzdaten, wie die spezifizierten Operationen effizient implementiert werden können, ohne das Abstraktionsniveau senken zu müssen. Die Optimierung des Kontrollflusses geschieht dabei ausgehend von der effizienten Realisierung der Basisoperationen der von XEC erzeugten Implementierungen primär anhand der Transitionsauswahl, da diese speziell bei komplexen Spezifikationen einen erheblichen Teil der Ausführungszeit bansprucht. Wir entwickeln dazu verschiedene heuristische Optimierungen der globalen Auswahl und der modullokalen Auswahl und werten diese sowohl analytisch wie auch experimentell aus. Wesentliche Ansatzpunkte sind dabei verschiedene ereignisgesteuerte Auswahlverfahren auf globaler Ebene und die Reduktion der zu untersuchenden Transitionen auf lokaler Ebene. Die Überprüfung der Ergebnisse anhand der ausführungszeitbezogenen Leistungsbewertung bestätigt diese Ergebnisse. Hinsichtlich der effizienten Handhabung von Daten untersuchen wir unterschiedliche Ansätze auf verschiedenen Ebenen, die jedoch in den meisten Fällen eine problemunangemessene Ausrichtung der Spezifikation auf die effiziente Datenübertragung erfordern. Eine überraschend elegante, problemorientierte und effiziente Lösung ergibt sich jedoch auf Basis der Containertyp-Erweiterung, die ursprünglich zur Steigerung des Abstraktionsniveaus eingeführt wurde. Dieses Ergebnis widerlegt die Vorstellung, dass Maßnahmen zur Steigerung der effizienten Implementierbarkeit auch immer durch eine Senkung des Abstraktionsniveaus erkauft werden müssen.
More and more, term rewriting systems are applied in computer science aswell as in mathematics. They are based on directed equations which may be used as non-deterministic functional programs. Termination is a key property for computing with termrewriting systems.In this thesis, we deal with different classes of so-called simplification orderings which areable to prove the termination of term rewriting systems. Above all, we focus on the problemof applying these termination methods to examples occurring in practice. We introduce aformalism that allows clear representations of orderings. The power of classical simplifica-tion orderings - namely recursive path orderings, path and decomposition orderings, Knuth-Bendix orderings and polynomial orderings - is improved. Further, we restrict these orderingssuch that they are compatible with underlying AC-theories by extending well-known methodsas well as by developing new techniques. For automatically generating all these orderings,heuristic-based algorithms are given. A comparison of these orderings with respect to theirpowers and their time complexities concludes the theoretical part of this thesis. Finally, notonly a detailed statistical evaluation of examples but also a brief introduction into the designof a software tool representing the integration of the specified approaches is given.
Layout analysis--the division of page images into text blocks, lines, and determination of their reading order--is a major performance limiting step in large scale document digitization projects. This thesis addresses this problem in several ways: it presents new performance measures to identify important classes of layout errors, evaluates the performance of state-of-the-art layout analysis algorithms, presents a number of methods to reduce the error rate and catastrophic failures occurring during layout analysis, and develops a statistically motivated, trainable layout analysis system that addresses the needs of large-scale document analysis applications. An overview of the key contributions of this thesis is as follows. First, this thesis presents an efficient local adaptive thresholding algorithm that yields the same quality of binarization as that of state-of-the-art local binarization methods, but runs in time close to that of global thresholding methods, independent of the local window size. Tests on the UW-1 dataset demonstrate a 20-fold speedup compared to traditional local thresholding techniques. Then, this thesis presents a new perspective for document image cleanup. Instead of trying to explicitly detect and remove marginal noise, the approach focuses on locating the page frame, i.e. the actual page contents area. A geometric matching algorithm is presented to extract the page frame of a structured document. It is demonstrated that incorporating page frame detection step into document processing chain results in a reduction in OCR error rates from 4.3% to 1.7% (n=4,831,618 characters) on the UW-III dataset and layout-based retrieval error rates from 7.5% to 5.3% (n=815 documents) on the MARG dataset. The performance of six widely used page segmentation algorithms (x-y cut, smearing, whitespace analysis, constrained text-line finding, docstrum, and Voronoi) on the UW-III database is evaluated in this work using a state-of-the-art evaluation methodology. It is shown that current evaluation scores are insufficient for diagnosing specific errors in page segmentation and fail to identify some classes of serious segmentation errors altogether. Thus, a vectorial score is introduced that is sensitive to, and identifies, the most important classes of segmentation errors (over-, under-, and mis-segmentation) and what page components (lines, blocks, etc.) are affected. Unlike previous schemes, this evaluation method has a canonical representation of ground truth data and guarantees pixel-accurate evaluation results for arbitrary region shapes. Based on a detailed analysis of the errors made by different page segmentation algorithms, this thesis presents a novel combination of the line-based approach by Breuel with the area-based approach of Baird which solves the over-segmentation problem in area-based approaches. This new approach achieves a mean text-line extraction error rate of 4.4% (n=878 documents) on the UW-III dataset, which is the lowest among the analyzed algorithms. This thesis also describes a simple, fast, and accurate system for document image zone classification that results from a detailed comparative analysis of performance of widely used features in document analysis and content-based image retrieval. Using a novel combination of known algorithms, an error rate of 1.46% (n=13,811 zones) is achieved on the UW-III dataset in comparison to a state-of-the-art system that reports an error rate of 1.55% (n=24,177 zones) using more complicated techniques. In addition to layout analysis of Roman script documents, this work also presents the first high-performance layout analysis method for Urdu script. For that purpose a geometric text-line model for Urdu script is presented. It is shown that the method can accurately extract Urdu text-lines from documents of different layouts like prose books, poetry books, magazines, and newspapers. Finally, this thesis presents a novel algorithm for probabilistic layout analysis that specifically addresses the needs of large-scale digitization projects. The presented approach models known page layouts as a structural mixture model. A probabilistic matching algorithm is presented that gives multiple interpretations of input layout with associated probabilities. An algorithm based on A* search is presented for finding the most likely layout of a page, given its structural layout model. For training layout models, an EM-like algorithm is presented that is capable of learning the geometric variability of layout structures from data, without the need for a page segmentation ground-truth. Evaluation of the algorithm on documents from the MARG dataset shows an accuracy of above 95% for geometric layout analysis.