Refine
Year of publication
- 2012 (61) (remove)
Document Type
- Doctoral Thesis (33)
- Report (12)
- Preprint (7)
- Article (4)
- Master's Thesis (2)
- Conference Proceeding (1)
- Periodical Part (1)
- Working Paper (1)
Language
- English (61) (remove)
Has Fulltext
- yes (61)
Keywords
- Transaction Costs (2)
- Arithmetic data-path (1)
- Bildverarbeitung (1)
- Bioinformatik (1)
- Carbon footprint (1)
- Chlamydomonas reinhardii (1)
- Cloud Computing (1)
- Cohen-Lenstra heuristic (1)
- Computeralgebra (1)
- Consistent Price Processes (1)
Faculty / Organisational entity
- Kaiserslautern - Fachbereich Mathematik (21)
- Kaiserslautern - Fachbereich Informatik (11)
- Fraunhofer (ITWM) (10)
- Kaiserslautern - Fachbereich Chemie (6)
- Kaiserslautern - Fachbereich Maschinenbau und Verfahrenstechnik (6)
- Kaiserslautern - Fachbereich Wirtschaftswissenschaften (3)
- Kaiserslautern - Fachbereich Elektrotechnik und Informationstechnik (2)
- Kaiserslautern - Fachbereich Biologie (1)
- Kaiserslautern - Fachbereich Physik (1)
Dealing with information in modern times involves users to cope with hundreds of thousands of documents, such as articles, emails, Web pages, or News feeds.
Above all information sources, the World Wide Web presents information seekers with great challenges.
It offers more text in natural language than one is capable to read.
The key idea for this research intends to provide users with adaptable filtering techniques, supporting them in filtering out the specific information items they need.
Its realization focuses on developing an Information Extraction system,
which adapts to a domain of concern, by interpreting the contained formalized knowledge.
Utilizing the Resource Description Framework (RDF), which is the Semantic Web's formal language for exchanging information,
allows extending information extractors to incorporate the given domain knowledge.
Because of this, formal information items from the RDF source can be recognized in the text.
The application of RDF allows a further investigation of operations on recognized information items, such as disambiguating and rating the relevance of these.
Switching between different RDF sources allows changing the application scope of the Information Extraction system from one domain of concern to another.
An RDF-based Information Extraction system can be triggered to extract specific kinds of information entities by providing it with formal RDF queries in terms of the SPARQL query language.
Representing extracted information in RDF extends the coverage of the Semantic Web's information degree and provides a formal view on a text from the perspective of the RDF source.
In detail, this work presents the extension of existing Information Extraction approaches by incorporating the graph-based nature of RDF.
Hereby, the pre-processing of RDF sources allows extracting statistical information models dedicated to support specific information extractors.
These information extractors refine standard extraction tasks, such as the Named Entity Recognition, by using the information provided by the pre-processed models.
The post-processing of extracted information items enables representing these results in RDF format or lists, which can now be ranked or filtered by relevance.
Post-processing also comprises the enrichment of originating natural language text sources with extracted information items by using annotations in RDFa format.
The results of this research extend the state-of-the-art of the Semantic Web.
This work contributes approaches for computing customizable and adaptable RDF views on the natural language content of Web pages.
Finally, due to the formal nature of RDF, machines can interpret these views allowing developers to process the contained information in a variety of applications.
Due to their N-glycosidase activity, ribosome-inactivating proteins (RIPs) are attractive candidates as antitumor and antiviral agents in medical and biological research. In the present study, we have successfully cloned two different truncated gelonins into pET-28a(+) vectors and expressed intact recombinant gelonin (rGel), recombinant C-terminally truncated gelonin (rC3-gelonin) and recombinant N- and C-terminally truncated gelonin (rN34C3-gelonin). Biological experiments showed that all these recombinant gelonins have no inhibiting effect on MCF-7 cell lines. These data suggest that the truncated-gelonins are still having a specific structure that does not allow for internalization into cells. Further, truncation of gelonin leads to partial or complete loss of N-glycosidase as well as DNase activity compared to intact rGel. Our data suggest that C-and N-terminal amino acid residues are involved in the catalytic and cytotoxic activities of rGel. In addition, the intact gelonin should be selected as a toxin in the immunoconjugate rather than truncated gelonin.
In the second part, an immunotoxin composed of gelonin, a basic protein of 30 kDa isolated from the Indian plant Gelonium multiflorum and the cytotoxic drug MTX has been studied as a potential tool of gelonin delivery into the cytoplasm of cells. Results of many experiments showed that, on the average, about 5 molecules of MTX were coupled to one molecule of gelonin. The MTX-gelonin conjugate is able to reduce the viability of MCF-7 cell in a dose-dependent manner (ID50, 10 nM) as shown by MTT assay and significantly induce direct and oxidative DNA damage as shown by the alkaline comet assay. However, in-vitro translation toxicity MTX-gelonin conjugates have IC50, 50.5 ng/ml which is less toxic than that of gelonin alone IC50, 4.6 ng/ml. It can be concluded that the positive charge plays an important role in the N-glycosidase activity of gelonin. Furthermore, conjugation of MTX with gelonin through α- and γ- carboxyl groups leads to the partial loss of its anti-folate activity compared to free MTX. These results, taken together, indicate that conjugation of MTX to gelonin permits delivery of the gelonin into the cytoplasm of cancer cells and exerts a measurable toxic effect.
In the third part, we have isolated and characterized two ribosome-inactivating proteins (RIPs) type I, gelonin and GAP31, from seeds of Gelonium multiflorum. Both proteins exhibit RNA-N-glycosidase activity. The amino acid sequences of gelonin and GAP31 were identified by MALDI and ESI mass spectrometry. Gelonin and GAP31 peptides - obtained by proteolytic digestion (trypsin and Arg-C) - are consistent with the amino acid sequence published by Rosenblum and Huang, respectively. Further structural characterization of gelonin and GAP31 (tryptic and Arg-C peptide mapping) showed that the two RIPs have 96% similarity in their sequence. Thus, these two proteins are most probably isoforms arisen from the same gene by alternative splicing. The ESI-MS analysis of gelonin and GAP31 exhibited at least three different post-translational modified forms. A standard plant paucidomannosidic N-glycosylation pattern (GlcNAc2Man2-5Xyl0-1 and GlcNAc2Man6-12Fuc1-2Xyl0-2) was identified using electrospray ionization MS for gelonin on N196 and GAP31 on N189, respectively. Based on these results, both proteins are located in the vacuoles of Gelonium multiflorum seeds.
Thermoplastic polymer-polymer composites consist of a polymeric matrix and a
polymeric reinforcement. The combination of these materials offers outstanding
mechanical properties at lower weight than standard fiber reinforced materials.
Furthermore, when both polymeric components originate from the same family or,
ideally, from the same polymer, their sustainability degree is higher than standard
fiber reinforced composites.
A challenge of polymer-polymer composites is the subsequent processing of their
semi-finished materials by heating techniques. Since the fibers are made of meltable
thermoplastic, the reinforcing fiber structure might be lost during the heating process.
Hence, the mechanical properties of an overheated polymer-polymer composite
would decline, and finally, they would be even lower than the neat matrix. A decrease
of process temperature to manage the heating challenge is not reasonable since the
cycle time would be increased at the same time. Therefore, this work pursues the
adaption of a fast and selective heating method on the use with polymer-polymer
composites. Inductively activatable particles, so-called susceptors, were distributed in
the matrix to evoke a local heating in the matrix when being exposed to an
alternating magnetic field. In this way, the energy input to the fibers is limited.
The experimental series revealed the induction particle heating effect to be mainly
related to susceptor material, susceptor fraction, susceptor distribution as well as
magnetic field strength, coupling distance, and heating time. A proper heating was
achieved with ferromagnetic particles at a filler content of only 5 wt-% in HDPE as
well as with its respective polymer fiber reinforced composites. The study included
the analysis of susceptor impact on mechanical and thermal matrix properties as well
as a degradation evaluation. The susceptors were identified to have only a marginal
impact on matrix properties. Furthermore, a semi-empiric simulation of the particle
induction heating was applied, which served for the investigation of intrinsic melting
processes.
The achieved results, the experimental as well as the analytic study, were
successfully adapted to a thermoforming process with a polymer-polymer material,
which had been preheated by means of particle induction.
This thesis generalizes the Cohen-Lenstra heuristic for the class groups of real quadratic
number fields to higher class groups. A "good part" of the second class group is defined.
In general this is a non abelian proper factor group of the second class group. Properties
of those groups are described, a probability distribution on the set of those groups is in-
troduced and proposed as generalization of the Cohen-Lenstra heuristic for real quadratic
number fields. The calculation of number field tables which contain information about
higher class groups is explained and the tables are compared to the heuristic. The agree-
ment is close. A program which can create an internet database for number field tables is
presented.
SHIM is a concurrent deterministic programming language for embedded systems built on rendezvous communication. It abstracts away many details to give the developer a high-level view that includes virtual shared variables, threads as orthogonal statements, and deterministic concurrent exceptions.
In this paper, we present a new way to compile a SHIM-like language into a set of asynchronous guarded actions, a well-established intermediate representation for concurrent systems. By doing so, we build a bridge to many other tools, including hardware synthesis and formal verification. We present our translation in detail, illustrate it through examples, and show how the result can be used by various other tools.
In this work we extend the multiscale finite element method (MsFEM)
as formulated by Hou and Wu in [14] to the PDE system of linear elasticity.
The application, motivated from the multiscale analysis of highly heterogeneous
composite materials, is twofold. Resolving the heterogeneities on
the finest scale, we utilize the linear MsFEM basis for the construction of
robust coarse spaces in the context of two-level overlapping Domain Decomposition
preconditioners. We motivate and explain the construction
and present numerical results validating the approach. Under the assumption
that the material jumps are isolated, that is they occur only in the
interior of the coarse grid elements, our experiments show uniform convergence
rates independent of the contrast in the Young's modulus within the
heterogeneous material. Elsewise, if no restrictions on the position of the
high coefficient inclusions are imposed, robustness can not be guaranteed
any more. These results justify expectations to obtain coefficient-explicit
condition number bounds for the PDE system of linear elasticity similar to
existing ones for scalar elliptic PDEs as given in the work of Graham, Lechner
and Scheichl [12]. Furthermore, we numerically observe the properties
of the MsFEM coarse space for linear elasticity in an upscaling framework.
Therefore, we present experimental results showing the approximation errors
of the multiscale coarse space w.r.t. the fine-scale solution.
Generic layout analysis--process of decomposing document image into homogeneous regions for a collection of diverse document images--has many important applications in document image analysis and understanding such as preprocessing of degraded warped, camera-captured document images, high performance layout analysis of document images containing complex cursive scripts, and word spotting in historical document images at page level. Many areas in this field like generic text line extraction method are considered as elusive goals so far, still beyond the reach of the state-of-the-art methods [NJ07, LSZT07, KB06]. This thesis addresses this problem in such a way that it presents generic, domain-independent, text line extraction and text and non-text segmentation methods, and then describes some important applications, that were developed based on these methods. An overview of the key contributions of this thesis is as follows.
The first part of this thesis presents a generic text line extraction method using a combination of matched filtering and ridge detection techniques, which are commonly used in computer vision. Unlike the state-of-the-art text line extraction methods in the literature, the generic text line extraction method can be equally and robustly applied to a large variety of document image classes including scanned and camera-captured documents, binary and grayscale documents, typed-text and handwritten documents, historical and contemporary documents, and documents containing different scripts. Different standard datasets are selected for performance evaluation that belong to different categories of document images such as the UW-III [GHHP97] dataset of scanned documents, the ICDAR 2007 [GAS07] and the UMD [LZDJ08] datasets of handwritten documents, the DFKI-I [SB07] dataset of camera-captured documents, Arabic/Urdu script documents dataset, and German calligraphic (Fraktur) script historical documents dataset. The generic text line extraction method achieves 86% (n = 23,763 text lines in 650 documents) text line detection accuracy which is better than the aggregate accuracy of 73% of the best performing domain-specific state-of-the-art methods. To the best of the author's knowledge, it is the first general-purpose text line extraction method that can be equally used for a diverse collection of documents.
This thesis also presents an active contour (snake) based curled text line extraction method for warped, camera-captured document images. The presented approach is applied to DFKI-I [SB07] dataset of camera-captured, Latin script document images for curled text line extraction. It achieves above 95% (n = 3,091 text lines in 102 documents) text line detection accuracy, which is significantly better than the competing state-of-the-art curled text line extraction methods. The presented text line extraction method can also be applied to document images containing different scripts like Chinese, Devanagari, and Arabic after small modifications.
The second part of this thesis presents an improved version of the state-of-the-art multiresolution morphology (Leptonica) based text and non-text segmentation method [Blo91], which is a domain-independent page segmentation approach and can be equally applied to a diverse collection of binarized document images. It is demonstrated that the presented improvements result in an increase in segmentation accuracy from 93% to 99% (n = 113 documents).
This thesis also introduces a discriminative learning based approach for page segmentation, where a self-tunable multi-layer perceptron (MLP) classifier [BS10] is trained for distinguishing between text and non-text connected components. Unlike other classification based page segmentation approaches in the literature, the connected components based discriminative learning based approach is faster than pixel based classification methods and does not require a block segmentation method beforehand. A segmentation accuracy of $96\%$ ($n = 113$ documents) is achieved in comparison to the state-of-the-art multiresolution morphology (Leptonica) based page segmentation method [Blo91] that achieves a segmentation accuracy of 93%. In addition to text and non-text segmentation of Latin script documents, the presented approach can also be adapted for document images containing other scripts as well as for other specialized layout analysis tasks such as digit and non-digit segmentation [HBSB12], orientation detection [RBSB09], and body-text and side-note segmentation [BAESB12].
Finally, this thesis presents important applications of the two generic layout analysis techniques, ridge-based text line extraction method and the multi-resolution morphology based text and non-text segmentation method, discussed above. First, a complete preprocessing pipeline is described for removing different types of degradations from grayscale warped, camera-captured document images that includes removal of grayscale degradations such as non-uniform shadows and blurring through binarization, noise cleanup applying page frame detection, and document rectification using monocular dewarping. Each of these preprocessing steps shows significant improvement in comparison to the analyzed state-of-the-art methods in the literature. Second, a high performance layout analysis method is described for complex Arabic script document images written in different languages such as Arabic, Urdu, and Persian and different styles for example Naskh and Nastaliq. The presented layout analysis system is robust against different types of document image degradations and shows better performance for text and non-text segmentation, text line extraction, and reading order determination on a variety of Arabic and Urdu document images as compared to the state-of-the-art methods. It can be used for large scale Arabic and Urdu documents' digitization processes. These applications demonstrate that the layout analysis methods, ridge-based text line extraction and the multi-resolution morphology based text and non-text segmentation, are generic and can be applied easily to a large collection of diverse document images.
Recently convex optimization models were successfully applied for solving various problems in image analysis and restoration. In this paper, we are interested in relations between convex constrained optimization problems of the form \(min\{\Phi(x)\) subject to \(\Psi(x)\le\tau\}\) and their non-constrained, penalized counterparts \(min\{\Phi(x)+\lambda\Psi(x)\}\). We start with general considerations of the topic and provide a novel proof which ensures that a solution of the constrained problem with given \(\tau\) is also a solution of the on-constrained problem for a certain \(\lambda\). Then we deal with the special setting that \(\Psi\) is a semi-norm and \(\Phi=\phi(Hx)\), where \(H\) is a linear, not necessarily invertible operator and \(\phi\) is essentially smooth and strictly convex. In this case we can prove via the dual problems that there exists a bijective function which maps \(\tau\) from a certain interval to \(\lambda\) such that the solutions of the constrained problem coincide with those of the non-constrained problem if and only if \(\tau\) and \(\lambda\) are in the graph of this function. We illustrate the relation between \(\tau\) and \(\lambda\) by various problems arising in image processing. In particular, we demonstrate the performance of the constrained model in restoration tasks of images corrupted by Poisson noise and in inpainting models with constrained nuclear norm. Such models can be useful if we have a priori knowledge on the image rather than on the noise level.
Recently convex optimization models were successfully applied
for solving various problems in image analysis and restoration.
In this paper, we are interested in relations between
convex constrained optimization problems
of the form
\({\rm argmin} \{ \Phi(x)\) subject to \(\Psi(x) \le \tau \}\)
and their penalized counterparts
\({\rm argmin} \{\Phi(x) + \lambda \Psi(x)\}\).
We recall general results on the topic by the help of an epigraphical projection.
Then we deal with the special setting \(\Psi := \| L \cdot\|\) with \(L \in \mathbb{R}^{m,n}\)
and \(\Phi := \varphi(H \cdot)\),
where \(H \in \mathbb{R}^{n,n}\) and \(\varphi: \mathbb R^n \rightarrow \mathbb{R} \cup \{+\infty\} \)
meet certain requirements which are often fulfilled in image processing models.
In this case we prove by incorporating the dual problems
that there exists a bijective function
such that
the solutions of the constrained problem coincide with those of the
penalized problem if and only if \(\tau\) and \(\lambda\) are in the graph
of this function.
We illustrate the relation between \(\tau\) and \(\lambda\) for various problems
arising in image processing.
In particular, we point out the relation to the Pareto frontier for joint sparsity problems.
We demonstrate the performance of the
constrained model in restoration tasks of images corrupted by Poisson noise
with the \(I\)-divergence as data fitting term \(\varphi\)
and in inpainting models with the constrained nuclear norm.
Such models can be useful if we have a priori knowledge on the image rather than on the noise level.
This research for this thesis was conducted to develop a framework which supports the automatic configuration of project-specific software development processes by selecting and combining different technologies: the Process Configuration Framework. The research draws attention to the problem that while the research community develops new technologies, the industrial companies continue only using their well-known ones. Because of this, technology transfer takes decades. In addition, there is the fact that there is no solution which solves all problems in a software development project. This leads to a number of technologies which need to be combined for one project.
The framework developed and explained in this research mainly addresses those problems by building a bridge between research and industry as well as by supporting software companies during the selection of the most appropriate technologies combined in a software process. The technology transformation gap is filled by a repository of (new) technologies which are used as a foundation of the Process Configuration Framework. The process is configured by providing SPEM process pattern for each technology, so that the companies can build their process by plugging into each other.
The technologies of the repository were specified in a schema including a technology model, context model, and an impact model. With context and impact it is possible to provide information about a technology, for example, its benefits to quality, cost or schedule. The offering of the process pattern as output of the Process Configuration Framework is performed in several stages:
I Technology Ranking:
1 Ranking based on Application Domain, Project & Impact
2 Ranking based on Environment
3 Ranking based on Static Context
II Technology Combination:
4 Creation of all possible Technology Chains
5 Restriction of the Technology Chains
6 Ranking based on Static and Dynamic Context
7 Extension of the Chains by Quality Assurance
III Process Configuration:
8 Process Component Diagram
9 Extension of the Process Component Diagram
10 Instantiation of the Components by Technologies of the Technology Chain
11 Providing process patterns
12 Creation of the process based on Patterns
The effectiveness and quality of the Process Configuration Framework have additionally been evaluated in a case study. Here, the Technology Chains manually created by experts were compared to the chains automatically created by the framework after it was configured by those experts. This comparison depicted that the framework results are similar and therefore can be used as a recommendation.
We conclude from our research that support during the configuration of a process for software projects is important especially for non-experts. This support is provided by the Process Configuration Framework developed in this research. In addition our research has shown that this framework offers a possibility to speed up the technology transformation gap between the research community and industrial companies.