The Super-Peer Selection Problem is an optimization problem in network topology construction. It may be cast as a special case of a Hub Location Problem, more exactly an Uncapacitated Single Allocation p-Hub Median Problem with equal weights. We show that this problem is still NP-hard by reduction from Max Clique.
In this article we present a method to generate random objects from a large variety of combinatorial classes according to a given distribution. Given a description of the combinatorial class and a set of sample data our method will provide an algorithm that generates objects of size n in worst-case runtime O(n^2) (O(n log(n)) can be achieved at the cost of a higher average-case runtime), with the generated objects following a distribution that closely matches the distribution of the sample data.
There is a well known relationship between alternating automata on finite words and symbolically represented nondeterministic automata on finite words. This relationship is of practical relevance because it allows to combine the advantages of alternating and symbolically represented nondeterministic automata on finite words. However, for infinite words the situation is unclear. Therefore, this work investigates the relationship between alternating omega-automata and symbolically represented nondeterministic omega-automata. Thereby, we identify classes of alternating omega-automata that are as expressive as safety, liveness and deterministic prefix automata, respectively. Moreover, some very simple symbolic nondeterminisation procedures are developed for the classes corresponding to safety and liveness properties.
Ownership Domains generalize ownership types. They support programming patterns like iterators that are not possible with ordinary ownership types. However, they are still too restrictive for cases in which an object X wants to access the public domains of an arbitrary number of other objects, which often happens in observer scenarios. To overcome this restriction, we developed so-called loose domains which abstract over several precise domains. That is, similar to the relation between supertypes and subtypes we have a relation between loose and precise domains. In addition, we simplified ownership domains by reducing the number of domains per object to two and hard-wiring the access permissions between domains. We formalized the resulting type system for an OO core language and proved type soundness and a fundamental accessibility property.
As any RNA sequence can be folded in many different ways, there are lots of different possible secondary structures for a given sequence. Most computational prediction methods based on free energy minimization compute a number of suboptimal foldings and we have to identify the native structures among all these possible secondary structures. For this reason, much effort has been made to develop approaches for identifying good predictions of RNA secondary structure. Using the abstract shapes approach as introduced by Giegerich et al., each class of similar secondary structures is represented by one shape and the native structures can be found among the top shape representatives. In this article, we derive some interesting results answering enumeration problems for abstract shapes and secondary structures of RNA. We start by computing symptotical representations for the number of shape representations of length n. Our main goal is to find out how much the search space can be reduced by using the concept of abstract shapes. To reach this goal, we analyze the number of secondary structures and shapes compatible with an RNA sequence of length n under the assumption that base pairing is allowed between arbitrary pairs of bases analytically and compare their exponential growths. Additionally, we analyze the number of secondary structures compatible with an RNA sequence of length n under the assumptions that base pairing is allowed only between certain pairs of bases and that the structures meet some appropriate conditions. The exponential growth factors of the resulting asymptotics are compared to the corresponding experimentally obtained value as given by Giegerich et al.
This article focuses on the analytical analysis of the free energy in a realistic model for RNA secondary structures. In fact, the free energy in a stochastic model derived from a database of small and large subunit ribosomal RNA (SSU and LSU rRNA) data is studied. A common thermody-namic model for computing the free energy of a given RNA secondary structure, as well as stochastic context-free grammars and generating functions are used to derive the desired results. These results include asymptotics for the expected free energy and for the corresponding variance of a random RNA secondary structure. The quality of our model is judged by comparing the derived results to the used database of SSU and LSU rRNA data. At the end of this article, it is discussed how our results could be used to help on identifying good predictions of RNA secondary structure.
One of the fundamental problems in computational structural biology is the prediction of RNA secondary structures from a single sequence. To solve this problem, mainly two different approaches have been used over the past decades: the free energy minimization (MFE) approach which is still considered the most popular and successful method and the competing stochastic context-free grammar (SCFG) approach. While the accuracy of the MFE based algorithms is limited by the quality of underlying thermodynamic models, the SCFG method abstracts from free energies and instead tries to learn about the structural behavior of the molecules by training the grammars on known real RNA structures, making it highly dependent on the availability of a rich high quality training set. However, due to the respective problems associated with both methods, new statistics based approaches towards RNA structure prediction have become increasingly appreciated. For instance, over the last years, several statistical sampling methods and clustering techniques have been invented that are based on the computation of partition functions (PFs) and base pair probabilities according to thermodynamic models. A corresponding SCFG based statistical sampling algorithm for RNA secondary structures has been studied just recently. Notably, this probabilistic method is capable of producing accurate (prediction) results, where its worst-case time and space requirements are equal to those of common RNA folding algorithms for single sequences.
The aim of this work is to present a comprehensive study on how enriching the underlying SCFG by additional information on the lengths of generated substructures (i.e. by incorporating length-dependencies into the SCFG based sampling algorithm, which is actually possible without significant losses in performance) affects the reliability of the induced RNA model and the accuracy of sampled secondary structures. As we will see, significant differences with respect to the overall quality of generated sample sets and the resulting predictive accuracy are typically implied. In principle, when considering the more specialized length-dependent SCFG model as basis for statistical sampling, a higher accuracy of predicted foldings can be reached at the price of a lower diversity of generated candidate structures (compared to the more general traditional SCFG variant or sampling based on PFs that rely on free energies).
This document introduces the extension of Katja to support position structures and explains the subtleties of their application as well as the design decisions made and problems solved with respect to their implementation. The Katja system was first introduced by Jan Schäfer in the context of his project work and is based on the MAX system developed by Arnd Poetzsch-Heffter.