On Abstract Shapes of RNA
- As any RNA sequence can be folded in many different ways, there are lots of different possible secondary structures for a given sequence. Most computational prediction methods based on free energy minimization compute a number of suboptimal foldings and we have to identify the native structures among all these possible secondary structures. For this reason, much effort has been made to develop approaches for identifying good predictions of RNA secondary structure. Using the abstract shapes approach as introduced by Giegerich et al., each class of similar secondary structures is represented by one shape and the native structures can be found among the top shape representatives. In this article, we derive some interesting results answering enumeration problems for abstract shapes and secondary structures of RNA. We start by computing symptotical representations for the number of shape representations of length n. Our main goal is to find out how much the search space can be reduced by using the concept of abstract shapes. To reach this goal, we analyze the number of secondary structures and shapes compatible with an RNA sequence of length n under the assumption that base pairing is allowed between arbitrary pairs of bases analytically and compare their exponential growths. Additionally, we analyze the number of secondary structures compatible with an RNA sequence of length n under the assumptions that base pairing is allowed only between certain pairs of bases and that the structures meet some appropriate conditions. The exponential growth factors of the resulting asymptotics are compared to the corresponding experimentally obtained value as given by Giegerich et al.