List of RNA structure prediction software

From Wikipedia, the free encyclopedia

This list of RNA structure prediction software is a compilation of software tools and web portals used for nucleic acid structure prediction.

Single sequence tertiary structure prediction

Comparative methods

RNA solvent accessibility prediction

Name

(Year)

DescriptionLinkReferences
RNAsnap2

(2020)

RNAsnap2 uses a dilated convolutional neural network with evolutionary features generated from BLAST + INFERNAL (same as RNAsol) and predicted base-pairing probabilities from LinearPartition as an input for the prediction of RNA solvent accessibility. Also, the single-sequence version of RNAsnap2 can predict the solvent accessibility of a given input RNA sequence without using evolutionary information.sourcecode

webserver

[101]
RNAsol

(2019)

RNAsol predictor uses a unidirectional LSTM deep learning algorithm with evolutionary information generated from BLASTN + INFERNAL and predicted secondary structure from RNAfold as an input for the prediction of RNA solvent accessibility.sourcecode

webserver

[102]
RNAsnap

(2017)

RNAsnap predictor uses an SVM machine learning algorithm and evolutionary information generated from BLASTN as an input for the prediction of RNA solvent accessibility.sourcecode[103]

Intermolecular interactions: RNA-RNA

Many ncRNAs function by binding to other RNAs. For example, miRNAs regulate protein coding gene expression by binding to 3' UTRs, small nucleolar RNAs guide post-transcriptional modifications by binding to rRNA, U4 spliceosomal RNA and U6 spliceosomal RNA bind to each other forming part of the spliceosome and many small bacterial RNAs regulate gene expression by antisense interactions E.g. GcvB, OxyS and RyhB.

Name DescriptionIntra-molecular structureComparativeLinkReferences
SQUARNA SQUARNA predicts RNA secondary structure formed by several RNA sequences using a greedy stem formation modelYesYessourcecode[1]
RNApredator RNApredator uses a dynamic programming approach to compute RNA-RNA interaction sites.YesNowebserver Archived 2015-01-10 at the Wayback Machine[104]
GUUGle A utility for fast determination of RNA-RNA matches with perfect hybridization via A-U, C-G, and G-U base pairing.NoNowebserver[105]
IntaRNA Efficient target prediction incorporating the accessibility of target sites.YesNosourcecode webserver[106][107][108][109][110]
CopraRNA Tool for sRNA target prediction. It computes whole genome predictions by mix of distinct whole genome IntaRNA predictions.YesYessourcecode webserver[111][107]
MINT Automatic tool to analyze three-dimensional structures of RNA and DNA molecules, their full-atom molecular dynamics trajectories or other conformation sets (e.g. X-ray or NMR-derived structures). For each RNA or DNA conformation MINT determines the hydrogen bonding network resolving the base pairing patterns, identifies secondary structure motifs (helices, junctions, loops, etc.) and pseudoknots. Also estimates the energy of stacking and phosphate anion-base interactions.YesNosourcecode webserver[112]
NUPACK Computes the full unpseudoknotted partition function of interacting strands in dilute solution. Calculates the concentrations, mfes, and base-pairing probabilities of the ordered complexes below a certain complexity. Also computes the partition function and basepairing of single strands including a class of pseudoknotted structures. Also enables design of ordered complexes.YesNoNUPACK[113]
OligoWalk/RNAstructure Predicts bimolecular secondary structures with and without intramolecular structure. Also predicts the hybridization affinity of a short nucleic acid to an RNA target.YesNo[114]
piRNA Calculates the partition function and thermodynamics of RNA-RNA interactions. It considers all possible joint secondary structure of two interacting nucleic acids that do not contain pseudoknots, interaction pseudoknots, or zigzags.YesNolinuxbinary[115]
piRNAPred an integrated framework for piRNA prediction employing hybrid features like k-mer nucleotide composition, secondary structure, thermodynamic and physicochemical properties.YesNo[116]
RNAripalign Calculates the partition function and thermodynamics of RNA-RNA interactions based on structural alignments. Also supports RNA-RNA interaction prediction for single sequences. It outputs suboptimal structures based on Boltzmann distribution. It considers all possible joint secondary structure of two interacting nucleic acids that do not contain pseudoknots, interaction pseudoknots, or zigzags.YesNo[117]
RactIP Fast and accurate prediction of RNA-RNA interaction using integer programming.YesNosourcecode webserver[118]
RNAaliduplex Based on RNAduplex with bonuses for covarying sitesNoYessourcecode[21]
RNAcofold Works much like RNAfold, but allows specifying two RNA sequences which are then allowed to form a dimer structure.YesNosourcecode[21][119]
RNAduplex Computes optimal and suboptimal secondary structures for hybridization. The calculation is simplified by allowing only inter-molecular base pairs.NoNosourcecode[21]
RNAhybrid Tool to find the minimum free energy hybridisation of a long and a short RNA (≤ 30 nt).NoNosourcecode, webserver[120][121]
RNAup Calculates the thermodynamics of RNA-RNA interactions. RNA-RNA binding is decomposed into two stages. (1) First the probability that a sequence interval (e.g. a binding site) remains unpaired is computed. (2) Then the binding energy given that the binding site is unpaired is calculated as the optimum over all possible types of bindings.YesNosourcecode[21][122]

Intermolecular interactions: MicroRNA:any RNA

The below table includes interactions that are not limited to UTRs.

Name DescriptionCross-speciesIntra-molecular structureComparativeLinkReferences
comTAR A a web tool for the prediction of miRNA targets that is mainly based on the conservation of the potential regulation in plant species.YesNoNoWeb tool[123]
RNA22 The first link (precomputed predictions) provides RNA22 predictions for all protein coding transcripts in human, mouse, roundworm, and fruit fly. It allows visualizing the predictions within a cDNA map and also find transcripts where multiple miR's of interest target. The second web-site link (interactive/custom sequences) first finds putative microRNA binding sites in the sequence of interest, then identifies the targeted microRNA. Both tools are provided by the Computational Medicine Center at Thomas Jefferson University.YesNoNoprecomputed predictions interactive/custom sequences[124]
RNAhybrid Tool to find the minimum free energy hybridisation of a long and a short RNA (≤ 30 nt).YesNoNosourcecode, webserver[120][121]
miRBooking Simulates the stochiometric mode of action of microRNAs using a derivative of the Gale-Shapley algorithm for finding a stable set of duplexes. It uses quantifications for traversing the set of mRNA and microRNA pairs and seed complementarity for ranking and assigning sites.YesNoNosourcecode, webserver [125]

Intermolecular interactions: MicroRNA:UTR

MicroRNAs regulate protein coding gene expression by binding to 3' UTRs, there are tools specifically designed for predicting these interactions. For an evaluation of target prediction methods on high-throughput experimental data see (Baek et al., Nature 2008),[126] (Alexiou et al., Bioinformatics 2009),[127] or (Ritchie et al., Nature Methods 2009)[128]

Name DescriptionCross-speciesIntra-molecular structureComparativeLinkReferences
Cupid Method for simultaneous prediction of miRNA-target interactions and their mediated competing endogenous RNA (ceRNA) interactions. It is an integrative approach significantly improves on miRNA-target prediction accuracy as assessed by both mRNA and protein level measurements in breast cancer cell lines. Cupid is implemented in 3 steps: Step 1: re-evaluate candidate miRNA binding sites in 3' UTRs. Step2: interactions are predicted by integrating information about selected sites and the statistical dependency between the expression profiles of miRNA and putative targets. Step 3: Cupid assesses whether inferred targets compete for predicted miRNA regulators.humanNoYessoftware (MATLAB)[129]
Diana-microT Version 3.0 is an algorithm based on several parameters calculated individually for each microRNA and it combines conserved and non-conserved microRNA recognition elements into a final prediction score.human, mouseNoYeswebserver[130]
MicroTar An animal miRNA target prediction tool based on miRNA-target complementarity and thermodynamic data.YesNoNosourcecode[131]
miTarget microRNA target gene prediction using a support vector machine.YesNoNowebserver[132]
miRror Based on the notion of a combinatorial regulation by an ensemble of miRNAs or genes. miRror integrates predictions from a dozen of miRNA resources that are based on complementary algorithms into a unified statistical frameworkYesNoNowebserver Archived 2016-03-03 at the Wayback Machine[133][134]
PicTar Combinatorial microRNA target predictions.8 vertebratesNoYespredictions[135]
PITA Incorporates the role of target-site accessibility, as determined by base-pairing interactions within the mRNA, in microRNA target recognition.YesYesNoexecutable, webserver, predictions[136]
RNA22 The first link (precomputed predictions) provides RNA22 predictions for all protein coding transcripts in human, mouse, roundworm, and fruit fly. It allows visualizing the predictions within a cDNA map and also find transcripts where multiple miR's of interest target. The second web-site link (interactive/custom sequences) first finds putative microRNA binding sites in the sequence of interest, then identifies the targeted microRNA. Both tools are provided by the Computational Medicine Center at Thomas Jefferson University.YesNoNoprecomputed predictions interactive/custom sequences[124]
RNAhybrid Tool to find the minimum free energy hybridisation of a long and a short RNA (≤ 30 nt).YesNoNosourcecode, webserver[120][121]
Sylamer Method to find significantly over or under-represented words in sequences according to a sorted gene list. Usually used to find significant enrichment or depletion of microRNA or siRNA seed sequences from microarray expression data.YesNoNosourcecode webserver[137][138]
TAREF TARget REFiner (TAREF) predicts microRNA targets on the basis of multiple feature information derived from the flanking regions of the predicted target sites where traditional structure prediction approach may not be successful to assess the openness. It also provides an option to use encoded pattern to refine filtering.YesNoNoserver/sourcecode[139]
p-TAREF plant TARget REFiner (p-TAREF) identifies plant microRNA targets on the basis of multiple feature information derived from the flanking regions of the predicted target sites where traditional structure prediction approach may not be successful to assess the openness. It also provides an option to use encoded pattern to refine filtering. It first time employed power of machine learning approach with scoring scheme through support vector regression (SVR) while considering structural and alignment aspects of targeting in plants with plant specific models. p-TAREF has been implemented in concurrent architecture in server and standalone form, making it one of the very few available target identification tools able to run concurrently on simple desktops while performing huge transcriptome level analysis accurately and fast. Also provides option to experimentally validate the predicted targets, on the spot, using expression data, which has been integrated in its back-end, to draw confidence on prediction along with SVR score.p-TAREF performance benchmarking has been done extensively through different tests and compared with other plant miRNA target identification tools. p-TAREF was found to perform better.YesNoNoserver/standalone
TargetScan Predicts biological targets of miRNAs by searching for the presence of sites that match the seed region of each miRNA. In flies and nematodes, predictions are ranked based on the probability of their evolutionary conservation. In zebrafish, predictions are ranked based on site number, site type, and site context, which includes factors that influence target-site accessibility. In mammals, the user can choose whether the predictions should be ranked based on the probability of their conservation or on site number, type, and context. In mammals and nematodes, the user can choose to extend predictions beyond conserved sites and consider all sites.vertebrates, flies, nematodesevaluated indirectlyYessourcecode, webserver[140][141][142][143][144][145]

ncRNA gene prediction software

Family specific gene prediction software

Name DescriptionFamilyLinkReferences
ARAGORN ARAGORN detects tRNA and tmRNA in nucleotide sequences.tRNA tmRNAwebserver source[155]
miReader miReader is a first of its type to detect mature miRNAs with no dependence on genomic or reference sequences. So far, discovering miRNAs was possible only with species for which genomic or reference sequences would be available as most of the miRNA discovery tools relied on drawing pre-miRNA candidates. Due to this, miRNA biology became limited to model organisms, mostly. miReader allows directly discerning mature miRNAs from small RNA sequencing data, with no need of genomic-reference sequences. It has been developed for many Phyla and species, from vertebrate to plant models. Its accuracy has been found to be consistently >90% in heavy validatory testing.mature miRNAwebserver/source webserver/source[156]
miRNAminer Given a search query, candidate homologs are identified using BLAST search and then tested for their known miRNA properties, such as secondary structure, energy, alignment and conservation, in order to assess their fidelity.MicroRNAwebserver[157]
RISCbinder Prediction of guide strand of microRNAs.Mature miRNAwebserver[158]
RNAmicro A SVM-based approach that, in conjunction with a non-stringent filter for consensus secondary structures, is capable of recognizing microRNA precursors in multiple sequence alignments.MicroRNAhomepage Archived 2009-08-16 at the Wayback Machine[159]
RNAmmer RNAmmer uses HMMER to annotate rRNA genes in genome sequences. Profiles were built using alignments from the European ribosomal RNA database[160] and the 5S Ribosomal RNA Database.[161]rRNAwebserver source Archived 2019-06-13 at the Wayback Machine[162]
SnoReport Uses a mix of RNA secondary structure prediction and machine learning that is designed to recognize the two major classes of snoRNAs, box C/D and box H/ACA snoRNAs, among ncRNA candidate sequences.snoRNAsourcecode Archived 2009-07-06 at the Wayback Machine[163]
SnoScan Search for C/D box methylation guide snoRNA genes in a genomic sequence.C/D box snoRNAsourcecode, webserver[164][165]
tRNAscan-SE a program for the detection of transfer RNA genes in genomic sequence.tRNAsourcecode, webserver[165][166]
miRNAFold A fast ab initio software for searching for microRNA precursors in genomes.microRNAwebserver[167]

RNA homology search software

Name DescriptionLinkReferences
DECIPHER (software) FindNonCoding takes a pattern mining approach to capture the essential sequence motifs and hairpin loops representing a non-coding RNA family and quickly identify matches in genomes. FindNonCoding was designed for ease of use and accurately finds non-coding RNAs with a low false discovery rate.sourcecode[168]
ERPIN "Easy RNA Profile IdentificatioN" is an RNA motif search program reads a sequence alignment and secondary structure, and automatically infers a statistical "secondary structure profile" (SSP). An original Dynamic Programming algorithm then matches this SSP onto any target database, finding solutions and their associated scores.sourcecode webserver Archived 2011-09-29 at the Wayback Machine[169][170][171]
Infernal "INFERence of RNA ALignment" is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs).sourcecode[172][173][174]
GraphClust Fast RNA structural clustering method to identify common (local) RNA secondary structures. Predicted structural clusters are presented as alignment. Due to the linear time complexity for clustering it is possible to analyse large RNA datasets.sourcecode[64]
PHMMTS "pair hidden Markov models on tree structures" is an extension of pair hidden Markov models defined on alignments of trees.sourcecode, webserver[175]
RaveNnA A slow and rigorous or fast and heuristic sequence-based filter for covariance models.sourcecode Archived 2008-05-28 at the Wayback Machine[176][177]
RSEARCH Takes one RNA sequence with its secondary structure and uses a local alignment algorithm to search a database for homologous RNAs.sourcecode[dead link][178]
Structator Ultra fast software for searching for RNA structural motifs employing an innovative index-based bidirectional matching algorithm combined with a new fast fragment chaining strategy.sourcecode[179]
RaligNAtor Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns sourcecode [180]

Benchmarks

Alignment viewers, editors

Inverse folding, RNA design

Name DescriptionLinkReferences
Single state design
EteRNA/EteRNABot An RNA folding game that challenges players to make sequences that fold into a target RNA structure. The best sequences for a given puzzle are synthesized and their structures are probed through chemical mapping. The sequences are then scored by the data's agreement to the target structure and feedback is provided to the players. EteRNABot is a software implementation based on design rules submitted by EteRNA players.EteRNA Game EteRNABot web server[194]
RNAinverse The ViennaRNA Package provides RNAinverse, an algorithm for designing sequences with desired structure.Web Server[21]
RNAiFold A complete RNA inverse folding approach based on constraint programming and implemented using OR Tools which allows for the specification of a wide range of design constraints. The RNAiFold software provides two algorithms to solve the inverse folding problem: i) RNA-CPdesign explores the complete search space and ii) RNA-LNSdesign based on the large neighborhood search metaheuristic is suitable to design large structures. The software can also design interacting RNA molecules using RNAcofold of the ViennaRNA Package. A fully functional, earlier implementation using COMET is available.Web Server Source Code[195][196][197]
RNA-SSD/RNA Designer The RNA-SSD (RNA Secondary Structure Designer) approach first assigns bases probabilistically to each position based probabilistic models. Subsequently, a stochastic local search is used to optimize this sequence. RNA-SSD is publicly available under the name of RNA Designer at the RNASoft web pageWeb Server[198]
INFO-RNA INFO-RNA uses a dynamic programming approach to generate an energy optimized starting sequence that is subsequently further improved by a stochastic local search that uses an effective neighbor selection method.Web Server Source Code[199][200]
RNAexinv RNAexinv is an extension of RNAinverse to generate sequences that not only fold into a desired structure, but they should also exhibit selected attributes such as thermodynamic stability and mutational robustness. This approach does not necessarily outputs a sequence that perfectly fits the input structure, but a shape abstraction, i.e. it keeps the adjacency and nesting of structural elements, but disregards helix lengths and the exact number unpaired positions, of it.Source Code[201]
RNA-ensign This approach applies an efficient global sampling algorithm to examine the mutational landscape under structural and thermodynamical constraints. The authors show that the global sampling approach is more robust, succeeds more often and generates more thermodynamically stable sequences than local approaches do.Source Code[202]
IncaRNAtion Successor of RNA-ensign that can specifically design sequences with a specified GC content using a GC-weighted Boltzmann ensemble and stochastic backtrackingSource Code[203]
DSS-Opt Dynamics in Sequence Space Optimization (DSS-Opt) uses Newtonian dynamics in the sequence space, with a negative design term and simulated annealing to optimize a sequence such that it folds into the desired secondary structure.Source Code[204]
MODENA This approach interprets RNA inverse folding as a multi-objective optimization problem and solves it using a genetic algorithm. In its extended version MODENA is able to design pseudoknotted RNA structures with the aid of IPknot.Source Code[205][206]
ERD Evolutionary RNA Design (ERD) can be used to design RNA sequences that fold into a given target structure. Any RNA secondary structure contains different structural components, each having a different length. Therefore, in the first step, the RNA subsequences (pools) corresponding to different components with different lengths are reconstructed. Using these pools, ERD reconstructs an initial RNA sequence which is compatible with the given target structure. Then ERD uses an evolutionary algorithm to improve the quality of the subsequences corresponding to the components. The major contributions of ERD are using the natural RNA sequences, a different method to evaluate the sequences in each population, and a different hierarchical decomposition of the target structure into smaller substructures.Web Server Source Code Archived 2014-10-19 at the Wayback Machine[207]
antaRNA Uses an underlying ant colony foraging heuristic terrain modeling to solve the inverse folding problem. The designed RNA sequences show high compliance to input structural and sequence constraints. Most prominently, also the GC value of the designed sequence can be regulated with high precision. GC value distribution sampling of solution sets is possible and sequence domain specific definition of multiple GC values within one entity. Due to the flexible evaluation of the intermediate sequences using underlying programs such as RNAfold, pKiss, or also HotKnots and IPKnot, RNA secondary nested structures and also pseudoknot structures of H- and K-type are feasible to solve with this approach.Web Server Source Code[208][209]
Dual state design
switch.pl The ViennaRNA Package provides a Perl script to design RNA sequences that can adopt two states. For instance RNA thermometer, which change their structural state depending on the environmental temperature, have been successfully designed using this program.Man Page Source Code[210]
RiboMaker Intended to design small RNAs (sRNA) and their target mRNA's 5'UTR. The sRNA is designed to activate or repress protein expression of the mRNA. It is also possible to design just one of the two RNA components provided the other sequence is fixed.Web Server Source Code[211]
Multi state design
RNAblueprint This C++ library is based on the RNAdesign multiple target sampling algorithm. It brings a SWIG interface for Perl and Python which allows for an effortless integration into various tools. Therefore, multiple target sequence sampling can be combined with many optimization techniques and objective functions.Source Code[212]
RNAdesign The underlying algorithm is based on a mix of graph coloring and heuristic local optimization to find sequences can adapt multiple prescribed conformations. The software can also use of RNAcofold to design interacting RNA sequence pairs.Source Code[permanent dead link][213]
Frnakenstein Frnakenstein applies a genetic algorithm to solve the inverse RNA folding problem.Source Code[214]
ARDesigner The Allosteric RNA Designer (ARDesigner) is a web-based tool that solves the inverse folding problem by incorporating mutational robustness. Beside a local search the software has been equipped with a simulated annealing approach to effectively search for good solutions. The tool has been used to design RNA thermometer.[dead link][215]
Notes

Secondary structure viewers, editors

See also

References

Related Articles

Wikiwand AI