List of software to detect low complexity regions in proteins

From Wikipedia, the free encyclopedia

Computational methods can study protein sequences to identify regions with low complexity, which can have particular properties regarding their function and structure.

More information Name, Last update ...
Name Last update Usage Description Open source? Reference
SAPS 1992 downloadable / web It describes several protein sequence statistics for the evaluation of distinctive characteristics of residue content and arrangement in primary structures. yes [1]
SEG 1993 downloadable It is a two pass algorithm: first, identifies the LCR, and then performs local optimization by masking with Xs the LCRs yes [2]
fLPS 2017 downloadable / web It can readily handle very large protein data sets, such as might come from metagenomics projects. It is useful in searching for proteins with similar CBRs and for making functional inferences about CBRs for a protein of interest yes [3]
CAST 2000 downloadable (as geneCAST) web It identifies LCRs using iterative steps of local dynamic programming search and masking against a database of homopolymeric protein sequences. yes [4]
SIMPLE 2002 downloadable web It facilitates the quantification of the amount of simple sequence in proteins and determines the type of short motifs that show clustering above a certain threshold. yes [5]
Oj.py 2001 on request A tool for demarcating low complexity protein domains. no [6]
DSR 2003 on request It calculates complexity using reciprocal complexity. no [7]
ScanCom 2003 on request Calculates the compositional complexity using the linguistic complexity measure. no [8]
CARD 2005 on request Based on the complexity analysis of subsequences delimited by pairs of identical, repeating subsequences. no [9]
BIAS 2006 downloadable / web It uses discrete scan statistics that provide a highly accurate multiple test correction to compute analytical estimates of the significance of each compositionally biased segment. yes [10]
GBA 2006 on request A graph-based algorithm that constructs a graph of the sequence. no [11]
SubSeqer 2008 web A graph-based approach for the detection and identification of repetitive elements in low–complexity sequences. no [12]
ANNIE 2009 web This method creates an automation of the sequence analytic process. no [13]
LPS-annotate 2011 on request This algorithm defines compositional bias through a thorough search for lowest-probability subsequences (LPSs; Low Probability Sequences) and serves as workbench of tools now available to molecular biologists to generate hypotheses and inferences about the proteins that they are investigating. no [14]
LCReXXXplorer 2015 web A web platform to search, visualize and share data for low complexity regions in protein sequences. LCR-eXXXplorer offers tools for displaying LCRs from the UniProt/SwissProt knowledgebase, in combination with other relevant protein features, predicted or experimentally verified. Also, users may perform queries against a custom designed sequence/LCR-centric database. no [15]
XNU 1993 downloadable It uses the PAM120 scoring matrix for the calculation of complexity. yes [16]
AlcoR 2022 downloadable A compression-based and alignment-free tool for detecting low-complexity regions in biological data (see: Kolmogorov complexity § Compression) yes [17]
Close

For a comprehensive review on the various methods and tools, see.[18]

In addition, a web meta-server named PLAtform of TOols for LOw COmplexity (PlaToLoCo) has been developed, for visualization and annotation of low complexity regions in proteins.[19] PlaToLoCo integrates and collects the output of five different state-of-the-art tools for discovering LCRs and provides functional annotations such as domain detection, transmembrane segment prediction, and calculation of amino acid frequencies. Furthermore, the union or intersection of the results of the search on a query sequence can be obtained.

A Neural Network webserver, named LCR-hound has been developed to predict the function of prokaryotic and eukaryotic LCRs, based on their amino acid or di-amino acid (bigram) content.[20]

References

Related Articles

Wikiwand AI