List of homing endonuclease cutting sites
From Wikipedia, the free encyclopedia
| Legend of nucleobases | |
|---|---|
| Code | Nucleotide represented |
| A | Adenine (A) |
| C | Cytosine (C) |
| G | Guanine (G) |
| T | Thymine (T) |
| N | A, C, G or T |
| M | A or C |
| R | A or G |
| W | A or T |
| Y | C or T |
| S | C or G |
| K | G or T |
| H | A, C or T |
| B | C, G or T |
| V | A, C or G |
| D | A, G or T |
The homing endonucleases are a special type of restriction enzymes encoded by introns or inteins. They act on the cellular DNA of the cell that synthesizes them; to be precise, in the opposite allele of the gene that encode them.[1]
The list includes some of the most studied examples. The following concepts have been detailed:
- Enzyme: Accepted name of the molecule, according to the internationally adopted nomenclature. Bibliographical references. (Further reading: Homing endonuclease § Nomenclature.)
- SF (structural family): Any of the established families for this kind of proteins, based in their shared structural motifs:
H1: LAGLIDADG family –H2: GIY-YIG family –H3: H-N-H family –H4: His-Cys box family –H5: PD-(D/E)xK –H6: EDxHD. (Further reading: Homing endonuclease § Structural families.) - PDB code: Code used to identify the structure of a protein in the PDB database. If no structure is available, a UniProt identifier is given instead.
- Source: Organism that naturally produces the enzyme.
- D: Biological domain of the source: A: archaea – B: bacteria – E: eukarya.
- SCL: Subcellular genome: chloro: chloroplast – chrm: chromosomal – mito: mitochondrial – plasmid: other extrachromosomal – phage: bacteriophage.
- Recognition sequence: Sequence of DNA recognized by the enzyme. The enzyme is specifically bound to this sequence.
- Cut: Cutting site and products of the cut. Both the recognition sequence and the cutting site match usually, but sometimes the cutting site can be dozens of nucleotides away from the recognition site.
| Enzyme | SF | PDB code | Source | D | SCL | Recognition sequence | Cut |
|---|---|---|---|---|---|---|---|
| I-AniI[2] | H1 |
1P8K | Aspergillus nidulans | E | mito | 5' TTGAGGAGGTTTCTCTGTAAATAA 3' AACTCCTCCAAAGAGACATTTATT |
5' ---TTGAGGAGGTTTC TCTGTAAATAA--- 3' 3' ---AACTCCTCC AAAGAGACATTTATT--- 5' |
| I-CeuI[3][4][5][6] | H1 |
2EX5 | Chlamydomonas eugametos | E | chloro | 5' TAACTATAACGGTCCTAAGGTAGCGA 3' ATTGATATTGCCAGGATTCCATCGCT |
5' ---TAACTATAACGGTCCTAA GGTAGCGA--- 3' 3' ---ATTGATATTGCCAG GATTCCATCGCT--- 5' |
| I-ChuI[7][8] | H1 |
Q32001 | Chlamydomonas humicola | E | chloro | 5' GAAGGTTTGGCACCTCGATGTCGGCTCATC 3' CTTCCAAACCGTGGAGCTACAGCCGAGTAG |
5' ---GAAGGTTTGGCACCTCG ATGTCGGCTCATC--- 3' 3' ---CTTCCAAACCGTG GAGCTACAGCCGAGTAG--- 5' |
| I-CpaI[8][9] | H1 |
Q39562 | Chlamydomonas pallidostigmata | E | chloro | 5' CGATCCTAAGGTAGCGAAATTCA 3' GCTAGGATTCCATCGCTTTAAGT |
5' ---CGATCCTAAGGTAGCGAA ATTCA--- 3' 3' ---GCTAGGATTCCATC GCTTTAAGT--- 5' |
| I-CpaII[10] | H1 |
Q39559 | Chlamydomonas pallidostigmata | E | chloro | 5' CCCGGCTAACTCTGTGCCAG 3' GGGCCGATTGAGACACGGTC |
5' ---CCCGGCTAACTC TGTGCCAG--- 3' 5' ---GGGCCGAT TGAGACACGGTC--- 3' |
| I-CreI[11] | H1 |
1BP7 | Chlamydomonas reinhardtii | E | chloro | 5' CTGGGTTCAAAACGTCGTGAGACAGTTTGG 3' GACCCAAGTTTTGCAGCACTCTGTCAAACC |
5' ---CTGGGTTCAAAACGTCGTGA GACAGTTTGG--- 3' 3' ---GACCCAAGTTTTGCAG CACTCTGTCAAACC--- 5' |
| I-DmoI | H1 |
1B24 | Desulfurococcus mobilis | A | chrm | 5' ATGCCTTGCCGGGTAAGTTCCGGCGCGCAT 3' TACGGAACGGCCCATTCAAGGCCGCGCGTA |
5' ---ATGCCTTGCCGGGTAA GTTCCGGCGCGCAT--- 3' 3' ---TACGGAACGGCC CATTCAAGGCCGCGCGTA--- 5' |
| H-DreI[12] | H1 |
1MOW | Hybrid: I-DmoI and I-CreI | AE | 5' CAAAACGTCGTAAGTTCCGGCGCG 3' GTTTTGCAGCATTCAAGGCCGCGC |
5' ---CAAAACGTCGTAA GTTCCGGCGCG--- 3' 3' ---GTTTTGCAG CATTCAAGGCCGCGC--- 5' | |
| I-HmuI[13][14] | H3 |
1U3E | Bacillus subtilis phage SP01 | B | phage | 5' AGTAATGAGCCTAACGCTCAGCAA 3' TCATTACTCGGATTGCGAGTCGTT |
Nicking endonuclease: * 3' ---TCATTACTCGGATTGC GAGTCGTT--- 5' |
| I-HmuII[14][15] | H3 |
Q38137 | Bacillus subtilis phage SP82 | B | phage | 5' AGTAATGAGCCTAACGCTCAACAA 3' TCATTACTCGGATTGCGAGTTGTT |
Nicking endonuclease: * 3' ---TCATTACTCGGATTGCGAGTTGTTN35 NNNN--- 5' |
| I-LlaI[16][17] | H3 |
P0A3U1 | Lactococcus lactis | B | chrm | 5' CACATCCATAACCATATCATTTTT 3' GTGTAGGTATTGGTATAGTAAAAA |
5' ---CACATCCATAA CCATATCATTTTT--- 3' 3' ---GTGTAGGTATTGGTATAGTAA AAA--- 5' |
| I-MsoI | H1 |
1M5X | Monomastix sp. | E | 5' CTGGGTTCAAAACGTCGTGAGACAGTTTGG 3' GACCCAAGTTTTGCAGCACTCTGTCAAACC |
5' ---CTGGGTTCAAAACGTCGTGA GACAGTTTGG--- 3' 3' ---GACCCAAGTTTTGCAG CACTCTGTCAAACC--- 5' | |
| PI-PfuI | H1 |
1DQ3 | Pyrococcus furiosus Vc1 | A | 5' GAAGATGGGAGGAGGGACCGGACTCAACTT 3' CTTCTACCCTCCTCCCTGGCCTGAGTTGAA |
5' ---GAAGATGGGAGGAGGG ACCGGACTCAACTT--- 3' 3' ---CTTCTACCCTCC TCCCTGGCCTGAGTTGAA--- 5' | |
| PI-PkoII | H1 |
2CW7 | Pyrococcus kodakarensis BAA-918 | A | 5' CAGTACTACGGTTAC 3' GTCATGATGCCAATG |
5' ---CAGTACTACG GTTAC--- 3' 3' ---GTCATG ATGCCAATG--- 5' | |
| I-PorI[18][19] | H3 |
Pyrobaculum organotrophum | A | chrm | 5' GCGAGCCCGTAAGGGTGTGTACGGG 3' CGCTCGGGCATTCCCACACATGCCC |
5' ---GCGAGCCCGTAAGGGT GTGTACGGG--- 3' 3' ---CGCTCGGGCATT CCCACACATGCCC--- 5' | |
| I-PpoI | H4 |
1EVX | Physarum polycephalum | E | plasmid | 5' TAACTATGACTCTCTTAAGGTAGCCAAAT 3' ATTGATACTGAGAGAATTCCATCGGTTTA |
5' ---TAACTATGACTCTCTTAA GGTAGCCAAAT--- 3' 3' ---ATTGATACTGAGAG AATTCCATCGGTTTA--- 5' |
| PI-PspI | H1 |
Q51334 | Pyrococcus sp. | A | chrm | 5' TGGCAAACAGCTATTATGGGTATTATGGGT 3' ACCGTTTGTCGATAATACCCATAATACCCA |
5' ---TGGCAAACAGCTATTAT GGGTATTATGGGT--- 3' 3' ---ACCGTTTGTCGAT AATACCCATAATACCCA--- 5' |
| I-ScaI[20][21] | H1 |
P03873 | Saccharomyces capensis | E | mito | 5' TGTCACATTGAGGTGCACTAGTTATTAC 3' ACAGTGTAACTCCACGTGATCAATAATG |
5' ---TGTCACATTGAGGTGCACT AGTTATTAC--- 3' 3' ---ACAGTGTAACTCCAC GTGATCAATAATG--- 5' |
| I-SceI[4][5] | H1 |
1R7M | Saccharomyces cerevisiae | E | mito | 5' AGTTACGCTAGGGATAACAGGGTAATATAG 3' TCAATGCGATCCCTATTGTCCCATTATATC |
5' ---AGTTACGCTAGGGATAA CAGGGTAATATAG--- 3' 3' ---TCAATGCGATCCC TATTGTCCCATTATATC--- 5' |
| PI-SceI[22][23] | H1 |
1VDE | Saccharomyces cerevisiae | E | 5' ATCTATGTCGGGTGCGGAGAAAGAGGTAATGAAATGGCA 3' TAGATACAGCCCACGCCTCTTTCTCCATTACTTTACCGT |
5' ---ATCTATGTCGGGTGC GGAGAAAGAGGTAATGAAATGGCA--- 3' 3' ---TAGATACAGCC CACGCCTCTTTCTCCATTACTTTACCGT--- 5' | |
| I-SceII[24][25][26] | H1 |
Saccharomyces cerevisiae | E | mito | 5' TTTTGATTCTTTGGTCACCCTGAAGTATA 3' AAAACTAAGAAACCAGTGGGACTTCATAT |
5' ---TTTTGATTCTTTGGTCACCC TGAAGTATA--- 3' 3' ---AAAACTAAGAAACCAG TGGGACTTCATAT--- 5' | |
| I-SecIII[24][27][28] | H1 |
Saccharomyces cerevisiae | E | mito | 5' ATTGGAGGTTTTGGTAACTATTTATTACC 3' TAACCTCCAAAACCATTGATAAATAATGG |
5' ---ATTGGAGGTTTTGGTAAC TATTTATTACC--- 3' 3' ---TAACCTCCAAAACC ATTGATAAATAATGG--- 5' | |
| I-SceIV[24][29][30] | H1 |
Saccharomyces cerevisiae | E | mito | 5' TCTTTTCTCTTGATTAGCCCTAATCTACG 3' AGAAAAGAGAACTAATCGGGATTAGATGC |
5' ---TCTTTTCTCTTGATTA GCCCTAATCTACG--- 3' 3' ---AGAAAAGAGAAC TAATCGGGATTAGATGC--- 5' | |
| I-SceV[24][31] | H3 |
Saccharomyces cerevisiae | E | mito | 5' AATAATTTTCTTCTTAGTAATGCC 3' TTATTAAAAGAAGAATCATTACGG |
5' ---AATAATTTTCT TCTTAGTAATGCC--- 3' 3' ---TTATTAAAAGAAGAATCATTA CGG--- 5' | |
| I-SceVI[24][32] | H3 |
Saccharomyces cerevisiae | E | mito | 5' GTTATTTAATGTTTTAGTAGTTGG 3' CAATAAATTACAAAATCATCAACC |
5' ---GTTATTTAATG TTTTAGTAGTTGG--- 3' 3' ---CAATAAATTACAAAATCATCA ACC--- 5' | |
| I-SceVII[20] | H1 |
Saccharomyces cerevisiae | E | mito | 5' TGTCACATTGAGGTGCACTAGTTATTAC 3' ACAGTGTAACTCCACGTGATCAATAATG |
Unknown ** | |
| I-Ssp6803I | H5 |
2OST | Synechocystis sp. PCC 6803 | B | 5' GTCGGGCTCATAACCCGAA 3' CAGCCCGAGTATTGGGCTT |
5' ---GTCGGGCT CATAACCCGAA--- 3' 3' ---CAGCCCGAGTA TTGGGCTT--- 5' | |
| I-TevI[33][34][35] | H2 |
1I3J | Escherichia coli phage T4 | B | phage | 5' AGTGGTATCAACGCTCAGTAGATG 3' TCACCATAGT TGCGAGTCATCTAC |
5' ---AGTGGTATCAAC GCTCAGTAGATG--- 3' 3' ---TCACCATAGT TGCGAGTCATCTAC--- 5' |
| I-TevII[33][36] | H2 |
Escherichia coli phage T4 | B | phage | 5' GCTTATGAGTATGAAGTGAACACGTTATTC 3' CGAATACTCATACTTCACTTGTGCAATAAG |
5' ---GCTTATGAGTATGAAGTGAACACGT TATTC--- 3' 3' ---CGAATACTCATACTTCACTTGTG CAATAAG--- 5' | |
| I-TevIII[37] | H3 |
Escherichia coli phage RB3 | B | phage | 5' TATGTATCTTTTGCGTGTACCTTTAACTTC 3' ATACATAGAAAACGCACATGGAAATTGAAG |
5' ---T ATGTATCTTTTGCGTGTACCTTTAACTTC--- 3' 3' ---AT ACATAGAAAACGCACATGGAAATTGAAG--- 5' | |
| PI-TliI[38][39] | H1 |
Thermococcus litoralis | A | chrm | 5' TAYGCNGAYACNGACGGYTTYT 3' ATRCGNCTRTGNCTGCCTAARA |
5' ---TAYGCNGAYACNGACGG YTTYT--- 3' 3' ---ATRCGNCTRTGNC TGCCTAARA--- 5' | |
| PI-TliII[22][39][40] | H1 |
Thermococcus litoralis | A | chrm | 5' AAATTGCTTGCAAACAGCTATTACGGCTAT 3' TTTAACGAACGTTTGTCGATAATGCCGATA |
Unknown ** | |
| I-Tsp061I | H1 |
2DCH | Thermoproteus sp. IC-061 | A | 5' CTTCAGTATGCCCCGAAAC 3' GAAGTCATACGGGGCTTTG |
5' ---CTTCAGTAT GCCCCGAAAC--- 3' 3' ---GAAGT CATACGGGGCTTTG--- 5' | |
| I-Vdi141I | H1 |
3E54 | Vulcanisaeta distributa IC-141 | A | 5' CCTGACTCTCTTAAGGTAGCCAAA 3' GGACTGAGAGAATTCCATCGGTTT |
5' ---CCTGACTCTCTTAA GGTAGCCAAA--- 3' 3' ---GGACTGAG AGAATTCCATCGGTTT--- 5' |
*: Nicking endonuclease: These enzymes cut only one DNA strand, leaving the other strand untouched.
**: Unknown cutting site: Researchers have not been able to determine the exact cutting site of these enzymes yet.