Talk:Gene
From Wikipedia, the free encyclopedia
| This is the talk page for discussing improvements to the Gene article. This is not a forum for general discussion of the subject of the article. |
Article policies
|
| Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
| Archives: 1, 2, 3Auto-archiving period: 3 months |
| Gene is a former featured article. Please see the links under Article milestones below for its original nomination page (for older articles, check the nomination archive) and why it was removed. | |||||||||||||||||||||||||
| Gene has been listed as one of the Natural sciences good articles under the good article criteria. If you can improve it further, please do so. If it no longer meets these criteria, you can reassess it. | |||||||||||||||||||||||||
| This article appeared on Wikipedia's Main Page as Today's featured article on October 5, 2004. | |||||||||||||||||||||||||
| |||||||||||||||||||||||||
| This It is of interest to the following WikiProjects: | ||||||||||||||||||||||||
| ||||||||||||||||||||||||
Suggesting 2015 GA Review
Transcluded from Talk:Gene/Review
To WP:MCB, WP:GEN, WP:BIOL and WP:EB
The gene article gets 50,000 views per month but has been de-listed as a featured article since 2006. Given the success of the recent blitz on the enzyme article, I thought I'd suggest spending a couple of weeks seeing if we can get it up to a higher standard. I'm going to start with updating some of the images. If you'd like to help out on the article, it'd be great to see you there. T.Shafee(Evo﹠Evo)talk 09:49, 31 March 2015 (UTC)
- It appears the main reason gene was delisted as a GA was sourcing (see Talk:Gene/GA1). The following free textbook is probably sufficient to document most basic facts about genes:
- Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). Molecular Biology of the Cell (Fourth ed.). New York: Garland Science. ISBN 978-0-8153-3218-3.
- a second one is even more relevant, but unfortunately not freely accessed:
- Watson JD, Baker TA, Bell SP, Gann A, Levine M, Losick R (2013). Molecular Biology of the Gene (7th ed.). Benjamin Cummings. ISBN 978-0-321-90537-6.
- I will start working on this as I find time. Boghog (talk) 17:58, 31 March 2015 (UTC)
- Thanks for the prompt on this! I see I did do some work here back in the day, but not enough. Looks like a typical large-but-untended wiki article - bloated up with random factoids with no attention to the flow of the article. I'm pretty busy for this week and out of town next week, but I'll try to give it some attention. Opabinia regalis (talk) 19:19, 31 March 2015 (UTC)
- I'll probably go through and make all the necessary MOS tweaks for FA status to the article within the next week. Too preoccupied with other articles at the moment to make any substantive content/reference changes though. Seppi333 (Insert 2¢ | Maintained) 03:24, 1 April 2015 (UTC)
Glossary
Snooping around I encountered Template:Genetics glossary, I don't know it's backstory, but it is a rather cleaver idea for a template in my opinion. I partially reckon it might go well under the first image in place or the second image depicting DNA, which conceptually is a tangent. I am not sure, hence my asking. --Squidonius (talk) 21:47, 1 April 2015 (UTC)
- Including a glossary could be useful, but I think it should be concise and tailored specifically for this article. Currently {{Genetics glossary}} contains 22 entries and some of the definitions are quite lengthy. A shorter glossary, closer to the size of {{Transcription factor glossary}} or {{Restriction enzyme glossary}}, IMHO would be more effective. Another option is to transclude the {{Genetics sidebar}} which in turn links to {{Genetics glossary}}. Boghog (talk) 06:38, 2 April 2015 (UTC)
- ...could also just transclude a collapsed version - provides the full set of terms and takes up little space. If people need a glossary, they can expand it. Glossaries probably shouldn't be expanded by default unless there's a lot of free space along the right side of the page between level 2 sections (i.e., horizontal line breaks), since images and tables should take precedence. Seppi333 (Insert 2¢ | Maintained) 07:25, 2 April 2015 (UTC)
- Collapsed or not collapsed, {{Genetics glossary}} is still way too long. Glossaries should be restricted to key terms with short definitions that can quickly be scanned while reading the rest of the article. IMHO, a long glossary defeats its purpose. Furthermore an uncollapsed glossary is more likely be read and if kept short, no need to collapse. Boghog (talk) 08:30, 2 April 2015 (UTC)
- Fair enough. Might as well make a new one since it's not referenced anyway; imo, glossaries should cite sources, preferably another glossary, because it's article content. Seppi333 (Insert 2¢ | Maintained) 08:39, 2 April 2015 (UTC)
- Hmm, apparently I added a bunch of stuff to that template awhile back, but don't remember it at all. It appears to be a subset of the article genetics glossary. (I'm not really sure we need both.) I agree that the template is way too long, and as constructed is hard to ctrl-F for a term.
- I suggest just linking to the MBC glossary as a "reference". I would consider this kind of thing as a summary analogous to the lead paragraphs; no need for a clutter of little blue numbers. Opabinia regalis (talk) 21:47, 2 April 2015 (UTC)
References
I'm planning on adding some more Molecular Biology of the Cell references to the article using {{rp}} to specify chapter sections. I went to the MBOC 4th ed. online page but I can find no way of searching by page number, chapter, section or anything else. Any ideas on how to specify specific sections as is possible for Biochemistry 5th ed. online? Alternatively, maybe there's a more easily refernced online textbook for general citations. T.Shafee(Evo﹠Evo)talk 11:30, 20 April 2015 (UTC)
- I had the same train of thought here on the regular talk page. How about something like this? Uses {{sfn}} to include links to individual sections as notes. Of course, now they're separate from the rest of the references, but maybe it's not a bad idea to distinguish 'basic stuff you can find in a textbook' from 'specific results you need to consult the literature for'. Opabinia regalis (talk) 06:09, 21 April 2015 (UTC)
- You're right, I missed that. I agree that it's actually a good way to format it. Having a separate list that indicates the significance of the references is useful. T.Shafee(Evo﹠Evo)talk 08:06, 21 April 2015 (UTC)
- I am not a big fan of {{sfn}} templates. They are more complicated and harder to maintain. Plus they don't directly address the problem of searching Molecular Biology of the Cell. What seems to work is to search for the chapter or subchapter titles in quotes. For example search for "DNA and Chromosomes" provides a link to the introduction of chapter 4. Then one can reference the chapter or subchapter number with {{rp}}. I am busy this week but should have more time this weekend to work on this. Boghog (talk) 12:21, 21 April 2015 (UTC)
- I mis-described my own suggestion; it's actually {{efn}} (not that that's better). I like your method better from an aesthetic and maintenance point of view, but the problem is that giving a reader a reference to "chapter 4" is less useful if there's no obvious way to get to chapter 4 from the book's table of contents page. I don't see a way to provide separate links for each chapter/section without splitting up the references in the reference list. We could use {{rp}} like this, but I think the links police won't like that. Opabinia regalis (talk) 18:03, 21 April 2015 (UTC)
- OK, I now see what you mean. The choice is between {{efn}} and in-line external links and {{efn}} is the lesser of two evils. One other possibility is to append the chapter external links to the citation:
- Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). Molecular Biology of the Cell (Fourth ed.). New York: Garland Science. ISBN 978-0-8153-3218-3. Chapter 4: DNA and Chromosomes; Chapter 7: Control of Gene Expression; Chapter 7.1: An Overview of Gene Control; Chapter 7.2: DNA-Binding Motifs in Gene Regulatory Proteins; Chapter 7.3: How Genetic Switches Work
- or have separate citations for each chapter where only the
|chapter=and|chapterurl=parameters differ:- Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). "Chapter 7: Control of Gene Expression". Molecular Biology of the Cell (Fourth ed.). New York: Garland Science. ISBN 978-0-8153-3218-3.
{{cite book}}: External link in(help); Unknown parameter|chapterurl=|chapterurl=ignored (|chapter-url=suggested) (help)
- Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). "Chapter 7: Control of Gene Expression". Molecular Biology of the Cell (Fourth ed.). New York: Garland Science. ISBN 978-0-8153-3218-3.
- Boghog (talk) 18:47, 21 April 2015 (UTC)
- OK, I now see what you mean. The choice is between {{efn}} and in-line external links and {{efn}} is the lesser of two evils. One other possibility is to append the chapter external links to the citation:
- My first reaction to your 'appended links' idea was that we shouldn't create our own linked pseudo-TOC given the publisher's apparent desire not to have a linked TOC hosted by the organization they actually licensed the content to. But all the other ideas do essentially the same thing, so that's a bit silly. I think I like that idea in combination with {{rp}} chapter labels best, as it's least intrusive in the text, makes clear how many citations go to a general reference, and doesn't require a separate list or potentially fragile formatting. Opabinia regalis (talk) 20:49, 21 April 2015 (UTC)
I've not done much non-standard reference citation so I'll wait until you've done a couple so that I can see the format in context before doing any more. The ones I added yesterday shouldn't be too difficult to reformat. T.Shafee(Evo﹠Evo)talk 12:24, 22 April 2015 (UTC)
- You're the one currently doing the work, so I think that means you get to decide :) Opabinia regalis (talk) 19:01, 22 April 2015 (UTC)
MBOC references
Article
Genes[1]: 2 are numerous[1]: 4 and useful[1]: 4.1
References
- Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). Molecular Biology of the Cell (Fourth ed.). New York: Garland Science. ISBN 978-0-8153-3218-3.
So {{rp}} labels the chapter number but does not provide any easy link to the actual information. Therefore it's combined with a list of chapter links. the benefit is that the {{rp}} template is relatively easy to maintain and the list of chapter links doesn't require maintainance and places all the MBOC links together. As stated above, there's basically no way to avoid linking individually to chapters if we want to cite MBOC. I'll finish building the chapter list over the next couple of days. T.Shafee(Evo﹠Evo)talk 01:29, 27 April 2015 (UTC)
- I've finished adding MBOC references up to section 3 (gene expression). Also, whoever originally wrote the gene expression section of the article really liked semicolons! T.Shafee(Evo﹠Evo)talk 10:51, 27 April 2015 (UTC)
- Looks great, I like the collapsible box! I can't find it at the moment, though - IIRC there is somewhere an agreement not to use collapsed boxes for references for accessibility reasons. I don't see it in WP:ACCESSIBILITY so I could be misremembering, and since the box contains links and not the reference note itself, it's probably fine. Just wanted to mention it in case someone recognized the issue. Opabinia regalis (talk) 07:50, 28 April 2015 (UTC)
- @Opabinia regalis and Evolution and evolvability: The guideline is MOS:COLLAPSE, which states "...boxes that toggle text display between hide and show, should not conceal article content, including reference lists ... When scrolling lists or collapsible content are used, take care that the content will still be accessible on devices that do not support JavaScript or CSS." I checked this article on my phone, a mid-2011 model, and that entire box just doesn't appear at all using the default mobile view. I tried setting the template parameter
expand=trueso the box is expanded by default but that made no difference. Maybe better to change to a bulleted or indented list? Adrian J. Hunter(talk•contribs) 10:50, 30 June 2015 (UTC)- @Adrian J. Hunter: Well spotted - It's really irritating when templates don't work properly on mobiles! I've changed the MBOC list to be wrapped in
{{Hidden begin}}+{{Hidden end}}, which renders properly on phones (default expanded). T.Shafee(Evo﹠Evo)talk 12:31, 30 June 2015 (UTC)- Yep, that works – thanks! Adrian J. Hunter(talk•contribs) 13:23, 30 June 2015 (UTC)
- @Adrian J. Hunter: Well spotted - It's really irritating when templates don't work properly on mobiles! I've changed the MBOC list to be wrapped in
- @Opabinia regalis and Evolution and evolvability: The guideline is MOS:COLLAPSE, which states "...boxes that toggle text display between hide and show, should not conceal article content, including reference lists ... When scrolling lists or collapsible content are used, take care that the content will still be accessible on devices that do not support JavaScript or CSS." I checked this article on my phone, a mid-2011 model, and that entire box just doesn't appear at all using the default mobile view. I tried setting the template parameter
Mendelian gene and molecular gene
The introduction does not do a good job of distinguishing between the Mendelian gene and the molecular gene. For example, different traits (phenotypes) may be due to alleles within a molecular gene or in regulatory sequences that lie outside of the gene. They may even be due to mutations in other functional regions such as centromeres, telomeres, or origins of replication.
It also tends to be biased toward complex diploid organisms. Bacteria also have genes.
In genetics, any allele that causes an observable trait is often called a "gene" - this is the way Richard Dawkins uses the word "gene" and so do many biologists. The molecular gene is very different.
We have the same problem when describing an allele. The Wikipedia article clearly states that alleles are variants at any locus in the genome, not just in molecular genes or Mendelian genes. Classical genetics restricted the word "allele" to the Mendelian gene because they didn't know about neutral alleles with no observable effect and they also didn't know whether two variants were located within a molecular gene or in a regulatory sequence.
Unless someone objects, I'm going to try and clean up the introduction so that it agrees with the correct definition of allele and with the main body of the article. Genome42 (talk) 17:39, 6 December 2025 (UTC)
OED
I added a link to the Oxford English Dictionary definition of gene. You will find none finer: "The basic unit of heredity in living organisms, originally recognized as a discrete physical factor associated with the inheritance of a particular morphological or physiological trait, and later shown to be located at a specific site on a chromosome and to consist of a sequence of DNA". It's similar (if more elegant) to the definition on Wikipedia: "The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and non-coding genes." True, introns are not included, but even if the definition is incomplete, it is not incorrect.
The OED is invaluable for many reasons. The page for gene is a case in point. We get examples of the word in literature, from 1909 (American Naturalist: "The difference between the two kinds of homozygotes with respect to any unit-character is, that one..has one pair of allelomorphs or ‘genes’ in addition to those possessed by the other kind of homozygote.") to 1911 (Wilhelm Johannsen: "I have proposed the terms ‘gene’ and ‘genotype’..to be used in the science of genetics. The ‘gene’ is nothing but a very applicable little word, easily combined with others, and hence it may be useful as an expression for the ‘unit-factors’, ‘elements’ or ‘allelomorphs’ in the gametes, demonstrated by modern Mendelian researches.") to 1919 (T. H. Morgan: "We should expect that the genes in different chromosome pairs will ‘assort’ independently, and this, in fact, is what Mendel's second law postulates.") to 2003 (Ruth Ozeki: "Biotechnology... Fish genes spliced into tomatoes. Bacterial DNA into potatoes.") A history of genetics in its own words.
The OED is the gold standard of dictionaries. It is not a scientific dictionary, but you can be sure they consulted with scientists in editing it. It has earned a place on this page. Charlie Faust (talk) 22:07, 8 December 2025 (UTC)
- Dictionaries are not reliable sources of information on scientific definitions. I can see that the first OED definition of gene is better than the definitions in other English dictionaries but that doesn't mean we should start incorporating dictionary definitions into science articles.
- I also don't agree that the OED deserves an exemption because it is "the gold standard of [English?] dictionaries."
- Let's break down the OED definition. Here's what it says ...
- "[A gene is] the basic unit of heredity in living organisms, originally recognized as a discrete physical factor associated with the inheritance of a particular morphological or physiological trait ..."
- This is an acceptable description of what old geneticists may have thought before the word "gene" entered the scientific literature and it corresponds roughly to the modern version of the Mendelian gene that is still in widespread use today (e.g. Dawkins).
- The OED goes on to say ....
- "[The gene was] later shown to be located at a specific site on a chromosome and to consist of a sequence of DNA (or RNA in certain viruses) containing a code for a protein or RNA molecule, together with any associated sequences necessary for transcription and translation."
- Saying that the Mendelian gene is now known as a specific DNA sequence is a correct description of what scientists now think of when referring to the Mendelian gene. That's why Dawkins calls it the selfish 'gene.'
- However, the OED definition implies that the Mendelian gene has now been replaced by the molecular gene and this is not correct. There are still two incompatible definitions of 'gene' in the scientific literature as discussed in reference #1.
- The OED definition doesn't help us distinguish between the modern molecular gene and other heritable sequences such as regulatory sequences. For that we need a more precise scientific definition such as "a gene is sequence of nucleotides in DNA that is transcribed to produce a functional RNA." We then go on to explain that this definition does not include regulatory sequences but it does include introns. The OED definition actually implies that regulatory sequences are part of the modern molecular definition of a gene and this is incorrect.
- The OED definition then goes on to mention regulatory genes ("Regulatory genes control the expression of other genes.") This is confusing because there's no reason to highlight those genes coding for transcription factors or regulatory RNAs except to sow confusion between those genes and regulatory sequences, which aren't genes.
- The OED then lists a number of historical examples using the word 'gene.' Many of them use 'gene' as a synonym for allele, just as Dawkins does. Some of them imply incorrectly that genes are the only heritable material. One of them makes an incorrect statement about the similarity of human and ape genes.
- The OED then goes on to define gene in two other ways, which is appropriate for an English dictionary but not a scientific one.
Chargaff
I added that Watson and Crick made use of Chargaff's rules. From their 1953 paper: "it is found that only specific pairs of bases can bond together. These pairs are: adenine (purine) with thymine (pyrimidine), and guanine (purine) with cytosine (pyrimidine).
In other words, if an adenine forms one member of a pair, on either chain, then on these assumptions the other member must be thymine; similarly for guanine and cytosine". ... It has been found experimentally that the ratio of the amounts of adenine to thymine, and the ratio of guanine to cytosine, are always very close to unity for deoxyribonucleic acid." They include a citation to Chargaff.
This leads to one of the great lines in science: "It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material."[1]
In their 1953 paper, they make explicit reference to Chargaff's rules and cite Chargaff. Per Nature: "Chargaff's realization that A = T and C = G, combined with some crucially important X-ray crystallography work by English researchers Rosalind Franklin and Maurice Wilkins, contributed to Watson and Crick's derivation of the three-dimensional, double-helical model for the structure of DNA."[2] Charlie Faust (talk) 18:45, 9 December 2025 (UTC)
- The history has been covered in many books and articles. Watson discovered the complementary base pairs using cardboard cutouts of the various bases and the presumed hydrogen bonds that could from. After he had come up with a model of the base pairs he realized that they conformed to part of Chargaff's data.
- Here's how it's described in Judsons's book on page 148.
- "[Watson] told me 'Francis and I had a rule that we wanted to use as little data, few assumptions, as possible to solve the structure, and we never knew whether Chargaff's rules had some complementary extraneous functional reason, and so we didn't put that in. The Chargaff ratio just fell out in the end.'"
- And here's how Matthew Cobb describes it on page 109 of "Life's Greatest Secret" ...
- "One of the main facts that in retrospect seems so obvious, but which was not at the time, is the role of what are sometimes called the 'Chargaff rules' - the fact that the amounts of A and T and of G and C are equivalent. These ratios were not known to be so precise at the time and they were certainly not 'rules.' As Jerry Donohue, who shared Watson and Crick's room at the Cavendish laboratory, later recalled in somewhat exaggerated fashion" 'When the final model of DNA was discovered - more or less by accident - it wasn't Chargaff's rules that made the model, but the model made the rules.'"
- Watson and Crick put the reference in the paper as confirmation of their model but they didn't use any of the ratios in building the base pair model in the first place. It's important to note that Chargaff published lots of papers on the various base compositions of DNA and the purine to pyrimidine ratio was just one of the observations he noted. It was only after the DNA model was published that Chargaff announced that he had predicted the A/T and G/C base pairs. You don't see that emphasis in any of his published work.
- Chargaff did an excellent job of promoting his own work and taking partial credit for the discovery of the structure of DNA. It's time to stop the spread of that misinformation.
- But the important point is that this an article about the meaning of the word 'gene' and not an article about the structure of DNA. The structure of DNA is covered in DNA and some of the history of the discovery in covered in History of molecular biology. Both of those articles contain a significant amount of misinformation and the DNA article is a classic example of a Wikipedia article that has become bloated with a huge amount of irrelevant information. Let's not allow that to happen in this article. Genome42 (talk) 9 December 2025 (UTC)
Here is Watson, in The Double Helix: "The moment was thus appropriate to think seriously about some curious regularities in DNA chemistry first observed at Columbia by the Austrian-born biochemist Erwin Chargaff. Since the war, Chargaff and his students had been painstakingly analyzing various DNA samples for the relative proportions of their purine and pyrimidine bases. In all their DNA preparations the number of adenine (A) molecules was very similar to the number of thymine (T) molecules, while the number of guanine (G) molecules was very close to the number of cytosine (C) molecules. Moreover, the proportion of adenine and thymine groups varied with their biological origin. The DNA of some organisms had an excess of A and T, while in other forms of life there was an excess of G and C."[3] Watson did not always give due credit, but he does to Chargaff. And he and Crick cite Chargaff in their 1953 paper.
Here is The Scientist: "Erwin Chargaff's groundbreaking research, which showed that DNA base pairs had a complementary relationship, laid the foundation for James Watson's and Francis Crick's DNA model."[4]
Here is the American Association for the Advancement of Science: "The work of Erwin Chargaff (1905-2002), an Austrian-born biochemist who emigrated to the United States, was instrumental in the discovery of the double helix. His research led him to propose two rules, which became known as 'Chargaff's Rules.' The first was that the number of guanine units equaled the number of cytosine units, just as the number of adenine units equaled the number of thymine units, but that these two numbers differed. This discovery discredited Levene's tetranucleotide hypothesis and eventually led to the base pairing model. He informed Watson and Crick of his findings in 1952."[5]
Here is Nature: "Chargaff's realization that A = T and C = G, combined with some crucially important X-ray crystallography work by English researchers Rosalind Franklin and Maurice Wilkins, contributed to Watson and Crick's derivation of the three-dimensional, double-helical model for the structure of DNA."[6]
Scientists stand on the shoulders of giants. Chargaff was one. Noting that does not diminish the brilliance of Watson and Crick.
- It's interesting that you should mention giants. Chargaff spent decades trying to convince people that he had discovered base complementarity when, in fact, he did no such thing. It was only after Watson & Crick that he realized the significance of some of the ratios he had published.
- Chargaff was a traditional biochemist and he was very much opposed to the new molecular biology, which he thought was too speculative. Here's what he said about molecular biologists (e.g. Watson and Crick) "That in our day such pygmies throw such giant shadows only shows how late in the day it has become."
- This quotation and a detailed discussion of Chargaff's actual and imaginary contribution can be found in a lengthy article by Horace Judson "Reflections on the Historiography of Molecular Biology".
Here's the Journal of Biological Chemistry: "Chargaff's research also helped lay the groundwork for James Watson and Francis Crick's discovery of the double-helix structure of DNA."
Here's the Genomics Education Program: "[Chargaff] noted that the numbers of adenine and thymine bases were always the same, as were the numbers of cytosine and guanine. This consistent base ratio informed Watson and Crick’s idea of base pairing.
Chargaff also demonstrated that adenine/thymine and cytosine/guanine proportions were different in different species. This suggested a molecular basis for hereditary differences."
Watson, not known for giving credit where it's due, explicitly cites Chargaff: "The moment was thus appropriate to think seriously about some curious regularities in DNA chemistry first observed at Columbia by the Austrian-born biochemist Erwin Chargaff." (From The Double Helix.) More significant, he and Crick cite Chargaff in the 1953 paper.
Whether or not Chargaff understood the significance of his findings, Watson and Crick did.
- It's perfectly acceptable to say that once Watson figured out the base pairs by building models he was pleased to discover that Chargaff's results confirmed his model. Genome42 (talk) 16:23, 17 December 2025 (UTC)
- Watson, James; Crick, Francis (April 1953). "Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid". Nature. 171 (4356): 737–8. Bibcode:1953Natur.171..737W. doi:10.1038/171737a0. PMID 13054692.
- Pray, Leslie A. (2008). "Discovery of DNA Structure and Function: Watson and Crick". Nature.
- "DNA Base Pairs and Erwin Chargaff". The Scientist. April 6, 2003.
- Borowski, Susan (April 25, 2013). "The other discoverers of DNA". American Association for the Advancement of Science.
- Pray, Leslie A. (2008). "Discovery of DNA Double Helix: Watson and Crick". Nature.

