Cosegregation

From Wikipedia, the free encyclopedia

Nuclear Profile searching for loci
Nuclear profile of genome. (A) Nucleus, (B) nuclear profile, (C) loci (green dots) where parts of target gene found.

Cosegregation, in genealogy, refers to the tendency of two or more genes located close together on the same chromosome to be inherited together during cell division. Due to their physical proximity, these genes are considered genetically linked and are likely to be inherited together.[1]

In genetics, the term may also refer to the estimated probability of interaction between multiple loci or specific regions within a target gene. This probability is assessed using data derived from nuclear profiles (NPs), which are thin slices taken from a cell nucleus. Within each NP, the presence or absence of particular loci is evaluated.[2]

These interaction probabilities—referred to as cosegregation values—are used in mathematical models such as SLICE[3] and normalized linkage disequilibrium. These models contribute to the generation of 3D genome architecture maps as part of genome architecture mapping (GAM) techniques. The resulting 3D renderings provide insights into genomic density and the radial positioning of loci within the nucleus.

Articles using co-segregation methodologies
Title Description
Complex multi-enhancer contacts captured by Genome Architecture Mapping (GAM).[3] Co-segregation between a pair of loci helped in this study to quantify Normalized Linkage Disequilibrium.
A simple method for cosegregation analysis to evaluate the pathogenicity of unclassified variants; BRCA1 and BRCA2 as an example.[4] Using co-segregation analysis along with a multifactorial approach resulted in highly conclusive results when attempting to classify unclassified variants.
Considerations in assessing germline variant pathogenicity using co-segregation analysis.[5] This article found that utilizing Bayes factor co-segregation analysis, along with a strong penetrance model, will result with higher accuracy than meiosis counting.
A Comparison of Cosegregation Analysis Methods for the Clinical Setting[6] Compares the utility of using full likelihood Bayes factor, cosegregation likelihood ratios, and counting meiosis to evaluate the pathogenicity of genetic variants.
Dissecting the co-segregation probability from genome architecture mapping Assesses the utility of cosegregation in Genome Architecture Mapping, finding normalized probability calculations a reasonable representation of inter-locus distance[7]

Some of the earliest known studies that have used cosegregation in genealogy dates back to the early 1980s. Around this time, scientists were conducting experiments on vegetative organisms to see if there are unique sequences of chloroplast DNA. The process of the experiment was to track the chloroplast gene in each generation by clustering the genes in nucleoids to reduce the number of segregated units. This study was done at the Duke University in the Zoology Department[8] where Karen P. VanWinkle-Swift utilized Pedigree Diagrams to show how the traits and sequences were passed down from parent to child.

In genetics, Cosegregation in Genome architecture mapping (GAM) is another process being used to identify the compaction and adjacency of genomic windows. In a study from 2017, cosegregation was used to understand gene-expression-specific contacts in organizing the genome in mammalian nuclei in the larger process of GAM.[3] The results of the study produced complex 3D structures that displayed interactions under certain regions of chromatin contacts and proved that GAM is a useful tool in the genome biologist's skill set that expands the ability to finely dissect 3D chromatin structures, cell types and valuable human samples. A study in 2021 "discovered extensive 'melting' of long genes when they are highly expressed and/or have high chromatin accessibility. The contacts most specific of neuron subtypes contain genes associated with specialized processes, such as addiction and synaptic plasticity, which harbour putative binding sites for neuronal transcription factors within accessible chromatin regions."[9] Both of these studies used mice as models due to their anatomical, physiological, and genetic similarity to humans.[10]

Usage

Overview

In genetics, cosegregation analysis is used to examine how multiple genetic factors are inherited together and how their interactions contribute to biological traits or conditions. Cosegregation is particularly useful in cases where a single gene does not completely explain the presence of a specific trait. By detecting patterns where genetic variants occur together, researchers can identify relationships between certain genes and analyze combinations of factors and how they influence outcomes. One example of this would be looking at a disorder that is associated with a particular gene, but is not consistently observed in those who carry that gene, cosegregation analysis can identify addition interacting genes that may contribute to the condition.

Cancer Research

Cosegregation is being actively used in medical fields like cancer research.[11] Many forms of cancer are not caused by a single mutation or gene, but rather a combination of multiple changes that disrupt normal processes. By using cosegregation analysis, researcher can use it to highlight the strongest connections between genes in cases where cancer develops. This approach helps to show complex and in-depth relationships and interactions between genes to further research into diagnosis and treatment of specific cancer types.

Heatmap displaying the cosegregation and linkage of 81 windows within a mouse genome.

Computational Biology

Cosegregation analysis is used widely in computational biology to study relationships between regions of a genome and to quantify patterns of genomic association. In this area of study, genomic loci or windows are analyzed to observe how frequently they are detected together in a sample. Detection frequencies are used to calculate cosegregation values which can then be normalized to show the strength of each connection.[12]


Examples of using cosegregation in genetics

An example of an application using cosegregation would be finding the normalized linkage disequilibrium (NLD) between two loci. Given a 2D dataset (row = genomic window slice, column = nuclear profile (NP)) a "1" was displayed if an NP existed in a window or a "0" otherwise. From this data, the NLD could be found using the base disequilibrium and its theorized maximum (). The amount of NPs present in loci (genomic windows) and , is then used to find the , and and the co-segregation which is, . After the NLD is found between two loci, it was then placed into another dataset to be visualized and then analyzed to determine how interconnected a loci is. This example was executed using python for computation and visualization of the given data and results and in finding the NLD. Using the NLD further analysis can be done to place the windows into "communities". To showcase this a graph to the right will show the community of one of the windows with the highest centrality which uses the average of the window's NLDs.

Displays the communities for a specific loci using centrality

An alternative method to using Normalized Linkage Disequilibrium is Normalized Pointwise Mutual Information (NPMI). NPMI measures how closely two loci are associated by taking the log of their joint cosegregation probability, , divided by their independent probabilities, . This log is then divided by the log of their joint probability, to normalize the result.

Both NLD and NPMI range between -1 and 1 and reflect how the joint cosegregation probability deviates from what would be expected if the two loci were independent. However, they differ in scope as NLD measures linear relationships, while NPMI can capture more complex, non-linear relationships between the loci.[13]

sample data
A sample of the 2D dataset that was used for the application of the cosegregation example.
Formulas for the example above
Calculations Formulas[3]
Detection Frequency or
Linkage

or

Linkage maximum (dmax)
Normalized Linkage Disequilibrium (NLD)

Normalized Pointwise Mutual Information (NPMI)

Formula

pseduo-code
pseudo-code showcasing the implementation of co-segregation in data science.
Formula for finding co-segregation given a GAM table showing if a loci is present in a slice of a genomic region
Formula[3] Variables
or

Variables and are the total number of nuclear profiles (NP) present in a given a detected genomic region slice, is the total number of NPs and is the frequency of and .

This formula can be easily programmed into code as seen in the pseudo-code in the figure to the right. The code was written to satisfy the Example described above.

Advantages

Given a large dataset of nuclear profiles, cosegregation is highly scalable due to its relatively simple mathematical formulation. The larger the data set that is provided, the more accurate the following equations will be. As depicted in the photo below, the amount of data being added to the equation merely adds linear time adjustments to the original equation.

How adding more NPs to dataset affects cosegregation equation.

In addition, cosegregation analysis scales efficiently with dataset size and can incorporate multiple loci of interest to determine the interaction probability. Since each additional locus introduces only a single additional computation, the method exhibits linear time complexity. The picture below shows how the amount of loci affects the detection frequency equation.

Adding loci affects the cosegregation equation in a linear time complexity.

Cosegregation analysis is also valuable in computational biology because it enables genomic data to be represented as matrices and networks. These representations allow for the application of graph-based methods, such as community detection and centrality analysis, to identify clusters of interacting genomic regions and highly connected loci.

Community detected in a genomic interaction network derived from cosegregation values. Nodes represent genomic windows and edges represent strong inferred interactions, illustrating how graph-based methods can identify interacting clusters.

In addition, cosegregation values can be visualized using heatmaps and network diagrams, which improve the interpretability of complex genomic interaction patterns. These visualizations help reveal structural features such as chromatin domains and interaction hubs.

Normalized linkage heatmap for genomic windows in the Hist1 region. Brighter values indicate stronger pairwise cosegregation, helping reveal blocks of coordinated chromatin interaction.

The resulting numerical values can be used to infer properties such as radial positioning, chromatin compaction, and the strength of interactions between genomic regions. [14]

Limitations

Visualizations

References

Related Articles

Wikiwand AI