Ancestral graph
From Wikipedia, the free encyclopedia

In statistics and Markov modeling, an ancestral graph is a type of mixed graph used to provide a graphical representation for the result of marginalizing one or more vertices in a graphical model that takes the form of a directed acyclic graph. Ancestral graphs can encode conditional independence relations that arise in directed acyclic graph (DAG) models with latent variables and selection bias.[1]
Ancestral graphs are mixed graphs with three kinds of edges: directed edges (→), drawn as an arrow from one vertex to another, bidirected edges (↔), which have an arrowhead at both ends, and undirected edges (—), which have no arrowheads. An ancestral graph must satisfy the following constraints:[1]
- There are no directed cycles.
- If there is an edge from a vertex to another vertex , with an arrowhead at (that is, either a directed edge from to or a bidirected edge), then there does not exist a path from to consisting of undirected edges and/or directed edges oriented consistently with the path.
- If a vertex is an endpoint of an undirected edge, then it is not also the endpoint of an edge with an arrowhead at .
A maximal ancestral graph (MAG) is an ancestral graph in which no additional edge may be added without violating the ancestral graph properties or changing the encoded independence model.[2]
In the context of causal inference, ancestral graphs assume the Causal Markov Condition (CMC) and Causal Faithfulness Condition (CFC). The CMC requires that every variable is independent of its non-descendants conditional on its parents in the causal graph, while the CFC stipulates that the only independence relationships are those implied by the CMC.[3]
Markov equivalence
Two ancestral graphs are Markov equivalent if they encode the same set of conditional independence relations. The independence relations in an ancestral graph are determined using m-separation, a graphical criterion that generalizes d-separation from DAGs to ancestral graphs.[2]
A collider on a path is a vertex where two arrowheads meet. A non-collider on a path is any vertex on the path that is not a collider. A collider may have an order, which indicates the minimum number of vertices on certain discriminating paths associated with that collider. A discriminating path for a vertex in an ancestral graph is a path where:[1]
- is not adjacent to
- Every vertex is a collider on and a parent of
For maximal ancestral graphs, Markov equivalence can be characterized graphically: two MAGs are Markov equivalent if and only if they have:[1]
- The same adjacencies (pairs of vertices connected by some edge)
- The same unshielded colliders (colliders on unshielded paths where the two vertices adjacent to the collider are not themselves adjacent)
- The same colliders with order on discriminating paths
This characterization leads to a polynomial-time algorithm for determining whether two maximal ancestral graphs are Markov equivalent, running in time where is the number of vertices and is the number of edges.[2]
Applications
Ancestral graphs are used to depict conditional independence relations between variables in Markov models.[4] They are particularly useful for representing the independence structure over observed variables when some variables in an underlying DAG model are unobserved (latent) or when data are sampled from a specific sub-population determined by selection variables.[2]
Ancestral graphs are advantageous because they can represent causal relationships without explicitly including latent variables in the model, which makes the search space tractable for causal discovery algorithms. Unlike DAG models with latent variables, which create an infinite search space unless the number of latent variables or network topology is constrained, ancestral graphs provide a finite representation.[1]
Ancestral graphs are closely related to inducing path graphs (IPGs), another graphical representation for causal systems with latent variables. While syntactically MAGs form a subclass of IPGs (as MAGs do not contain almost directed cycles while IPGs may), PAGs generally reveal more qualitative causal information than partially oriented inducing path graphs (POIPGs).[5] This makes ancestral graphs particularly advantageous for causal discovery and inference tasks.
Ancestral graphs enable the estimation of causal effects from observational data even when the system is causally insufficient (containing unmeasured common causes). The LV-IDA algorithm (Latent Variable Intervention-calculus when the DAG is Absent) extends the IDA approach to estimate bounds on causal effects when latent variables may be present.[3] Unlike methods that assume causal sufficiency (no unmeasured confounders), this approach can identify when causal effects are non-identifiable and provide informative bounds instead of point estimates.
Causal discovery
The FCI algorithm (Fast Causal Inference) is a procedure for learning causal structure from observational data in the presence of latent confounders and selection bias. The algorithm outputs a partial ancestral graph (PAG), which represents a Markov equivalence class of MAGs. Zhang (2008) established that with additional orientation rules beyond those in the original FCI algorithm, the procedure is complete. That is, it can discover all aspects of causal structure that are uniquely determined by conditional independence relations, under standard causal inference assumptions.[1]
The completeness of orientation rules for PAGs has important implications for identifying invariant relationships under interventions. A conditional probability is invariant under an intervention on if the probability remains unchanged after the intervention. PAGs provide complete graphical criteria for determining when such invariance holds, meaning they can identify all invariant relationships that are uniquely determined by the conditional independence structure.[5]