fci {pcalg} | R Documentation |
Estimate a Partial Ancestral Graph (PAG) from observational data, using the FCI (Fast Causal Inference) algorithm.
fci(suffStat, indepTest, alpha, labels, p, skel.method = c("stable", "original", "stable.fast"), type = c("normal", "anytime", "adaptive"), fixedGaps = NULL, fixedEdges = NULL, NAdelete = TRUE, m.max = Inf, pdsep.max = Inf, rules = rep(TRUE, 10), doPdsep = TRUE, biCC = FALSE, conservative = FALSE, maj.rule = FALSE, numCores = 1, verbose = FALSE)
suffStat |
sufficient statistics: A named |
indepTest |
a |
alpha |
numeric significance level (in (0, 1)) for the individual conditional independence tests. |
labels |
(optional) |
p |
(optional) number of variables (or nodes). May be specified
if |
skel.method |
character string specifying method; the default,
|
type |
character string specifying the version of the FCI
algorithm to be used. By default, it is |
fixedGaps |
|
fixedEdges |
logical matrix of dimension p*p. If entry
|
NAdelete |
If indepTest returns |
m.max |
Maximum size of the conditioning sets that are considered in the conditional independence tests. |
pdsep.max |
Maximum size of Possible-D-SEP for which subsets are
considered as conditioning sets in the conditional independence
tests. If the nodes |
rules |
Logical vector of length 10 indicating which rules should be used when directing edges. The order of the rules is taken from Zhang (2008). |
doPdsep |
If |
biCC |
If |
conservative |
Logical indicating if the unshielded triples should be checked for ambiguity the second time when v-structures are determined. For more information, see details. |
maj.rule |
Logical indicating if the unshielded triples should be checked for ambiguity the second time when v-structures are determined using a majority rule idea, which is less strict than the standard conservative. For more information, see details. |
numCores |
Specifies the number of cores to be used for parallel
estimation of |
verbose |
If true, more detailed output is provided. |
This function is a generalization of the PC algorithm (see
pc
), in the sense that it allows arbitrarily many
latent and selection variables. Under the assumption that the data
are faithful to a DAG that includes all latent and selection variables,
the FCI algorithm (Fast Causal Inference algorithm) (Spirtes, Glymour
and Scheines, 2000) estimates the Markov equivalence class of MAGs
that describe the conditional independence relationships between the
observed variables.
We estimate an equivalence class of maximal ancestral graphs (MAGs) instead of DAGs, since DAGs are not closed under marginalization and conditioning (Richardson and Spirtes, 2002).
An equivalence class of a MAG can be uniquely represented by a partial
ancestral graph (PAG). A PAG contains the following types of edges:
o-o, o-, o->, ->, <->, -. The bidirected edges come from hidden variables,
and the undirected edges come from selection variables. The edges have
the following interpretation: (i) there is an edge between x
and y
if and only if variables x and y are conditionally dependent
given S for all sets S consisting of all selection variables
and a subset of the observed variables; (ii) a tail on an edge means
that this tail is present in all MAGs in the Markov equivalence class;
(iii) an arrowhead on an edge means that this arrowhead is present in
all MAGs in the Markov equivalence class; (iv) a o-edgemark means that
there is a at least one MAG in the Markov equivalence class where the
edgemark is a tail, and at least one where the edgemark is an
arrowhead. Information on the interpretation of edges in a MAG can be
found in the references given below.
The first part of the FCI algorithm is analogous to the PC algorithm. It
starts with a complete undirected graph and estimates an initial skeleton
using skeleton(*, method="stable")
which produces an
initial order-independent skeleton, see skeleton
for
more details. All edges of this skeleton are of
the form o-o. Due to the presence of hidden variables, it is no longer
sufficient to consider only subsets of the neighborhoods of nodes x
and y
to decide whether the edge x o-o y
should be removed.
Therefore, the initial skeleton may contain some superfluous edges.
These edges are removed in the next step of the algorithm which
requires some orientations. Therefore, the v-structures
are determined using the conservative method (see discussion on
conservative
below).
After the v-structures have been oriented, Possible-D-SEP sets for each
node in the graph are computed at once. To decide whether edge
x o-o y
should be removed, one performs conditional indepedence
tests of x and y given all possible subsets of Possible-D-SEP(x) and
of Possible-D-SEP(y). The edge is removed if a conditional
independence is found. This produces a fully order-independent final
skeleton as explained in Colombo and Maathuis (2014). Subsequently,
the v-structures are newly determined on the final skeleton (using
information in sepset). Finally, as many as possible undetermined edge
marks (o) are determined using (a subset of) the 10 orientation rules
given by Zhang (2008).
The “Anytime FCI” algorithm was introduced by Spirtes (2001). It
can be viewed as a modification of the FCI algorithm that only performs
conditional independence tests up to and including order m.max when
finding the initial skeleton, using the function skeleton
, and
the final skeleton, using the function pdsep
. Thus, Anytime FCI
performs fewer conditional independence tests than FCI. To use the
Anytime algorithm, one sets type = "anytime"
and needs to
specify m.max
, the maximum size of the conditioning sets.
The “Adaptive Anytime FCI” algorithm was introduced by Colombo
et. al (2012). The first part of the algorithm is identical to the normal
FCI described above. But in the second part when the final skeleton is
estimated using the function pdsep
, the Adaptive Anytime
FCI algorithm only performs conditional independence tests up to and
including order m.max
, where m.max is the maximum size of the
conditioning sets that were considered to determine the initial
skeleton using the function skeleton
. Thus, m.max is chosen
adaptively and does not have to be specified by the user.
Conservative versions of FCI, Anytime FCI, and Adaptive Anytime FCI
are computed if conservative = TRUE
is specified. After the
final skeleton is computed, all potential
v-structures a-b-c are checked in the following way. We test whether a
and c are independent conditioning on any subset of the neighbors of a
or any subset of the neighbors of c. When a subset makes a and c
conditionally independent, we call it a separating set. If b is in no
such separating set or in all such separating sets, no further action
is taken and the normal version of the FCI, Anytime FCI, or Adaptive
Anytime FCI algorithm is continued. If, however, b is in only some
separating sets, the triple a-b-c is marked ‘ambiguous’. If a is
independent of c given some S in the skeleton (i.e., the edge a-c
dropped out), but a and c remain dependent given all subsets of
neighbors of either a or c, we will call all triples a-b-c
‘unambiguous’. This is because in the FCI algorithm, the true separating set
might be outside the neighborhood of either a or c. An ambiguous
triple is not oriented as a v-structure. Furthermore, no further
orientation rule that needs to know whether a-b-c is a v-structure or
not is applied. Instead of using the conservative version, which is
quite strict towards the v-structures, Colombo and Maathuis (2014)
introduced a less strict version for the v-structures called majority
rule. This adaptation can be called using maj.rule = TRUE
. In
this case, the triple a-b-c is marked as ‘ambiguous’ if and only if b
is in exactly 50 percent of such separating sets or no separating set
was found. If b is in less than 50 percent of the separating sets it
is set as a v-structure, and if in more than 50 percent it is set as a
non v-structure (for more details see Colombo and Maathuis,
2014). Colombo and Maathuis (2014) showed that with both these
modifications, the final skeleton and the decisions about the
v-structures of the FCI algorithm are fully order-independent.
Note that the order-dependence issues on the 10 orientation rules are still present, see Colombo and Maathuis (2014) for more details.
An object of class
fciAlgo
(see
fciAlgo
) containing the estimated graph
(in the form of an adjacency matrix with various possible edge marks),
the conditioning sets that lead to edge removals (sepset) and several other
parameters.
Diego Colombo and Markus Kalisch (kalisch@stat.math.ethz.ch).
D. Colombo and M.H. Maathuis (2014). Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.
D. Colombo, M. H. Maathuis, M. Kalisch, T. S. Richardson (2012). Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Statist. 40, 294-321.
M.Kalisch, M. Maechler, D. Colombo, M. H. Maathuis, P. Buehlmann (2012). Causal Inference Using Graphical Models with the R Package pcalg. Journal of Statistical Software 47(11) 1–26, http://www.jstatsoft.org/v47/i11/.
P. Spirtes (2001). An anytime algorithm for causal inference. In Proc. of the Eighth International Workshop on Artifiial Intelligence and Statistics 213-221. Morgan Kaufmann, San Francisco.
P. Spirtes, C. Glymour and R. Scheines (2000). Causation, Prediction, and Search, 2nd edition, MIT Press, Cambridge (MA).
P. Spirtes, C. Meek, T.S. Richardson (1999). In: Computation, Causation and Discovery. An algorithm for causal inference in the presence of latent variables and selection bias. Pages 211-252. MIT Press.
T.S. Richardson and P. Spirtes (2002). Ancestral graph Markov models. Annals of Statistics 30 962-1030.
J. Zhang (2008). On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence 172 1873-1896.
fciPlus
for a more efficient variation of FCI;
skeleton
for estimating a skeleton
using the PC algorithm; pc
for estimating a CPDAG using
the PC algorithm; pdsep
for computing
Possible-D-SEP for each node and testing and adapting the graph
accordingly; qreach
for a fast way of finding Possible-D-SEP
for a given node.
gaussCItest
, disCItest
,
binCItest
and dsepTest
as examples for
indepTest
.
################################################## ## Example without latent variables ################################################## set.seed(42) p <- 7 ## generate and draw random DAG : myDAG <- randomDAG(p, prob = 0.4) ## find skeleton and PAG using the FCI algorithm suffStat <- list(C = cov2cor(trueCov(myDAG)), n = 10^9) res <- fci(suffStat, indepTest=gaussCItest, alpha = 0.9999, p=p, doPdsep = FALSE) ################################################## ## Example with hidden variables ## Zhang (2008), Fig. 6, p.1882 ################################################## ## create the graph g p <- 4 L <- 1 # '1' is latent V <- c("Ghost", "Max","Urs","Anna","Eva") edL <- setNames(vector("list", length=length(V)), V) edL[[1]] <- list(edges=c(2,4),weights=c(1,1)) edL[[2]] <- list(edges=3,weights=c(1)) edL[[3]] <- list(edges=5,weights=c(1)) edL[[4]] <- list(edges=5,weights=c(1)) g <- new("graphNEL", nodes=V, edgeL=edL, edgemode="directed") ## compute the true covariance matrix of g cov.mat <- trueCov(g) ## delete rows and columns belonging to latent variable L true.cov <- cov.mat[-L,-L] ## transform covariance matrix into a correlation matrix true.corr <- cov2cor(true.cov) ## The same, for the following three examples indepTest <- gaussCItest suffStat <- list(C = true.corr, n = 10^9) ## find PAG with FCI algorithm. ## As dependence "oracle", we use the true correlation matrix in ## gaussCItest() with a large "virtual sample size" and a large alpha: normal.pag <- fci(suffStat, indepTest, alpha = 0.9999, labels = V[-L], verbose=TRUE) ## find PAG with Anytime FCI algorithm with m.max = 1 ## This means that only conditioning sets of size 0 and 1 are considered. ## As dependence "oracle", we use the true correlation matrix in the ## function gaussCItest with a large "virtual sample size" and a large ## alpha anytime.pag <- fci(suffStat, indepTest, alpha = 0.9999, labels = V[-L], type = "anytime", m.max = 1, verbose=TRUE) ## find PAG with Adaptive Anytime FCI algorithm. ## This means that only conditining sets up to size K are considered ## in estimating the final skeleton, where K is the maximal size of a ## conditioning set found while estimating the initial skeleton. ## As dependence "oracle", we use the true correlation matrix in the ## function gaussCItest with a large "virtual sample size" and a large ## alpha adaptive.pag <- fci(suffStat, indepTest, alpha = 0.9999, labels = V[-L], type = "adaptive", verbose=TRUE) ## define PAG given in Zhang (2008), Fig. 6, p.1882 corr.pag <- rbind(c(0,1,1,0), c(1,0,0,2), c(1,0,0,2), c(0,3,3,0)) ## check if estimated and correct PAG are in agreement all(corr.pag == normal.pag @ amat) # TRUE all(corr.pag == anytime.pag @ amat) # FALSE all(corr.pag == adaptive.pag@ amat) # TRUE ij <- rbind(cbind(1:4,1:4), c(2,3), c(3,2)) all(corr.pag[ij] == anytime.pag @ amat[ij]) # TRUE