glmertree {glmertree} | R Documentation |
Model-based recursive partitioning based on (generalized) linear mixed models.
lmertree(formula, data, weights = NULL, offset = NULL, ranefstart = NULL, cluster = NULL, abstol = 0.001, maxit = 100, joint = TRUE, dfsplit = TRUE, verbose = FALSE, plot = FALSE, lmer.control = lmerControl(), ...) glmertree(formula, data, family = "binomial", weights = NULL, offset = NULL, ranefstart = NULL, cluster = NULL, abstol = 0.001, maxit = 100, joint = TRUE, dfsplit = TRUE, verbose = FALSE, plot = FALSE, glmer.control = glmerControl(), ...)
formula |
formula specifying the response variable and a three-part right-hand-side describing the regressors, random effects, and partitioning variables, respectively. For details see below. |
data |
data.frame to be used for estimating the model tree. |
family |
family specification for |
weights |
numeric. An optional numeric vector of weights. (Note that
this is passed with standard evaluation, i.e., it is not enough to pass
the name of a column in |
offset |
numeric. An optional numeric vector of values to be included in
the linear predictor with a coeffcient of one. (Note that this is passed with
standard evaluation, i.e., it is not enough to pass the name of a column in
|
ranefstart |
numeric vector, |
cluster |
optional vector (numeric, character or factor) with a cluster ID
to be employed for clustered covariances in the parameter stability tests. If
|
abstol |
numeric. The convergence criterion used for estimation of the model.
When the difference in log-likelihoods of the random-effects model from two
consecutive iterations is smaller than |
maxit |
numeric. The maximum number of iterations to be performed in estimation of the model tree. |
joint |
logical. Should the fixed effects from the tree be (re-)estimated jointly along with the random effects? |
dfsplit |
logical or numeric. |
verbose |
Should the log-likelihood value of the estimated random-effects model be printed for every iteration of the estimation? |
plot |
Should the tree be plotted at every iteration of the estimation? Note that selecting this option slows down execution of the function. |
lmer.control, glmer.control |
list. An optional list with control
parameters to be passed to |
... |
Additional arguments to be passed to |
(G)LMM trees learn a tree where each terminal node is associated with different regression coefficients while adjusting for global random effects (such as a random intercept). This allows for detection of subgroup-specific fixed effects, keeping the random effects constant throughout the tree. The estimation algorithm iterates between (1) estimation of the tree given an offset of random effects, and (2) estimation of a random-effects model given the tree structure. See Fokkema et al. (2015) for a detailed introduction.
To specify all variables in the model a formula
such as
y ~ x1 + x2 | random | z1 + z2 + z3
is used, where y
is the
response, x1
and x2
are the regressors in every node of the
tree, random
is the random effect, and z1
to z3
are
the partitioning variables considered for growing the tree. If random
is only a single variable such as id
a random intercept with respect
to id
is used. Alternatively, it may be an explicit random-effects
formula such as (1 | id)
or a more complicated formula. (Note that
in the latter case, the brackets are necessary to protect the pipes in the
random effects formulation.)
In the random-effects model from step (2), two strategies are available:
Either the fitted values from the tree can be supplied as an offset
(joint = FALSE
) so that only the random effects are estimated.
Or the fixed effects are (re-)estimated along with the random effects
using a nesting factor with nodes from the tree (joint = TRUE
).
In the former case, the estimation of each random-effects model is typically
faster but more iterations are required.
The code is still under development and might change in future versions.
The function returns a list with the following objects:
tree |
The final |
lmer |
The final |
ranef |
The corresponding random effects of |
varcorr |
The corresponding |
variance |
The corresponding |
data |
The dataset specified with the |
loglik |
The log-likelihood value of the last iteration. |
iterations |
The number of iterations used to estimate the |
maxit |
The maximum number of iterations specified with the |
ranefstart |
The random effects used as an offset, as specified with
the |
formula |
The formula as specified with the |
randomformula |
The formula as specified with the |
abstol |
The prespecified value for the change in log-likelihood to evaluate
convergence, as specified with the |
mob.control |
A list containing control parameters passed to
|
lmer.control |
A list containing control parameters passed to
|
joint |
Whether the fixed effects from the tree were (re-)estimated jointly along
with the random effects, specified with the |
Fokkema M, Smits N, Zeileis A, Hothorn T, Kelderman H (2015). “Detecting Treatment-Subgroup Interactions in Clustered Data with Generalized Linear Mixed-Effects Model Trees”. Working Paper 2015-10. Working Papers in Economics and Statistics, Research Platform Empirical and Experimental Economics, Universität Innsbruck. http://EconPapers.RePEc.org/RePEc:inn:wpaper:2015-10
## artificial example data data("DepressionDemo", package = "glmertree") ## fit normal linear regression LMM tree for continuous outcome lt <- lmertree(depression ~ treatment | cluster | age + anxiety + duration, data = DepressionDemo) print(lt) plot(lt, which = "all") # default behavior, which may also be "tree" or "ranef" coef(lt) ranef(lt) predict(lt, type = "response") # default behavior, type may also be "node" residuals(lt) ## fit logistic regression GLMM tree for binary outcome gt <- glmertree(depression_bin ~ treatment | cluster | age + anxiety + duration, data = DepressionDemo) print(gt) plot(gt, which = "all") # default behavior, which may also be "tree" or "ranef" coef(gt) ranef(gt) predict(gt, type = "response") # default behavior, type may also be "node" or "link" residuals(gt)