Title: | Quantile Regression for Linear Mixed-Effects Models |
---|---|
Description: | Quantile regression (QR) for Linear Mixed-Effects Models via the asymmetric Laplace distribution (ALD). It uses the Stochastic Approximation of the EM (SAEM) algorithm for deriving exact maximum likelihood estimates and full inference results for the fixed-effects and variance components. It also provides graphical summaries for assessing the algorithm convergence and fitting results. |
Authors: | Christian E. Galarza <[email protected]> and Victor H. Lachos <[email protected]> |
Maintainer: | Christian E. Galarza <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.3 |
Built: | 2024-11-10 05:00:26 UTC |
Source: | https://github.com/cran/qrLMM |
This package contains a principal function that performs a quantile regression for a Linear Mixed-Effects Model using the Stochastic-Approximation of the EM Algorithm (SAEM) for an unique or a set of quantiles.
Exploiting the nice hierarchical representation of the ALD, our classical approach follows the Stochastic Approximation of the EM(SAEM) algorithm for deriving exact maximum likelihood estimates of the fixed-effects and variance components.
Package: | qrLMM |
Type: | Package |
Version: | 1.0 |
Date: | 2014-12-19 |
License: | What license is it under? |
Christian E. Galarza <[email protected]> and Victor H. Lachos <[email protected]>
Maintainer: Christian E. Galarza <[email protected]>
Delyon, B., Lavielle, M. & Moulines, E. (1999). Convergence of a stochastic approximation version of the EM algorithm. Annals of Statistics, pages 94-128.
Koenker, R., Machado, J. (1999). Goodness of fit and related inference processes for quantile regression. J. Amer. Statist. Assoc. 94(3):1296-1309.
Yu, K. & Moyeed, R. (2001). Bayesian quantile regression. Statistics & Probability Letters, 54(4), 437-447.
Yu, K., & Zhang, J. (2005). A three-parameter asymmetric Laplace distribution and its extension. Communications in Statistics-Theory and Methods, 34(9-10), 1867-1879.
Orthodont
, Cholesterol
, QRLMM
,
QRNLMM
#See examples for the QRLMM function linked above.
#See examples for the QRLMM function linked above.
The Framingham cholesterol study generated a benchmark dataset (Zhang and Davidian, 2001) for longitudinal analysis to examine the role of serum cholesterol as a risk factor for the evolution of cardiovascular disease for 200 randomly selected subjects.
data(Cholesterol)
data(Cholesterol)
This data frame contains the following columns:
newid
a numeric vector indicating the subject on which the measurement was made. It represents the subject number in the sample.
ID
a numeric vector indicating the subject on which the measurement was made. It represents the subject number in the population.
cholst
cholesterol level for patient newid
.
sex
a dichotomous gender (0=female, 1=male).
age
age of the patient in years.
year
years elapsed since the start of the study to the current measurement.
Zhang, D., & Davidian, M. (2001). Linear mixed models with flexible distributions of random effects for longitudinal data. Biometrics, 57(3), 795-802.
https://www.framinghamheartstudy.org/about-fhs/background.php
## Not run: data(Cholesterol) attach(Cholesterol) y = cholst #response x = cbind(1,sex,age) #design matrix for fixed effects z = cbind(1,year) #design matrix for random effects #A median regression median_reg = QRLMM(y,x,newid,nj,MaxIter = 500) ## End(Not run)
## Not run: data(Cholesterol) attach(Cholesterol) y = cholst #response x = cbind(1,sex,age) #design matrix for fixed effects z = cbind(1,year) #design matrix for random effects #A median regression median_reg = QRLMM(y,x,newid,nj,MaxIter = 500) ## End(Not run)
Functions for plotting a profiles plot for grouped data.
group.plot(x,y,groups,...) group.lines(x,y,groups,...) group.points(x,y,groups,...)
group.plot(x,y,groups,...) group.lines(x,y,groups,...) group.points(x,y,groups,...)
y |
the response vector of dimension |
x |
vector of longitudinal (repeated measures) covariate of dimension |
groups |
factor of dimension |
... |
Christian E. Galarza <[email protected]> and Victor H. Lachos <[email protected]>
## Not run: #A full profile plot for Soybean data data(Soybean,package = "qrNLMM") attach(Soybean) group.plot(x = Time,y = weight,groups = Plot,type="b", main="Soybean profiles",xlab="time (days)", ylab="mean leaf weight (gr)") #Profile plot by genotype group.plot(x = Time[Variety=="P"],y = weight[Variety=="P"], groups = Plot[Variety=="P"],type="l",col="blue", main="Soybean profiles by genotype",xlab="time (days)", ylab="mean leaf weight (gr)") group.lines(x = Time[Variety=="F"],y = weight[Variety=="F"], groups = Plot[Variety=="F"],col="black") ## End(Not run)
## Not run: #A full profile plot for Soybean data data(Soybean,package = "qrNLMM") attach(Soybean) group.plot(x = Time,y = weight,groups = Plot,type="b", main="Soybean profiles",xlab="time (days)", ylab="mean leaf weight (gr)") #Profile plot by genotype group.plot(x = Time[Variety=="P"],y = weight[Variety=="P"], groups = Plot[Variety=="P"],type="l",col="blue", main="Soybean profiles by genotype",xlab="time (days)", ylab="mean leaf weight (gr)") group.lines(x = Time[Variety=="F"],y = weight[Variety=="F"], groups = Plot[Variety=="F"],col="black") ## End(Not run)
The Orthodont
data frame has 108 rows and 4 columns of the
change in an orthdontic measurement over time for several young subjects.
This data frame contains the following columns:
a numeric vector of distances from the pituitary to the pterygomaxillary fissure (mm). These distances are measured on x-ray images of the skull.
a numeric vector of ages of the subject (yr).
an ordered factor indicating the subject on which the
measurement was made. The levels are labelled M01
to M16
for the males and F01
to F13
for
the females. The ordering is by increasing average distance
within sex.
a factor with levels
Male
and
Female
Investigators at the University of North Carolina Dental School followed the growth of 27 children (16 males, 11 females) from age 8 until age 14. Every two years they measured the distance between the pituitary and the pterygomaxillary fissure, two points that are easily identified on x-ray exposures of the side of the head.
Pinheiro, J. C. and Bates, D. M. (2000), Mixed-Effects Models in S and S-PLUS, Springer, New York. (Appendix A.17)
Potthoff, R. F. and Roy, S. N. (1964), “A generalized multivariate analysis of variance model useful especially for growth curve problems”, Biometrika, 51, 313–326.
## Not run: data(Orthodont) attach(Orthodont) sex = c() sex[Sex=="Male"]=0 sex[Sex=="Female"]=1 y = distance #response x = cbind(1,sex,age) #design matrix for fixed effects z = cbind(1,age) #design matrix for random effects #A median regression median_reg = QRLMM(y,x,z,Subject,MaxIter = 500) ## End(Not run)
## Not run: data(Orthodont) attach(Orthodont) sex = c() sex[Sex=="Male"]=0 sex[Sex=="Female"]=1 y = distance #response x = cbind(1,sex,age) #design matrix for fixed effects z = cbind(1,age) #design matrix for random effects #A median regression median_reg = QRLMM(y,x,z,Subject,MaxIter = 500) ## End(Not run)
Performs a quantile regression for a LMEM using the Stochastic-Approximation of the EM Algorithm (SAEM) for an unique or a set of quantiles.
QRLMM(y,x,z,groups,p=0.5,precision=0.0001,MaxIter=300,M=10,cp=0.25, beta=NA,sigma=NA,Psi=NA,show.convergence=TRUE,CI=95)
QRLMM(y,x,z,groups,p=0.5,precision=0.0001,MaxIter=300,M=10,cp=0.25, beta=NA,sigma=NA,Psi=NA,show.convergence=TRUE,CI=95)
y |
the response vector of dimension |
x |
design matrix for the fixed effects of dimension |
z |
design matrix for the random effects of dimension |
groups |
factor of dimension |
p |
unique quantile or a set of quantiles related to the quantile regression. |
precision |
the convergence maximum error. |
MaxIter |
the maximum number of iterations of the SAEM algorithm. Default = 300. |
M |
Number of Monte Carlo simulations used by the SAEM Algorithm. Default = 10. For more accuracy we suggest to use |
cp |
cut point |
beta |
fixed effects vector of initial parameters, if desired. |
sigma |
dispersion initial parameter for the error term, if desired. |
Psi |
Variance-covariance random effects matrix of initial parameters, if desired. |
show.convergence |
if |
CI |
Confidence to be used for the Confidence Interval when a grid of quantiles is provided. Default=95. |
This function considers a linear mixed-effects model defined as:
where, and
are the design matrices for the fixed and random effects respectively,
are the fixed effects (associated to the
-th quantile),
are the random (normal) effects and
is a random error (considered to be asymmetric Laplace).
This algorithm performs the SAEM algorithm proposed by Delyon et al. (1999), a stochastic version of the usual EM Algorithm deriving exact maximum likelihood estimates of the fixed-effects and variance components.
If the initial parameters are not provided, by default, the fixed effects parameter and dispersion parameter
will be the maximum Likelihood Estimates for an Asymmetric Laplace Distribution (obviating the random term). See Yu & Zhang
(2005).
When a grid of quantiles is provided, a graphical summary with point estimates and Confidence Intervals for model parameters is shown and also a graphical summary for the convergence of these estimates (for each quantile), if show.convergence=TRUE
.
If the convergence graphical summary shows that convergence has not be attained, it's suggested to increase M
to 20, to increase the total number of iterations MaxIter
to 500 or both.
About the cut point parameter cp
, a number between 0 and 1 will assure an initial convergence in distribution to a solution neighborhood for the first
cp
*MaxIter
iterations and an almost sure convergence for the rest of the iterations. If you do not know how SAEM algorithm works, this parameter SHOULD NOT be changed.
This program uses progress bars that will close when the algorithm ends. They must not be closed before if not the algorithm will stop.
The function returns a list with two objects
conv |
A two elements list with the matrices |
The second element of the list is res
, a list of 12 elements detailed as
iter |
number of iterations. |
criteria |
attained criteria value. |
beta |
fixed effects estimates. |
weights |
random effects weights ( |
sigma |
scale parameter estimate for the error term. |
Psi |
Random effects variance-covariance estimate matrix. |
SE |
Standard Error estimates. |
table |
Table containing the inference for the fixed effects parameters. |
loglik |
Log-likelihood value. |
AIC |
Akaike information criterion. |
BIC |
Bayesian information criterion. |
HQ |
Hannan-Quinn information criterion. |
time |
processing time. |
If a grid of quantiles was provided, the result is a list of the same dimension where each element corresponds to each quantile as detailed above.
Christian E. Galarza <[email protected]> and Victor H. Lachos <[email protected]>
Delyon, B., Lavielle, M. & Moulines, E. (1999). Convergence of a stochastic approximation version of the EM algorithm. Annals of Statistics, pages 94-128.
Yu, K., & Zhang, J. (2005). A three-parameter asymmetric Laplace distribution and its extension. Communications in Statistics-Theory and Methods, 34(9-10), 1867-1879.
## Not run: #Using the Orthodontic distance growth data data(Orthodont) attach(Orthodont) y = distance #response x = cbind(1,c(rep(0,64),rep(1,44)),age) #design matrix for fixed effects z = cbind(1,age) #design matrix for random effects groups = Subject model = QRLMM(y,x,z,groups,MaxIter=200) beta = model$res$beta #fixed effects weights = model$res$weight #random weights nj = c(as.data.frame(table(groups))[,2]) #obs per subject fixed = tcrossprod(x,t(beta)) random = rep(0,dim(x)[1]) #initializing random shift for (j in 1:length(nj)){ z1=matrix(z[(sum(nj[1:j-1])+1):(sum(nj[1:j])),],ncol=dim(z)[2]) random[(sum(nj[1:j-1])+1):(sum(nj[1:j]))] = tcrossprod(z1,t(weights[j,])) } pred = fixed + random #predictions group.plot(age,pred,groups,type = "l") group.points(age,distance,groups) ########## #Fit a very quick regression for the three quartiles (Just for having an idea!) QRLMM(y,x,z,groups,p = c(0.25,0.50,0.75),MaxIter=50,M=10) #A full profile quantile regression (This might take some time) QRLMM(y,x,z,groups,p = seq(0.05,0.95,0.05),MaxIter=300,M=10) #A simple output example ------------------------------------------------- Quantile Regression for Linear Mixed Model ------------------------------------------------- Quantile = 0.75 Subjects = 27 ; Observations = 108 ; Balanced = 4 ----------- Estimates ----------- - Fixed effects Estimate Std. Error z value Pr(>|z|) beta 1 17.08405 0.53524 31.91831 0 19 beta 2 2.15393 0.36929 5.83265 0 beta 3 0.61882 0.05807 10.65643 0 sigma = 0.38439 Random effects i) weights ... ii) Varcov matrix z1 z2 z1 0.16106 -0.00887 z2 -0.00887 0.02839 ------------------------ Model selection criteria ------------------------ Loglik AIC BIC HQ Value -216.454 446.907 465.682 454.52 ------- Details ------- Convergence reached? = FALSE Iterations = 300 / 300 Criteria = 0.00381 MC sample = 10 Cut point = 0.25 Processing time = 7.590584 mins ## End(Not run)
## Not run: #Using the Orthodontic distance growth data data(Orthodont) attach(Orthodont) y = distance #response x = cbind(1,c(rep(0,64),rep(1,44)),age) #design matrix for fixed effects z = cbind(1,age) #design matrix for random effects groups = Subject model = QRLMM(y,x,z,groups,MaxIter=200) beta = model$res$beta #fixed effects weights = model$res$weight #random weights nj = c(as.data.frame(table(groups))[,2]) #obs per subject fixed = tcrossprod(x,t(beta)) random = rep(0,dim(x)[1]) #initializing random shift for (j in 1:length(nj)){ z1=matrix(z[(sum(nj[1:j-1])+1):(sum(nj[1:j])),],ncol=dim(z)[2]) random[(sum(nj[1:j-1])+1):(sum(nj[1:j]))] = tcrossprod(z1,t(weights[j,])) } pred = fixed + random #predictions group.plot(age,pred,groups,type = "l") group.points(age,distance,groups) ########## #Fit a very quick regression for the three quartiles (Just for having an idea!) QRLMM(y,x,z,groups,p = c(0.25,0.50,0.75),MaxIter=50,M=10) #A full profile quantile regression (This might take some time) QRLMM(y,x,z,groups,p = seq(0.05,0.95,0.05),MaxIter=300,M=10) #A simple output example ------------------------------------------------- Quantile Regression for Linear Mixed Model ------------------------------------------------- Quantile = 0.75 Subjects = 27 ; Observations = 108 ; Balanced = 4 ----------- Estimates ----------- - Fixed effects Estimate Std. Error z value Pr(>|z|) beta 1 17.08405 0.53524 31.91831 0 19 beta 2 2.15393 0.36929 5.83265 0 beta 3 0.61882 0.05807 10.65643 0 sigma = 0.38439 Random effects i) weights ... ii) Varcov matrix z1 z2 z1 0.16106 -0.00887 z2 -0.00887 0.02839 ------------------------ Model selection criteria ------------------------ Loglik AIC BIC HQ Value -216.454 446.907 465.682 454.52 ------- Details ------- Convergence reached? = FALSE Iterations = 300 / 300 Criteria = 0.00381 MC sample = 10 Cut point = 0.25 Processing time = 7.590584 mins ## End(Not run)