Package 'qrLMM' reference manual

Title:	Quantile Regression for Linear Mixed-Effects Models
Description:	Quantile regression (QR) for Linear Mixed-Effects Models via the asymmetric Laplace distribution (ALD). It uses the Stochastic Approximation of the EM (SAEM) algorithm for deriving exact maximum likelihood estimates and full inference results for the fixed-effects and variance components. It also provides graphical summaries for assessing the algorithm convergence and fitting results.
Authors:	Christian E. Galarza <[email protected]> and Victor H. Lachos <[email protected]>
Maintainer:	Christian E. Galarza <[email protected]>
License:	GPL (>= 2)
Version:	2.3
Built:	2025-01-09 05:11:03 UTC
Source:	https://github.com/cran/qrLMM

Package for Quantile Regression for Linear Mixed-Effects Model

Description

This package contains a principal function that performs a quantile regression for a Linear Mixed-Effects Model using the Stochastic-Approximation of the EM Algorithm (SAEM) for an unique or a set of quantiles.

Exploiting the nice hierarchical representation of the ALD, our classical approach follows the Stochastic Approximation of the EM(SAEM) algorithm for deriving exact maximum likelihood estimates of the fixed-effects and variance components.

Details

Package:	qrLMM
Type:	Package
Version:	1.0
Date:	2014-12-19
License:	What license is it under?

Author(s)

Christian E. Galarza <[email protected]> and Victor H. Lachos <[email protected]>

Maintainer: Christian E. Galarza <[email protected]>

References

Delyon, B., Lavielle, M. & Moulines, E. (1999). Convergence of a stochastic approximation version of the EM algorithm. Annals of Statistics, pages 94-128.

Koenker, R., Machado, J. (1999). Goodness of fit and related inference processes for quantile regression. J. Amer. Statist. Assoc. 94(3):1296-1309.

Yu, K. & Moyeed, R. (2001). Bayesian quantile regression. Statistics & Probability Letters, 54(4), 437-447.

Yu, K., & Zhang, J. (2005). A three-parameter asymmetric Laplace distribution and its extension. Communications in Statistics-Theory and Methods, 34(9-10), 1867-1879.

Examples

#See examples for the QRLMM function linked above.
#See examples for the QRLMM function linked above.

Framingham cholesterol study

Description

The Framingham cholesterol study generated a benchmark dataset (Zhang and Davidian, 2001) for longitudinal analysis to examine the role of serum cholesterol as a risk factor for the evolution of cardiovascular disease for 200 randomly selected subjects.

Usage

data(Cholesterol)data(Cholesterol)

Format

This data frame contains the following columns:

newid: a numeric vector indicating the subject on which the measurement was made. It represents the subject number in the sample.
ID: a numeric vector indicating the subject on which the measurement was made. It represents the subject number in the population.
cholst: cholesterol level for patient newid.
sex: a dichotomous gender (0=female, 1=male).
age: age of the patient in years.
year: years elapsed since the start of the study to the current measurement.

Source

Zhang, D., & Davidian, M. (2001). Linear mixed models with flexible distributions of random effects for longitudinal data. Biometrics, 57(3), 795-802.

References

https://www.framinghamheartstudy.org/about-fhs/background.php

Examples

## Not run: 
data(Cholesterol)
attach(Cholesterol)

y  = cholst            #response
x  = cbind(1,sex,age)  #design matrix for fixed effects
z  = cbind(1,year)     #design matrix for random effects

#A median regression
median_reg = QRLMM(y,x,newid,nj,MaxIter = 500)

## End(Not run)
## Not run: 
data(Cholesterol)
attach(Cholesterol)

y  = cholst            #response
x  = cbind(1,sex,age)  #design matrix for fixed effects
z  = cbind(1,year)     #design matrix for random effects

#A median regression
median_reg = QRLMM(y,x,newid,nj,MaxIter = 500)

## End(Not run)

Plot function for grouped data

Description

Functions for plotting a profiles plot for grouped data.

Usage

group.plot(x,y,groups,...)
group.lines(x,y,groups,...)
group.points(x,y,groups,...)
group.plot(x,y,groups,...)
group.lines(x,y,groups,...)
group.points(x,y,groups,...)

Arguments

`y`	the response vector of dimension $N$ where $N$ is the total of observations.
`x`	vector of longitudinal (repeated measures) covariate of dimension $N$ . For example: Time, location, etc.
`groups`	factor of dimension $N$ specifying the partitions of the data over which the random effects vary.
`...`	additional graphical arguments passed to `matplot`. See `par`.

Author(s)

Christian E. Galarza <[email protected]> and Victor H. Lachos <[email protected]>

Examples

## Not run: 
#A full profile plot for Soybean data

data(Soybean,package = "qrNLMM")
attach(Soybean)

group.plot(x = Time,y = weight,groups = Plot,type="b",
             main="Soybean profiles",xlab="time (days)",
             ylab="mean leaf weight (gr)")

#Profile plot by genotype

group.plot(x = Time[Variety=="P"],y = weight[Variety=="P"],
             groups = Plot[Variety=="P"],type="l",col="blue",
             main="Soybean profiles by genotype",xlab="time (days)",
             ylab="mean leaf weight (gr)")

group.lines(x = Time[Variety=="F"],y = weight[Variety=="F"],
              groups = Plot[Variety=="F"],col="black")

## End(Not run)
## Not run: 
#A full profile plot for Soybean data

data(Soybean,package = "qrNLMM")
attach(Soybean)

group.plot(x = Time,y = weight,groups = Plot,type="b",
             main="Soybean profiles",xlab="time (days)",
             ylab="mean leaf weight (gr)")

#Profile plot by genotype

group.plot(x = Time[Variety=="P"],y = weight[Variety=="P"],
             groups = Plot[Variety=="P"],type="l",col="blue",
             main="Soybean profiles by genotype",xlab="time (days)",
             ylab="mean leaf weight (gr)")

group.lines(x = Time[Variety=="F"],y = weight[Variety=="F"],
              groups = Plot[Variety=="F"],col="black")

## End(Not run)

Growth curve data on an orthdontic measurement

Description

The Orthodont data frame has 108 rows and 4 columns of the change in an orthdontic measurement over time for several young subjects.

Format

This data frame contains the following columns:

distance: a numeric vector of distances from the pituitary to the pterygomaxillary fissure (mm). These distances are measured on x-ray images of the skull.
age: a numeric vector of ages of the subject (yr).
Subject: an ordered factor indicating the subject on which the measurement was made. The levels are labelled M01 to M16 for the males and F01 to F13 for the females. The ordering is by increasing average distance within sex.
Sex: a factor with levels Male and Female

Details

Investigators at the University of North Carolina Dental School followed the growth of 27 children (16 males, 11 females) from age 8 until age 14. Every two years they measured the distance between the pituitary and the pterygomaxillary fissure, two points that are easily identified on x-ray exposures of the side of the head.

Source

Pinheiro, J. C. and Bates, D. M. (2000), Mixed-Effects Models in S and S-PLUS, Springer, New York. (Appendix A.17)

Potthoff, R. F. and Roy, S. N. (1964), “A generalized multivariate analysis of variance model useful especially for growth curve problems”, Biometrika, 51, 313–326.

Examples

## Not run: 
data(Orthodont)
attach(Orthodont)

sex = c()
sex[Sex=="Male"]=0
sex[Sex=="Female"]=1

y  = distance         #response
x  = cbind(1,sex,age) #design matrix for fixed effects
z  = cbind(1,age)     #design matrix for random effects


#A median regression
median_reg = QRLMM(y,x,z,Subject,MaxIter = 500)

## End(Not run)
## Not run: 
data(Orthodont)
attach(Orthodont)

sex = c()
sex[Sex=="Male"]=0
sex[Sex=="Female"]=1

y  = distance         #response
x  = cbind(1,sex,age) #design matrix for fixed effects
z  = cbind(1,age)     #design matrix for random effects


#A median regression
median_reg = QRLMM(y,x,z,Subject,MaxIter = 500)

## End(Not run)

Quantile Regression for Linear Mixed-Effects Models

Description

Performs a quantile regression for a LMEM using the Stochastic-Approximation of the EM Algorithm (SAEM) for an unique or a set of quantiles.

Usage

QRLMM(y,x,z,groups,p=0.5,precision=0.0001,MaxIter=300,M=10,cp=0.25,
      beta=NA,sigma=NA,Psi=NA,show.convergence=TRUE,CI=95)
QRLMM(y,x,z,groups,p=0.5,precision=0.0001,MaxIter=300,M=10,cp=0.25,
      beta=NA,sigma=NA,Psi=NA,show.convergence=TRUE,CI=95)

Arguments

`y`	the response vector of dimension $N$ where $N$ is the total of observations.
`x`	design matrix for the fixed effects of dimension $N x d$ where $d$ represents the number of fixed effects including the intercept, if considered.
`z`	design matrix for the random effects of dimension $N x q$ where $q$ represents the number of random effects.
`groups`	factor of dimension $N$ specifying the partitions of the data over which the random effects vary.
`p`	unique quantile or a set of quantiles related to the quantile regression.
`precision`	the convergence maximum error.
`MaxIter`	the maximum number of iterations of the SAEM algorithm. Default = 300.
`M`	Number of Monte Carlo simulations used by the SAEM Algorithm. Default = 10. For more accuracy we suggest to use `M=20`.
`cp`	cut point $(0 \le cp \le 1)$ which determines the percentage of initial iterations with no memory.
`beta`	fixed effects vector of initial parameters, if desired.
`sigma`	dispersion initial parameter for the error term, if desired.
`Psi`	Variance-covariance random effects matrix of initial parameters, if desired.
`show.convergence`	if `TRUE`, it will show a graphical summary for the convergence of the estimates of all parameters for each quantile in order to assess the convergence.
`CI`	Confidence to be used for the Confidence Interval when a grid of quantiles is provided. Default=95.

Details

This function considers a linear mixed-effects model defined as:

$y_i = x_i*\beta_p + z_i*b_i + \epsilon_i;$

where, $x_i$ and $z_i$ are the design matrices for the fixed and random effects respectively, $\beta_p$ are the fixed effects (associated to the $p$ -th quantile), $b_i$ are the random (normal) effects and $\epsilon_i$ is a random error (considered to be asymmetric Laplace).

This algorithm performs the SAEM algorithm proposed by Delyon et al. (1999), a stochastic version of the usual EM Algorithm deriving exact maximum likelihood estimates of the fixed-effects and variance components.

If the initial parameters are not provided, by default, the fixed effects parameter $\beta$ and dispersion parameter $\sigma$ will be the maximum Likelihood Estimates for an Asymmetric Laplace Distribution (obviating the random term). See Yu & Zhang (2005).

When a grid of quantiles is provided, a graphical summary with point estimates and Confidence Intervals for model parameters is shown and also a graphical summary for the convergence of these estimates (for each quantile), if show.convergence=TRUE.

If the convergence graphical summary shows that convergence has not be attained, it's suggested to increase M to 20, to increase the total number of iterations MaxIter to 500 or both.

About the cut point parameter cp, a number between 0 and 1 $(0 \le cp \le 1)$ will assure an initial convergence in distribution to a solution neighborhood for the first cp*MaxIter iterations and an almost sure convergence for the rest of the iterations. If you do not know how SAEM algorithm works, this parameter SHOULD NOT be changed.

This program uses progress bars that will close when the algorithm ends. They must not be closed before if not the algorithm will stop.

Value

The function returns a list with two objects

conv

A two elements list with the matrices teta and se containing the point estimates and standard error estimate for all parameters along all iterations.

The second element of the list is res, a list of 12 elements detailed as

`iter`	number of iterations.
`criteria`	attained criteria value.
`beta`	fixed effects estimates.
`weights`	random effects weights ( $b_i$ ).
`sigma`	scale parameter estimate for the error term.
`Psi`	Random effects variance-covariance estimate matrix.
`SE`	Standard Error estimates.
`table`	Table containing the inference for the fixed effects parameters.
`loglik`	Log-likelihood value.
`AIC`	Akaike information criterion.
`BIC`	Bayesian information criterion.
`HQ`	Hannan-Quinn information criterion.
`time`	processing time.

Note

If a grid of quantiles was provided, the result is a list of the same dimension where each element corresponds to each quantile as detailed above.

Author(s)

Christian E. Galarza <[email protected]> and Victor H. Lachos <[email protected]>

References

Delyon, B., Lavielle, M. & Moulines, E. (1999). Convergence of a stochastic approximation version of the EM algorithm. Annals of Statistics, pages 94-128.

Yu, K., & Zhang, J. (2005). A three-parameter asymmetric Laplace distribution and its extension. Communications in Statistics-Theory and Methods, 34(9-10), 1867-1879.

Examples


## Not run: 

#Using the Orthodontic distance growth data

data(Orthodont)
attach(Orthodont)

y  = distance #response
x  = cbind(1,c(rep(0,64),rep(1,44)),age) #design matrix for fixed effects
z  = cbind(1,age) #design matrix for random effects
groups = Subject

model = QRLMM(y,x,z,groups,MaxIter=200)

beta = model$res$beta  #fixed effects
weights = model$res$weight  #random weights
nj = c(as.data.frame(table(groups))[,2]) #obs per subject
fixed = tcrossprod(x,t(beta))
random = rep(0,dim(x)[1])  #initializing random shift

for (j in 1:length(nj)){ 
  z1=matrix(z[(sum(nj[1:j-1])+1):(sum(nj[1:j])),],ncol=dim(z)[2])
  random[(sum(nj[1:j-1])+1):(sum(nj[1:j]))] = tcrossprod(z1,t(weights[j,]))
}

pred = fixed + random  #predictions
group.plot(age,pred,groups,type = "l")
group.points(age,distance,groups)


##########
#Fit a very quick regression for the three quartiles (Just for having an idea!)

QRLMM(y,x,z,groups,p = c(0.25,0.50,0.75),MaxIter=50,M=10)

#A full profile quantile regression (This might take some time)
QRLMM(y,x,z,groups,p = seq(0.05,0.95,0.05),MaxIter=300,M=10)

#A simple output example
-------------------------------------------------
Quantile Regression for Linear Mixed Model
-------------------------------------------------
Quantile = 0.75
Subjects = 27 ; Observations = 108 ; Balanced = 4
-----------
Estimates
-----------
- Fixed effects
Estimate Std. Error z value Pr(>|z|)
beta 1 17.08405 0.53524 31.91831 0
19
beta 2 2.15393 0.36929 5.83265 0
beta 3 0.61882 0.05807 10.65643 0
sigma = 0.38439
Random effects

i)  weights
   ...
   
ii) Varcov matrix
z1 z2
z1 0.16106 -0.00887
z2 -0.00887 0.02839
------------------------
Model selection criteria
------------------------
Loglik AIC BIC HQ
Value -216.454 446.907 465.682 454.52
-------
Details
-------
Convergence reached? = FALSE
Iterations = 300 / 300
Criteria = 0.00381
MC sample = 10
Cut point = 0.25
Processing time = 7.590584 mins

## End(Not run)
## Not run: 

#Using the Orthodontic distance growth data

data(Orthodont)
attach(Orthodont)

y  = distance #response
x  = cbind(1,c(rep(0,64),rep(1,44)),age) #design matrix for fixed effects
z  = cbind(1,age) #design matrix for random effects
groups = Subject

model = QRLMM(y,x,z,groups,MaxIter=200)

beta = model$res$beta  #fixed effects
weights = model$res$weight  #random weights
nj = c(as.data.frame(table(groups))[,2]) #obs per subject
fixed = tcrossprod(x,t(beta))
random = rep(0,dim(x)[1])  #initializing random shift

for (j in 1:length(nj)){ 
  z1=matrix(z[(sum(nj[1:j-1])+1):(sum(nj[1:j])),],ncol=dim(z)[2])
  random[(sum(nj[1:j-1])+1):(sum(nj[1:j]))] = tcrossprod(z1,t(weights[j,]))
}

pred = fixed + random  #predictions
group.plot(age,pred,groups,type = "l")
group.points(age,distance,groups)


##########
#Fit a very quick regression for the three quartiles (Just for having an idea!)

QRLMM(y,x,z,groups,p = c(0.25,0.50,0.75),MaxIter=50,M=10)

#A full profile quantile regression (This might take some time)
QRLMM(y,x,z,groups,p = seq(0.05,0.95,0.05),MaxIter=300,M=10)

#A simple output example
-------------------------------------------------
Quantile Regression for Linear Mixed Model
-------------------------------------------------
Quantile = 0.75
Subjects = 27 ; Observations = 108 ; Balanced = 4
-----------
Estimates
-----------
- Fixed effects
Estimate Std. Error z value Pr(>|z|)
beta 1 17.08405 0.53524 31.91831 0
19
beta 2 2.15393 0.36929 5.83265 0
beta 3 0.61882 0.05807 10.65643 0
sigma = 0.38439
Random effects

i)  weights
   ...
   
ii) Varcov matrix
z1 z2
z1 0.16106 -0.00887
z2 -0.00887 0.02839
------------------------
Model selection criteria
------------------------
Loglik AIC BIC HQ
Value -216.454 446.907 465.682 454.52
-------
Details
-------
Convergence reached? = FALSE
Iterations = 300 / 300
Criteria = 0.00381
MC sample = 10
Cut point = 0.25
Processing time = 7.590584 mins

## End(Not run)

Package 'qrLMM'

Help Index

Package for Quantile Regression for Linear Mixed-Effects Model

Description

Details

Author(s)

References

See Also

Examples

Framingham cholesterol study

Description

Usage

Format

Source

References

Examples

Plot function for grouped data

Description

Usage

Arguments

Author(s)

See Also

Examples

Growth curve data on an orthdontic measurement

Description

Format

Details

Source

Examples

Quantile Regression for Linear Mixed-Effects Models

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples