Finite mixture models with fixed weights applied to growth data

Marek Molas; Emmanuel Lesaffre

Biometrical Letters (2012)

  • Volume: 49, Issue: 2, page 103-119
  • ISSN: 1896-3811

Abstract

top
To model cross-sectional growth data the LMS method is widely applied. In this method the distribution is summarized by three parameters: the Box-Cox power that converts outcome to normality (L); the median (M); and the coeficient of variation (S). Here, we propose an alternative approach based on fitting finite mixture models with several components which may perform better than the LMS method in case the data show an unusual distribution. Further, we explore fixing the weights of the mixture components in contrast to the standard approach where weights are estimated. Having fixed weights improves the speed of computation and the stability of the solution. In addition, fixing the weights provides almost as good a fit as when the weights are estimated. Our methodology combines Gaussian mixture modelling and spline smoothing. The estimation of the parameters is based on the joint modelling of mean and dispersion. We illustrate the methodology on the Fourth Dutch Growth Study, which is a cross-sectional study that contains information on the growth of 7303 boys as a function of age. This information is used to construct centile curves, so-called growth curves, which describe the distribution of height as a smooth function of age. Further, we analyse simulated data showing a bimodal structure at some time point. In its full generality, this approach permits the replacement of the Gaussian components by any parametric density. Further, different components of the mixture can have a diferent probabilistic (multivariate) structure, allowing for censoring and truncation.

How to cite

top

Marek Molas, and Emmanuel Lesaffre. "Finite mixture models with fixed weights applied to growth data." Biometrical Letters 49.2 (2012): 103-119. <http://eudml.org/doc/268778>.

@article{MarekMolas2012,
abstract = {To model cross-sectional growth data the LMS method is widely applied. In this method the distribution is summarized by three parameters: the Box-Cox power that converts outcome to normality (L); the median (M); and the coeficient of variation (S). Here, we propose an alternative approach based on fitting finite mixture models with several components which may perform better than the LMS method in case the data show an unusual distribution. Further, we explore fixing the weights of the mixture components in contrast to the standard approach where weights are estimated. Having fixed weights improves the speed of computation and the stability of the solution. In addition, fixing the weights provides almost as good a fit as when the weights are estimated. Our methodology combines Gaussian mixture modelling and spline smoothing. The estimation of the parameters is based on the joint modelling of mean and dispersion. We illustrate the methodology on the Fourth Dutch Growth Study, which is a cross-sectional study that contains information on the growth of 7303 boys as a function of age. This information is used to construct centile curves, so-called growth curves, which describe the distribution of height as a smooth function of age. Further, we analyse simulated data showing a bimodal structure at some time point. In its full generality, this approach permits the replacement of the Gaussian components by any parametric density. Further, different components of the mixture can have a diferent probabilistic (multivariate) structure, allowing for censoring and truncation.},
author = {Marek Molas, Emmanuel Lesaffre},
journal = {Biometrical Letters},
keywords = {mixture models; growth curves; splines; IWLS algorithm; exible distributions},
language = {eng},
number = {2},
pages = {103-119},
title = {Finite mixture models with fixed weights applied to growth data},
url = {http://eudml.org/doc/268778},
volume = {49},
year = {2012},
}

TY - JOUR
AU - Marek Molas
AU - Emmanuel Lesaffre
TI - Finite mixture models with fixed weights applied to growth data
JO - Biometrical Letters
PY - 2012
VL - 49
IS - 2
SP - 103
EP - 119
AB - To model cross-sectional growth data the LMS method is widely applied. In this method the distribution is summarized by three parameters: the Box-Cox power that converts outcome to normality (L); the median (M); and the coeficient of variation (S). Here, we propose an alternative approach based on fitting finite mixture models with several components which may perform better than the LMS method in case the data show an unusual distribution. Further, we explore fixing the weights of the mixture components in contrast to the standard approach where weights are estimated. Having fixed weights improves the speed of computation and the stability of the solution. In addition, fixing the weights provides almost as good a fit as when the weights are estimated. Our methodology combines Gaussian mixture modelling and spline smoothing. The estimation of the parameters is based on the joint modelling of mean and dispersion. We illustrate the methodology on the Fourth Dutch Growth Study, which is a cross-sectional study that contains information on the growth of 7303 boys as a function of age. This information is used to construct centile curves, so-called growth curves, which describe the distribution of height as a smooth function of age. Further, we analyse simulated data showing a bimodal structure at some time point. In its full generality, this approach permits the replacement of the Gaussian components by any parametric density. Further, different components of the mixture can have a diferent probabilistic (multivariate) structure, allowing for censoring and truncation.
LA - eng
KW - mixture models; growth curves; splines; IWLS algorithm; exible distributions
UR - http://eudml.org/doc/268778
ER -

References

top
  1. Cole T.J., Green P.J. (1992): Smoothing reference centile curves: The LMS method and penalized likelihood. Statistics in Medicine 11: 1305-1319.[Crossref] 
  2. Dempster A.P., Laird N.M., Rubin D.B. (1977): Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society Series B 39: 1-38. Zbl0364.62022
  3. Eilers P., Marx B. (1996): Flexible smoothing with b-splines and penalties. Statistical Science 11: 89-121.[Crossref] Zbl0955.62562
  4. Ghidey W., Lesa re E., Eilers P. (2004): Smooth random e ects distribution in a linear mixed model. Biometrics 60: 945-953.[Crossref] Zbl1274.62238
  5. Harrell F.E. (2001): Regression Modelling Strategies. Springer-Verlag, New York. Zbl0982.62063
  6. Komarek A., Lesa re E., Hilton J.F. (2005): Accelerated failure time model for arbitrarily censored data with smoothed error distribution. Journal of Computational and Graphical Statistics 14: 726-745.[Crossref] 
  7. Lee Y., Nelder J.A., Pawitan Y. (2006): Generalized Linear Models with Random E ects. Chapman & Hall / CRC: Boca Raton. Zbl1110.62092
  8. McLachlan G.J., Peel D. (2000): Finite Mixture Models. John Wiley and Sons, New York. Zbl0963.62061
  9. Muthen B., Brown H.C. (2009): Estimating drug e ects in the presence of placebo response: Casual inference using growth mixture modelling. Statistics in Medicine 28: 3363-3385.[WoS][Crossref] 
  10. Nelder J.A., Pregibon D. (1987): An extended quasi-likelihood function. Biometrika 74: 221-232.[Crossref] Zbl0621.62078
  11. Nelder J.A., Wedderburn R.W.M. (1972): Generalized linear models. Journal of Royal Statistical Society A 135: 370-384. 
  12. Ramsay J.O. (1988): Monotone regression splines in action. Statistical Science 3: 425-461.[Crossref] 
  13. van Buuren S., Fredriks M. (2001): Worm plot: a simple diagnostic device for modelling growth reference curves. Statistics in Medicine 20: 1259-1277. [Crossref] 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.