gamlss.com

The following topics are covered in this page:
Centile plot.

Latest News

There will be a three days short course for GAMLSS within 2009. The short course title is:

"Introduction To Modern Smoothing Methods: GAMLSS and P-Splines In Action".

It will be given by Paul Eilers and Mikis Stasinopoulos at the Postgraduate Statistics Centre at Lancaster University on the 30th November to the 2nd of December.

The dependencies within the gamlss packages has changed radically in Version 3.0-0.

All gamlss.family distributions are moved now to the gamlss.dist package. All data files are moved to the new package gamlss.data. Both those two packages can be used now in R independently from the main gamlss package. The main package gamlss automatically inputs gamlss.dist and gamlss.data.

The functions pb() (penalised beta splines) automatically chooses the amount of smoothing needed using a variety of local estimate methods i.e ML, EM, GAIC and GCV. It is now the recommend smoothing method.

You can find the gamlss packages reference card here.

You can find the booklet for the Utrecht short course here.

What is GAMLSS

Generalized Additive Models for Location, Scale and Shape (GAMLSS) are (semi) parametric regression type models. They are parametric, in that they require a parametric distribution assumption for the response variable, and "semi" in the sense that the modelling of the parameters of the distribution, as functions of explanatory variables, may involve using non-parametric smoothing functions.

GAMLSS were introduced by Rigby and Stasinopoulos (2001, 2005) and Akantziliotou et al. (2002) as a way of overcoming some of the limitations associated with the popular Generalized Linear Models (GLM) and Generalized Additive Models (GAM), Nelder and Weddeburn (1972) and Hastie and Tibshirani (1990) respectively.

In GAMLSS the exponential family distribution assumption for the response variable, y, is relaxed and replaced by a general distribution family, including highly skew and/or kurtotic continuous and discrete distributions. The systematic part of the model is expanded to allow modelling not only the mean (or location) but all the parameters of the distribution of y as linear and/or nonlinear parametric and/or additive non-parametric functions of explanatory variables and/or random effects.

Hence GAMLSS is especially suited to modelling a response variable which does not follow an exponential family distribution, (eg. leptokurtic or platykurtic and/or positive or negative skew response data, or overdispersed counts) or which exhibit heterogeneity (eg. where the scale or shape of the distribution of the response variable changes with explanatory variables(s)).

How to use GAMLSS

The GAMLSS framework of statistical modelling is implemented in a series of packages in R, (R Development Core Team, 2007), a free software, see URL http://www.R-project.org. The packages can be downloaded from the R library, CRAN, or from here (especially for the newer and possibly untested versions).

For new users of GAMLSS we recommend the second edition of the manual . For GAMLSS in action look at the paper published in the Journal of Statistical Software. For more examples and other topics look at the short course booklet given at the Utrecht short course.

What distributions can be used

The form of the distribution assumed for the response variable y, is very general. There are around 50 different distributions available in the current implementation of GAMLSS. This table displays their names and their abbreviations in R for most of them. New distributions can be added easily. Truncated versions of these distributions can be used using the package gamlss.tr. Censored (or interval) response variables can be used using the package gamlss.cens .

What additive terms can be used

There are several additive terms available in the current GAMLSS implementation. These include cubic splines, cs(), varying coefficient, vc(), penalized splines, ps(), loess lo(), fractional polynomials, fp(), power polynomials, pp(), random effects random() and ra() and non-linear terms, nl().

Why should I use GAMLSS

If your response variable is count (discrete) data it is very likely that the Poisson distribution will not fit well. GAMLSS provides a variety of discrete distributions (including the negative binomial) that you can try out. The dispersion parameter can be also modelled as a function of explanatory variables.

For continuous response variables GAMLSS provides a variety of different distributions some of which could deal with skewness, some with kurtosis and some with both skewness and kurtosis. For situations where extreme outliers exist in the response variable some of the distribution possess robust properties. For reasonably large number of observations, say more that 1000, the probability is that the exponential family distributions available within the generalized linear model framework will not adequately fit the data.

For centile estimation the WHO Multicentre Growth Reference Study Group have recommended gamlss and the BCPE distributions for the construction of the WHO Child Growth Standards.

How to learn more about GAMLSS

The original manual, now in its second edition, provides information on how to use the R-package gamlss. For examples using the package gamlss(), the recent Journal of Statistical Software paper is suitable. For statisticians wanted to know more about the theory and the algorithms we recommend the Royal Statistical Society read paper.

References

Akantziliotou, K. Rigby, R. A. and Stasinopoulos, D. M. (2002) The R implementation of Generalized Additive Models for Location, Scale and Shape in Statistical modelling in Society: Proceedings of the 17th International Workshop on statistical modelling, ed: Stasinopoulos, M. and Touloumi, G., 75-83, Chania, Greece

Hastie, T. J. and Tibshirani, R. J. (1990), Generalized Additive Models,Chapman and Hall, London.

Nelder, J. A. and Wedderburn, R. W. M., (1972) Generalized linear models, J. R. Statist. Soc. A., 135, 370-384.

Rigby, R. A. and Stasinopoulos, D. M. (2001), The GAMLSS project: a flexible approach to statistical modelling, in :New Trends in Statistical Modelling: Proceedings of the 16th International Workshop on Statistical Modelling, ed:Klein, B. and Korsholm, L, 249-256, Odense, Denmark

Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized Additive Models for Location, Scale and Shape, (with discussion). Appl. Statist., 54, pp 507-554.

R Development Core Team (2007). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org .

Valid HTML 4.0 Strict