Half-Normal Plots and Overdispersed Models in R: The hnp Package
Main Article Content
Abstract
Count and proportion data may present overdispersion, i.e., greater variability than expected by the Poisson and binomial models, respectively. Different extended generalized linear models that allow for overdispersion may be used to analyze this type of data, such as models that use a generalized variance function, random-effects models, zero-inflated models and compound distribution models. Assessing goodness-of-fit and verifying assumptions of these models is not an easy task and the use of half-normal plots with a simulated envelope is a possible solution for this problem. These plots are a useful indicator of goodness-of-fit that may be used with any generalized linear model and extensions. For GLIM users, functions that generated these plots were widely used, however, in the open-source software R, these functions were not yet available on the Comprehensive R Archive Network (CRAN). We describe a new package in R, hnp, that may be used to generate the half-normal plot with a simulated envelope for residuals from different types of models. The function hnp() can be used together with a range of different model fitting packages in R that extend the basic generalized linear model fitting in glm() and is written so that it is relatively easy to extend it to new model classes and different diagnostics. We illustrate its use on a range of examples, including continuous and discrete responses, and show how it can be used to inform model selection and diagnose overdispersion.