November 26, 2025
In a Generalized Linear Model (GLM) we have
\[ g(E(y_i)) = g(\mu_i) = \beta_0 + \beta_1 x_{1i} + ... + \beta_p x_{pi} \]
and
\[ \text{Var}(y_i) = \phi a_i V(\mu_i) \]
A GNM extends this to \[ g(\mu_i) = \eta(\boldsymbol{x}_i; \boldsymbol{\beta}) \] where \(\eta(\boldsymbol{x}_i; \boldsymbol{\beta})\) is nonlinear in the parameters \(\boldsymbol{\beta}\).
GNMs may be thought of as…
… an extension of Nonlinear Least Squares
… an extension of GLMs
{gnm} focuses on the latter
Armstrong et al (2014, BMC Medical Research Methodology) provide the following example data original from Bhaskaran et al, 2013, International Journal of Epidemiology:
date daily from 2002-01-01 to 2006-12-31ozone daily average, µg/m\(^3\)temperature \(^\circ\)CnumdeathsDefine month-weekday strata:
| record | date | weekday | stratum | numdeaths | ozone | temperature |
|---|---|---|---|---|---|---|
| 1 | 2002-01-01 | Tue | 2002-Jan-Tue | 199 | 4.59 | -0.23 |
Expand each record to make case-control sets within the strata:
| record | date | weekday | case | weight | ozone | temperature |
|---|---|---|---|---|---|---|
| 1 | 2002-01-01 | Tue | 1 | 199 | 4.59 | -0.23 |
| 1 | 2002-01-08 | Tue | 0 | 199 | 3.13 | 3.51 |
| 1 | 2002-01-15 | Tue | 0 | 199 | 15.95 | 7.24 |
| 1 | 2002-01-22 | Tue | 0 | 199 | 43.09 | 9.04 |
| 1 | 2002-01-29 | Tue | 0 | 199 | 38.11 | 10.91 |
Given that one case occurs in each stratum, we can model the event \(D_{i,s}\) that the death in stratum \(s\) occurs on day \(i\) as follows:
\[D_{i,s} \sim \text{Bernoulli} \left(\pi_i = \frac{\exp \left\{\boldsymbol{\beta}^T\boldsymbol{x}_i\right\}}{\sum_{j \in s(i)} \exp \left\{ \boldsymbol{\beta}^T \boldsymbol{x}_j\right\}}\right)\] In our London example, we have:
ozone and one for temperatureArmstrong et al (2014) propose to fit the equivalent Poisson model to \(y_{i,s}\), the number of deaths on date \(i\) in stratum \(s\), where
\[ E(y_i) = \exp(\alpha_s + \boldsymbol{\beta}^T\boldsymbol{x}_i) \] and \(\alpha_s\) are nuisance parameters that ensure the multinomial denominators are matched.
In this case we have
GLMs are estimated with an IWLS algorithm, where
\[ \hat{\boldsymbol{\beta}}^{(r + 1)} = \left(\boldsymbol{X}^{T}\boldsymbol{W}^{(r)}\boldsymbol{X}\right)^{-1}\boldsymbol{X}^{T}\boldsymbol{W}^{(r)}\boldsymbol{z}^{(r)} \]
For GNMs, {gnm}:
Uses a generalized inverse to avoid specifying identifiability constraints for nonlinear terms.
Can eliminate parameters of a nuisance factor from \(\boldsymbol{X}\), for efficient estimation.
For an ordinary least squares model \[ \begin{align*} \left[(\boldsymbol{y}|\boldsymbol{X})^T(\boldsymbol{y}|\boldsymbol{X})\right]^{-1} &= \begin{pmatrix} \boldsymbol{y}^T\boldsymbol{y} & \boldsymbol{y}^T\boldsymbol{X} \\ \boldsymbol{X}^T\boldsymbol{y} & \boldsymbol{X}^T\boldsymbol{X} \\ \end{pmatrix}^{-1} = \begin{pmatrix} \boldsymbol{A}_{11} & \boldsymbol{A}_{12} \\ \boldsymbol{A}_{21} & \boldsymbol{A}_{22} \\ \end{pmatrix} \end{align*} \] where \(\boldsymbol{A}_{11}, \boldsymbol{A}_{12}\) and \(\boldsymbol{A}_{22}\) are functions of \(\boldsymbol{y}^T\boldsymbol{y}\), \(\boldsymbol{X}^T\boldsymbol{y}\) and \(\boldsymbol{X}^T\boldsymbol{X}\).
It can be shown that \[ \begin{align*} \hat{\boldsymbol{\beta}} &= (\boldsymbol{X}^T\boldsymbol{X})^{-1}\boldsymbol{X}^T\boldsymbol{y} = - \frac{\boldsymbol{A}_{21}}{\boldsymbol{A}_{11}} \end{align*} \] requiring only the first row (column) of the inverse to be found.
Let \[ \tilde{\boldsymbol{X}} = \boldsymbol{W}^{\frac{1}{2}}(\boldsymbol{z}|\boldsymbol{X}) = (\boldsymbol{U} | \boldsymbol{V}) \]
Typically \(\boldsymbol{n} \boldsymbol{>>} \boldsymbol{p}\)
Then \[ \begin{align*} (\tilde{\boldsymbol{X}}^T\tilde{\boldsymbol{X}})^{-} &= \begin{pmatrix} \boldsymbol{U}^T\boldsymbol{U} & \boldsymbol{U}^T\boldsymbol{V} \\ \boldsymbol{V}^T\boldsymbol{U} & \boldsymbol{V}^T\boldsymbol{V} \\ \end{pmatrix}^{-} = \begin{pmatrix} \boldsymbol{B}_{11} & \boldsymbol{B}_{12} \\ \boldsymbol{B}_{21} & \boldsymbol{B}_{22} \\ \end{pmatrix} \end{align*} \] Again, only the first row (column) of this generalised inverse is required to estimate \(\hat{\boldsymbol{\beta}}\), so we are only interested in \(\boldsymbol{B}_{11}\) and \(\boldsymbol{B}_{12}\).
\[ \begin{align*} \boldsymbol{B}_{11} &= (\boldsymbol{U}^T\boldsymbol{U} - \boldsymbol{U}^T\boldsymbol{V}(\boldsymbol{V}^T\boldsymbol{V})^{-1}\boldsymbol{V}^T\boldsymbol{U})^{-} \\ \boldsymbol{B}_{12} &= - (\boldsymbol{V}^T\boldsymbol{V})^{-1}\boldsymbol{V}^T\boldsymbol{U}\boldsymbol{B}_{11} \end{align*} \]
The structure of \(\tilde{\boldsymbol{X}}\) makes things simpler
So we do not need to construct the large \(nk \times n\) matrix \(\boldsymbol{V}\).
Fit with standard GLM function
1.198 sec elapsed
Methodology and London example:
Some recent Australasian applications:
Chen et al (2022) Ambient air pollution and epileptic seizures: A panel study in Australia, Epilepsia
Campbell et al (2024) Assessing mortality associated with heatwaves in the cool climate region of Tasmania, Australia, The Journal of Climate Change and Health
We have repeated measures of a categorical outcome for multiple subjects, e.g.
| Subject | Time 1 | Time 2 | Time 3 |
|---|---|---|---|
| 1 | None | Mild | Moderate |
| 2 | None | None | Severe |
| 3 | Mild | Moderate | Moderate |
Let \(y_{it} \in \left\{1, ..., J\right\}\) be the response for subject \(i\) at time \(t\).
We want to model the marginal probabilities
\[P(y_{it} = j | \boldsymbol{x}_{it})\] regarding the correlation between repeated measures as a nuisance.
Represent the categorical response for subject \(i\) at time \(t\) as a vector of integers:
\[\boldsymbol{y}_{it} = (y_{i1}, ..., y_{iJ})^\top\]
where
\[y_{ij} = \begin{cases} 1, & \text{if } y_i = j, \\ 0, & \text{otherwise.} \end{cases}\] Then the expected value is the vector of category probabilities we want to model \[E[\boldsymbol{y}_{it} | \boldsymbol{x}_{it}] = \left( P(y_{it} = 1 | \boldsymbol{x}_{it}),\, P(y_{it} = 2 | \boldsymbol{x}_{it}),\, \dots,\, P(y_{it} = J | \boldsymbol{x}_{it}) \right)^\top = \boldsymbol{\pi}_{it}\]
A univariate multinomial model can provide the required model for \(\boldsymbol{\pi}_{it}\), e.g.
Baseline-category multinomial logit model for nominal outcomes
\[\log \left(\frac{\pi_{itj}}{\pi_{itJ}}\right) = \beta_{j0} + \boldsymbol{\beta}_j^\top \mathbf{x}_{it}, \quad j = 1, \dots, J-1\]
Cumulative link model for ordinal outcomes
\[F^{-1}\bigl[P(y_{it} \le j \mid \mathbf{x}_{it})\bigr] = \beta_{j0} + \boldsymbol{\beta}^\top \mathbf{x}_{it}, \quad j = 1, \dots, J-1\] from which \(\pi_{itj}\) can be derived as differences between cumulative probabilities.
Any model for \(V_i\) will result in consistent estimates for \(\boldsymbol{\beta}\), however, if the model is
Want a parsimonious representation that captures the correlation patterns
Touloumis et al (2013, Biometrics) propose to use association models for the marginalized contingency tables, implemented in R package {multgee}.
Different models for \(\log(\theta_{jj'g})\) can be obtained through different models for \(\phi^g\mu^g_j\mu^g_{j'}\):
| Structure | Model for Log Local Odds Ratio | Interaction Formula | Response |
|---|---|---|---|
| Independent | 1 | N/A | Any |
| Uniform | \(\phi\) | ~ r:c |
Ordinal |
| Category exchangeable | \(\phi^g\) | ~ G:r:c |
Ordinal |
| Time exchangeable | \(\phi(\mu_j-\mu_{j+1})(\mu_{j'}-\mu_{j'+1})\) | ~ MultHomog(R,C) |
Any |
| Time-specific RC | \(\phi^g (\mu^g_j-\mu^g_{j+1})(\mu^g_{j'}-\mu^g_{j'+1})\) | ~ MultHomog(G:R, G:C) |
Any |
Where MultHomog() is a function in {gnm} for specifying homogeneous multiplicative interactions (\(\mu_r\mu_c\)).
The housing data from {multgee} is an example of nominal longitudinal data:
y: housing in streets or shelters (0), community housing (1), or independent housing (2)time: 0, 6, 12, and 24 monthssec: binary indicator of access to Section 8 Rental Certificate (financial support)Estimate time-specific intrinsic parameters \(\phi^g\) (in a RC-G(1) model):
[1] 1.0644358 0.8748243 0.6171560 2.5821515 1.6521935 3.6752211
There is a wide range in the strength of association, so should use time-specific RC model
QIC CIC RJC
fit_ind 1533.624 1533.624 16.0000
fit_exch 1517.530 1518.330 15.6001
fit_rc 1509.793 1510.677 15.5579
The housing data is analysed in:
The model is described in:
A recent Australasian example:
Manufacturers of IVD assays need to provide information the precision of their product.
This is based on a precision experiment. CA19_9 from {VCA} is an example from the Clinical and Laboratory Standards Institute guidelines.
6 samples were assayed with 5 replicates on 5 days, at 3 sites
result: measured concentration of CA19-9 (tumour marker)day: day 1, 2, 3, 4, or 5site: laboratory 1, 2 or 3sample: P1, P2, P5, Q3, Q4, Q6 (“P” patient sample pool, “Q” control, number indicates concentration)anovaVCA() from {VCA} computes variance components based on ANOVA:
Result Variance Component Analysis:
-----------------------------------
Name DF SS MS VC %Total SD CV[%]
1 total 11.318142 1.086864 100 1.042528 8.629244
2 site 2 22.041867 11.020933 0.384291 35.357751 0.619912 5.131154
3 site:day 12 16.964 1.413667 0.177773 16.356539 0.421632 3.489944
4 error 60 31.488 0.5248 0.5248 48.28571 0.724431 5.996282
Mean: 12.08133 (N = 75)
Experimental Design: balanced | Method: ANOVA
Doing this for all samples, we can obtain a “Precision Performance Table”
| Mean | Reproducibility %CV | Between-Site %CV | Between-Day %CV | Repeatability %CV | |
|---|---|---|---|---|---|
| sample.P1 | 12.08 | 8.6 | 5.1 | 3.5 | 6.0 |
| sample.P2 | 41.58 | 4.4 | 3.1 | 0.8 | 3.1 |
| sample.Q3 | 55.75 | 4.1 | 3.2 | 1.3 | 2.2 |
| sample.Q4 | 165.70 | 3.8 | 3.3 | 0.8 | 1.7 |
| sample.P5 | 379.10 | 2.4 | 1.3 | 0.5 | 2.0 |
| sample.Q6 | 414.30 | 3.7 | 3.1 | 0.4 | 2.1 |
where Reproducibility is the total variation and Repeatability is the pure assay imprecision.
Precision profiles model the relationship between the (component of) variance and the mean response.
{VFP} implements several models as in the Variance Function Program software:
| Number | Model | Type |
|---|---|---|
| 1 | \(\sigma^2 = 1\) | linear |
| 6 | \(\sigma^2 = \beta_1 + \beta_2 \mu + \beta_3 \mu^J\) | nonlinear |
| 7 | \(\sigma^2 = \beta_1 + \beta_2 \mu^J\) | nonlinear |
| 8 | \(\sigma^2 = (\beta_1 + \beta_2 \mu)^J\) | nonlinear |
| 9 | \(\sigma^2 = \beta_1 \mu^J\) | nonlinear |
plus special cases Models 2-5 with \(J\) set to 2 or another specified value.
For balanced designs we have
\[E(\hat{\sigma}_c^2) = \sigma_c^2 \qquad \text{Var}(\hat{\sigma}_c^2) \approx \frac{2}{\nu_c}\sigma_c^4\]
where \(\nu_c\) is the degrees of freedom for \(\hat{\sigma}_c^2\).
Therefore we can fit precision profiles using a Gamma GNM with an identity link
\[E(\hat{\sigma}_c^2) = \eta(\mu, \boldsymbol{\beta}) \qquad \text{Var}(\hat{\sigma}_c^2) = \frac{\phi}{\alpha_c} E(\hat{\sigma}_c^2)^2\]
with weights \(\alpha_c = \nu_c/2\).
First use anovaVCA() to estimate variance components for each sample:
Then use get_mat() to get the mean, a total variance (VC), corresponding degrees of freedom (DF) and the coefficient of variation (as %, CV) for all samples:
Mean DF VC SD CV
sample.P1 12.08133 11.318142 1.086864 1.042528 8.629244
sample.P2 41.58400 7.604586 3.376848 1.837620 4.419056
sample.Q3 55.74667 4.896189 5.257296 2.292879 4.113034
sample.Q4 165.65600 3.331477 39.752635 6.304969 3.806061
sample.P5 379.09067 16.709246 85.059893 9.222792 2.432872
sample.Q6 414.28667 4.112871 241.089499 15.527057 3.747902
powfun7 <- function (x) {
list(predictors = list(beta1 = 1, beta2 = 1, J = 1),
variables = list(substitute(x)),
term = function(predictors, variables) {
paste0(predictors[1], "+",
predictors[2], "*", variables[1], "^", predictors[3])
})
}
class(powfun7) <- "nonlin"
mod7 <- gnm(VC ~ powfun7(Mean) - 1, family = Gamma(link = "identity"), weights = DF/2,
data = total, start = c(1, 1, 2), verbose = FALSE)
coef(mod7)Coefficients:
beta1 beta2 J
0.71739105 0.00563237 1.66887107
[1] 1.758392
fit_vfp() will fit all 9 models and select best fitting by AIC
(VFP) Variance-Function
-----------------------
Model 4*: sigma^2=(beta1+beta2*u)^2
Coefficients:
beta1 beta2
0.7329 0.02671
AIC = 140.6 RSS = 11637 Deviance = 1.875 GoF P-value= 1.000
*Best-fitting model of 8 VFP-models overall
Determine concentrations at which a specified CV is not exceeded.
The VFP vignette gives further detail on the use of precision profiles.
{VFP} aims to implement the functionality of the Variance Function Program distributed by the Australasian Association for Clinical Biochemistry and Laboratory Medicine
{gnm} provides a unified framework for a wide range of models
Applications “in the wild” have used several of its features:
For more on {gnm} see the vignette or the BIBC 2025 tutorial
Turner, H and D Firth (2007) gnm: A Package for Generalized Nonlinear Models R News
Anderson, J. A. (1984). Regression and Ordered Categorical Variables JRSS B
Goodman, L. A. (1979). Simple models for the analysis of association in cross-classifications having ordered categories JASA
Lee, R. D. and L. Carter (1992). Modelling and forecasting the time series of {US} mortality JASA