Derived Quantities

Overview

This document provides a brief overview of the new “derived quantities” functionality available in the nimble MCMC.

The operation of every derived quantity function is goverened by an interval parameter. Each derived quantity function is only called to execute (by the overarching MCMC algorithm) every interval MCMC iterations. The default value for interval is given by the thin interval of the MCMC. This means that, by default, each derived quantity function will execute on the exact MCMC iterations when samples are recorded. If the value of 1 is given as the interval for a derived quantity function, then that derived quantity will execute at the end of every MCMC sampling iteration. When using a value of of interval greater than one, derived quantity functions will execute less often.

Operation of derived quantity functions is entirely post-burnin. Derived quantity functions are not executed whatsoever during any specified burnin period. Following the burnin period, their operation is governed by interval using the post-burnin MCMC iteration number. This “life begins post-burnin” mindset extends even to the niter argument of the before_chain method (see final section of this vignette), the value of which gives the total number of post-burnin MCMC iterations.

Derived quantity functions can be added to an MCMC configuration object using the addDerivedQuantity method. Alternatively, several of the derived quantity functions have support provided directly from configureMCMC, buildMCMC, and nimbleMCMC.

Installation

At present, use of derived quantities requires installing nimble from the derived_quantities GitHub branch.

This can be done by execuing the code below:

remove.packages('nimble')
library(remotes)
remotes::install_github('nimble-dev/nimble', ref = 'derived_quantities', subdir = 'packages/nimble')
library(nimble)

Derived quantity functions

Derived quantity functions are available for:

Mean
Variance
Log-densities for single or (summed) groups of nodes
Posterior predictive nodes and posterior derived quantities
Any nimbleFunction following the correct specification

The mean and variance derived quantities also have options for the calculation and recording and intervals on which they operate, which are discussed below.

Example model: `pump`

We revisit an old friend, the pump model, for the following examples.

This example is taken from Chapter 2 of nimble User Manual.

library(nimble)

pumpCode <- nimbleCode({ 
    for(i in 1:N) {
        theta[i] ~ dgamma(alpha, beta)
        lambda[i] <- theta[i]*t[i]
        x[i] ~ dpois(lambda[i])
    }
    alpha ~ dexp(1.0)
    beta ~ dgamma(0.1, 1.0)
})

pumpConsts <- list(N = 10,
                   t = c(94.3, 15.7, 62.9, 126, 5.24,
                         31.4, 1.05, 1.05, 2.1, 10.5))

pumpData <- list(x = c(5, 1, 5, 14, 3, 19, 1, 1, 4, 22))

pumpInits <- list(alpha = 1, beta = 1,
                  theta = rep(0.1, pumpConsts$N))

pumpModel <- nimbleModel(pumpCode, pumpConsts, pumpData, pumpInits)

Interface using `MCMCconf` class methods

Derived quantities can be added to an MCMC configuration using the addDerivedQuantity method.

Derived Quantities: `mean` and `variance`

First, we demonstrate how to calculate the derived quantities for the mean and variance.

We don’t provide any value for the interval for the mean derived quantity function. In this case, the thinning interval of the MCMC will be used as interval. The default MCMC thinning interval is 1, and therefore samples are recorded on every MCMC iteration, and every MCMC iteration is used for updating the mean statistic.

By default, the value of the mean will only be recorded a single time: at the end of the MCMC chain. But as shown later, the frequency of recording this running value can be controled using the recordingFrequency control argument.

For example, here we add the mean for nodes theta[1:5]:

conf <- configureMCMC(pumpModel, monitors = 'theta')

conf$addDerivedQuantity('mean', nodes = 'theta[1:5]')

runMCMC(buildMCMC(conf), niter = 10)

## $samples
##         theta[1]   theta[2]   theta[3]   theta[4]  theta[5]  theta[6]
##  [1,] 0.03737694 0.09865025 0.03649662 0.12522414 0.4857874 0.6063645
##  [2,] 0.07874100 0.13153689 0.12135310 0.13791837 0.5350740 0.3549349
##  [3,] 0.04033155 0.01038802 0.12112408 0.09962342 0.8897602 0.7023247
##  [4,] 0.07620929 0.05737905 0.09593115 0.09423096 0.2936890 0.7931137
##  [5,] 0.02522779 0.19619094 0.08603864 0.18530519 0.6953881 0.5053920
##  [6,] 0.05134725 0.05824679 0.10415039 0.12207369 0.9109975 0.5577273
##  [7,] 0.04794195 0.03883467 0.12092339 0.13631598 0.3258862 0.4792542
##  [8,] 0.04405165 0.07408440 0.03244485 0.10060799 0.6509626 0.6688481
##  [9,] 0.09570877 0.08721566 0.04289181 0.10201171 0.3665027 0.6603961
## [10,] 0.04661217 0.04245859 0.05577296 0.18237111 0.5580305 0.4373089
##         theta[7]  theta[8]  theta[9] theta[10]
##  [1,] 0.18722041 0.1109388 0.4560808  1.353741
##  [2,] 0.84069611 1.2098658 0.8706886  1.690188
##  [3,] 0.52577549 0.3879329 1.7916276  1.669334
##  [4,] 1.78958953 0.3457725 1.4794440  1.851470
##  [5,] 0.62984123 0.1151414 1.6864721  1.957804
##  [6,] 1.17495589 1.7823799 0.9433304  2.511873
##  [7,] 0.15237913 2.1837687 2.0427324  2.376009
##  [8,] 0.80552804 0.6362633 2.7269010  1.907291
##  [9,] 0.04581018 0.1589184 1.0634597  1.552146
## [10,] 2.20708280 0.1114166 1.3957377  2.430146
## 
## $derived
## $derived$mean
##        theta[1]   theta[2]  theta[3]  theta[4]  theta[5]
## [1,] 0.05435484 0.07949853 0.0817127 0.1285683 0.5712078

We could also, for example, calculate the variance for nodes using the variance function. This time, we:

Change the interval for the mean calculation to 2. This means that every other MCMC iteration will be used to calculate the mean.
Also track the variance for three nodes. For the variance, we’ll use use an interval of 3, which means the variance will be calculated using every third MCMC iteration.

Once again by default, only the final value of each statistic (at the end of the MCMC chain) will be recorded and returned:

conf <- configureMCMC(pumpModel)

conf$addDerivedQuantity('mean',     nodes = 'theta[1:5]',                   interval = 2)
conf$addDerivedQuantity('variance', nodes = c('alpha', 'beta', 'theta[2]'), interval = 3)

conf$printDerivedQuantities()

## [1] derived quantity: mean,  execution interval: 2,  nodes: theta[1:5]
## [2] derived quantity: variance,  execution interval: 3,  nodes: c(alpha, beta, theta[2])

runMCMC(buildMCMC(conf), niter = 10)

## $samples
##           alpha      beta
##  [1,] 0.3687959 0.2250495
##  [2,] 0.3687959 0.3911731
##  [3,] 0.3687959 0.4953044
##  [4,] 0.3687959 0.6899367
##  [5,] 0.3687959 0.6937727
##  [6,] 0.3687959 0.5232010
##  [7,] 0.5050178 0.8479132
##  [8,] 0.5050178 0.5182927
##  [9,] 0.5050178 0.3144140
## [10,] 0.5050178 1.2964387
## 
## $derived
## $derived$mean
##        theta[1]   theta[2]  theta[3]  theta[4]  theta[5]
## [1,] 0.04952034 0.07872361 0.0755526 0.1521774 0.6026896
## 
## $derived$variance
##            alpha       beta      theta[2]
## [1,] 0.006185468 0.01284859 0.00009529348

Alternatively, we could provide a thinning interval thin for the MCMC configuation. In the next example, we don’t provide any value of interval to the derived quantity functions. In this case, the value of the MCMC thinning interval will be the default value for the interval.

Thus, in the example below, MCMC samples are only recoded ever other MCMC iteration, and these very same iterations (and therefore, the same values of model nodes) will be used to calculate the mean and variance:

conf <- configureMCMC(pumpModel, thin = 2)

conf$addDerivedQuantity('mean',     nodes = c('alpha', 'beta'))
conf$addDerivedQuantity('variance', nodes = c('alpha', 'beta'))

conf$printDerivedQuantities()

## [1] derived quantity: mean,  execution interval: thin,  nodes: c(alpha, beta)
## [2] derived quantity: variance,  execution interval: thin,  nodes: c(alpha, beta)

runMCMC(buildMCMC(conf), niter = 10)

## $samples
##          alpha      beta
## [1,] 0.5050178 0.5149494
## [2,] 0.5050178 0.7389678
## [3,] 0.5050178 0.6459727
## [4,] 0.5050178 1.5614222
## [5,] 0.5050178 0.5467156
## 
## $derived
## $derived$mean
##          alpha      beta
## [1,] 0.5050178 0.8016055
## 
## $derived$variance
##      alpha      beta
## [1,]     0 0.1881517

`recordingFrequency`

Another important option exists for the mean and variance dreived quantities. This is the recordingFrequency. This is subtly different from the interval option:

interval controls after how many MCMC iterations the derived quantity function is called, and therefore the current parameter values are used to update the statistic value.
recordingFrequency controls after how many updates to the statisitc is the (current) value of the statistic recorded. The default value of recordingFrequency is 0, which corresponds to the special case of only recording the statistic once - at the end of the MCMC chain.

By way of example:

interval = 1 and recordingFrequency = 0 (which is the default value of recordingFrequency) means that an updated value of the statistic is updated after every MCMC iteration, and only the final value of this statistic (at the end of the MCMC chain) is recorded.
interval = 4 and recordingFrequency = 0 (which is the default value of recordingFrequency) means that only every fourth MCMC iteration will be used to update the value of the statistic, and only the final value of this statistic (at the end of the MCMC chain) is recorded.
interval = 1 and recordingFrequency = 1 means that an updated value of the statistic is calculated after every MCMC iteration, and every such value is recorded.
interval = 4 and recordingFrequency = 1 means that only every fourth MCMC iteration will be used to update the value of the statistic, and all of these updated values will be recorded. So, if an MCMC executes for 100 iterations, only 25 samples will be used to update the statistic, and all 25 updated values of the statistics will be recorded.
interval = 1 and recordingFrequency = 5 means that an updated value of the statistic is calculated after every MCMC iteration, but the value of the statistic is only recorded after every 5 such calculations. So, if an MCMC executes for 100 iterations, all 100 samples are used to update the statistic, but only 20 values of the statistic will be recorded.
interval = 10 and recordingFrequency = 2 means that every tenth MCMC iteration will be used to update the value of the statistic, and the value of the statistic is only recorded after every other such calculation. So, if an MCMC executes for 100 iterations, only 10 samples will be used for calculations, and of these 10, only 5 values will be recorded.

We provide a single example:

conf <- configureMCMC(pumpModel)

## uses default interval = thinning interval (1)
## uses default recordingFrequency = 0 (only record value once, at end of the chain)
conf$addDerivedQuantity('mean', nodes = 'alpha')

## uses default interval = thinning interval (1)
## records the value of the mean after every 10 updates (every 10 iterations)
conf$addDerivedQuantity('mean', nodes = 'alpha', recordingFrequency = 10)

## specifies both interval and recordingFrequency
## records the value of the mean after every 10 updates (every 20 iterations)
conf$addDerivedQuantity('mean', nodes = 'alpha', interval = 2, recordingFrequency = 10)

conf$printDerivedQuantities()

## [1] derived quantity: mean,  execution interval: thin,  nodes: alpha
## [2] derived quantity: mean,  execution interval: thin,  nodes: alpha,  recordingFrequency: 10
## [3] derived quantity: mean,  execution interval: 2,  nodes: alpha,  recordingFrequency: 10

## do not return the 100 samples, only the derived quantities
runMCMC(buildMCMC(conf), niter = 100, samples = FALSE)

## $mean
##          alpha
## [1,] 0.6574795
## 
## $mean
##           alpha
##  [1,] 0.5013775
##  [2,] 0.4849963
##  [3,] 0.5280343
##  [4,] 0.6080534
##  [5,] 0.6104068
##  [6,] 0.6135897
##  [7,] 0.6199118
##  [8,] 0.6458410
##  [9,] 0.6583921
## [10,] 0.6574795
## 
## $mean
##          alpha
## [1,] 0.4831762
## [2,] 0.6160992
## [3,] 0.6258352
## [4,] 0.6592835
## [5,] 0.6735258

Derived Quantity: `logProb`

The logProb derived quantity function calculates and stores the log-density for:

Any single nodes
Any groups of nodes (summed)
All stochastic model nodes

When the nodes argument is a character vector of node names, the log-density of each node is tracked individually:

conf <- configureMCMC(pumpModel)

conf$addDerivedQuantity('logProb', nodes = c('alpha', 'beta'))

runMCMC(buildMCMC(conf), niter = 3)

## $samples
##          alpha      beta
## [1,] 0.9876866 1.5719823
## [2,] 0.7582832 0.8186222
## [3,] 0.6033949 0.2334301
## 
## $derived
## $derived$logProb
##           alpha      beta
## [1,] -0.9876866 -4.231799
## [2,] -0.7582832 -2.891215
## [3,] -0.6033949 -1.176757

Alternatively, if we set the thinning interval of the MCMC to be 3, then samples will only be recorded after the third MCMC iteration, and the logProb derived quantity will only operate after every third MCMC iteration. This is the default behavior of the interval argument, which when omitted, assumes the value of the MCMC thinning interval:

conf <- configureMCMC(pumpModel, thin = 3)

conf$addDerivedQuantity('logProb', nodes = c('alpha', 'beta'))

runMCMC(buildMCMC(conf), niter = 3)

## $samples
##          alpha    beta
## [1,] 0.5927843 1.15702
## 
## $derived
## $derived$logProb
##           alpha      beta
## [1,] -0.5927843 -3.540996

When the nodes argument is a list, each list element is treated as a group of nodes, and the sum-log-density of each group is tracked.

The distinction between the character vector and list style arguments for the nodes parameter of the logProb derived quantity is highlighted by the following two illustrious examples.

These also demonstrate how the keyword ".all" may be provided, corresponding to all stochastic model nodes (including data).

conf <- configureMCMC(pumpModel)

## using a character vector:
conf$addDerivedQuantity('logProb', nodes = c('theta[1:3]', '.all'))

runMCMC(buildMCMC(conf), niter = 1)

## $samples
##          alpha      beta
## [1,] 0.5927843 0.3850937
## 
## $derived
## $derived$logProb
##       theta[1]  theta[2]  theta[3] _all_nodes_
## [1,] 0.1855171 -0.311408 0.2108473   -33.64375

conf <- configureMCMC(pumpModel)

## using a list:
conf$addDerivedQuantity('logProb', nodes = list('theta[1:3]', '.all'))

runMCMC(buildMCMC(conf), niter = 1)

## $samples
##          alpha     beta
## [1,] 0.5927843 0.871052
## 
## $derived
## $derived$logProb
##      theta[1:3] _all_nodes_
## [1,]   2.214037   -27.92216

Two final examples demonstrate using the list syntax.

If the nodes argument is provided as an unnamed list, then some processing is done to give reasonable output column names, in the case of sums of node groupings:

conf <- configureMCMC(pumpModel)

conf$addDerivedQuantity(
         'logProb',
         nodes = list('alpha',
                      'beta',
                      c('alpha', 'beta'),
                      'theta[1:5]',
                      c('x[2]', 'x[5]'),
                      '.all'))

runMCMC(buildMCMC(conf), niter = 3)

## $samples
##          alpha      beta
## [1,] 0.5927843 0.9821222
## [2,] 0.5523950 0.6000796
## [3,] 0.5523950 1.3751547
## 
## $derived
## $derived$logProb
##           alpha      beta      sum1 theta[1:5]      sum2 _all_nodes_
## [1,] -0.5927843 -3.218599 -3.811384   1.330215 -2.501608   -29.68891
## [2,] -0.5523950 -2.393169 -2.945564   1.962431 -4.216364   -28.29746
## [3,] -0.5523950 -3.914577 -4.466972   1.956055 -3.047514   -31.54404

Alternatively, if the nodes argument is provided as a named list, then the list names are used as the output column names:

conf <- configureMCMC(pumpModel)

conf$addDerivedQuantity(
         'logProb',
         nodes = list(alpha_node = 'alpha',
                      beta_node  = 'beta',
                      alpha_beta = c('alpha', 'beta'),
                      `theta[1:5]` = 'theta[1:5]',
                      x2_x5 = c('x[2]', 'x[5]'),
                      joint_density = '.all'))

runMCMC(buildMCMC(conf), niter = 3)

## $samples
##          alpha      beta
## [1,] 0.5804051 0.5902455
## [2,] 0.5804051 1.1146760
## [3,] 0.5804051 0.7305319
## 
## $derived
## $derived$logProb
##      alpha_node beta_node alpha_beta theta[1:5]     x2_x5 joint_density
## [1,] -0.5804051 -2.368463  -2.948868  0.3014313 -2.574392     -29.41299
## [2,] -0.5804051 -3.465096  -4.045501  2.7948099 -4.024362     -30.76317
## [3,] -0.5804051 -2.700660  -3.281065 -0.2986487 -3.215800     -35.76760

Derived Quantity: `predictive`

The predictive derived quantity simulates the values of posterior predictive nodes and posterior derived quantities in a model, and stores these simulated values. This may be useful when a model structure includes posterior predictive nodes (or deterministically defined posterior derived quantities), but for reasons of efficiency, these nodes may not undergo MCMC sampling.

In such cases, the predictive derived quantity function may be assigned to these nodes, and when executed it will simulate new values for these nodes and record the simulated values.

The predictive function accepts a nodes argument, which are the names of the posterior predictive nodes to save. nodes should generally only specify posterior predictive model nodes, and a warning will be issued if nodes includes non-predictive nodes.

If nodes is omitted, the default set of nodes will be all (non-data) terminal nodes. If nodes is specified as nodes = '.all', then all predictive model nodes (both stochastic and deterministic) will be simulated and saved.

The predictive function will automatically determine the set of nodes which need to be simulated, prior to saving the values of the nodes. Alternatively, one can specify which nodes should be simulated using the simNodes argument. The set of nodes specified in simNodes will be simulated, in topological order, prior to saving the values of nodes. If simulation in the exact order specified in the simNodes argument is required, one may also provide the argument sort = FALSE.

The automatic determination of the set of simulation nodes simNodes assumes a valid MCMC assignment of samplers to unobserved model nodes. If one removes samplers from the default MCMC configuration, or otherwise does non-standard modifications to the set of MCMC samplers, then the automatic determination of simNodes may be incorrect. In this case, both the nodes and simNodes arguments should be precisely specified for the predictive derived quantity function, to ensure operation as desired.

We introduce a toy model to demonstrate the functionality of the predictive derived quantity:

toyCode <- nimbleCode({
    mu ~ dunif(0, 10)    ## parameter
    y ~ dnorm(mu, 1)     ## data
    muSq <- mu^2         ## posterior derived quantity
    z ~ dnorm(muSq, 1)   ## posterior predictive node
})

toyData <- list(y = 3)

toyInits <- list(mu = 1, z = 3)

toyModel <- nimbleModel(toyCode, data = toyData, inits = toyInits)

For the following examples, we use the argument samplePredictiveNodes = FALSE to configureMCMC, which prevents the assignment of a posterior_predictive sampler which would otherwise be assigned by default to muSq and z. This is representative of “turning off” posterior predictive networks in a model during a long MCMC run, so as to mitigate the computational burden.

First, we apply the predictive derived quantity function to both muSq and z. New values of both muSq and z will be simulated and saved on each MCMC iteration.

For this model, this is the same as if nodes were specified as nodes = '.all'.

conf <- configureMCMC(toyModel, samplePredictiveNodes = FALSE)

conf$addDerivedQuantity('predictive', nodes = c('muSq', 'z'))

runMCMC(buildMCMC(conf), 10)

##   [Warning] No samplers assigned for 1 node, use conf$getUnsampledNodes() for node name.

## $samples
##             mu
##  [1,] 0.970757
##  [2,] 1.160991
##  [3,] 2.372388
##  [4,] 3.187159
##  [5,] 3.187159
##  [6,] 4.375424
##  [7,] 2.929319
##  [8,] 2.133176
##  [9,] 3.629798
## [10,] 3.543656
## 
## $derived
## $derived$predictive
##             muSq          z
##  [1,]  0.9423692 -0.3550189
##  [2,]  1.3478990 -2.1916870
##  [3,]  5.6282254  5.9191093
##  [4,] 10.1579825  9.9356419
##  [5,] 10.1579825  9.7981729
##  [6,] 19.1443361 20.0450778
##  [7,]  8.5809117  8.5094346
##  [8,]  4.5504386  3.9768134
##  [9,] 13.1754363 11.9167916
## [10,] 12.5575010 14.1125281

In the event that we don’t want to save the intermediate value of muSq, we can instead specify nodes = 'z'. Now, the predictive function will still simulate both muSq and z, but only the values of z will be saved.

For this model, this is the same as if nodes were omitted, in which case the default value of nodes will be all (non-data) terminal nodes, which is only z.

We also change the value of interval, so this predictive process only takes place every fifth MCMC iteration:

conf <- configureMCMC(toyModel, samplePredictiveNodes = FALSE)

conf$addDerivedQuantity('predictive',
                        interval = 5,
                        nodes = 'z')

runMCMC(buildMCMC(conf), 10)

##   [Warning] No samplers assigned for 1 node, use conf$getUnsampledNodes() for node name.

## $samples
##             mu
##  [1,] 3.068817
##  [2,] 3.068817
##  [3,] 4.444217
##  [4,] 4.252674
##  [5,] 3.388714
##  [6,] 3.388714
##  [7,] 3.388714
##  [8,] 3.058565
##  [9,] 4.333355
## [10,] 4.409686
## 
## $derived
## $derived$predictive
##             z
## [1,] 11.01953
## [2,] 18.65907

The warnings above about unsampled nodes may also be suppressed by setting the nimble option: nimbleOptions(MCMCwarnUnsampledStochasticNodes = FALSE).

The default value of the samplePredictiveNodes argument to configureMCMC is given by the current value of the nimble package option MCMCassignSamplersToPosteriorPredictiveNodes. This nimble package option itself has default value of TRUE.

Other methods

The following methods of the MCMC confgiuration MCMCconf class are user-facing, for managing derived quantities:

addDerivedQuantity (as shown herein)
removeDerivedQuantity
removeDerivedQuantities
printDerivedQuantities (as shown herein)
getDerivedQuantities
getDerivedQuantityDefinition

Interface using `configureMCMC` and `nimbleMCMC`

Access to these derived quantity functions is also available directly from configureMCMC, and equivalently using nimbleMCMC.

However, when accessed directly from either configureMCMC or nimbleMCMC, the only value avaialble for the interval is 1, and the only value for the recordingFrequency of the mean and variance functions is 0. This means that every MCMC iteration is used for making these calculations, and the mean and variance statistics are each recorded only once at the conclusion of the MCMC chain. For more fine-grained control, users are directed towards the addDerivedQuantity method.

conf <- configureMCMC(pumpModel,
                      mean = c('alpha','beta'),
                      variance = c('alpha', 'beta'),
                      logProb = 'x')

## ===== Monitors =====
## thin = 1: alpha, beta
## ===== Samplers =====
## RW sampler (1)
##   - alpha
## conjugate sampler (11)
##   - beta
##   - theta[]  (10 elements)
## ===== DerivedQ =====
## - logProb (1)
## - mean (1)
## - variance (1)

conf$printDerivedQuantities()

## [1] derived quantity: mean,  execution interval: thin,  nodes: c(alpha, beta)
## [2] derived quantity: variance,  execution interval: thin,  nodes: c(alpha, beta)
## [3] derived quantity: logProb,  execution interval: thin,  nodes: x

runMCMC(buildMCMC(conf), niter = 3)

## $samples
##          alpha      beta
## [1,] 0.5804051 0.9460430
## [2,] 0.5804051 0.7260083
## [3,] 0.5804051 0.1281396
## 
## $derived
## $derived$mean
##          alpha      beta
## [1,] 0.5804051 0.6000636
## 
## $derived$variance
##      alpha     beta
## [1,]     0 0.179138
## 
## $derived$logProb
##           x[1]      x[2]      x[3]      x[4]      x[5]      x[6]      x[7]
## [1,] -1.752872 -1.036063 -1.923268 -2.349904 -2.165948 -2.396313 -1.131762
## [2,] -2.440114 -1.064980 -1.831571 -3.153007 -1.497743 -2.395676 -1.278002
## [3,] -5.002542 -1.214148 -1.838098 -2.459566 -1.608043 -3.157465 -1.011257
##           x[8]      x[9]     x[10]
## [1,] -1.399032 -1.647850 -2.608035
## [2,] -1.269966 -1.648320 -2.594059
## [3,] -1.599726 -1.637207 -2.846393

In addition to either a character vector or a list, the logProb option may also simply be provided as TRUE, which corresponds to all stochastic model nodes.

Here, we demonstrate this usage from the highest level using nimbleMCMC:

nimbleMCMC(pumpModel, niter = 10, logProb = TRUE)

## Compiling
##   [Note] This may take a minute.
##   [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
## running chain 1...

$samples
          alpha      beta
 [1,] 1.0000000 2.6403304
 [2,] 1.0000000 1.5159802
 [3,] 0.6179094 0.5215840
 [4,] 0.6179094 1.2062297
 [5,] 0.6179094 2.4043075
 [6,] 0.8440706 1.9108299
 [7,] 0.8440706 2.2120011
 [8,] 0.8440706 0.5119748
 [9,] 0.8440706 0.8164949
[10,] 0.8440706 0.5188594

$derived
$derived$logProb
      _all_nodes_
 [1,]   -32.24134
 [2,]   -33.29453
 [3,]   -29.77974
 [4,]   -33.41835
 [5,]   -33.23479
 [6,]   -31.59724
 [7,]   -34.02351
 [8,]   -31.37778
 [9,]   -28.40736
[10,]   -31.21062

Specification for general derived quantity `nimbleFunction`

Any artibrary derived quantity can be implemented as a nimbleFunction, provided it contains the virtual base class shown below, and also follows the nimbleFunction format below.

Derived quantity nimbleFunctions must have:

A run method, which is called when the MCMC executes this derived quantity function. The run method accepts a single argument called timesRan, which reflects how many times this function has been called (including the present call) during the current MCMC chain.
A set_interval method, which matches the definition shown below. This method sets the internal interval variable within the derived quantity nimbleFunction.
A get_results method, to return a numeric matrix of results.
A get_names method to return a character vector of column names (for the results matrix).

There are several other optional methods of derived quantity functions, as well.

The before_chain method, if present, executes at the onset of each MCMC chain. Note that its niter argument is calculated as: (total number of MCMC iterations) - (number of burnin iterations). That said, the before_chain method also accepts the number of burnin iterations nburnin as an argument, so it at least has knowledge of how many MCMC iterations have taken place preceeding its operational regieme.

The after_chain method, if present, executes at the conclusion of each MCMC chain, and accepts no arguments.

Some additional comments are provided within the skeletal derived quantity nimbleFunction code below:

derived_name <- nimbleFunction(
    name = 'derived_name',
    contains = derived_BASE,     ## all derived functions must contain (inherit from) the derived_BASE class
    setup = function(model, mcmc, interval, control) {
        ## 
        ## setup function must have these arguments, and has unrestricted access to:
        ## - model
        ## - mcmc
        ## - interval (the MCMC sampling interval on which the run function will be executed)
        ## - control (list of control arguments provided to this function)
        ## 
    },
    run = function(timesRan = double()) {
        ## 
        ## run fuction has one argument:
        ## timesRan, which is the number of times this method has executed during the present MCMC chain,
        ## including the present time.  This can be used as an index for a matrix of results.
        ## 
    },
    methods = list(
        set_interval = function(newInterval = double()) {
            ##
            ## COMPULSORY set_interval method, which matches the definition below exactly.
            ## this method is necessary, to allow the default value of 'interval' to
            ## match the thinning interval of the MCMC.
            ## 
            interval <<- newInterval
            ##
        },
        before_chain = function(niter = double(), nburnin = double(), thin = double(1), chain = double()) {
            ## 
            ## (optional) before_chain method,
            ## must have must have these arguments specifications:
            ## - niter (total number of *post-burnin* MCMC iterations)
            ## - nburnin (number of burnin iterations for this MCMC chain)
            ## - thin (a length=2 vector, containing the thinning intervals for mvSamples and mvSamples2)
            ## - chain (the chain number of the current MCMC chain being run)
            ## 
        },
        after_chain = function() {
            ## 
            ## (optional) after_chain method,
            ## no arguments and no return value
            ## 
        },
        get_results = function() {
            ## 
            ## COMPULSORY get_results method,
            ## which must return a dim=2 numeric array
            ## 
            returnType(double(2))
        },
        get_names = function() {
            ## 
            ## COMPULSORY get_names method,
            ## which must return a dim=1 character vector
            ## 
            returnType(character(1))
        },
        reset = function() {
            ## 
            ## (optional) reset method
            ## no arguments and no return value
            ## 
        }
    )
)

Derived Quantities

Daniel Turek

April 2025

Overview

Installation

Derived quantity functions

Example model: `pump`

Interface using `MCMCconf` class methods

Derived Quantities: `mean` and `variance`

`recordingFrequency`

Derived Quantity: `logProb`

Derived Quantity: `predictive`

Other methods

Interface using `configureMCMC` and `nimbleMCMC`

Specification for general derived quantity `nimbleFunction`

Derived Quantities

Daniel Turek

April 2025

Overview

Installation

Derived quantity functions

Example model: pump

Interface using MCMCconf class methods

Derived Quantities: mean and variance

recordingFrequency

Derived Quantity: logProb

Derived Quantity: predictive

Other methods

Interface using configureMCMC and nimbleMCMC

Specification for general derived quantity nimbleFunction

Example model: `pump`

Interface using `MCMCconf` class methods

Derived Quantities: `mean` and `variance`

`recordingFrequency`

Derived Quantity: `logProb`

Derived Quantity: `predictive`

Interface using `configureMCMC` and `nimbleMCMC`

Specification for general derived quantity `nimbleFunction`