Back to Index
library(nimble)
library(coda)
nimbleOptions(verbose = FALSE)
source('../bayes_nsgp/source_nimble.R')
source('../code/nngp4.R')
We’ll use the same precipitation data and model that you used in the FNS_M2_implementation
example.
COprecip <- read.csv('../fullGP_nonstationary/Final_data.csv')
coords <- as.matrix(COprecip[,c('Longitude', 'Latitude')])
dist_list <- ns_dist(coords)
Xmat <- unname(lm(logPrecip ~ Zelevation*Zslope10, x = TRUE, data = COprecip)$x)
p <- dim(Xmat)[2]
z <- COprecip$logPrecip
constants <- list(X_tau = Xmat, X_sigma = Xmat, X_Sigma = Xmat, X_mu = Xmat,
p_tau = p, p_sigma = p, p_Sigma = p, p_mu = p,
dist1_sq = dist_list$dist1_sq, dist2_sq = dist_list$dist2_sq, dist12 = dist_list$dist12)
niter <- 1000
First, we’ll use the Full GP likelihood.
Rmodel <- nsgp_model(likelihood = 'fullGP', constants = constants, z = z)
system.time(samples <- nimbleMCMC(model = Rmodel, niter = niter, progressBar = FALSE, setSeed = TRUE))
## user system elapsed
## 193.528 2.206 197.386
samplesList <- list(fullGP = samples)
Next, we’ll use the NNGP likelihood approximation, using \(k=15\) nearest neighbors.
k <- 15
dst <- unname(as.matrix(dist(coords)))
nID <- determineNeighbors(dst, k)
Rmodel <- nsgp_model(likelihood = 'NNGP', constants = constants, z = z, k = k, nID = nID)
system.time(samples <- nimbleMCMC(model = Rmodel, niter = niter, progressBar = FALSE, setSeed = TRUE))
## user system elapsed
## 215.647 1.169 217.800
samplesList$NNGP <- samples
Perfect agreement between the fullGP and NNGP likelihoods.
chainsPlot(samplesList)
Effective sample size is similar between the full GP and NNGP models.
do.call('cbind', lapply(samplesList, function(x) apply(x, 2, effectiveSize)))
## fullGP NNGP
## alpha[1] 76.76478 16.39946
## alpha[2] 67.27624 35.43334
## alpha[3] 108.17518 60.20091
## alpha[4] 151.36318 52.22122
## beta[1] 46.15630 22.44975
## beta[2] 104.86872 83.12268
## beta[3] 160.78625 26.28199
## beta[4] 124.51320 61.09206
## delta[1] 137.87157 36.64646
## delta[2] 138.38503 68.34985
## delta[3] 179.75589 81.25906
## delta[4] 143.96016 34.97568
## gamma1[1] 196.47077 82.01591
## gamma1[2] 209.14354 29.60924
## gamma1[3] 211.17410 77.49607
## gamma1[4] 207.97081 44.28948
## gamma2[1] 279.74353 18.61498
## gamma2[2] 151.25245 99.87818
## gamma2[3] 216.82321 47.50818
## gamma2[4] 225.74655 62.42767
## psi11 108.17428 25.63554
## psi22 159.83727 56.50624
## rho 212.84515 62.13015
The calculateAD_ns()
function now only uses sub-blocks of the full (N x N) covariance matrix, and gives the same results as before.
For this size model (\(N=217\)), we don’t see any computational savings from using the NNGP likelihood.
However, I’m very confident that for larger values of \(N\) (say in the 1000’s or larger), we’ll observe huge computational savings, when using the NNGP likelihood versus the full GP likelihood.
This shows a proof of concept of how this all could work.
A few changes you might want to look at:
mu
, the GP process mean...
, or as a constants
list argument.Also, we can talk about incorporating the Gibbs sampler for the beta
terms in the process mean.