Conversation
That's correct for cmdstanr. Turns out rstan supports getting draws from an optimization fit (using the Hessian to build a multivariate normal approximation at the posterior mode and draw samples from it), but this is implemented in rstan's R code. There was some discussion on putting this code in CmdStan - and thus make it available to cmdstanr, but we're not there yet. |
I was just writing this!
As I commented in this issue, what do you think about |
|
Do you think you could add a short vignette that uses the backend to evaluate how well is ADVI calibrated for some simple but non-trivial model? (Basically any example model from Stan's user guide will be great, even better if it is not similar to other examples we have in other vignettes...) |
Okay, then how can decide which samples we need to discard? Because for now I think what we can only do is "run optimization 10k times, save last 4k iterations as samples". |
|
We would need an additional steps to calibrate the draws taken from the draws, like the VSBC methods or some other VI method. |
This would be a pain to add since it's not in the offcial Stan releases. |
martinmodrak
left a comment
There was a problem hiding this comment.
I didn't run the code itself, but generally looks usable.
One additional comment:
variational doesn't return any meaningful diagnostics
CmdstanVB does not currently expose it, but a useful diagnostic would be a binary indicator whether ELBO converged - the idea would be to just check that the output contains the string "MEDIAN ELBO CONVERGED" (not sure about the exact string). Or mabye there is a warning when it didn't converge that we could check?
R/backends.R
Outdated
| stop("The model has to be already compiled, call $compile() first.") | ||
| } | ||
| args <- list(...) | ||
| unacceptable_params <- c("data", "parallel_chains ", "cores", "num_cores") |
There was a problem hiding this comment.
Since variational is always run on single core, I guess data is the only argument we really want to forbid here.
There was a problem hiding this comment.
There is a thread argument (and thread_per_chain argument for sample). Is this argument not relative to SBC? What is the difference using four parallel chains each with single threads vs one four threads for a single chain? This reduce_sum doc introduces thread as similar to parallel.
There was a problem hiding this comment.
Within-chain paralellization is another big can of worms here. I think it is sufficiently rare to let us expect people to just figure out the correct configuration (number of workers, cores_per_fit) themselves and pass the correct threading-related arguments to the backend - the configuration would likely would be very much use-case dependent, although most often NOT using any within-chain paralellization would be the best choice. We however definitely need to document this. I've started #49 to make sure we don't forget.
R/backends.R
Outdated
| #' package. | ||
| #' @export | ||
| SBC_backend_cmdstan_optimize <- function(model, ...) { | ||
| stop("The optimize method is currently not supported.") |
There was a problem hiding this comment.
I think it is better to just remove the code for optimization now.
I agree with @Dashadower that we probably don't want to implement specific sampling/resampling/fitting strategies in the SBC package - there's substantial work on making sure this stuff works as expected. I would just support what the underlying packages (e.g. |
Agreed! I asked Steve last week but haven't got the response yet. He first expressed the possibility (if needed) in this parallel design doc here: stan-dev/design-docs#40 (comment). Seems like it is not too difficult? |
I've skimmed through the issue thread and looks like there's comments on current ADVI not needing multichain optimization? This might mean other algorithms like RVI or Pathfinder will be needed and that's where the problem arouses - not being in official stan. |
|
In short, anything that needs manipulation to the stan cpp codebase isn't possible. |
I disagree. But, happy with variational and merging after addressing Martin's comments. |
|
Just checking: users can provide ADVI-specific arguments such as |
Yes |
|
Looking good! Thanks for the effort! |
I've made a minor tweak to
compute_resultssincevariationaldoesn't return any meaningful diagnostics and the current implementation was throwing an error ifbackend_diagnostics_listis empty.For now, I don't think
optimizecan be added since it will return a single point estimate instead of samples.