Sampling-free Uncertainty Estimation in Gated Recurrent Units with Exponential Families
Abstract. There has recently been a concerted effort to derive mechanisms in vision and machine
learning systems to offer uncertainty estimates of the predictions they make. Clearly, there are enormous benefits
to a system that is not only accurate but also has a sense for when it is not sure. Existing proposals center
around Bayesian interpretations of modern deep architectures -- these are effective but can often be computationally
demanding. We show how classical ideas in the literature on exponential families on probabilistic networks provide
an excellent starting point to derive uncertainty estimates in Gated Recurrent Units (GRU). Our proposal directly
quantifies uncertainty deterministically, without the need for costly sampling-based estimation. We demonstrate
how our model can be used to quantitatively and qualitatively measure uncertainty in unsupervised image sequence
prediction. To our knowledge, this is the first result describing sampling-free uncertainty estimation for powerful sequential models such as GRUs.
Figure: The internal structure of the Sampling-free Probabilistic GRU (SP-GRU).
Figure: When a model makes a sequential prediction of the model, the output may look "reasonable" (black background images).
Does this mean we can consider this to be a good prediction? Can we quantify pixel-level uncertainty of the output (blue background images)?
The above pipeline shows how we can quantify uncertainty across the predicted image sequences. The plot on the right shows that the degree of uncertainty
correlates with the degree of deviation imposed in the input (+0, +5, +10, +15 degrees deviation in the trajectory added in the input) where we expect
higher uncertainty from more deviated (uncertainty of +15 degrees > uncertainty of +5 degrees).
Figure: Heterogeneity detection using normative probability mapping. Using the predicted mean and variance
that SP-GRU offers, normative probability map (Marquand et al., 2016) can be computed to derive extreme statistics for
heterogeneity (outlier) detection with respect to the given group. Since we are interested in sequential samples, we can
derive extreme value statistics (EVS) at each time point. This can be used to further derive the confidence intervals
which can then be used for statistically identifying how "outlier-ish" a new subject is.
 Seong Jae Hwang, Ronak Mehta, Hyunwoo J. Kim, Vikas Singh,
"Sampling-free Uncertainty Estimation in Gated Recurrent Units with Exponential Families", arXiv preprint arXiv:1804.07351, 2018.