Sampling-free Uncertainty Estimation in Gated Recurrent Units with Exponential Families
[pdf] [arXiv]

Abstract. There has recently been a concerted effort to derive mechanisms in vision and machine learning systems to offer uncertainty estimates of the predictions they make. Clearly, there are enormous benefits to a system that is not only accurate but also has a sense for when it is not sure. Existing proposals center around Bayesian interpretations of modern deep architectures -- these are effective but can often be computationally demanding. We show how classical ideas in the literature on exponential families on probabilistic networks provide an excellent starting point to derive uncertainty estimates in Gated Recurrent Units (GRU). Our proposal directly quantifies uncertainty deterministically, without the need for costly sampling-based estimation. We demonstrate how our model can be used to quantitatively and qualitatively measure uncertainty in unsupervised image sequence prediction. To our knowledge, this is the first result describing sampling-free uncertainty estimation for powerful sequential models such as GRUs.

Figure: The internal structure of the Sampling-free Probabilistic GRU (SP-GRU).

Figure: When a model makes a sequential prediction of the model, the output may look "reasonable" (black background images). Does this mean we can consider this to be a good prediction? Can we quantify pixel-level uncertainty of the output (blue background images)? The above pipeline shows how we can quantify uncertainty across the predicted image sequences. The plot on the right shows that the degree of uncertainty correlates with the degree of deviation imposed in the input (+0, +5, +10, +15 degrees deviation in the trajectory added in the input) where we expect higher uncertainty from more deviated (uncertainty of +15 degrees > uncertainty of +5 degrees).

Figure: Heterogeneity detection using normative probability mapping. Using the predicted mean and variance that SP-GRU offers, normative probability map (Marquand et al., 2016) can be computed to derive extreme statistics for heterogeneity (outlier) detection with respect to the given group. Since we are interested in sequential samples, we can derive extreme value statistics (EVS) at each time point. This can be used to further derive the confidence intervals which can then be used for statistically identifying how "outlier-ish" a new subject is.

[1] Seong Jae Hwang, Ronak Mehta, Hyunwoo J. Kim, Vikas Singh, "Sampling-free Uncertainty Estimation in Gated Recurrent Units with Exponential Families", arXiv preprint arXiv:1804.07351, 2018. [pdf] [arXiv]