Bayes by Backprop

Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D.
(2015, June).
Weight uncertainty in neural network.
In International conference on machine learning (pp. 1613-1622).
PMLR.

Posted Jul 31, 2024

By jayarnim

2 min read

Bayes by Backprop

문제 의식

인공신경망 알고리즘은 파라미터 수가 매우 많아 과적합 문제에서 자유롭지 못하나 단일 최적값만 학습하여 지나치게 확신하는 예측을 수행함. 학습 조기 종료, 배치 정규화, 레이어 정규화, 드롭아웃 등 과적합 방지 기법들은 모형이 산출하는 결과에 대한 정규화 기법으로서 모형(의사결정 과정) 자체에 대한 정규화 기법이 아님. 즉, 이 기법들은 과적합 문제의 핵심인 과신을 해결하지 못함.
- 과적합(Overfitting): 모형의 복잡도가 데이터 셋이 제공하는 정보량보다 심화되어 발생하는 현상
- 과신(Overconfidence): 정보가 부족한 상황에서 단일 가설을 확신하여 예측하는 현상
BBB(Bayes by Backprop): 인공신경망의 가중치에 불확실성을 반영하여 데이터에 대한 설명력을 확보하는 동시에(Likelihood) 사전 정보(Prior)를 활용하여 모형의 복잡도를 제어함으로써(Kullback–Leibler divergence) 일반화 성능을 향상시키는 학습 방법론
- How to Approximate: Variational Inference
- Objective Function: Evidence Lower Bound
- How to Sample from Approx.: Reparameterization Trick

How to Modeling

Approx.
- Multivariate Gaussian Dist.(or Mean-Field Guassian Dist.):
  \[\begin{aligned} q(w) &= \mathcal{N}(\mu, \text{diag}(\sigma^{2})) \end{aligned}\]
- Trainable Parameters:
  \[\begin{aligned} \Theta_{\text{approx.}} &= (\mu, \rho)\\ \sigma &= \log{\left[1 + \exp(\rho)\right]} \end{aligned}\]
Prior
- Gaussian: L2 Normalization
  \[\begin{aligned} Q(w) &=\mathcal{N}(0,\sigma^{2}) \end{aligned}\]
- Scale Mixture of Gaussian: Approx. Sparse, Heavy-tail Dist.
  \[\begin{aligned} P(w) &=\pi \cdot \mathcal{N}(0,\sigma_{1}^{2}) + (1-\pi) \cdot \mathcal{N}(0,\sigma_{2}^{2}) \end{aligned}\]
Reparameterization Trick is a technique that transforms the sampling process from an approximate distribution into a differentiable function:
\[w \sim \mathcal{N}(\mu,\sigma^{2}) \quad \rightarrow \quad w = \mu + \sigma \cdot \epsilon\]
- $\epsilon \sim \mathcal{N}(0,1)$
Variational Inference:
\[\begin{aligned} \text{ELBO} &= \mathbb{E}_{W \sim Q}\left[\log{P(\mathcal{D} \mid W)}\right] - KL[Q(W) \parallel P(W)] \end{aligned}\]

BAYES, 2.posterior approx.

Bayesian Objective Function Variational Inference

This post is licensed under CC BY 4.0 by the author.

Bayes by Backprop

How to Modeling

Trending Tags