Post

Beta and Extension

Based on the following lectures
(1) “Statistics (2018-1)” by Prof. Sang Ah Lee, Dept. of Economics, College of Economics & Commerce, Kookmin Univ.
(2) "Statistical Models and Application (2024-1)" by Prof. Yeo Jin Chung, Dept. of Data Science, The Grad. School, Kookmin Univ.
(3) “Bayesian Modeling (2024-1)” by Prof. Yeo Jin Chung, Dept. of AI, Big Data & Management, College of Business Administration, Kookmin Univ.

Beta


01

  • 베타 분포(Beta Distribution): 성공 확률을 나타내는 분포

    \[X\sim\mathrm{Beta}(\alpha,\beta),\quad 0<X<1\]
    • $\alpha$: 분포의 좌측 모양을 결정하는 형상 파라미터로서 성공 데이터의 양을 나타냄
    • $\beta$: 분포의 우측 모양을 결정하는 형상 파라미터로서 실패 데이터의 양을 나타냄
  • 단위 물리량이 $1$ 인 어떤 사건에 대하여 성공과 실패가 각각 $\alpha,\beta$ 만큼 누적 관측되었다고 하자. 전체 관측 데이터의 누적 물리량($Y_{1}+Y_{2}$) 대비 성공 데이터의 누적 물리량($Y_{1}$)의 비율을 확률변수 $X$ 로 정의하였을 때, 이 확률변수는 베타 분포를 따르게 된다.

    \[\begin{gathered} X:=\frac{Y_{1}}{Y_{1}+Y_{2}}\quad\mathrm{for}\quad\begin{cases}Y_{1}\sim\mathrm{Gamma}(\alpha,1)\\Y_{2}\sim\mathrm{Gamma}(\beta,1)\end{cases}\\ \Downarrow\\ X\sim\mathrm{Beta}(\alpha,\beta) \end{gathered}\]
  • probability density function:

    \[p(x\mid\alpha,\beta) =\frac{1}{B(\alpha,\beta)}x^{\alpha-1}(1-x)^{\beta-1}\]
  • beta function:

    \[B(\alpha,\beta) =\frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha+\beta)}\]
  • $k$-th moment:

    \[\begin{aligned} \mathbb{E}\left[X^{k}\right] &=\int_{0}^{1}{x^{k}p(x)\mathrm{d}x}\\ &=\int_{0}^{1}{x^{k}\cdot\frac{1}{B(\alpha,\beta)}x^{\alpha-1}(1-x)^{\beta-1}\mathrm{d}x}\\ &=\frac{1}{B(\alpha,\beta)}\underbrace{\int_{0}^{1}{x^{(\alpha+k)-1}(1-x)^{\beta-1}\mathrm{d}x}}_{=B(\alpha+k,\beta)}\\ &=\frac{B(\alpha+k,\beta)}{B(\alpha,\beta)} \end{aligned}\]
    • $\mathbb{E}\left[X\right]=\alpha/(\alpha+\beta)$
    • $\mathbb{E}\left[X^{2}\right]=\alpha(\alpha+1)/(\alpha+\beta)(\alpha+\beta+1)$
    • $\mathrm{Var}\left[X\right]=\alpha\beta/(\alpha+\beta)^{2}(\alpha+\beta+1)$
  • canonical form:

    \[\begin{aligned} p(x) &=\frac{1}{B(\alpha,\beta)}x^{\alpha-1}(1-x)^{\beta-1}\\ &=\exp{\left[(\alpha-1)\log{x}+(\beta-1)\log{(1-x)}-\log{B(\alpha,\beta)}\right]}\\ &=1\cdot\exp{\left[\begin{pmatrix}\alpha-1\\\beta-1\end{pmatrix}^{T}\begin{pmatrix}\log{x}\\\log{(1-x)}\end{pmatrix}-\log{B(\alpha,\beta)}\right]} \end{aligned}\]
    • $T(x)=\log{x},\log{(1-x)}$
    • $\eta(\theta)=\alpha-1,\beta-1$
    • $A(\eta)=\log{B(\alpha,\beta)}$
    • $h(x)=1$

Dirichlet


  • 디리클레 분포(Dirichlet Distribution): $K$ 개 범주가 주어졌을 때 각 범주의 실현 확률을 나타내는 분포

    \[\Pi\sim\mathrm{Dirichlet}(\Theta),\quad\pi_{i}\ge0,\;\sum_{k=1}^{K}{\pi_{k}}=1\]
  • 단위 물리량이 $1$ 이고 결과가 $K$ 개의 범주로 실현되는 어떤 사건에 대하여 각 범주가 $\theta_{1},\cdots,\theta_{K}$ 만큼 누적 관측되었다고 하자. 전체 관측 데이터의 누적 물리량($Y_{1}+\cdots+Y_{K}$) 대비 범주 $k$ 실현 데이터의 누적 물리량($Y_{k}$)의 비율을 확률변수 $\pi_{k}$ 로 정의하였을 때, 이 확률변수는 디리클레 분포를 따르게 된다.

    \[\begin{gathered} \pi_{k}:=\frac{Y_{k}}{Y_{1}+\cdots+Y_{K}}\quad\mathrm{for}\quad Y_{k}\sim\mathrm{Gamma}(\theta_{k},1)\\ \Downarrow\\ \Pi\sim\mathrm{Dirichlet}(\Theta) \end{gathered}\]
  • probability density function:

    \[p(\Pi\mid\Theta) =\frac{1}{B(\Theta)}\prod_{k=1}^{K}{\pi_{k}^{\theta_{k}-1}}\]
  • multi-variate beta function:

    \[B(\Theta) =\frac{\prod_{k=1}^{K}{\Gamma\left(\theta_{k}\right)}}{\Gamma\left(\sum_{k=1}^{K}{\theta_{k}}\right)}\]
  • $k$-th moment ($\psi_{k}\in\mathbb{Z}\setminus\mathbb{Z}^{-}$):

    \[\begin{aligned} \mathbb{E}\left[\prod_{k=1}^{K}{\pi_{k}^{\psi_{k}}}\right] &=\int{\prod_{k=1}^{K}{\pi_{k}^{\psi_{k}}}p(\Pi)\mathrm{d}\Pi}\\ &=\int{\prod_{k=1}^{K}{\pi_{k}^{\psi_{k}}}\frac{1}{B(\Theta)}\prod_{k=1}^{K}{\pi_{k}^{\psi_{k}-1}}\mathrm{d}\Pi}\\ &=\frac{1}{B(\Theta)}\underbrace{\int{\prod_{k=1}^{K}{\pi_{k}^{(\theta_{k}+\psi_{k})-1}}\mathrm{d}\Pi}}_{=B(\Theta+\Psi)}\\ &=\frac{B(\Theta+\Psi)}{B(\Theta)} \end{aligned}\]
    • $\mathbb{E}\left[\pi_{k}\right]=\theta_{k}/\sum_{i=1}^{K}{\theta_{i}}$
    • $\mathbb{E}\left[\pi_{i}\pi_{j}\right]=\theta_{i}\theta_{j}/\sum_{i=1}^{K}{\theta_{i}}\left(\sum_{i=1}^{K}{\theta_{i}}+1\right)$
    • $\mathrm{Cov}\left[\pi_{i},\pi_{j}\right]=-\theta_{i}\theta_{j}/\left(\sum_{i=1}^{K}{\theta_{i}}\right)^{2}\left(\sum_{i=1}^{K}{\theta_{i}}+1\right)$
  • canonical form:

    \[\begin{aligned} p(\Pi) &=\frac{1}{B(\Theta)}\prod_{k=1}^{K}{\pi_{k}^{\theta_{k}-1}}\\ &=\exp{\left[\log{\prod_{k=1}^{K}{\pi_{k}^{\theta_{k}-1}}}-\log{B(\Theta)}\right]}\\ &=\exp{\left[\sum_{k=1}^{K}{\log{\pi_{k}^{\theta_{k}-1}}}-\log{B(\Theta)}\right]}\\ &=1\cdot\exp{\left[\sum_{k=1}^{K}{(\theta_{k}-1)\log{\pi_{k}}}-\log{B(\Theta)}\right]}\\ \end{aligned}\]
    • $T(\Pi)=\log{\Pi}$
    • $\eta(\theta)=\Theta-\mathbf{1}$
    • $A(\eta)=\log{B(\Theta)}$
    • $h(\Pi)=1$
This post is licensed under CC BY 4.0 by the author.