Post

Latent Factor Model with Attention Mechanism

Based on the following lectures
(1) “Recommendation System Design (2024-1)” by Prof. Ha Myung Park, Dept. of Artificial Intelligence. College of SW, Kookmin Univ.
(2) "Recommender System (2024-2)" by Prof. Hyun Sil Moon, Dept. of Data Science, The Grad. School, Kookmin Univ.

DACR: History Embedding with ATTN


  • 문제 의식: Implicit Feedback Problem
    • DeepCF
      • 표현 학습(Representation Learning): 사용자와 아이템 간 선형 관계를 바탕으로 저차원 잠재요인 공간을 효율적으로 구성
      • 매칭 함수 학습(Matching Function Learning): 다양한 매칭 함수를 근사하여 사용자와 아이템 간 비선형 상호작용 포착
    • Implicit Feedback Problem
      • 관측치의 불완전성(Observation Incompleteness): 관측과 미관측이 반드시 선호와 비선호를 의미하지 않음
      • 선호의 비가시성(Hidden Signal): 관측치의 불완전성으로 인하여 선호의 정도나 의도를 포착하기 어려움
      • 즉, 암시적 피드백 데이터는 행동 매칭 데이터이기에 선호 매칭에 사용하기 위해서는 내재된 선호 정보를 부각하고 잡음을 여과하는 절차가 필요함
  • DACR(Deep Collaborative Recommendation Algorithm Based on Attention Mechanism): 사용자, 아이템 표현 및 그 결합 표현에 어텐션 메커니즘(Attention Mechanism)을 적용하여 차원별 가중치를 명시적으로 설계함으로써 입력 중 집중할(Focus) 정보를 선별하여 강조하는 앙상블 모형
    • Cui, C., Qin, J., & Ren, Q.
      (2022).
      Deep collaborative recommendation algorithm based on attention mechanism.
      Applied Sciences, 12(20), 10594.
  • Components
    • ARL: Attention Representation Learning
    • AML: Attnetion Matching Function Learning
    • DACR: ARL & AML Emsemble

Notation


  • $u=1,2,\cdots,M$: user idx
  • $i=1,2,\cdots,N$: item idx
  • $\mathbf{Y} \in \mathbb{R}^{M \times N}$: user-item interaction matrix
  • $\overrightarrow{\mathbf{u}}_{u} \in \mathbb{R}^{K}$: user latent factor vector
  • $\overrightarrow{\mathbf{v}}_{i} \in \mathbb{R}^{K}$: item latent factor vector
  • $\overrightarrow{\mathbf{z}}_{u,i}$: predictive vector of user $u$ and item $i$
  • $\hat{y}_{u,i}$: interaction probability of user $u$ and item $i$
  • $\delta$: softmax function
  • $\sigma$: sigmoid function

How to Modeling


01

  • DACR is ARL & AML Emsemble

    \[\begin{aligned} \hat{y}_{u,i} &= \sigma(\overrightarrow{\mathbf{w}} \cdot [\overrightarrow{\mathbf{z}}_{u,i}^{\text{(ARL)}} \oplus \overrightarrow{\mathbf{z}}_{u,i}^{\text{(AML)}}]) \end{aligned}\]

ARL

  • Linear Transformation:

    \[\begin{aligned} \overrightarrow{\mathbf{p}}_{u} &= \mathbf{W} \cdot \mathbf{Y}_{u*}\\ \overrightarrow{\mathbf{q}}_{i} &= \mathbf{W} \cdot \mathbf{Y}_{*i} \end{aligned}\]
    • $\overrightarrow{\mathbf{p}}_{u} \in \mathbb{R}^{D}$
    • $\overrightarrow{\mathbf{q}}_{i} \in \mathbb{R}^{D}$
  • Attention Weight:

    \[\begin{aligned} \alpha_{u} &= \delta(\mathbf{W} \cdot \overrightarrow{\mathbf{p}}_{u} + \overrightarrow{\mathbf{b}})\\ \alpha_{i} &= \delta(\mathbf{W} \cdot \overrightarrow{\mathbf{q}}_{i} + \overrightarrow{\mathbf{b}}) \end{aligned}\]
  • Representation Learning:

    \[\begin{aligned} \overrightarrow{\mathbf{u}}_{u} &= \text{MLP}_{\text{ReLU}}\left(\overrightarrow{\mathbf{p}}_{u} \oplus [\alpha_{u} \odot \overrightarrow{\mathbf{p}}_{u}]\right)\\ \overrightarrow{\mathbf{v}}_{i} &= \text{MLP}_{\text{ReLU}}\left(\overrightarrow{\mathbf{q}}_{i} \oplus [\alpha_{i} \odot \overrightarrow{\mathbf{q}}_{i}]\right) \end{aligned}\]
  • predictive vector of user $u$ and item $i$:

    \[\begin{aligned} \overrightarrow{\mathbf{z}}_{u,i} &= \overrightarrow{\mathbf{u}}_{u} \odot \overrightarrow{\mathbf{v}}_{i} \end{aligned}\]
  • if use ARL as a single prediction module:

    \[\begin{aligned} \hat{y}_{u,i} &= \sigma(\overrightarrow{\mathbf{w}} \cdot \overrightarrow{\mathbf{z}}_{u,i}) \end{aligned}\]

AML

  • History Embedding:

    \[\begin{aligned} \overrightarrow{\mathbf{u}}_{u} &= \mathbf{W} \cdot \mathbf{Y}_{u*}\\ \overrightarrow{\mathbf{v}}_{i} &= \mathbf{W} \cdot \mathbf{Y}_{*i} \end{aligned}\]
  • Vector Concatenation:

    \[\begin{aligned} \overrightarrow{\mathbf{x}}_{u,i} &= \overrightarrow{\mathbf{p}}_{u} \oplus \overrightarrow{\mathbf{q}}_{i} \end{aligned}\]
  • Attention Weight:

    \[\begin{aligned} \alpha_{u,i} &= \delta(\mathbf{W} \cdot \overrightarrow{\mathbf{x}}_{u,i} + \overrightarrow{\mathbf{b}}) \end{aligned}\]
  • Matching Function Learning:

    \[\begin{aligned} \overrightarrow{\mathbf{z}}_{u,i} &= \text{MLP}_{\text{ReLU}}\left(\overrightarrow{\mathbf{x}}_{u,i} \oplus [\alpha_{u,i} \odot \overrightarrow{\mathbf{x}}_{u,i}]\right) \end{aligned}\]
  • if use AML as a single prediction module:

    \[\begin{aligned} \hat{y}_{u,i} &= \sigma(\overrightarrow{\mathbf{w}} \cdot \overrightarrow{\mathbf{z}}_{u,i}) \end{aligned}\]
This post is licensed under CC BY 4.0 by the author.