Dual Embedding based Latent Factor Models

Based on the following lectures
(1) “Recommendation System Design (2024-1)” by Prof. Ha Myung Park, Dept. of Artificial Intelligence. College of SW, Kookmin Univ.
(2) "Recommender System (2024-2)" by Prof. Hyun Sil Moon, Dept. of Data Science, The Grad. School, Kookmin Univ.

Posted Mar 27, 2024

By jayarnim

5 min read

Embedding Type

아이디 임베딩(ID Embedding)
- Embedding user and item identifiers into a low-dimensional vector space
- 사용자의 고유한 선호 정보나 아이템의 고유한 특징 정보를 반영한 표현을 도출함
- 사용자와 아이템의 맥락 정보가 부족하여 행동 패턴이나 구매 패턴을 반영하기 어려움
히스토리 임베딩(History Embedding)
- Generate each user and item expressions based on past interaction history
- 사용자의 행동 패턴이나 아이템의 구매 패턴을 반영한 표현을 도출함
- 사용자와 아이템을 상호간에 의존하여 표현하므로 고유 정보를 보존하기 어려움

DELF

문제 의식: 아이디 임베딩(ID Embedding)과 히스토리 임베딩(History Embedding)의 상호 보완적 관계
- 아이디 임베딩은 고유 정보를 보존한 표현을 생성하는 데 강점
- 히스토리 임베딩은 맥락 정보를 반영한 표현을 생성하는 데 강점
DELF(Dual Embedding based Deep Latent Factor Model): 사용자와 아이템의 아이디 임베딩과 히스토리 임베딩을 조합하여 다양한 매칭 함수를 병렬 학습하는 모형
- Cheng, W., Shen, Y., Zhu, Y., & Huang, L.
  (2018, July).
  DELF: A dual-embedding based deep latent factor model for recommendation.
  In IJCAI (Vol. 18, pp. 3329-3335).

Notation

$u=1,2,\cdots,M$: user idx
$i=1,2,\cdots,N$: item idx
$\mathbf{R} \in \mathbb{R}^{M \times N}$: user-item interaction matrix
$\overrightarrow{\mathbf{p}}_{u} \in \mathbb{R}^{K}$: user ID embedding vector
$\overrightarrow{\mathbf{q}}_{i} \in \mathbb{R}^{K}$: item ID embedding vector
$\overrightarrow{\mathbf{m}}_{u} \in \mathbb{R}^{K}$: user history embedding vector
$\overrightarrow{\mathbf{n}}_{i} \in \mathbb{R}^{K}$: item history embedding vector
$\overrightarrow{\mathbf{z}}_{u,i}$: predictive vector of user $u$ and item $i$
$\hat{y}_{u,i}$: interaction probability of user $u$ and item $i$

How to Modeling

ID Embedding:
\[\begin{aligned} \overrightarrow{\mathbf{p}}_{u} &=\text{Emb}(u)\\ \overrightarrow{\mathbf{q}}_{i} &=\text{Emb}(i) \end{aligned}\]
History Embedding:
\[\begin{aligned} \overrightarrow{\mathbf{m}}_{u} &=\text{ATTN}(\overrightarrow{\mathbf{h}}^{\text{(user)}}, \mathbf{H}[\forall j \in \mathcal{R}_{u}^{+} \setminus \{i\},:], \mathbf{Y}[\forall j \in \mathcal{R}_{u}^{+} \setminus \{i\},:])\\ \overrightarrow{\mathbf{n}}_{i} &=\text{ATTN}(\overrightarrow{\mathbf{h}}^{\text{(item)}}, \mathbf{H}[\forall v \in \mathcal{R}_{i}^{+} \setminus \{u\},:], \mathbf{X}[\forall v \in \mathcal{R}_{i}^{+} \setminus \{u\},:]) \end{aligned}\]
Pairwise Neural Interaction Layers:
\[\begin{aligned} \overrightarrow{\mathbf{z}}_{u,i}^{(1)} &= \text{MLP}_{\text{ReLU}}(\overrightarrow{\mathbf{p}}_{u} \oplus \overrightarrow{\mathbf{q}}_{i})\\ \overrightarrow{\mathbf{z}}_{u,i}^{(2)} &= \text{MLP}_{\text{ReLU}}(\overrightarrow{\mathbf{m}}_{u} \oplus \overrightarrow{\mathbf{n}}_{i})\\ \overrightarrow{\mathbf{z}}_{u,i}^{(3)} &= \text{MLP}_{\text{ReLU}}(\overrightarrow{\mathbf{p}}_{u} \oplus \overrightarrow{\mathbf{n}}_{i})\\ \overrightarrow{\mathbf{z}}_{u,i}^{(4)} &= \text{MLP}_{\text{ReLU}}(\overrightarrow{\mathbf{m}}_{u} \oplus \overrightarrow{\mathbf{q}}_{i}) \end{aligned}\]
Predict interaction probability of user $u$ and item $i$:
\[\begin{aligned} \hat{y}_{u,i} &= \sigma(\overrightarrow{\mathbf{w}} \cdot [\overrightarrow{\mathbf{z}}_{u,i}^{(1)} \oplus \overrightarrow{\mathbf{z}}_{u,i}^{(2)} \oplus \overrightarrow{\mathbf{z}}_{u,i}^{(3)} \oplus \overrightarrow{\mathbf{z}}_{u,i}^{(4)}] + \overrightarrow{\mathbf{b}}) \end{aligned}\]

How to Attention

Another ID Embedding:
\[\begin{aligned} \overrightarrow{\mathbf{x}}_{v} &=\text{Emb}(v)\\ \overrightarrow{\mathbf{y}}_{j} &=\text{Emb}(j) \end{aligned}\]
Query Vector is Global Context Vector:
\[\begin{aligned} \overrightarrow{\mathbf{h}}^{\text{(user)}}, \quad \overrightarrow{\mathbf{h}}^{\text{(item)}} \end{aligned}\]
Key Vector is Generated by:
\[\begin{aligned} \overrightarrow{\mathbf{h}}_{v} &= \text{tanh}(\mathbf{W} \cdot \overrightarrow{\mathbf{x}}_{v} + \overrightarrow{\mathbf{b}})\\ \overrightarrow{\mathbf{h}}_{j} &= \text{tanh}(\mathbf{W} \cdot \overrightarrow{\mathbf{y}}_{j} + \overrightarrow{\mathbf{b}}) \end{aligned}\]
History Embedding Vector is Generated by:
\[\begin{aligned} \overrightarrow{\mathbf{m}}_{u} &= \sum_{j \in \mathcal{R}_{u}^{+} \setminus \{i\}}{\alpha_{j} \cdot \overrightarrow{\mathbf{y}}_{j}}\\ \overrightarrow{\mathbf{n}}_{i} &= \sum_{v \in \mathcal{R}_{i}^{+} \setminus \{u\}}{\alpha_{v} \cdot \overrightarrow{\mathbf{x}}_{v}}\\ \end{aligned}\]
Attention Weight is Calculated by Softmax:
\[\begin{aligned} \alpha_{j} &= \frac{\exp{f(\overrightarrow{\mathbf{h}}^{\text{(user)}},\overrightarrow{\mathbf{h}}_{j})}}{\sum_{j \in \mathcal{R}_{u}^{+} \setminus \{i\}}{\exp{f(\overrightarrow{\mathbf{h}}^{\text{(user)}},\overrightarrow{\mathbf{h}}_{j})}}}\\ \alpha_{v} &= \frac{\exp{f(\overrightarrow{\mathbf{h}}^{\text{(item)}},\overrightarrow{\mathbf{h}}_{v})}}{\sum_{v \in \mathcal{R}_{i}^{+} \setminus \{u\}}{\exp{f(\overrightarrow{\mathbf{h}}^{\text{(item)}},\overrightarrow{\mathbf{h}}_{v})}}} \end{aligned}\]
Attention Score Function is Dot Product:
\[\begin{aligned} f(q,k) &= q \cdot k \end{aligned}\]

DNCF

문제 의식: 아이디 임베딩(ID Embedding)과 히스토리 임베딩(History Embedding)의 분리로 인한 표현력의 제약
- DELF 는 아이디 임베딩과 히스토리 임베딩을 분리하여 매칭 함수 학습을 수행함
- 각 표현이 서로의 표현력을 보완하거나 강화하지 못함
DNMF(Deep Neural Matrix Factorization): 아이디 임베딩과 히스토리 임베딩을 결합한 하나의 표현을 생성하여 NeuMF 의 표현력을 강화하는 앙상블 모형
- He, G., Zhao, D., & Ding, L.
  (2021).
  Dual-embedding based neural collaborative filtering for recommender systems.
  arXiv preprint arXiv:2102.02549.
Components
- DGMF: Dual-Embedding based Generalized Matrix Factorization
- DMLP: Dual-Embedding based Multi-Layer Perceptron
- DNMF: DGMF & DMLP Ensemble

Notation

$u=1,2,\cdots,M$: user idx
$i=1,2,\cdots,N$: item idx
$\mathbf{Y} \in \mathbb{R}^{M \times N}$: user-item interaction matrix
$\overrightarrow{\mathbf{p}}_{u} \in \mathbb{R}^{K}$: user ID embedding vector
$\overrightarrow{\mathbf{q}}_{i} \in \mathbb{R}^{K}$: item ID embedding vector
$\overrightarrow{\mathbf{m}}_{u} \in \mathbb{R}^{K}$: user history embedding vector
$\overrightarrow{\mathbf{n}}_{i} \in \mathbb{R}^{K}$: item history embedding vector
$\overrightarrow{\mathbf{u}}_{u}$: user embedding combination vector
$\overrightarrow{\mathbf{v}}_{i}$: item embedding combination vector
$\overrightarrow{\mathbf{z}}_{u,i}$: predictive vector of user $u$ and item $i$
$\hat{y}_{u,i}$: interaction probability of user $u$ and item $i$

How to Modeling

DNMF is DGMF & DMLP Ensemble
\[\begin{aligned} \hat{y}_{u,i} &= \sigma(\overrightarrow{\mathbf{w}} \cdot [\overrightarrow{\mathbf{z}}_{u,i}^{\text{(DGMF)}} \oplus \overrightarrow{\mathbf{z}}_{u,i}^{\text{(DMLP)}}] + \overrightarrow{\mathbf{b}}) \end{aligned}\]

DGMF

ID Embedding:
\[\begin{aligned} \overrightarrow{\mathbf{p}}_{u} &=\text{Emb}(u)\\ \overrightarrow{\mathbf{q}}_{i} &=\text{Emb}(i) \end{aligned}\]
History Embedding:
\[\begin{aligned} \overrightarrow{\mathbf{m}}_{u} &=\frac{1}{\sqrt{\vert \mathcal{R}_{u}^{+} \setminus \{i\} \vert}}\mathbf{W} \cdot \mathbf{Y}_{u*}\\ \overrightarrow{\mathbf{n}}_{i} &=\frac{1}{\sqrt{\vert \mathcal{R}_{i}^{+} \setminus \{u\} \vert}}\mathbf{W} \cdot \mathbf{Y}_{*i} \end{aligned}\]
Embedding Combination:
\[\begin{aligned} \overrightarrow{\mathbf{u}}_{u} &= \text{Agg}(\overrightarrow{\mathbf{p}}_{u}, \overrightarrow{\mathbf{m}}_{u})\\ \overrightarrow{\mathbf{v}}_{i} &= \text{Agg}(\overrightarrow{\mathbf{q}}_{i}, \overrightarrow{\mathbf{n}}_{i}) \end{aligned}\]
- element-wise sum
- element-wise mean
- concatenation
- attention
Predictive Vector of user $u$ and item $i$:
\[\begin{aligned} \overrightarrow{\mathbf{z}}_{u,i} &= \overrightarrow{\mathbf{u}}_{u} \odot \overrightarrow{\mathbf{v}}_{i} \end{aligned}\]
If use DGMF as a single prediction module:
\[\begin{aligned} \hat{y}_{u,i} &= \sigma(\overrightarrow{\mathbf{w}} \cdot \overrightarrow{\mathbf{z}}_{u,i} + \overrightarrow{\mathbf{b}}) \end{aligned}\]

DMLP

ID Embedding:
\[\begin{aligned} \overrightarrow{\mathbf{p}}_{u} &=\text{Emb}(u)\\ \overrightarrow{\mathbf{q}}_{i} &=\text{Emb}(i) \end{aligned}\]
History Embedding:
\[\begin{aligned} \overrightarrow{\mathbf{m}}_{u} &=\frac{1}{\sqrt{\vert \mathcal{R}_{u}^{+} \setminus \{i\} \vert}}\mathbf{W} \cdot \mathbf{Y}_{u*}\\ \overrightarrow{\mathbf{n}}_{i} &=\frac{1}{\sqrt{\vert \mathcal{R}_{i}^{+} \setminus \{u\} \vert}}\mathbf{W} \cdot \mathbf{Y}_{*i} \end{aligned}\]
Embedding Combination:
\[\begin{aligned} \overrightarrow{\mathbf{u}}_{u} &= \overrightarrow{\mathbf{p}}_{u} \oplus \overrightarrow{\mathbf{m}}_{u}\\ \overrightarrow{\mathbf{v}}_{i} &= \overrightarrow{\mathbf{q}}_{i} \oplus \overrightarrow{\mathbf{n}}_{i} \end{aligned}\]
Predictive Vector of user $u$ and item $i$:
\[\begin{aligned} \overrightarrow{\mathbf{z}}_{u,i} &= \text{MLP}_{\text{ReLU}}(\overrightarrow{\mathbf{u}}_{u} \oplus \overrightarrow{\mathbf{v}}_{i}) \end{aligned}\]
If use DMLP as a single prediction module:
\[\begin{aligned} \hat{y}_{u,i} &= \sigma(\overrightarrow{\mathbf{w}} \cdot \overrightarrow{\mathbf{z}}_{u,i} + \overrightarrow{\mathbf{b}}) \end{aligned}\]

RECOMMENDER SYSTEM, 2.mlp based collaborative filtering

This post is licensed under CC BY 4.0 by the author.

Embedding Type

DELF

Notation

How to Modeling

How to Attention

DNCF

Notation

How to Modeling

DGMF

DMLP

Trending Tags