t_wの輪郭

Feedlyでフォローするボタン
微分損失関数
あれコサイン類似度からなる損失関数の微分

数値微分と結果が一致することを確認できた


コサイン類似度: \( cos(X, Y) = \frac{\sum_{i=1}^{N}(X_i Y_i)}{\sqrt{\sum_{i=1}^{N}(X_i^2)} \sqrt{\sum_{i=1}^{N}(Y_i^2)}} \)

目標とする類似度: \(t\)

コサイン類似度からなる損失関数: \( L(X, Y, t) = (t- cos(X, Y))^2 \)

より

コサイン類似度からなる損失関数を\(X_i\)について偏微分した式

$$ \frac{\partial L(X, Y, t)}{\partial X_i} = \frac{\partial(t-cos(X,Y))^2}{\partial X_i} = \frac{\partial(t-cos(X,Y))^2}{\partial (t-cos(X,Y))} \frac{\partial(t-cos(X,Y))}{\partial cos(X,Y)} \frac{\partial cos(X,Y)}{\partial X_i} $$

を求める。


$$ \frac{\partial(t-cos(X,Y))^2}{\partial (t-cos(X,Y))} = 2(t-cos(X,Y))$$
$$ \frac{\partial(t-cos(X,Y))}{\partial cos(X,Y)} = -1 $$
$$ \frac{\partial cos(X,Y)}{\partial X_i} = \frac{\partial \frac{\sum_{i=1}^{N}(X_i Y_i)}{\sqrt{\sum_{i=1}^{N}(X_i^2)} \sqrt{\sum_{i=1}^{N}(Y_i^2)}}}{\partial X_i} $$
$$ = \frac{\partial (\sum_{i=1}^{N}(X_i Y_i) ({\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}}})}{\partial X_i} $$
$$ = \frac{\partial (\sum_{i=1}^{N}(X_i Y_i))}{\partial X_i} (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} + \sum_{i=1}^{N}(X_i Y_i) \frac{\partial ((\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}})}{\partial X_i} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} $$
$$ = Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} + \sum_{i=1}^{N}(X_i Y_i) \frac{\partial (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}}}{\partial (\sum_{i=1}^{N}(X_i^2))} \frac{\partial (\sum_{i=1}^{N}(X_i^2))}{\partial X_i} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} $$
$$ = Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} + \sum_{i=1}^{N}(X_i Y_i) (-\frac{1}{2} (\sum_{i=1}^{N}(X_i^2))^{-\frac{3}{2}} 2X_i) (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} $$
$$ = Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} - X_i \sum_{i=1}^{N}(X_i Y_i) (\sum_{i=1}^{N}(X_i^2))^{-\frac{3}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} $$
$$ = (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} (Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} - X_i \sum_{i=1}^{N}(X_i Y_i) (\sum_{i=1}^{N}(X_i^2))^{-\frac{3}{2}}) $$

より

$$ \frac{\partial L(X, Y, t)}{\partial X_i} = 2(t-cos(X,Y)) (-1) ((\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} (Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} - X_i \sum_{i=1}^{N}(X_i Y_i) (\sum_{i=1}^{N}(X_i^2))^{-\frac{3}{2}}) $$
$$ = -2(t-cos(X,Y)) ((\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} (Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} - X_i \sum_{i=1}^{N}(X_i Y_i) (\sum_{i=1}^{N}(X_i^2))^{-\frac{3}{2}}) $$