https://math.stackexchange.com/questions/1923613/partial-derivative-of-cosine-similarity
コサイン類似度からなる損失関数の微分
2023/9/6 21:16:00
数値微分と結果が一致することを確認できた
コサイン類似度: \( cos(X, Y) = \frac{\sum_{i=1}^{N}(X_i Y_i)}{\sqrt{\sum_{i=1}^{N}(X_i^2)} \sqrt{\sum_{i=1}^{N}(Y_i^2)}} \)
目標とする類似度: \(t\)
コサイン類似度からなる損失関数: \( L(X, Y, t) = (t- cos(X, Y))^2 \)
より
コサイン類似度からなる損失関数を\(X_i\)について偏微分した式
$$ \frac{\partial L(X, Y, t)}{\partial X_i} = \frac{\partial(t-cos(X,Y))^2}{\partial X_i} = \frac{\partial(t-cos(X,Y))^2}{\partial (t-cos(X,Y))} \frac{\partial(t-cos(X,Y))}{\partial cos(X,Y)} \frac{\partial cos(X,Y)}{\partial X_i} $$
を求める。
$$ \frac{\partial(t-cos(X,Y))^2}{\partial (t-cos(X,Y))} = 2(t-cos(X,Y))$$
$$ \frac{\partial(t-cos(X,Y))}{\partial cos(X,Y)} = -1 $$
$$ \frac{\partial cos(X,Y)}{\partial X_i} = \frac{\partial \frac{\sum_{i=1}^{N}(X_i Y_i)}{\sqrt{\sum_{i=1}^{N}(X_i^2)} \sqrt{\sum_{i=1}^{N}(Y_i^2)}}}{\partial X_i} $$
$$ = \frac{\partial (\sum_{i=1}^{N}(X_i Y_i) ({\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}}})}{\partial X_i} $$
$$ = \frac{\partial (\sum_{i=1}^{N}(X_i Y_i))}{\partial X_i} (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} + \sum_{i=1}^{N}(X_i Y_i) \frac{\partial ((\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}})}{\partial X_i} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} $$
$$ = Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} + \sum_{i=1}^{N}(X_i Y_i) \frac{\partial (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}}}{\partial (\sum_{i=1}^{N}(X_i^2))} \frac{\partial (\sum_{i=1}^{N}(X_i^2))}{\partial X_i} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} $$
$$ = Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} + \sum_{i=1}^{N}(X_i Y_i) (-\frac{1}{2} (\sum_{i=1}^{N}(X_i^2))^{-\frac{3}{2}} 2X_i) (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} $$
$$ = Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} - X_i \sum_{i=1}^{N}(X_i Y_i) (\sum_{i=1}^{N}(X_i^2))^{-\frac{3}{2}} (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} $$
$$ = (\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} (Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} - X_i \sum_{i=1}^{N}(X_i Y_i) (\sum_{i=1}^{N}(X_i^2))^{-\frac{3}{2}}) $$
より
$$ \frac{\partial L(X, Y, t)}{\partial X_i} = 2(t-cos(X,Y)) (-1) ((\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} (Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} - X_i \sum_{i=1}^{N}(X_i Y_i) (\sum_{i=1}^{N}(X_i^2))^{-\frac{3}{2}}) $$
$$ = -2(t-cos(X,Y)) ((\sum_{i=1}^{N}(Y_i^2))^{-\frac{1}{2}} (Y_i (\sum_{i=1}^{N}(X_i^2))^{-\frac{1}{2}} - X_i \sum_{i=1}^{N}(X_i Y_i) (\sum_{i=1}^{N}(X_i^2))^{-\frac{3}{2}}) $$