Let's Run Jinyeah

Objective function/Loss function 본문

Deep Learning/Theory

Objective function/Loss function

jinyeah 2022. 5. 10. 13:14

To improve the performance of a Deep Learning model the goal is to the minimize or maximize the objective function. For regression, classification, and regression problems, the objective function is minimzing the difference between predictions and ground truths. Therefore, the objective function is also called loss functions.


Regression Loss Functions

  • Squared Error Loss
  • Absolute Error Loss
  • Huber Loss

Classification Loss Functions

  • Cross-Entropy Loss (based on KL divergence)
  • Hinge Loss

Segmentation Loss Functions

  • [Region-based loss ] Dice Loss
  • [Distribuion-based loss] Pixel-wise Cross-Entropy Loss

Regression

1. Mean Absolute Error (L1-norm, L1-loss)

  • the average of sum of absolute differences between predictions and actual observations
  • more robust to outliers since it does not make use of square
  • not differentiable

2. Mean Squared Error (L2-norm, L2 loss)

  • the average squared difference between predictions and actual observations 
  • only concerned with the average magnitude of error irrespective of their direction
  • it is also called as L2 norm because it is the vector norm or Euclidean distance of the outputs and the targets
  • due to squaring, outliers are penalized heavily in comparison to less deviated predictions
  • simple and easy to calculate the gradient = differentiable

3. Huber Loss

  • takes advantages of L1, L2 loss: differentiable and robust to outliers
  • have to find hyperparameter Delta
    • if Delta is close to 0, it becomes likes MSE
    • if Delta is big, it becomes like MAE

Classification

1. Hinge Loss/Multi-class SVM Loss

  • s(j) : score of all incorrect categories
  • s(yi) : score of correct category
  • 1: safety margin
  • if s(j) > s(yi) - 1, wrong prediction and high loss
  • if s(j) < s(yi) - 1, correct prediction and zero loss

2. Cross-Entropy Loss/Negative Log-Likelihood

참고) Kullback Leibler Divergence Loss

  • a measure of how one probability distribution differs from a baseline distribution
  • p is the actual probability(base), q is the estimated probability
  • same as normalized log probability of the true likelihood divided by the likelihood of the second distribution

Segmentation

1. [Region-based loss] Dice Loss

  • Region-based loss functions aim to minimize the mismatch or maximize the overlap regions between ground truth and predicted segmentation
  • based on the Dice coefficient which is a measure of overlap between two samples
  • P(true) is a binary mask, so the overlap between predictions and targets can be calculated with element-wise multiplication
  • only consider foreground

2. [Distribution-based loss] Pixel-wise Cross Entropy Loss

  • apply cross entropy loss for every pixel

 

reference

https://towardsdatascience.com/common-loss-functions-in-machine-learning-46af0ffc4d23

https://towardsdatascience.com/part-2-a-new-tool-to-your-toolkit-kl-divergence-736c134baa3d

https://towardsdatascience.com/cross-entropy-negative-log-likelihood-and-all-that-jazz-47a95bd2e81

https://hoonst.github.io/2021/01/07/loss-functions/

'Deep Learning > Theory' 카테고리의 다른 글

Transfer Learning and Domain Adaptation  (0) 2022.08.10
Entropy and Cross-Entropy  (0) 2022.07.31
Normalization  (0) 2022.06.18
Bayes Decision Theory  (0) 2022.05.15
Variance & Bias  (0) 2021.08.30
Comments