Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | |||||
3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 | 11 | 12 | 13 | 14 | 15 | 16 |
17 | 18 | 19 | 20 | 21 | 22 | 23 |
24 | 25 | 26 | 27 | 28 | 29 | 30 |
31 |
Tags
- 3d medical image
- clip intensity values
- non parametic softmax
- Policy Gradient
- sidleup
- scowl
- checkitout
- rest-api
- Inorder Traversal
- model-free control
- noise contrast estimation
- Knowledge Distillation
- freebooze
- Actor-Critic
- remove outliers
- Excel
- loss functions
- REINFORCE
- sample rows
- domain adaptation
- straightup
- MRI
- shadowing
- thresholding
- 자료구조
- resample
- normalization
- pulloff
- fastapi
- objective functions for machine learning
Archives
- Today
- Total
Let's Run Jinyeah
Distilling the Knowledge in Neural Network 본문
Paper Review/Knowledge Distillation
Distilling the Knowledge in Neural Network
jinyeah 2022. 5. 10. 13:06[논문] Distilling the Knowledge in Neural Network
- Knowledge Distillation의 초창기 논문 (2014 NIPS workshop)
- one model compression technique for bringing computations to edge devices. Where the goal is to have a small and compact model to mimic the performance of the cumbersome model
- Supervised learning (Label 존재)
- Response based knowledge (Teacher의 soft target) + Offline distillation (pre-trained Teacher model)
Knowledge
기존
- one-hot encoding을 통한 softmax 확률값에서 가장 높은 값은 1, 나머지 값은 0이 되는 hard target 사용
- 기존 softmax function은 가장 큰 확률값은 1에 가깝고 나머지는 0에 가까운 값으로 매핑되는 문제점 있음
제안 방법
- soft target: Temperature hyperparameter (T)를 softmax function에 추가
- Temperature = 1 일 때, 기존 softmax function과 동일
- Temperature가 클수록 더 soft한 확률분포 얻음
Distillation
Teacher 모델이 학습한 Knowledge를 Student 모델에 전달
- disillation loss: soft label과 soft prediction의 차이를 Kullback-Leiber Divergence를 통해 구함
- student loss: hard predictions와 hard label을 Cross-entropy를 통해 구함
What is Kullback-Leiber Divergence?
2022.05.10 - [Deep Learning/Basic] - Objective function
reference
https://towardsdatascience.com/distilling-knowledge-in-neural-network-d8991faa2cdc
https://intellabs.github.io/distiller/knowledge_distillation.html
https://dsbook.tistory.com/324
Comments