Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |
Tags
- domain adaptation
- pulloff
- resample
- Knowledge Distillation
- straightup
- fastapi
- noise contrast estimation
- sidleup
- 3d medical image
- Excel
- Policy Gradient
- scowl
- freebooze
- rest-api
- non parametic softmax
- checkitout
- MRI
- shadowing
- model-free control
- thresholding
- remove outliers
- 자료구조
- objective functions for machine learning
- Actor-Critic
- loss functions
- clip intensity values
- Inorder Traversal
- sample rows
- REINFORCE
- normalization
Archives
- Today
- Total
Let's Run Jinyeah
Distilling the Knowledge in Neural Network 본문
Paper Review/Knowledge Distillation
Distilling the Knowledge in Neural Network
jinyeah 2022. 5. 10. 13:06[논문] Distilling the Knowledge in Neural Network
- Knowledge Distillation의 초창기 논문 (2014 NIPS workshop)
- one model compression technique for bringing computations to edge devices. Where the goal is to have a small and compact model to mimic the performance of the cumbersome model
- Supervised learning (Label 존재)
- Response based knowledge (Teacher의 soft target) + Offline distillation (pre-trained Teacher model)
Knowledge
기존
- one-hot encoding을 통한 softmax 확률값에서 가장 높은 값은 1, 나머지 값은 0이 되는 hard target 사용
- 기존 softmax function은 가장 큰 확률값은 1에 가깝고 나머지는 0에 가까운 값으로 매핑되는 문제점 있음
제안 방법
- soft target: Temperature hyperparameter (T)를 softmax function에 추가
- Temperature = 1 일 때, 기존 softmax function과 동일
- Temperature가 클수록 더 soft한 확률분포 얻음
Distillation
Teacher 모델이 학습한 Knowledge를 Student 모델에 전달
- disillation loss: soft label과 soft prediction의 차이를 Kullback-Leiber Divergence를 통해 구함
- student loss: hard predictions와 hard label을 Cross-entropy를 통해 구함
What is Kullback-Leiber Divergence?
2022.05.10 - [Deep Learning/Basic] - Objective function
reference
https://towardsdatascience.com/distilling-knowledge-in-neural-network-d8991faa2cdc
https://intellabs.github.io/distiller/knowledge_distillation.html
https://dsbook.tistory.com/324
Comments