알아두면 좋은 주요 딥러닝 모델들

딥러닝 모델 자체는 굉장히 오래되었다고 하지만 여기서는 자세한 역사보다는 기념비적인 논문이자 CNN 구조를 가진 LeNet5과 ImageNet Large Scale Visual Recognition Challenge (ILSVRC)의 2012년 우승 모델인 AlexNet으로 시작해서 2023년까지의 주목할만한 모델들을 정리해보았다.

LLM 모델들은 별도로 다룰 필요가 있다고 생각하여 제외하고 가장 유명한 모델인 ChatGPT의 시작인 GPT만 소개한다. Out-of-Distribution (OOD)과 Generation (Generative Adversarial Network, GAN), Document Image Classification 등 개인적으로 관심이 있거나 다양한 강의를 통해 수집한 자료이므로 모두가 공감할 만한 모델들은 아닐수 있다. 하지만 상당수는 시간을 들여 볼만한 가치가 있는 모델들이라 생각한다.

딥러닝 모델들이 가장 메이저하게 쓰이는 분야는 컴퓨터 비전과 자연어처리다. 파란색 네모 안에 표기된 모델들은 컴퓨터 비전 모델들이고 빨간색은 자연어 모델, 보라색은 Multimodal 모델들이다.

각각의 모델은 각자 풀고자하는 문제들이 있기 마련인데 이런 태스크를 괄호로 표시하여 아래와 같이 정리하였다.

향후 공부하고자 하는 방향이나 주제를 정하기에 도움이 되리라 생각한다.

Computer Vision Models

1998

LeNet-5 (Image Classification)

2012

AlexNet (Image Classification)

2014

VGGNet (Image Classification)
GoogleNet = Inception Model (Image Classification)
R-CNN (Object Detection)
SPPNet (Object Detection)
Show and Tell (Image Captioning)

2015

ResNet (Image Classification)
Fast R-CNN (Object Detection)
Faster R-CNN (Object Detection)
Show, Attend and Tell (Image Captioning)
FCN (Semantic Segmentation)
DeepLab (Semantic Segmentation)
DeconvNet (Semantic Segmentation)
U-Net (Image Segmentation)

2016

YOLO (Object Detection)
SSD (Single Shot Detector) (Object Detection)
CAM (Class Activation Map) (Interpretable / Visualized Classification)
Occupancy Flow (IROS 2016) (Object Movement Prediction)

2017

ASPP = DeepLab V2 (Semantic Segmentation)
DenseNet (Image Classification)
PSPNet (Semantic Segmentation)
CycleGAN (GAN)
MobileNet (Image Classification)
ODIN (Out-of-Distribution(OOD) Detection)

2018

Neural Architecture Search (NAS) (Optimizing Architecure of Neural Network)
DCGAN (GAN)
StarGAN (GAN)
StyleGAN (GAN)

2019

EfficientNet (Image Classification)
Grad-CAM (Interpretable / Visualized Classification)
HRNet (Human Pose Estimation)

2020

ViT (Image Classification)
DETR (Object Detection)
DINO (SSL(Self-Supervised Learning) and Knowledge Distillation)

2021

SETR (Semantic Segmentation)
SegFormer (Semantic Segmentation)
CoAtNet (Image Classification)
ConvNeXt (Image Classification)
Swin Transformer (Image Classification)

2023

Segment Anything Model (SAM) (Do Task by Prompt Input)

NLP Models

2014

Seq2Seq (Machine Translation)

2017

Transformer (Machine Translation)

2018

GPT (GLUE)
BERT (GLUE)

*GLUE = General Language Understanding Evaludation

범용적인 언어 모델을 위해 QA, sentence similarity, sentiment analysis 등 다양한 NLP 태스크를 모은 것이다.

Multimodal Models

2019

LayoutLM (Document Image Understanding)

2021

CLIP

2022

BLIP

Refences:

고려대학교 XAI506: Deep Learning

고려대학교 XAI511: Neural Network

고려대학교 XAI601: Deep Learning Applications

[업스테이지] AI 심화 학습 - Computer Vision 강의 in 패스트캠퍼스&업스테이지 AI Lab 1기

패스트캠퍼스 실무 사례로 배우는 컴퓨터 비전 논문 구현과 알고리즘 성능 최적화 With SOTA 모델

https://en.wikipedia.org/wiki/Deep_learning

'Deep Learning' 카테고리의 다른 글

Meta, Few-shot, Zero-shot, Active Learning (0)	2024.10.25
딥러닝 모델 분석과 구현 정리 (0)	2024.04.24
Tensorboard and WandB (0)	2024.03.29
PyTorch Image Datasets, Custom Dataset, Fix Seed (0)	2024.03.27
Memory Requirement of Deep Learning Models (0)	2024.02.01

공부 기록하는 블로그

알아두면 좋은 주요 딥러닝 모델들

Computer Vision Models

NLP Models

Multimodal Models

'Deep Learning' 카테고리의 다른 글

티스토리툴바

알아두면 좋은 주요 딥러닝 모델들

Computer Vision Models

NLP Models

Multimodal Models

'Deep Learning' 카테고리의 다른 글

관련글

티스토리툴바