Floating-point Number와 Mixed Precision

컴퓨터가 숫자를 저장하는 방법 중 하나인 Floating point number랑 mixed precision을 블로그에 정리한적이 없길래 간단하게 정리한다.

mantissa는 기억이 나는데 exponent 용어가 기억이 안나서 순간 당황했다.

Nvidia에서 퍼온 그림인데 위와 같이 float 자료형은 Sign, Range (Exponent), 그리고 Precision (Mantissa)으로 구성된다.

sign은 말 그대로 양수냐 음수냐의 부호를 나타내고, range (exponent)에서 숫자의 자리수를 표기하고, precision (mantissa)에서 숫자의 정확도를 표현한다.

$(-1)^{sign}$ * $2^E$ * M 의 형태로 숫자를 표현한다.

이때 M이 크면 클수록 비트 수가 많기 때문에 표현이 더욱 정밀해진다.

FP32를 기준으로 TF32나 FP16, BFLOAT16 등으로 연산의 단위를 변경해서 모델의 사이즈와 그라디언트의 사이즈를 줄인다.

즉 정확도와 모델의 크기와 연산량 감소 사이의 트레이드오프 관계다.

Mixed precision은 이렇게 다양한 데이터 타입 중에서 FP32와 FP16을 함께 사용해서 양쪽의 장점을 최대한 함께 사용하려는 방법이다.

PyTorch에선 autocast로 사용 가능하며 Huggingface에서는 TrainerArguments에서 설정할 수 있다.

Torch의 autocast는 아래와 같이 사용한다.

# Creates model and optimizer in default precision
model = Net().cuda()
optimizer = optim.SGD(model.parameters(), ...)

# Creates a GradScaler once at the beginning of training.
scaler = GradScaler()

for epoch in epochs:
    for input, target in data:
        optimizer.zero_grad()

        # Runs the forward pass with autocasting.
        with autocast(device_type='cuda', dtype=torch.float16):
            output = model(input)
            loss = loss_fn(output, target)

        # Scales loss.  Calls backward() on scaled loss to create scaled gradients.
        # Backward passes under autocast are not recommended.
        # Backward ops run in the same dtype autocast chose for corresponding forward ops.
        scaler.scale(loss).backward()

        # scaler.step() first unscales the gradients of the optimizer's assigned params.
        # If these gradients do not contain infs or NaNs, optimizer.step() is then called,
        # otherwise, optimizer.step() is skipped.
        scaler.step(optimizer)

        # Updates the scale for next iteration.
        scaler.update()

Huggingface의 TrainerArguments는 다음과 같이 사용한다.

FP16은 아래와 같이 사용한다.

from transformers import TrainingArguments

args = TrainingArguments(
    per_device_train_batch_size=4,
    gradient_accumulation_steps=16,
    gradient_checkpointing=True,
    fp16=True.
)

TF32는 아래와 같이 사용한다.

import torch
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.allow_tf32 = True

from transformers import TrainingArguments

args = TrainingArguments(
    per_device_train_batch_size=4,
    gradient_accumulation_steps=16,
    gradient_checkpointing=True,
    bf16=True.
    tf32=True,
)

References:

https://www.nvidia.com/ko-kr/data-center/a100/

https://pytorch.org/blog/what-every-user-should-know-about-mixed-precision-training-in-pytorch/

https://docs.nvidia.com/deeplearning/performance/mixed-precision-training/index.html

https://pytorch.org/docs/stable/notes/amp_examples.html

https://www.youtube.com/watch?v=l8pRSuU81PU

https://blogs.nvidia.co.kr/blog/tensorfloat-32-precision-format/

https://introduce-ai.tistory.com/entry/FP32-TF32-FP16-BFLOAT16-Mixed-Precision%EC%97%90-%EB%8C%80%ED%95%9C-%EC%9D%B4%ED%95%B4

https://blogs.nvidia.com/blog/tensorfloat-32-precision-format/

https://inyongs.tistory.com/40

https://huggingface.co/docs/transformers/main/perf_train_gpu_one?mixed-precision=fp16

'Deep Learning' 카테고리의 다른 글

Quantization 정리 (0)	2025.05.20
KAN (2024) 논문 리뷰 (0)	2025.05.15
Meta, Few-shot, Zero-shot, Active Learning (0)	2024.10.25
딥러닝 모델 분석과 구현 정리 (0)	2024.04.24
Tensorboard and WandB (0)	2024.03.29

공부 기록하는 블로그

Floating-point Number와 Mixed Precision

'Deep Learning' 카테고리의 다른 글

티스토리툴바

Floating-point Number와 Mixed Precision

'Deep Learning' 카테고리의 다른 글

관련글

티스토리툴바