Ubuntu에서 CPU Monitoring

방대한 양의 NLP 데이터를 전처리할 때 CPU 멀티프로세싱을 사용했다.

이때 서버에서 CPU를 어떻게 모니터링해야하나 찾아본 결과다.

htop이 가장 편했던걸로 기억한다.

1. top 명령어

가장 기본적이고 자주 사용되는 실시간 CPU 모니터링 도구다.

CPU 사용률 퍼센트, 메모리 사용량, 실행 중인 프로세스 등을 실시간으로 확인할 수 있다

top

결과 >

top - 12:35:11 up 1 min,  1 user,  load average: 0.00, 0.00, 0.00
Tasks:  24 total,   1 running,  23 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  31975.2 total,  30918.7 free,    746.9 used,    622.3 buff/cache
MiB Swap:   8192.0 total,   8192.0 free,      0.0 used.  31228.2 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
      1 root      20   0   21712  12456   9416 S   0.0   0.0   0:00.21 systemd
      2 root      20   0    3060   1760   1760 S   0.0   0.0   0:00.00 init-systemd(Ub
     10 root      20   0    3076   1820   1760 S   0.0   0.0   0:00.00 init
     56 root      19  -1   42160  15080  14120 S   0.0   0.0   0:00.05 systemd-journal
    101 root      20   0   24872   5760   4800 S   0.0   0.0   0:00.05 systemd-udevd
    124 systemd+  20   0   21452  12800  10560 S   0.0   0.0   0:00.04 systemd-resolve
    127 systemd+  20   0   91020   7520   6720 S   0.0   0.0   0:00.02 systemd-timesyn
    164 root      20   0    4236   2560   2400 S   0.0   0.0   0:00.00 cron
    165 message+  20   0    9636   4960   4480 S   0.0   0.0   0:00.00 dbus-daemon
    169 ollama    20   0   66.0g  45120  24000 S   0.0   0.1   0:00.20 ollama

h를 누르면 아래와 같은 도움말이 뜬다.

Help for Interactive Commands - procps-ng 4.0.4
Window 1:Def: Cumulative mode Off.  System: Delay 3.0 secs; Secure mode Off.

  Z,B,E,e   Global: 'Z' colors; 'B' bold; 'E'/'e' summary/task memory scale
  l,t,m,I,0 Toggle: 'l' load avg; 't' task/cpu; 'm' memory; 'I' Irix; '0' zeros
  1,2,3,4,5 Toggle: '1/2/3' cpu/numa views; '4' cpus abreast; '5' P/E-cores
  f,X       Fields: 'f' add/remove/order/sort; 'X' increase fixed-width fields

  L,&,<,> . Locate: 'L'/'&' find/again; Move sort column: '<'/'>' left/right
  R,H,J,C . Toggle: 'R' Sort; 'H' Threads; 'J' Num justify; 'C' Coordinates
  c,i,S,j . Toggle: 'c' Cmd name/line; 'i' Idle; 'S' Time; 'j' Str justify
  x,y     . Toggle highlights: 'x' sort field; 'y' running tasks
  z,b     . Toggle: 'z' color/mono; 'b' bold/reverse (only if 'x' or 'y')
  u,U,o,O . Filter by: 'u'/'U' effective/any user; 'o'/'O' other criteria
  n,#,^O  . Set: 'n'/'#' max tasks displayed; Show: Ctrl+'O' other filter(s)
  V,v,F   . Toggle: 'V' forest view; 'v' hide/show children; 'F' keep focused

  d,k,r,^R 'd' set delay; 'k' kill; 'r' renice; Ctrl+'R' renice autogroup
  ^G,K,N,U  View: ctl groups ^G; cmdline ^K; environment ^N; supp groups ^U
  Y,!,^E,P  Inspect 'Y'; Combine Cpus '!'; Scale time ^E; View namespaces ^P
  W,q       Write config file 'W'; Quit 'q'
          ( commands shown with '.' require a visible task display window )
Press 'h' or '?' for help with Windows,
Type 'q' or <Esc> to continue

2. htop 명령어

top의 향상된 버전으로, 컬러로 구분되며 마우스를 클릭하면 하늘색으로 하이라이트 된다.

F 키들을 통해서 검색, 필터링과 같은 다양한 기능을 적용할 수 있다.

특히 F5 트리를 누르면 Figure 2에 나온것 처럼 구조화 되어서 편하다.

htop

결과 >

위 0번 부터 19번까지가 CPU의 개별 코어를 나타낸다.

본인은 i5-13600이라서 총 20개의 코어를 가지고 있다.

F5 Tree 결과 >

3. vmstat 명령

1초 간격으로 CPU, 메모리, I/O 통계를 보여준다.

첫 줄에서는 시스템 부팅 이후의 평균값을 표시한다.

vmstat 1

결과 >

procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st gu
0  0      0 32108444   2924 176456    0    0   421    38  124    0  0  0 100  0  0  0
0  0      0 32108444   2924 176768    0    0     0     0   48   54  0  0 100  0  0  0
0  0      0 32108444   2924 176768    0    0     0     0   35   40  0  0 100  0  0  0

4. mpstat 명령어

CPU의 코어별 상태를 확인한다.

1초 간격으로 업데이트 하는 코드는 아래와 같다.

mpstat -P ALL 1

결과 >

5. sar 명령어

CPU, 메모리, IO를 모니터링하여 보고서 형태로 출력한다.

sar -u 1 5  # 1초 간격으로 5회 측정

결과 >

6. /proc/stat 파일 확인

CPU와 관련된 raw data를 확인한다.

cat /proc/stat

출력 값의 컬럼은 순서대로 다음과 같다.

cpu, user, system, nice, idle, wait, hi, si, zero

user : 사용자 영역 코드 실행 시간
system: 기본보다 낮은 우선순위로 실행한 사용자 영역 코드 실행 시간
nice : 커널 영역 코드 실행 시간
idle : I/O 완료가 아닌 대기 시간
wait : I/O 완료 대기 시간
hi : Hard Interrupt(IRQ)
si : Soft Interrupt(SoftIRQ)
zero : 끝

결과 >

cpu 245 0 457 1601721 192 0 76 0 0 0
cpu0 4 0 32 80079 3 0 48 0 0 0
cpu1 40 0 22 80035 32 0 3 0 0 0
cpu2 32 0 46 80020 21 0 6 0 0 0
cpu3 10 0 31 80051 35 0 8 0 0 0
cpu4 5 0 18 80100 12 0 6 0 0 0
cpu5 4 0 10 80116 1 0 0 0 0 0
cpu6 35 0 87 79991 10 0 0 0 0 0
cpu7 14 0 5 80114 0 0 1 0 0 0
cpu8 5 0 13 80103 8 0 0 0 0 0
cpu9 4 0 4 80125 1 0 0 0 0 0
cpu10 23 0 13 80092 4 0 0 0 0 0
cpu11 10 0 16 80092 9 0 1 0 0 0
cpu12 6 0 16 80091 15 0 0 0 0 0
cpu13 2 0 2 80127 2 0 0 0 0 0
cpu14 4 0 75 80042 2 0 0 0 0 0
cpu15 6 0 12 80102 10 0 0 0 0 0
cpu16 5 0 18 80104 6 0 0 0 0 0
cpu17 0 0 1 80130 0 0 0 0 0 0
cpu18 17 0 23 80083 7 0 0 0 0 0
cpu19 10 0 5 80116 3 0 0 0 0 0
intr 117943 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 865 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 157825
btime 1748575989
processes 883
procs_running 1
procs_blocked 0
softirq 113559 0 13614 2 366 0 0 53628 23766 0 22183

References:

https://waspro.tistory.com/155

https://cafe24.zendesk.com/hc/ko/articles/9853917937817-LINUX-sar-%EB%AA%85%EB%A0%B9%EC%96%B4%EB%A5%BC-%EC%9D%B4%EC%9A%A9%ED%95%9C-%EC%8B%9C%EC%8A%A4%ED%85%9C-%EB%AA%A8%EB%8B%88%ED%84%B0%EB%A7%81

https://www.ibm.com/docs/ko/aix/7.3.0?topic=s-sar-command

https://hbase.tistory.com/326

https://velog.io/@ahngj96/procstat-CPU-%EC%A0%95%EB%B3%B4

https://blog.naver.com/ptupark/130102605590

'개발 > Linux' 카테고리의 다른 글

Ubuntu 기본 명령어 모음 (0)	2024.11.28
Ubuntu에서 GPU Monitoring (0)	2024.11.08

공부 기록하는 블로그

Ubuntu에서 CPU Monitoring

1. top 명령어

2. htop 명령어

3. vmstat 명령

5. sar 명령어

6. /proc/stat 파일 확인

'개발 > Linux' 카테고리의 다른 글

티스토리툴바

Ubuntu에서 CPU Monitoring

1. top 명령어

2. htop 명령어

3. vmstat 명령

5. sar 명령어

6. /proc/stat 파일 확인

'개발 > Linux' 카테고리의 다른 글

관련글

티스토리툴바