Sabtu, 31 Januari 2026

Mapping Process ID to Instance-ID Openstack

Halo Guys, Ridwan heres!  Come again with a new  notes update!

Kali ini saya mau berbagi sedikit tips Troubleshooting di OpenStack. Sebagai SysAdmin pasti pernah mengalami momen di mana Monitoring System Trigger Alert pada salah satu Compute Node (Hypervisor) yang mengalami High Load Resource Usage.


Alert di atas berisi informasi terkait kondisi Resource CPU usage melebihi nilai ambang batas normal, dan tidak ada informasi lebih details seperti penyebabnya apa, aplikasi apa yang menggunakan resource tersebut, dan lainnya.  Untuk mencari tahu details tersebut, kita butuh belajar kembali "Back to Basic" pada course di  Red Hat 124 dengan judul materi "Monitoring and Managing Linux Processes" 


1. Analisis Anomali Resource usage di Server Compute

Pada output command "top" akan ada banyak proses qemu-kvm karena server ini merupakan hypervisor tempat running nya VM.

top - 13:31:48 up 66 days, 23:50,  1 user,  load average: 126.82, 126.23, 128.78
Tasks: 3357 total,   4 running, 3353 sleeping,   0 stopped,   0 zombie
%Cpu(s): 52.4 us,  8.6 sy,  0.0 ni, 35.2 id,  0.0 wa,  1.9 hi,  1.8 si,  0.0 st
MiB Mem : 3094507.+total, 614593.1 free, 2479829.+used,  12083.9 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used. 614678.2 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 691730 qemu      20   0  518.2g 510.5g  22936 R  2646  16.9 565461:25 qemu-kvm
  75529 qemu      20   0   35.0g  31.8g  22584 S  1165   1.1 725669:37 qemu-kvm
 213227 qemu      20   0  132.2g 124.6g  22696 S 775.4   4.1 529504:26 qemu-kvm

Perhatikan baris pertama pada kolom PID dan %CPU. Di situ terlihat jelas ada proses dari qemu-kvm dengan PID 691730 yang menggunakan CPU secara tidak wajar (sampai 2646% karena multicore). Ini dia penyebab yang bikin load average server jadi tinggi.


2. Mencari Openstack ID-Instance dari Process ID (PID)

[tripleo-admin@Openstack-Compute-2 ~]$ ps aux | grep 691730

tripleo+  578303  0.0  0.0   6408  2316 pts/27   S+   13:31   0:00 grep --color=auto 691730
qemu      691730 1201 16.8 543362420 535312312 ? Sl    2025 565465:21 /usr/libexec/qemu-kvm -name guest=instance-00003161,debug-threads=on -S -object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-324-instance-00003161/master-key.aes"} -machine pc-q35-rhel9.0.0,usb=off,dump-guest-core=off,memory-backend=pc.ram -accel kvm -cpu Cascadelake-Server-noTSX -m 524288 -object {"qom-type":"memory-backend-ram","id":"pc.ram","size":549755813888} -overcommit mem-lock=off -smp 64,sockets=64,dies=1,cores=1,threads=1 -uuid 457990e4-8d26-4f18-9940-72f652b99572 -smbios type=1,manufacturer=Red Hat,product=OpenStack Compute,version=23.2.3-17.1.20231018130828.el9ost,serial=457990e4-8d26-4f18-9940-72f652b99572,uuid=457990e4-8d26-4f18-9940-72f652b99572,family=Virtual Machine -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=59,server=on,wait=off -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device {"driver":"pcie-root-port","port":16,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x2"} -device {"driver":"pcie-root-port","port":17,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x2.0x1"} -device {"driver":"pcie-root-port","port":18,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x2.0x2"} -device {"driver":"pcie-root-port","port":19,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x2.0x3"} -device {"driver":"pcie-root-port","port":20,"chassis":5,"id":"pci.5","bus":"pcie.0","addr":"0x2.0x4"} -device {"driver":"pcie-root-port","port":21,"chassis":6,"id":"pci.6","bus":"pcie.0","addr":"0x2.0x5"} -device {"driver":"pcie-root-port","port":22,"chassis":7,"id":"pci.7","bus":"pcie.0","addr":"0x2.0x6"} -device {"driver":"pcie-root-port","port":23,"chassis":8,"id":"pci.8","bus":"pcie.0","addr":"0x2.0x7"} -device {"driver":"pcie-root-port","port":24,"chassis":9,"id":"pci.9","bus":"pcie.0","multifunction":true,"addr":"0x3"} -device {"driver":"pcie-root-port","port":25,"chassis":10,"id":"pci.10","bus":"pcie.0","addr":"0x3.0x1"} -device {"driver":"pcie-root-port","port":26,"chassis":11,"id":"pci.11","bus":"pcie.0","addr":"0x3.0x2"} -device {"driver":"pcie-root-port","port":27,"chassis":12,"id":"pci.12","bus":"pcie.0","addr":"0x3.0x3"} -device {"driver":"pcie-root-port","port":28,"chassis":13,"id":"pci.13","bus":"pcie.0","addr":"0x3.0x4"} -device {"driver":"pcie-root-port","port":29,"chassis":14,"id":"pci.14","bus":"pcie.0","addr":"0x3.0x5"} -device {"driver":"pcie-root-port","port":30,"chassis":15,"id":"pci.15","bus":"pcie.0","addr":"0x3.0x6"} -device {"driver":"pcie-root-port","port":31,"chassis":16,"id":"pci.16","bus":"pcie.0","addr":"0x3.0x7"} -device {"driver":"pcie-root-port","port":32,"chassis":17,"id":"pci.17","bus":"pcie.0","addr":"0x4"} -device {"driver":"pcie-pci-bridge","id":"pci.18","bus":"pci.1","addr":"0x0"} -device {"driver":"piix3-usb-uhci","id":"usb","bus":"pci.18","addr":"0x1"} -blockdev {"driver":"host_device","filename":"/dev/dm-8","aio":"native","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"} -device {"driver":"virtio-blk-pci","bus":"pci.4","addr":"0x0","drive":"libvirt-1-format","id":"virtio-disk0","bootindex":1,"write-cache":"on","serial":"eb6afa3f-0f95-407e-a71f-94f013d98902"} -netdev {"type":"tap","fd":"62","vhost":true,"vhostfd":"64","id":"hostnet0"} -device {"driver":"virtio-net-pci","rx_queue_size":512,"host_mtu":8942,"netdev":"hostnet0","id":"net0","mac":"fa:16:3e:65:81:25","bus":"pci.2","addr":"0x0"} -netdev {"type":"tap","fd":"65","vhost":true,"vhostfd":"66","id":"hostnet1"} -device {"driver":"virtio-net-pci","rx_queue_size":512,"host_mtu":9000,"netdev":"hostnet1","id":"net1","mac":"fa:16:3e:63:c6:c3","bus":"pci.3","addr":"0x0"} -add-fd set=0,fd=61,opaque=serial0-log -chardev pty,id=charserial0,logfile=/dev/fdset/0,logappend=on -device {"driver":"isa-serial","chardev":"charserial0","id":"serial0","index":0} -device {"driver":"usb-tablet","id":"input0","bus":"usb.0","port":"1"} -device {"driver":"usb-kbd","id":"input1","bus":"usb.0","port":"2"} -audiodev {"id":"audio1","driver":"none"} -vnc 172.22.74.116:8,audiodev=audio1 -device {"driver":"virtio-vga","id":"video0","max_outputs":1,"bus":"pcie.0","addr":"0x1"} -device {"driver":"virtio-balloon-pci","id":"balloon0","bus":"pci.5","addr":"0x0"} -object {"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"} -device {"driver":"virtio-rng-pci","rng":"objrng0","id":"rng0","bus":"pci.6","addr":"0x0"} -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on

Kalau kita baca argument dari command (qemu) nya panjang sekali, padahal kita hanya butuh informasi ID-Instances saja.

Kalian dapat menggunakan command berikut, untuk menemukan instances dengan CPU/Memory usage tertinggi ( --sort=-%cpu atau --sort=-%mem)

[tripleo-admin@Openstack-Compute-2 ~]$ ps aux --sort=-%cpu | awk 'NR==1 {print $1, $2, $3, $4, "UUID"} / -uuid / {for(i=1;i<=NF;i++) if($i=="-uuid") {print $1, $2, $3, $4, $(i+1)}}' | column -t
USER   PID      %CPU  %MEM  UUID
qemu   691730   2646  16.9  457990e4-8d26-4f18-9940-72f652b99572
qemu   75529    1165  1.1   e6175dac-d40f-4375-855c-b48632a28d68
qemu   213227   775.4 4.1   148f0a0b-b13d-4718-aff2-1bc2b11ec505
admin  1642873  0.0   0.0   /

Output dari command ini lebih rapih, mudah dipahami.

 

3. Check ID-Instances via Openstack Client 

Setelah mendapatkan ID dari instance, langkah selanjutnya adalah mengecek dari Openstack-Cli-Client atau Dashboard Horizon untuk mengatahui details info dari VM tersebut.

[stack@DIRECTOR ~]$ source overcloudrc
(overcloud) [stack@DIRECTOR ~]$ openstack server show 457990e4-8d26-4f18-9940-72f652b99572 --fit-width

+-------------------------------------+----------------------------------------------------------------------+
| Field                               | Value                                                                |
+-------------------------------------+----------------------------------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                                               |
| OS-EXT-AZ:availability_zone         | nova                                                                 |
| OS-EXT-SRV-ATTR:host                | compute-2                                    |
| OS-EXT-SRV-ATTR:hypervisor_hostname | compute-2                                  |
| OS-EXT-SRV-ATTR:instance_name       | instance-00003161                                                    |
| OS-EXT-STS:power_state              | Running                                                              |
| OS-EXT-STS:task_state               | None                                                                 |
| OS-EXT-STS:vm_state                 | active                                                               |
| OS-SRV-USG:launched_at              | 2025-12-03T14:13:20.000000                                           |
| addresses                           | 2024399561-IT-Corp-DRC=10.24.216.154; VPC LAN CRM NG=192.168.100.253 |
| flavor                              | a1.xxxlarge.rc (df018f79-910f-4487-965b-2068585cb1ca)                |
| id                                  | 457990e4-8d26-4f18-9940-72f652b99572                                 |
| name                                | LAB-RDW001                                                      |
| project_id                          | 4af09b6255794f4bbc9b285fb5d7eb3d                                     |
| status                              | ACTIVE                                                               |
| user_id                             | 2ce9c0fc6163463b85c323ea07151e75                                     |
+-------------------------------------+----------------------------------------------------------------------+
Bingo!  pada kolom name,  kita akhirnya tahu bahwa VM yang menyebabkan load tinggi tersebut bernama LAB-RDW001.

Sekian tutorial singkat kali ini, Thank you sudah membaca tulisan ini.. 

Semoga bermanfaat !
Share: