Thanks for all your error reports, I didn't forget it. I'll cleanup my guide soon. Thanks again!

Exercise 28. Performance: getting performance stats, uptime, free, top

This exercise is pretty simple. First, what kind of performance stats do we need?

  1. CPU usage stats:
    1. How much is it loaded?
    2. Which processes are using it?
  2. Memory usage stats:
    1. How much memory is used?
    2. How much memory is free?
    3. How much memory is used for caching?
    4. Which processes consume it?
  3. Disk usage stats:
    1. How many input/output operations are performed?
    2. By which processes?
  4. Network usage stats:
    1. How many data is transferred?
    2. By which processes?
  5. Process stats:
    1. How many processes there are?
    2. What they are doing? Working, waiting for something?
    3. If waiting for something what is it? CPU, disks, network?

To get this stats we can use the following utilities:

  1. uptime — how long the system has been running.
  2. free — display amount of free and used memory in the system.
  3. vmstat — information about processes, memory, paging, block IO, traps, disks and cpu activity.
  4. top — display Linux tasks in real-time.

Let us examine this programs and their output.

uptime output:

user1@vm1:~$ uptime
#(1)      (2)                (3)                    (4)   (5)   (6)
 03:13:58 up 4 days, 22:45,  1 user,  load average: 0.00, 0.00, 0.00

Fields and descriptions:

Field Description
(1) Current time.
(2) Uptime (time passed since boot).
(3) How many users are currently logged in.
(4) CPU load in past 1 minute. This is not normalized, so a load average of 1 means a single CPU system is loaded all the time while on a 4 CPU system it means it was idle 75% of the time.
(5) CPU load in past 5 minutes.
(6) CPU load in past 15 minutes.

free output:

user1@vm1:~$ free -mt
#            (1)         (2)        (3)         (4)        (5)        (6)
             total       used       free     shared    buffers     cached
Mem:           496        267        229          0         27        196
#                         (7)        (8)
-/+ buffers/cache:         43        453
# 9
Swap:          461          0        461
# 10
Total:         958        267        691

Fields and descriptions:

Field Description
(1) Total amount of psysical memory.
(2) Total amount of used physical memory.
(3) Total mount of free physical memory.
(4) Shared memory column should be ignored; it is obsolete.
(5) Total RAM dedicated to cache disk block, and filesystem metadata.
(6) Total RAM dedicated to pages from file reading.
(7) Total amount of physical memory excluding buffers and cache, (2) - (5) - (6)
(8) Total amount of free physical memeory counting buffers and cache as free, (1) - (7)
(9) Swap file usage statistics.
(10) Total memory usage statistics, including swap

vmstat output:

user1@vm1:~$ vmstat -S M
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
#(1,2)  (3)    (4)    (5)   (6)    (7)   (8)   (9)  (10) (11) (12,13,14,15,16)
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0      0    229     27    196    0    0     0     0   11    6  0  0 100  0
 
user1@vm1:~$ vmstat -S M -a
#                    (17)    (18)
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id wa
 0  0      0     11    434     19    0    0    24     2   11    6  0  0 100  0
 
user1@vm1:~$ vmstat -d
#19    (20)    (21)    (22)    (23)  (24)    (25)    (26)    (27)   (28)    (29)
disk- ------------reads------------ ------------writes----------- -----IO------
       total merged sectors      ms  total merged sectors      ms    cur    sec
sda    11706    353  402980   17612   9303  40546  336358   46980      0     19
sr0        0      0       0       0      0      0       0       0      0      0
loop0      0      0       0       0      0      0       0       0      0      0
 
user1@vm1:~$ vmstat -m | head
#(30)                       (31)  (32)   (33)   (34)
Cache                       Num  Total   Size  Pages
ext3_inode_cache          13700  13700    808     10
ext3_xattr                    0      0     88     46
journal_handle              170    170     24    170
journal_head                 37     72    112     36
revoke_table                256    256     16    256
revoke_record               128    128     32    128
kmalloc_dma-512               8      8    512      8
ip6_dst_cache                16     24    320     12
UDPLITEv6                     0      0   1024      8

Fields and descriptions:

Mode Stats Field Description
Virtual memory Procs (1) r: The number of processes waiting for run time.
(2) b: The number of processes in uninterruptible sleep.
Memory (3) swpd: the amount of virtual memory used.
(4) free: the amount of idle memory.
(5) buff: the amount of memory used as buffers.
(6) cache: the amount of memory used as cache.
(17) inact: the amount of inactive memory.
(18) active: the amount of active memory.
Swap (7) si: Amount of memory swapped in from disk (/s).
(8) so: Amount of memory swapped to disk (/s).
I/O (9) bi: Blocks received from a block device (blocks/s).
(10) bo: Blocks sent to a block device (blocks/s).
System (11) in: The number of interrupts per second, including the clock.
(12) cs: The number of context switches per second.
CPU (13) us: Time spent running non-kernel code. (user time, including nice time)
(14) sy: Time spent running kernel code. (system time)
(15) id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
(16) wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle.
Disk, -d Device (19) Device name
Reads (20) total: Total reads completed successfully
(21) merged: grouped reads (resulting in one I/O)
(22) sectors: Sectors read successfully
(23) ms: milliseconds spent reading
Writes (24) total: Total writes completed successfully
(25) merged: grouped writes (resulting in one I/O)
(26) sectors: Sectors written successfully
(27) ms: milliseconds spent writing
I/O (28) cur: I/O in progress
(29) s: seconds spent for I/O
Slab, -m Slab (30) cache: Cache name
(31) num: Number of currently active objects
(32) total: Total number of available objects
(33) size: Size of each object
(34) pages: Number of pages with at least one active object

top output:

#     (1)      (2)                (3)      (4)
top - 03:22:44 up 4 days, 22:54,  1 user,  load average: 0.00, 0.00, 0.00
#        (5)         (6)         (7)            (8)          (9)
Tasks:  63 total,   1 running,  62 sleeping,   0 stopped,   0 zombie
#        (10)     (11)     (12)    (13)      (14)     (15)     (16)     (17)
Cpu(s):  0.0%us,  1.1%sy,  0.0%ni, 98.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
#                (18)            (19)            (20)            (21)
Mem:    508820k total,   273792k used,   235028k free,    27844k buffers
#                (22)            (23)            (24)            (25)
Swap:   473080k total,        0k used,   473080k free,   201252k cached
 
#(26) (27)     (28)(29) (30) (31) (32,33) (34)(35)      (36) (37)
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    1 root      20   0  8356  804  676 S  0.0  0.2   0:05.99 init
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd
    3 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0
    4 root      20   0     0    0    0 S  0.0  0.0   0:00.06 ksoftirqd/0
    5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 watchdog/0
    6 root      20   0     0    0    0 S  0.0  0.0   0:03.25 events/0
    7 root      20   0     0    0    0 S  0.0  0.0   0:00.00 cpuset
<...>

Fields and descriptions:

Section Field Description
Uptime (1) Current time.
(2) Uptime (time passed since boot).
(3) How many users are currently logged in.
(4) CPU load in past 1, 5 and 15 minuts. This is not normalized, so a load average of 1 means a single CPU system is loaded all the time while on a 4 CPU system it means it was idle 75% of the time.
Tasks (5) Total number of running processes.
(6) Number of processes currently executing.
(7) Number of processes currently sleeping.
(8) Number of processes curretnly stopped (with CTRL+Z for example).
(9) Number of defunct (“zombie”) process, terminated but not reaped by its parent.
Cpu(s) (10) The time the CPU has spent running users' processes that are not niced.
(11) The time the CPU has spent running the kernel and its processes.
(12) The time the CPU has spent running users' proccess that have been niced.
(13) The time the CPU spent idle.
(14) Amount of time the CPU has been waiting for I/O to complete.
(15) The amount of time the CPU has been servicing hardware interrupts.
(16) The amount of time the CPU has been servicing software interrupts.
(17) The amount of CPU 'stolen' from this virtual machine by the hypervisor for other tasks (such as running another virtual machine).
Mem/swap (18) Total amount of psysical memory.
(19) Total amount of used physical memory.
(20) Total mount of free physical memory.
(21) Total RAM dedicated to cache disk block, and filesystem metadata.
(22,23,24) Total, used and free swap memory.
(25) Total RAM dedicated to pages from file reading.
Process (26) The task's unique process ID, which periodically wraps, though never restarting at zero.
(27) The effective user name of the task's owner.
(28) The priority of the task.
(29) The nice value of the task. A negative nice value means higher priority, whereas a positive nice value means lower priority. Zero in this field simply means priority will not be adjusted in determining a task's dispatchability.
(30) The total amount of virtual memory used by the task. It includes all code, data and shared libraries plus pages that have been swapped out and pages that have been mapped but not used.
(31) The non-swapped physical memory a task has used.
(32) The amount of shared memory used by a task. It simply reflects memory that could be potentially shared with other processes.
(33) The status of the task which can be one of: 'D' = uninterruptible sleep, 'R' = running, 'S' = sleeping, 'T' = traced or stopped, 'Z' = zombie.
(34) The task's share of the elapsed CPU time since the last screen update, expressed as a percentage of total CPU time.
(35) A task's currently used share of available physical memory.
(36) CPU Time, hundredths, The same as 'TIME', but reflecting more granularity through hundredths of a second.
(37) Command – Command line or Program name

There are lots of fields as you may see. Many of fields are present in several utilities, which duplicate functionality of each other somewhat. Normally you will need only small subset of this fields, but you need to know that there is this many (actually, a lot more) information about system performance available to you, because sometimes an obscure problem happens and to be able to solve it you need to know how to read this stats.

Now you will learn how to uses system performance utilities.

Do this

 1: uptime
 2: free
 3: vmstat
 4: ( sleep 5 && dd if=/dev/urandom of=/dev/null bs=1M count=30 && sleep 5 && killall vmstat )& vmstat 1
 5: uptime
 6: ( sleep 5 && dd if=/dev/zero of=test.img bs=32 count=$((32*1024*200)) && sleep 5 && killall vmstat )& vmstat -nd 1 | egrep -v 'loop|sr0'
 7: echo 3 | sudo tee /proc/sys/vm/drop_caches
 8: free -mt ; find / >/dev/null 2>&1 ; free -mt
 9: echo 3 | sudo tee /proc/sys/vm/drop_caches
10: cat test.img /dev/null ; free -mt

What you should see

user1@vm1:~$ uptime
 05:36:45 up 6 days,  1:08,  1 user,  load average: 0.00, 0.00, 0.00
user1@vm1:~$ free
             total       used       free     shared    buffers     cached
Mem:        508820     239992     268828          0        820     213720
-/+ buffers/cache:      25452     483368
Swap:       473080          0     473080
user1@vm1:~$ vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0      0 268828    820 213720    0    0    21    10   14   11  0  0 100  0
user1@vm1:~$ ( sleep 5 && dd if=/dev/urandom of=/dev/null bs=1M count=30 && sleep 5 && killall vmstat )& vmstat 1
[1] 6078
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 1  1      0 268556    828 213736    0    0    21    10   14   11  0  0 100  0
 0  0      0 268556    828 213772    0    0    16     0   19   10  0  0 100  0
 0  0      0 268556    828 213772    0    0     0     0   13    8  0  0 100  0
 0  0      0 268556    828 213772    0    0     0     0   15   11  0  0 100  0
 0  0      0 268556    828 213772    0    0     0     0   14   10  0  0 100  0
 0  0      0 268556    828 213772    0    0     0     0   18   13  0  0 100  0
 1  0      0 267316    836 213844    0    0    74     0  267   26  0 99  1  0
 1  0      0 267316    836 213844    0    0     0     0  303    7  0 100  0  0
 1  0      0 267316    836 213844    0    0     0     0  271   11  0 100  0  0
 1  0      0 267316    836 213844    0    0     0     0  257   12  0 100  0  0
30+0 records in
30+0 records out
31457280 bytes (31 MB) copied, 4.95038 s, 6.4 MB/s
 0  0      0 267928    860 213860    0    0    27     0  265   29  1 97  2  0
 0  0      0 267936    860 213848    0    0     0     0   15    9  0  0 100  0
 0  0      0 267936    860 213848    0    0     0     0   14    7  0  0 100  0
 0  0      0 267936    860 213848    0    0     0     0   14    7  0  0 100  0
 0  0      0 267936    860 213848    0    0     0     0   13   11  0  0 100  0
Terminated
user1@vm1:~$ uptime
 05:22:15 up 6 days, 54 min,  1 user,  load average: 0.07, 0.02, 0.00
[1]+  Done                    ( sleep 5 && dd if=/dev/urandom of=/dev/null bs=1M count=30 && sleep 5 && killall vmstat )
user1@vm1:~$ uptime
 05:22:22 up 6 days, 54 min,  1 user,  load average: 0.06, 0.02, 0.00
user1@vm1:~$ ( sleep 5 && dd if=/dev/zero of=test.img bs=32 count=$((32*1024*200)) && sleep 5 && killall vmstat )& vmstat -nd 1 | egrep -v 'loop|sr0'
[1] 6086
disk- ------------reads------------ ------------writes----------- -----IO------
       total merged sectors      ms  total merged sectors      ms    cur    sec
sda   146985 2230744 21821320  105848  32190 1343154 10927338 1330144      0    105
sda   146995 2230744 21821648  105848  32190 1343154 10927338 1330144      0    105
sda   146995 2230744 21821648  105848  32190 1343154 10927338 1330144      0    105
sda   146995 2230744 21821648  105848  32190 1343154 10927338 1330144      0    105
sda   146995 2230744 21821648  105848  32190 1343154 10927338 1330144      0    105
sda   146995 2230744 21821648  105848  32190 1343154 10927338 1330144      0    105
sda   146999 2230744 21821680  105856  32190 1343154 10927338 1330144      0    105
sda   146999 2230744 21821680  105856  32190 1343154 10927338 1330144      0    105
sda   147000 2230744 21821688  105856  32208 1344160 10935530 1330288      0    105
sda   147000 2230744 21821688  105856  32274 1349214 10976490 1330748      0    105
sda   147000 2230744 21821688  105856  32325 1353259 11009258 1331236      0    105
sda   147000 2230744 21821688  105856  32450 1364657 11101442 1337176      0    105
sda   147000 2230744 21821688  105856  32450 1364657 11101442 1337176      0    105
sda   147001 2230744 21821696  105856  32471 1366301 11114762 1337348      0    105
sda   147001 2230744 21821696  105856  32525 1370529 11149018 1337732      0    105
sda   147001 2230744 21821696  105856  32573 1374577 11181786 1338064      0    105
sda   147001 2230744 21821696  105856  32698 1386562 11278666 1346244      0    105
6553600+0 records in
6553600+0 records out
209715200 bytes (210 MB) copied, 11.7088 s, 17.9 MB/s
sda   147001 2230744 21821696  105856  32698 1386562 11278666 1346244      0    105
sda   147001 2230744 21821696  105856  32698 1386562 11278666 1346244      0    105
sda   147001 2230744 21821696  105856  32698 1386562 11278666 1346244      0    105
sda   147001 2230744 21821696  105856  32698 1386562 11278666 1346244      0    105
sda   147001 2230744 21821696  105856  32762 1393910 11337962 1349192      0    105
user1@vm1:~$ echo 3 | sudo tee /proc/sys/vm/drop_caches
3
[1]+  Done                    ( sleep 5 && dd if=/dev/zero of=test.img bs=32 count=$((32*1024*200)) && sleep 5 && killall vmstat )
user1@vm1:~$ free -mt ; find / >/dev/null 2>&1 ; free -mt
             total       used       free     shared    buffers     cached
Mem:           496         30        466          0          0          5
-/+ buffers/cache:         24        472
Swap:          461          0        461
Total:         958         30        928
             total       used       free     shared    buffers     cached
Mem:           496         64        432          0         22          6
-/+ buffers/cache:         35        461
Swap:          461          0        461
Total:         958         64        894
user1@vm1:~$ echo 3 | sudo tee /proc/sys/vm/drop_caches
3
user1@vm1:~$ cat test.img /dev/null ; free -mt
             total       used       free     shared    buffers     cached
Mem:           496        230        265          0          0        205
-/+ buffers/cache:         24        471
Swap:          461          0        461
Total:         958        230        727
user1@vm1:~$

Explanation

  1. Prints out current uptime.
  2. Prints out free memory information.
  3. This one is interesting. It is better be thought like a sort of experiment. First, we launch ( sleep 5 && dd if=/dev/urandom of=/dev/null bs=1M count=30 && sleep 5 && killall vmstat )& in the background, and immediately after that we start vmstat in continuous mode so it will print out its information until interrupted. We are able to see that after 5 seconds since this command start CPU load increases dramatically for some time and then decreases, after another 5 seconds vmstat is killed.
  4. Prints out current uptime. Notice changes in load average.
  5. This is another experiment, almost the same as before, but this time with disk writes.
  6. Drops all caches and buffers.
  7. Another experiment. We want to see how reading all file and directory names in the system affects filesystem cache in memory, and we are able to see that it is cached in buffers, which is in accordance to theory.
  8. Drops all caches and buffers again.
  9. This time we want to see how reading a file affects filesystem cache in memory. We are able to see that files which are read are cached in cached section for increasing time of consequent access.

Extra credit

  1. Why in our first experiment not user, but system CPU usage goes up to 100?
  2. What this means? dd if=/dev/zero of=test.img bs=32 count=$( (32*1024*200) ).
  3. Launch top and press h. Now try sorting its output by CPU, memory and PID.

Discussion

Navigation

Learn Linux The Hard Way