Do Nagios NCPA memory stats match the output of the Linux utility free?
Many admins like sanity checks when investigating new tools. We sometimes hear the objection that NCPA memory stats don’t match the output of the Linux utility
free— a statement that happens (wonderfully) to be both true and not true at the same time. The discrepancies mostly have to do with reporting units, conversions, and how everything other than total memory is determined.
On a Linux host we are monitoring, with the NCPA agent installed, we’ll run
free. Then, from our XI box, we’ll run the NCPA memory check both as a regular check, and also as a manual check from the command line. Then compare.
Let’s start with how NCPA memory metrics match the output of
free: total memory. Running free on a test box, we get:
[root@localhost memory]# free
Total used free shared buffers cached
Mem: 8193024 6984832 1208192 0 202888 974112
-/+ buffers/cache:5807832 2385192
Swap: 262136 0 262136
It is important to know that free by default returns memory stats in kibibytes. Yes, it is true that if you run
man free on some distributions, it will say memory stats are given in kilobytes, but that is an old version of the man page.
NCPA by default returns memory stats in gibibytes, so in the XI interface after running the NCPA Wizard against the host, we are going to see a result like this:
So, how do we compare total memory between
free and NCPA output in the XI interface?
The simplest thing to do is go to your browser and search up a converter BUT be sure to specifiy kibibytes and gibibytes for units. I only point this out because I used incorrect units at least twice.
Alternately, you can run check_ncpa.py from the command line and specify output in kibibytes like this:
[root@centos7x64 ~]# /usr/local/nagios/libexec/check_ncpa.py -H
192.168.3.33 -t 'a' -P 5693 -M memory/virtual -u Ki -w 80 -c 90
-u flag specificies units in
Ki for kibibytes.
OK: Used memory was 67.40 % (Available: 2670920.00 KiB, Total:
8193024.00 KiB, Free: 1202984.00 KiB, Used: 5148448.00 KiB) |
But what about the free/used/available metrics?
I will concede the point that on these metrics, free and NCPA do not entirely agree, but there are simple reasons. The “free” memory value between the two measures is only a little different, and the difference is at least partially attributable to the small amount of memory load from NCPA checking memory.
free and NCPA calculate memory metrics differently. Why? That’s an interesting rabbit hole to go down, but it suffices to say NCPA uses psutil and available memory is “the memory that can be given instantly to processes without the system going into swap.”
That’s a handy metric. The NCPA percentage memory used calculation is ((total – available) / 100), which gives admins a solid idea of how host memory is performing.
The very clever will notice that for NCPA none of (used + free) or (used + available) or (used + free + available) sum to total memory in our example. Again the psutil doc will be helpful here. Basically, they are not meant to sum.
Admins applying sanity checks to their NCPA results may indeed initially question the sanity of NCPA output. With a solid understanding of the units of measure in question, as well as what is actually being measured, admins can see that NCPA memory stats check out.