How to Monitor Hyper-V Performance using PNP4Nagios

How to Monitor Hyper-V Performance using PNP4Nagios

At a high level, you need three things to run a trouble-free datacenter (even if your datacenter consists of two mini-tower systems stuffed in a closet): intelligent architecture, monitoring, and trend analysis. Intelligent architecture consists of making good purchase decisions and designing virtual machines that can appropriately handle their load. Monitoring allows you to prevent or respond quickly to emergent situations. Trend analysis helps you to determine how well your reality matches your projections and greatly assists in future architectural decisions. In this article, we’re going to focus on trend analysis. We will set up a data collection and graphing system called “PNP4Nagios” that will allow you to track anything that you can measure. It will hold that data for four years. You can display it in graphs on demand.

What You Get

I know that intro was a little heavy. So, to put it more simply, I’m giving you graphs. Want to know how much CPU that VM has been using? Trying to figure out how quickly your VMs are filling up your Cluster Shared Volumes? Curious about a VM’s memory usage? We have all of that.

Where I find it most useful: Getting rid of vendor excuses. We all have at least one of those vendors that claim that we’re not providing enough CPU or memory or disk or a combination. Now, you can visually determine the reasonableness of their demands.

First, the host and service screens in Nagios will get a new graph icon next to every host and service that track performance data. Also, hovering over one of those graph icons will show a preview of the most recent chart:


Second, clicking any of those icons will open a new tab with the performance data graph for the selected item.


Just as the Nagios pages periodically refresh, the PNP4Nagios page will update itself.

Additionally, you can do the following:

  • Click-dragging a section on a graph will cause it to zoom. If you’ve ever used the zoom feature in Performance Monitor, this is similar.
  • In the Actions bar, you can:
    • Set a custom time/date range to graph
    • Generate a PDF of the visible charts
    • Generate XML summary data
  • Create a “basket” of the graphs that you view most. The basket persists between sessions, so you can build a dashboard of your favorite charts

What You Need

Fortunately, you don’t need much to get going with PNP4Nagios.

Fiscal Cost

Let’s answer the most important question: what does it cost? PNP4Nagios does not require you to purchase anything. Their site does include a Donate button. If your organization finds PNP4Nagios useful, it would be good to throw a few dollars their way.

You’ll need an infrastructure to install PNP4Nagios on, of course. We’ll wrap that up into the later segments.


As its name implies, PNP4Nagios needs Nagios. PNP4Nagios installs alongside Nagios on the same system. We have a couple of walkthroughs for installing Nagios as a Hyper-V guest, divided by distribution.

The installation really doesn’t change much between distributions. The differences lie in how you install the prerequisites and in how you configure Apache. If you know those things about your distribution, then you should be able to use either of the two linked walkthroughs to great effect. If you’d rather see something on your exact distribution, the official Nagios project has stepped up its game on documentation. If we haven’t got instructions for your distribution, maybe they do. There are still things that I do differently, but nothing of critical importance. Also, being a Hyper-V blog, I have included special items just for monitoring Hyper-V, so definitely look at the post-installation steps of my articles.

Also, if you want to use SSL and Active Directory to secure your Nagios installation, we’ve got an article for that.

Disk Space

According to the PNP4Nagios documentation, each item that you monitor will require about 400 kilobytes once it has reached maximum data retention. That assumes that you will leave the default historical interval and retention lengths. More information can be found on the PNP4Nagios site. So, 20 systems with 12 monitors apiece will use about 96 megabytes.

PNP4Nagios itself appears to use around 7 megabytes once installed and extracted.

Downloading PNP4Nagios

PNP4Nagios is distributed on Sourceforge:

As always, I recommend that you download to a standard workstation and then transfer the files to the Nagios server. Since I operate using a Windows PC and run Nagios on a Linux system, WinSCP is my choice of transfer tool.

On my Linux systems, I create a “Download” directory in my home folder and place everything there. The install portion of my instructions will be written using the file’s location as a starting point. So, for me, I begin with cd ~/Downloads.

Installing PNP4Nagios

PNP4Nagios installs quite easily.

PNP4Nagios Prerequisites

Most of the prerequisites for PNP4Nagios automatically exist in most Linux distributions. Most of the remainder will have been satisfied when you installed Nagios. The documentation lists them:

  • Perl, at least version 5. To check your installed Perl version: perl -v
  • RRDTool: This one will not be installed automatically or during a regular Nagios build. Most distributions include it in their mainstream repositories. Install with your distribution’s package manager.
    • CentOS and most other RedHat-based distributions: sudo yum install perl-rrdtool
    • SUSE-based systems: sudo zypper install rrdtool
    • Ubuntu and most other Debian-based distributions: sudo apt install rrdtool librrds-perl
  • PHP, at least version 5. This would have been installed with Nagios. Check with: php -v
  • GD extension for PHP. You might have installed this with Nagios. Easiest way to check is to just install it; it will tell you if you’ve already got it.
    • CentOS and most other RedHat-based distributions: sudo yum install php-gd
    • SUSE-based systems: sudo zypper install php-gd
    • Ubuntu and most other Debian-based distributions: sudo apt install php-gd
  • mod_rewrite extension for Apache. This should have been installed along with Nagios. How you check depends on whether your distribution uses “apache2” or “httpd” as the name of the Apache executable:
    • CentOS and most other RedHat-based distributions: sudo httpd -M | grep rewrite
    • Ubuntu, openSUSE, and most Debian and SUSE distributions: sudo apache2ctl -M | grep rewrite
  • There will be a bit more on this in the troubleshooting section near the end of the article, but if you’re running a more current version of PHP (like 7), then you may not have the XML extension built-in. I only ran into this problem on my Ubuntu installation. I solved it with this: sudo apt install php-xml
  • openSUSE was missing a couple of PHP modules on my system: sudo zypper install php-sockets php-zlib

If you are missing anything that I did not include instructions for, you can visit one of my articles on installing Nagios. If I haven’t got one for your distribution, then you’ll need to search for instructions elsewhere.

Unpacking and Installing PNP4Nagios

As I mentioned in the download section, I place my downloaded files in ~/Downloads. I start from there (with cd ~/Downloads). Start these directions in the folder where you placed your downloaded PNP4Nagios tarball.

  1. Unpack the tarball. I wrote these directions with version 0.6.26. Modify your command as necessary (don’t forget about tab completion!): tar xzf pnp4nagios-0.6.26.tar.gz
  2. Move to the unpacked folder: cd ./pnp4nagios-0.6.26/
  3. Next, you will need to configure the installer. Most of us can just use it as-is. Some of us will need to override some things, such as the Nagios user groups. To determine if that applies to you, open /usr/local/nagios/etc/nagios.cfg. Look for the following section:

    If both nagios_user and nagios_group are “nagios”, then you don’t need to do anything special.
    Regular configuration: ./configure
    Configuration with overrides: ./configure --with-nagios-user=naguser --with-nagios-group=nagcmd .
    Other overrides are available. You can view them all with ./configure --help. One useful override would be to change the location of the emitted perfdata files to an auxiliary volume to control space usage. On my Ubuntu system, I needed to override the location of the Apache conf files: ./configure --with-httpd-conf=/etc/apache2/sites-available
  4. When configure completes, check its output. Verify that everything looks OK. Especially pay attention to “Apache Config File” — note the value because you will access it later. If anything looks off, install any missing prerequisites and/or use the appropriate configure options. You can continue running ./configure until everything suits your needs.
  5. Compile the program: make all. If you have an “oh no!” moment in which you realize that you missed something, you can still re-run ./configure and then compile again.
  6. Because we’re doing a new installation, we will have it install everything: sudo make fullinstall. Be aware that we are now using sudo. That’s because it will need to copy files into locations that your regular account won’t have access to. For an upgrade, you’d likely only want sudo make install. Please check the documentation for additional notes about upgrading. If you didn’t pay attention to the output file locations during configure, they’ll be displayed to you again.
  7. We’re going to be adding a bit of flair to our Nagios links. Enable the pop-up extension with: sudo cp ./contrib/ssi/status-header.ssi /usr/local/nagios/share/ssi/

Installation is complete. We haven’t wired it into Nagios yet, so don’t expect any fireworks.

Configure Apache Security for PNP4Nagios

If you just use the default Apache security for Nagios, then you can skip this whole section. As outlined in my previous article, I use Active Directory authentication. Really, all that you need to do is duplicate your existing security configuration to the new site. Remember how I told you to pay attention to the output of configure, specifically “Apache Config File”? That’s the file to look in.

My “fixed” file looks like this:

Only a single line needed to be changed to match my Nagios virtual directories.

Initial Verification of PNP4Nagios Installation

Before we go any further, let’s ensure that our work to this point has done what we expected.

  1. If you are using a distribution whose Apache enables and disables sites by symlinking into sites-available and you instructed PNP4Nagios to place its files there (ex: Ubuntu), enable the site: sudo a2ensite pnp4nagios.conf
  2. Restart Apache.
    1. CentOS and most other RedHat-based distributions: sudo service httpd restart
    2. Almost everyone else: sudo service apache2 restart
  3. If necessary, address any issues with Apache starting. For instance, Apache on my openSUSE box really did not like the “Order” and “Allow” directives.
  4. Once Apache starts correctly, access http://yournagiosserveraddress/pnp4nagios. For instance, my internal URL is Remember that you copied over your Nagios security configuration, so you will log in using the same credentials that you use on a normal Nagios site.
  5. Fix any problems indicated by the web page. Continue reloading the Apache server and the page as necessary until you get the green light:
  6. Remove the file that validates the installation: sudo rm /usr/local/pnp4nagios/share/install.php

Installation was painless on my CentOS and Ubuntu systems. openSUSE gave me more drama. In particular, it complained about “PHP zlib extension not available” and “PHP socket extension not available”. Very easy to fix: sudo zypper install php-sockets php-zlib. Don’t forget to restart Apache after making these changes.

Initial Configuration of Nagios for PNP4Nagios

At this point, you have PNP4Nagios mostly prepared to do its job. However, if you try to access the URL, you’ll get a message that says that it doesn’t have any data: “perfdata directory “/usr/local/pnp4nagios/var/perfdata/” is empty. Please check your Nagios config.” Nagios needs to start feeding it data.

We start by making several global changes. If you are comparing my walkthrough to the official PNP4Nagios documentation, be aware that I am guiding you to a Bulk + NPCD configuration. I’ll talk about why after the how-to.

Global Nagios Configuration File Changes

In the text editor of your choice, open /usr/local/nagios/etc/nagios.cfg. Find each of the entries that I show in the following block and change them accordingly. Some don’t need anything other than to be uncommented:


Next, open /usr/local/nagios/etc/objects/templates.cfg. At the end, you’ll find some existing commands that mention “perfdata”. After those, add the commands from the following block. If you don’t use the initial Nagios sample files, then just place these commands in any active cfg file that makes sense to you.

Configuring NPCD

The performance collection method that we’re employing involves the Nagios Perfdata C Daemon (NPCD). The default configuration will work perfectly for this walkthrough. If you need something more from it, you can edit /usr/local/pnp4nagios/etc/npcd.cfg. We just want it to run as a daemon:

Enable it to run automatically at startup.

  • Most Red Hat and SUSE based distributions: sudo chkconfig --add npcd
  • Ubuntu and most other Debian-based distributions: sudo update-rc.d npcd defaults

Configuring Hosts in Nagios for PNP4Nagios Graphing

If you made it here, you’ve successfully completed all the hard work! Now you just need to tell Nagios to start collecting performance data so that PNP4Nagios can graph it.

Note: I deviate substantially from the PNP4Nagios official documentation. If you follow those directions, you will quickly and easily set up every single host and every single service to gather data. I didn’t want that because I don’t find such a heavy hand to be particularly useful. You’ll need to do more work to exert finer control. In my opinion, that extra bit of work is worth it. I’ll explain why after the how-to.

If you followed the path of least resistance, every single host in your Nagios environment inherits from a single root source. Open /usr/local/nagios/etc/objects/templates.cfg. Find the define host object with a name of generic-host. Most likely, this is your master host object. Look at its configuration:

Now that you’ve enabled performance data processing in nagios.cfg, this means that Nagios and PNP4Nagios will now start graphing for every single host in your Nagios configuration. Sound good? Well, wait a second. What it really means is that it will graph the output of the check_command for every single host in your Nagios configuration. What is check_command in this case? Probably check_ping or check_icmp. The performance data that those output are the round-trip average and packets lost during pings from the Nagios server to the host in question. Is that really useful information? To track for four years?

I don’t really need that information. Certainly not for every host. So, I modified mine to look this:

What we have:

  • Our existing hosts are untouched. They’ll continue not recording performance data just as they always have.
  • A new, small host definition called “perf-host”. It also does not set up the recording of host performance data. However, its “action_url” setting will cause it to display a link to any graphs that belong to this host. You can use this with hosts that have graphed services but you don’t want the ping statistics tracked. To use it, you would set up/modify hosts and host templates to inherit from this template in addition to whatever host templates they already inherit from. For example: use perf-host,generic-host.
  • A new, small host definition called “perf-host-pingdata”. It works exactly like “perf-host” except that it will capture the ping data as well. The extra bit on the end of the “action_url” will cause it to draw a little preview when you mouseover the link. To use it, you will set up/modify hosts and host templates to inherit from this template in addition to whatever host templates they already inherit from. For example: use perf-host-pingdata,generic-host.

Note: When setting the inheritance:

  • perf-host or perf-host-pingdata must come before any other host templates in a use line.
  • In some instances, including a space after the comma in a use line causes Nagios to panic if the name of the host does not also have a space (ex: you are using tabs instead of spaces on the name generic_host line. Make sure that all of your use directives have no spaces after any commas and you will never have a problem. Ex: use perf-host,generic-host.

Remember to check the configuration and restart Nagios after any changes to the .cfg files:

Couldn’t You Just Set a Single Root Host for Inheritance?

An alternative to the above would be:

In this configuration, perf-host inherits directly from generic-host. You could then have all of your other systems inherit from perf-host instead of generic-host. The problem is that even in a fairly new Nagios installation, a fair number of hosts already inherit from generic-host. You’d need to determine which of those you wanted to edit and carefully consider how inheritance works. If you’re going to all of that trouble, it seems to me that maybe you should just directly edit the generic-host template and be done with it.

Truthfully, I’m only telling you what I do. Do whatever makes sense to you.

Configuring Services in Nagios for PNP4Nagios Graphing

You’ll get much more use of out service graphing than host graphing. Just as with hosts, the default configuration enables performance graphing for all services. Not all services emit performance data, and you may not want data from all services that do produce data. So, let’s fine-tune that configuration as well.

Still in /usr/local/nagios/etc/objects/templates.cfg, find the define service object with a name of generic-service. Disable performance data collection on it and add a stub service that enables performance graphing:

When you want to capture performance data from a service, prepend the new stub service to its use line. Ex: use perf-service,generic-service. The warnings from the host section about the order of items and the lack of a space after the comma in the use line transfer to the service definition.

Remember to check the configuration and restart Nagios after any changes to the .cfg files:

Example Configurations

In case the above doesn’t make sense, I’ll show you what I’m doing.

Most of the check_nt services emit performance data. I’m especially interested in CPU, disk, and memory. The uptime service also emits data, but for some reason, it doesn’t use the defined “counter” mode. Instead, it’s just a graph that steadily increases at each interval until you reboot, then it starts over again at zero. I don’t find that terribly useful, especially since Nagios has its own perfectly capable host uptime graphs. So, I first configure the “windows-server” host to show the performance action_url. Then I configure the desired default Windows services to capture performance data.

My /usr/local/nagios/etc/objects/windows.cfg:

Now, my hosts that inherit from the default Windows template have the extra action icon, but my other hosts do not:

p4n_hostswithiconsThe same story on the services page; services that track performance data have an icon, but the others do not:


Troubleshooting your PNP4Nagios Deployment

Not getting any data? First of all, be patient, especially when you’re just getting started. I have shown you how to set up the bulk mode with NPCD which means that data captures and graphing are delayed. I’ll explain why later, but for now, just be aware that it will take some time before you get anything at all.

If it’s been some time, say, 15 minutes, and you’re still not getting any data. Go to and download the verify_pnp_config file. Transfer it to your Nagios host. I just plop it into my Downloads folder as usual. Navigate to the folder where you placed yours, then run:

That should give you the clues that you need to fix most any problems.

I did have one leftover problem, but only my Ubuntu system where I had updated to PHP 7. The verify script passed everything, but trying to load any PNP4Nagios page gave me this error: “Call to undefined function simplexml_load_file()”. I only needed to install the PHP XML package to fix that: sudo apt install php-xml. I didn’t look up the equivalent on the other distributions.

Plugin Output for Performance Graphing

To determine if a plugin can be graphed, you could just look at its documentation. Otherwise, you’ll need to manually execute it from /usr/local/nagios/libexec. For instance, we’ll just use the first one that shows up on an Ubuntu system, check_apt:


See the pipe character (|) there after the available updates report? Then the jumble of characters after that? That’s all in the standard format for Nagios performance charting. That format is:

  1. A pipe character after the standard Nagios service monitoring result.
  2. A human-readable label. If the label includes any special characters, the entire label should be enclosed in single quotes.
  3. An equal sign (=)
  4. The reported value.
  5. Optionally, a unit of measure.
  6. A semi-colon, optionally followed by a value for the warning level. If the warning level is visible on the produced chart, it will be indicated by a horizontal yellow line.
  7. A semi-colon, optionally followed by a value for the critical level. If the warning level is visible on the produced chart, it will be indicated by a horizontal red line.
  8. A semicolon, optionally followed by the minimum value for the chart’s y-axis. Must be the same unit of measure as the value in #4. If not specified, PNP4Nagios will automatically set the minimum value. If this value would make the current value invisible, PNP4Nagios will set its own minimum.
  9. A semicolon, optionally followed by the maximum value for the chart’s y-axis. Must be the same unit of measure as the value in #4. If not specified, PNP4Nagios will automatically set the maximum value. If this value would make the current value invisible, PNP4Nagios will set its own maximum.

This format is defined by Nagios and PNP4Nagios conforms to it. You can read more about the format at:

My plugins did not originally emit any performance data. I have been working on that and should hopefully have all of that work completed before you read this article.

My PNP4Nagios Configuration Philosophy

I had several decision points when setting up my system. You may choose to diverge as it meets your needs. I’ll use this section to explain why I made the choices that I did.

Why “Bulk with NPCD” Mode?

Initially, I tried to set up PNP4Nagios in “synchronous” mode. That would cause Nagios to instantly call on PNP4Nagios to generate performance data immediately after every check’s results were returned. I chose that initially because it seemed like the path of least resistance.

It didn’t work for me. I’m betting that I did something wrong. But, I didn’t get my problem sorted out. I found a lot more information on the NPCD mode. So, I switched. Then I researched the differences. I feel like I made the correct choice.

You can read up on the available modes yourself:

In synchronous mode, Nagios can’t do anything while PNP4Nagios processes the return information. That’s because it all occurs in the same thread; we call that behavior “blocking”. According to the PNP4Nagios documentation, that method “will work very good up to about 1,000 services in a 5-minute interval”. I assume that’s CPU-driven, but I don’t know. I also don’t know how to quantify or qualify “will work very good”. I also don’t know what sort of environments any of my readers are using.

Bulk mode moves the processing of data from per-return-of-results to gathering results for a while and then processing them all at once. The documentation says that testing showed that 2,000 services were processed in .06 seconds. That’s easier to translate to real-world systems, although I still don’t know the overall conditions that generated that benchmark.

When we add NPCD onto bulk mode, then we don’t block Nagios at all. Nagios still does the bulk gathering, but NPCD processes the data, not Nagios. I chose this method as it means that as long as your Nagios system is multi-core and not already overloaded, you should not encounter any meaningful interruption to your Nagios service by adding PNP4Nagios. It should also work well with most installation sizes. For really big Nagios/PNP4Nagios installations (also not qualified or quantified), you can follow their instructions on configuring “Gearman Mode”.

One drawback to this method: Your “4 Hour” charts will frequently show an empty space at the right of their charts. That’s because they will be drawn in-between collection/processing periods. All of the data will be filled in after a few minutes. You just may not have instant gratification.

Why Not Just Allow Every Host and Service to be Monitored?

The default configuration of PNP4Nagios results in every single host and every single service being enabled for monitoring. From an “ease-of-configuration” standpoint, that’s tempting. Once you’ve set the globals, you literally don’t have to do anything else.

However, we are also integrating directly with Nagios’ generated HTML pages. Whereas PNP4Nagios can determine that a service doesn’t have performance data because Nagios won’t have generated anything, the front-end just has an instruction to add a linked icon to every single service. So, if you just globally enable it, then you’ll get a lot of links that don’t work.

If you’re the only person using your environment, maybe that’s OK. But, if you share the environment, then you’ll start getting calls wanting to you to “fix” all those broken links. It won’t take long before you’re spending more time explaining (and re-explaining) that not all of the links have anything to show.

Why Not Just Change the Inheritance Tree?

If you want, you could have your performance-enabled hosts and services inherit from the generic-host/generic-service templates, then have later templates, hosts, and services inherit from those. If that works for you, then take that approach.

I chose to employ multiple inheritance as a way of overriding the default templates because it seemed like less effort to me. When I went to modify the services, I simply copied “perf-service,” to the clipboard and then selectively pasted it into the use line of every service that I wanted. It worked easier for me than a selective find-replace operation or manual replacement. It also seems to me that it would be easier to revert that decision if I make a mistake somewhere.

I can envision very solid arguments for handling this differently. I won’t argue. I just think that this approach was best for my situation.

How to Securely Monitor Hyper-V with Nagios and NSClient

How to Securely Monitor Hyper-V with Nagios and NSClient


I’ve provided some articles on monitoring Hyper-V using Nagios. In all of them, I’ve specifically avoided the topic of securing the communications chain. On the one hand, I figure that we’re only working with monitoring data; we’re not passing credit card numbers or medical history.

On the other hand, several of my sensors use parameterized scripts. If I didn’t design my scripts well, then perhaps someone could use them as an attack vector. Rather than pretend that I can ever be certain that I’ll never make a mistake like that, I can bundle the communications into an encrypted channel. Even if you’re not worried about the scripts, you can still enjoy some positive side effects.

What’s in this Article

The end effects of following this article through to completion:

  • You will access your Nagios web interface at an https:// address.
  • You can access your Nagios web interface using your Active Directory domain credentials. You can also allow others to use their credentials. You can control what any given account can access.
  • Nagios will perform NRPE checks against your Hyper-V and Windows Server systems using a secured channel

As you get started, be advised that this is a very long article. I did test every single line, usually multiple times. You will almost always need to use sudo. I tried to add it everywhere it was appropriate, but each time I proofread this article, I find another that I missed. You might want to just sudo -s right in the beginning and be done with it.

General Philosophy

When I started working on this article, I fully intended to utilize a Microsoft-based certificate authority. Conceptually, PKI (public key infrastructure) is extremely simple. But, from that simplicity, implementers run off and make things as complicated as possible. I have not encountered any worse offender than Microsoft. After several days struggling against it, I ran up against problems that I simply couldn’t sort out. After trying to decipher one too many of Microsoft’s cryptic and non-actionable errors (“Error 0x450a05a1: masking tape will not eat pearl soufflé from the file that cannot be named”), I finally gave up. So, while it should be possible to use a Microsoft CA for everything that you see here, I cannot demonstrate it. Be aware that Microsoft’s tools tend to output DER (binary) certificates. Choose the Base64 option when you can. You can convert DER to Base64 (PEM).

Rather than giving up entirely, I re-centered myself and adopted the following positions:

  • Most of my readers probably don’t want to go to the hassle of configuring a Microsoft CA anyway; many of you don’t have the extra Windows Server licenses for that sort of thing, either
  • We’re securing monitoring traffic, not state secrets. We can shift the bulk of the security responsibility to the endpoints
  • To make all of this more secure, one simply needs to use a more secure CA. The remainder of the directions stay the same

Some things could be done differently. In a few places, I’m fairly certain that I worked harder than necessary (i.e., could have used fewer arguments to openssl). Functionality and outcome were most important.

In general, certificates are used to guarantee the identity of hosts. Anything else — host names, IP addresses, MAC addresses, etc., can be easily spoofed. In this case, we’re locking down a monitoring system. If someone manages to fool Nagios… uh… OK then. I am more concerned that, if you use the scripts that I provide, we are transmitting PowerShell commands to a service running with administrative privileges, and the transmission is sent in clear text. There are many safeguards to prevent that from being a security risk, but I want to add layers on top of that. So, while host authentication is always a good thing, my primary goal in this case is to encrypt the traffic. It’s on you to take precautions to lock down your endpoints. Maintain a good root password on your Linux boxes, maintain solid password protection policies, etc.

I borrowed heavily from many sources, but probably none quite so strongly as Nagios’ own documentation:


You need a Linux machine running Nagios. I wrote one guide for doing that on Ubuntu. I wrote another guide for doing that on CentOS. I have a third article forthcoming for doing the same with OpenSUSE. It’s totally acceptable for you to bring your own. The distributions aren’t so radically different that you won’t be able to figure out any differences that survive this article.

Also, for any of this to make sense, you need at least one Windows Server/Hyper-V Server system to monitor.

Have Patience! I can go on and on all day about how Microsoft makes a point of avoiding actionable error messages. In this case, they are far from alone. I lost many hours trying to decipher useless messages from openssl and NRPE. Solutions are usually simple, but the problems are almost always frustrating because the authors of these tools couldn’t be bothered to employ useful error handling. NSClient++ treated me much better, but even that let me down a few times. Take your time and remember that, even though there are an absurd number of configuration points, certificate exchange is fundamentally simple. Whatever problem you encounter is probably a small one.

Step 1. Acquire and Enable openssl

Every distribution that I used already had openssl installed. Just to be sure, use your distribution’s package manager. Examples:

You’ll probably get a message that you already have the package. Good!

Next, we need a basic configuration. You should automatically get one along with the default installation of openssl. Look for a file named “openssl.cnf”. It will probably be in in /etc/ssl or /usr/lib/ssl. Linux can help you:

If you haven’t got one, then maybe removing and reinstalling openssl will create it… I never tried that. You could try this site: I’ll also provide the one that I used. Different sections of the file are used for different purposes. I’ll show each portion in context.

Set Up Your Directories and Environment

You will need to place your certificate files in a common place. First, look around the location where you found the openssl.cnf file. Specifically, check for “private” and “certs” directories. If they don’t exist, you can make some.

To keep things simple, I just dump everything there on systems that need a directory created. I will write the remainder of this document using that directory. If your system already has the split directories, use “private” to hold key files and “certs” to hold certificate files. Note that if you find these files underneath a “ca” path, that is for the certificate authority, not the client certificates that I’m talking about. I’ll specifically cover the certificate authority in the next section.

Step 2. Set Up a Certificate Authority

In this implementation, the Linux system that runs Nagios will also host a certificate authority. We’ll use that to CA to generate certificates that Nagios and NRPE can use. Some people erroneously refer to those as “self-signed” because they aren’t issued by an “official” CA. However, that’s not the definition of “self-signed”. A self-signed certificate doesn’t have an authority chain. In our case, that term will apply only to the CA’s own certificate, which will then be used to sign other certificates. All of those will be authentic, not-self-signed certificates. As I describe it, you’ll use the same system as for both your CA and Nagios system, but you could just as easily spin up another Linux system to be the CA. You would only need to copy CSR, certificate, and key files across the separate systems as necessary to implement that.

Set Up Your Directories and Environment

You need places to put your CA’s files and certificates. openssl will require its own particular files. If you found some CA folders near your openssl.cnf, use those. Otherwise, you can create your own.

Configure your default openssl.cnf (sometimes openssl.conf). Note the file locations that I mentioned in the previous section. Mine looks like this:

Make certain that you walk through it and enter your localized settings in place of mine. Also note that I’ve removed the comment mark in front of “req_extensions = v3_req”.

Create the CA certificate:

You will first be asked to answer a series of questions. If you filled out the fields correctly, then you can just press [Enter] all the way through them. You will then be asked to provide a password for the private key. Even though we aren’t securing anything of earth-shattering importance, take this seriously.

Your CA’s private key is the most vital file out of all that you’ll be creating. We’re going to lock it down so that it can only be accessed by root:

The public key is included in the public cert file (ca_cert.pem). That can safely be read by anyone, anywhere, any time.

For bonus points, research setting up access to your new CA’s certificate revocation list (CRL). I did not set that up for mine.

Step 3. Set Your Managing Computer to Trust the CA

Your management computer will access the Nagios site that will be secured by your new CA. Therefore, your management computer needs to trust the certificates issued by that CA, or you’ll get warnings in every major browser.

For a Linux machine (client, not the Nagios server), check to see if /etc/ssl/certs contains several files. If it does (Ubuntu, openSUSE), just copy the CA cert there. You can rename the file so that it stands out better, if you like. Not every app on Linux will read that folder; you’ll need to find directions for those apps specifically.

If your Linux distribution doesn’t have that folder (CentOS), then look for /etc/pki/ca-trust/source/anchors. If that exists (CentOS), copy the certificate file there. Then, run:

For a Windows machine:

  1. Use WinSCP to transfer the ca_cert.pem file to your Windows system (not the key; the key never needs to leave the CA).
  2. Run MMC.EXE as administrator.
  3. Click File->Add/Remove Snap-in.
  4. Choose Computer Account and click Next.
  5. Leave Local Computer selected and click Finish.
  6. Click OK back on the Add/Remove Snap-ins dialog.
  7. Back in the main screen, right-click Trusted Root Certification Authorities. Hover over All Tasks, then click Import.
  8. On the Welcome screen, you should not be allowed to change the selection from Local Machine.
  9. Browse to the file that you copied over using WinSCP. You’ll either need to change the selection to allow all files or you’ll need to have renamed the certificate to have a .cer extension.
  10. Choose Trusted Root Certification Authorities.
  11. Click Finish on the final screen.
  12. Find your new CA in the list and double-click it to verify.

The above steps can be duplicated for other computers that need to access the Nagios site. For something a bit more widespread, you can deploy the certificate using Group Policy Management Console. In the GPO’s properties, drill down to Computer Configuration\Windows Settings\Security Settings\Public Key Policies\Trusted Root Certification Authorities. You can right-click on that node and click Import to start the same wizard that you used above.

Note: Internet Explorer, Edge, and Chrome will use trusted root certificates from the Windows store. The authors of Firefox have decided that reinventing the wheel and maintaining a completely separate certificate store makes sense somehow. You’ll have to configure its trusted root certificate store within the program.

Step 4. Secure the Nagios Web Site

If you followed any of my earlier guides, you’re accessing your Nagios site over port 80 with Basic authentication. That means that any moderately enterprising soul can snag your Nagios site’s clear-text password(s) right out of the Ethernet. You have several options to fix that. I chose to use an SSL site while retaining Basic authentication. Your password still travels, but it travels encrypted. As long as you protect the site’s private key, an attacker should find cracking your password prohibitively difficult.

You could also use Kerberos authentication to have the Nagios site check your credentials against Active Directory. When that works, it appears that your password is protected, even using unencrypted HTTP. However, I could not find an elegant way to combine that with the existing file-based authentication. So, if you’re one of my readers at a smaller site with only one or two domain controllers and you lose your domain for some reason, you’d also lose your ability to log in to your monitoring environment. Also, managing Kerberos users in Nagios is kind of ugly. I didn’t find that a palatable option.

So, we’re going to keep the file-based authentication model and add LDAP authentication on top of it. You’ll be able to use your Active Directory account to log in to the Nagios site, but you’ll also be able to fall back to the existing “nagiosadmin” account when necessary.

One thing that I don’t demonstrate is updating the firewall to allow for port 443. Whatever directions you used to open up port 80, follow those for 443.

Create the Certificate for Apache

If you only use the one site address, then you can continue using the same openssl.cnf file from earlier steps. So, if I were using “” to access my site, then I would just proceed with what I have. However, I access my site with “”. I also have a handful of other sites on the same system. I (and you) could certainly create multiple certificates to handle them all. I chose to use Subject Alternate[sic] Names instead. That means that I create a single certificate with all of the names that I want. It means less overhead and micromanagement for me. Again, we’re not hosting a stock exchange, so we don’t need to complicate things.

You have two choices:

  1. Edit your existing openssl.cnf file with the differences for the new certificate(s).
  2. Copy your existing openssl.cnf file, make the changes to the copy, and override the openssl command to use the copied file.

I suppose a third option would be to hack at the openssl command line to manually insert what you want. That requires more gumption than I can muster, and I don’t see any benefits. I’m going with option 2.

Of course, it’s not a requirement to use nano. Use the editor you prefer.

The following shows sample additions to the file. They are not sufficient on their own!

The req_extensions line already exists in the default sample config, but has a hash mark in front of it to comment it out. Remove that (or type a new line, whatever suits you). The [ v3_req ] section probably exists; whatever it’s got, leave it. Just add the subjectAltName line. The [ alt_names ] segment won’t exist (probably). Add it, along with the DNS and IP entries that you want.

Note: The certificates we create now are not part of the CA. I sudo mkdir /var/certs to hold non-CA cert files. That’s a convenience, not a requirement. Follow the guidance from earlier.

If you’re copy/pasting, note that I used a .cnf file from /etc/ssl. Your structure may be different.

You will be asked to answer a series of questions like you did for the CA, with an additional query regarding a password and a company name. I would just [Enter] through both of those.

Verify that your CSR has the necessary Subject Alternate Names:

Secure the private key:

Submit the request to your new CA:

First, openssl will ask you to supply the password for the CA’s private key. Next, you’ll be shown a preview of the certificate and asked twice to confirm its creation.

The generated file will appear in your CA’s configured output directory (the one used in these directions is /var/ca/newcerts). It will use the next serial from your /var/ca/serial file as the name. So, if you’re following straight through, that will be /var/ca/newcerts/01.pem. You can ls /var/ca/newcerts to see them all. The highest number is the one that was just generated. Verify that it’s yours:

Transfer the certificate to whatever location that you’ll have Apache call it from, and, for convenience, rename it:

Tell Apache to Use SSL with the Certificate

Apache allows so much latitude in configuration that it appears to be complicated. Every distribution that installs Apache from repositories follows its own conventions, making things even more challenging. I’ll help guide you where possible. If you feel lost, just remember these things:

  • The last time that Apache finds a configuration setting overrides all previous configurations of that setting
  • Apache reads files in alphabetical order
  • Apache doesn’t care about file names, only extensions

So, any time that a configuration doesn’t work, that means that a later setting overrides. It might be further down in the same file or it might be in another file, but it’s out there somewhere. It might be in a file with a seemingly related name, but it might not be.

Start by locating the master Apache configuration file.

  • Ubuntu and OpenSUSE: /etc/apache2/apache2.conf
  • CentOS: /etc/httpd/conf/httpd.conf

This file will help you to figure out what extensions qualify a configuration file and which directories Apache searches for those configuration files.

We will take these basic steps:

  1. Enable SSL
  2. Instruct Apache to listen on ports 80 and 443
  3. Instruct Apache to redirect all port 80 traffic to port 443
  4. Secure all 443 traffic with the certificate that we created in the preceding section

Enable SSL in Apache

Your distribution probably enabled SSL already. Verify on Ubuntu with apache2 -M | grep ssl. Verify on CentOS/OpenSUSE with httpd -M | grep ssl. If you are rewarded with a return of ssl_module, then you don’t need to do anything else.

To enable Apache SSL on Ubuntu/OpenSUSE: sudo a2enmod ssl.

To enable Apache SSL on CentOS: sudo yum install mod_ssl.

Configure SSL in Apache

We could do all of steps 2-4 in a single file or across multiple files. I tend to do step 2 in a place that makes sense for the distribution, then steps 3 and 4 in the primary site configuration file. We could also spread out certificates across multiple virtual hosts. I’m not hosting tenants, so I tend to use one virtual host per site, but each uses the same certificate.

Remember, it doesn’t really matter where any of these things are set. The only thing that matters is that they are processed by Apache after any conflicting settings. Do your best to simply eliminate any conflicts. For instance, CentOS puts a lot of SSL settings in /etc/httpd/conf.d/ssl.conf. For that distribution, I left all of the settings it creates for defaults but commented out the entire VirtualHost host item. I strongly encourage you to create backup copies of any file before you modify them. Ex: cp /etc/httpd/conf.d/ssl.conf /etc/httpd/conf.ssl.conf.original

Somewhere, you need a Listen 443 directive. Most distributions will automatically set it when you enable SSL (look in ports.conf or a conf file with “ssl” in the name). However, I’ve had a few times when that only worked for IPv6. If you can’t get 443 to work on IPv4, try Listen This resolves step 2.

Next, we need a port 80 to 443 redirect. Apache has an “easy” Redirect command, but it’s too restrictive unless you’re only hosting a single site. In my primary site file, I create an empty port 80 site that redirects all inbound requests to an otherwise identical location using https:

This sequence sends a 301 code back to the browser along with the “corrected” URL. As long as the browser understands what to do with 301s (every modern browser does), then the URL will be rewritten right in the address bar. If you’re stuck for where to place this, I recommend:

  • On Ubuntu: /etc/apache2/sites-available/000-default.conf (symlinked from /etc/apache2/sites-enabled/)
  • On CentOS: /etc/httpd/conf.d/sites.conf
  • On OpenSUSE: /etc/apache2/default-server.conf

Wherever you put it, you need to verify that there are no other virtual hosts set to port 80. If there are, comment them out. You could also replace the 80 with 443, provided that you also add in the certificate settings that I’m about to show you.

After setting up the 80->443 redirect, you next need to configure a secured virtual host. It must do two things: listen on port 443 and use a certificate to encrypt traffic. Mine looks like this:

If you have other sites on the same host, create essentially the same thing but use the ServerName/ServerAlias fields to differentiate. For instance, my MRTG site is on the same server:

If you want, you can certainly use the instructions from the preceding section to create as many additional certificates as necessary for your other sites.

You’ve finished the hard work! Now just restart Apache. service apache2 restart on systems that name the service “apache2” (Ubuntu, OpenSUSE) or service httpd restart (CentOS). Test by accessing the site using an https prefix, then again with an http prefix to ensure that redirection works.

Step 5: Configure Nagios for Active Directory Authentication

Now that we’re securing the Nagios web site with SSL, we can feel a little bit better about entering Active Directory domain credentials into its challenge dialog. We have five phases for that process.

  1. Create (or designate) an account to use for directory reads.
  2. Select an OU to scan for valid accounts.
  3. Enable Apache to use LDAP authentication.
  4. Configure Apache directory security.
  5. Set Nagios to recognize Active Directory accounts.

Create an Account for Directory Access

Use PowerShell or Active Directory Users and Computers to create a user account. It does not matter what you call it. It does not matter where you put it. It does not need to have any group membership other than the default Domain Users group. It only requires enough powers to read the directory, which all Domain Users have by default. I recommend that you set its password to never expire or be prepared to periodically update your Nagios configuration files.

Once you’ve created it, you need its distinguished name. You can find that on the Attribute Editor tab in ADUC. You can also find it with Get-ADUser:


Keep the DN and the password on hand. You’ll need them in a bit.

Selecting OUs for Apache LDAP

When an account logs in to the web site, Apache’s mod_authnz_ldap will search for it within locations that you specify. You need to know the distingished name of at least one organizational unit. Apache’s mod_ldap queries cannot run against the entire directory. I found many, many, many articles claiming that it’s possible, including Apache’s official document, but they are all lies (thanks for wasting hours of my time on searches and tests, though, guys, I always appreciate that).

It will, however, search multiple locations, and it can search downward into the child OUs of whatever OU you specify. Luckily for me, I have a special Accounts OU that I’ve created to organize user accounts. Hopefully, you have something similar. If not, you can use the default Users folder. You can do both.

I’ll show you how to connect to an OU and the default Users folder.


It is not necessary for the directory read account that you created in the first part of this section to exist in the selected location(s). The target location(s), or a sub-OU, only needs to contain the accounts that will log in to Nagios.

Once you’ve made your selection(s), you need to know the distinguished name(s). You can use the Attribute Editor tab like you did for the user, or Get-ADOrganizationalUnit:


Enabling LDAP Authentication in Apache

Apache requires two modules for LDAP authentication: authnz_ldap_module and ldap_module. You will probably need to enable them, but you can check in advance. On Ubuntu, use apache2 -M | grep ldap. On CentOS/OpenSUSE, use httpd -M | grep ldap. If you see both of these modules, then you don’t need to do anything else.

To enable Apache LDAP authentication on Ubuntu/OpenSUSE: sudo a2enmod authnz_ldap. You might also need to: sudo a2enmod ldap.

To enable Apache LDAP authentication on CentOS: yum install mod_ldap.

Make certain to perform the apachectl -M verification afterward to ensure that both modules are available.

Configuring Apache Directories to Use LDAP Authentication

Collect your OU DN(s), your user DN, and the password that user. Now, we’re going to configure LDAP authorization sections in Apache. Again, you can put these in any conf file that pleases you. I usually find the distribution’s LDAP configuration file:

  • Ubuntu: /etc/apache2/mods-available/ldap.conf
  • CentOS: /etc/httpd/conf.modules.d/01-ldap.conf
  • OpenSUSE: no default file is created for the ldap module on OpenSUSE; you can create your own or add it to another, like /etc/apache2/global.conf

Warning: On Ubuntu, the files always exist in mods-available; when you run a2enmod, it symlinks them from mods-enabled. I highly recommend that you avoid the mods-enabled directory. Eventually, something bad will happen if you touch anything there manually (yes, that’s experience talking). Edit the files in mods-available.

My ldap.conf, for comparison:

The initial lines are not required, but can make the overall experience a bit smoother. I am just using defaults; I didn’t tune any of those lines. The only thing to be aware of is that Apache will be oblivious to changes that occur during cache timeouts — including lockouts and disables.

A breakdown of the AuthnProviderAlias sections:

  • In the opening brackets, AuthnProviderAlias and ldap must be the first two parts. We are triggering the authn provider framework and telling it that we’re specifically working with the ldap provider. Where I used ldap-accounts and ldap-users, you can use anything you like. I named them after the specific OUs that I selected. Whatever you enter here will be used as reference tags in directories.
  • For AuthLDAPBindDN, use the distinguished name of the read-only Active Directory user that you created at the beginning of this section. You can omit the quotes if you have no spaces in the DN, but I would recommend keeping them just in case.
  • For AuthLDAPBindPassword, use the password of the read-only Active Directory account. Do not use quotes unless there are quotes in the password. If your password contains a quote, I recommend changing it.
  • For AuthLDAPURL, use the distinguished name of the OU to search. Use one? instead of sub? if you don’t want it to search sub-OUs.

Note that the Users folder uses CN, not OU.

TLS/SSL/LDAPS Configuration for Apache and LDAP

You should be able to authenticate with TLS or LDAPS if configured in your domain. I couldn’t get that to work because of the state of my domain. I have made it work elsewhere, so I can confirm that it does work. If you want to try on your own, I will tell you that right off, find the “LogLevel” line in your Apache configs and bump it up to “Debug” until you have it working, or you’ll have no idea why things don’t work. The logs are output to somewhere in /var/logs/httpd or /var/logs/apache2, depending on your configuration/distribution (the file is usually ssl_error_log, but it can be overridden, so you might need to dig a tiny bit). You can go through Apache’s documentation on this mod for some hints. You need at least:

  • LDAPTrustedGlobalCert CA_BASE64 /path/to/your/domain/CA.pem in some Apache file. I use the built-in ldap.conf or 01-ldap.conf for Apache. If you download the certificate chain from your domain CA’s web enrollment server, you can extract the subordinate’s certificate and convert it from P7B to PEM.
  • LDAPTrustedMode SSL in some Apache file if you will be using LDAPS on port 636. I normally keep it near the previous entry. Note: you can also just append SSL to any of the AuthLDAPURL entries for local configuration instead of global. In your AuthLDAPURL lines, you must change ldap: to ldaps: and append :636 to the hostname portion. Ex: AuthLDAPURL ldaps://,DC=siron,DC=int?sAMAccountName?sub?(objectClass=user)
  • LDAPTrustedMode TLS in some Apache file if you will be using TLS on port 389. I normally keep it near the previous entry. Note: you can also just append TLS any of the AuthLDAPURL entries for local configuration instead of global.
  • In the <Directory> fields that attach to AD, you need: LDAPTrustedClientCert CERT_BASE64 /var/certs/your-local-system.pem. It might also work in the AuthnProviderAlias blocks; I haven’t yet been able to try.
  • You might need to add LDAPVerifyServerCert off to an Apache configuration. I don’t like that, because it eliminates the domain controller authentication benefit of using TLS or LDAPS. Essentially, if you can get openssl -s_client -connect your.domain.address:636 -CAfile ca-cert-file-from-first-bullet.pem to work, then you will be fine.

The hardest part is usually keeping LDAPVerifyServerCert On. First, use openssl s_client -connect your.domain.controller:636 -CAfile your.addomain.cafile.pem. It will display a certificate. Paste that into a file and save it. Then, use openssl verify -CAfile your.addomain.cafile If that says OK, then you should be able to get SSL/TLS to work.

Because security is our goal here and I couldn’t get TLS or LDAP to work, I did run a Wireshark trace on the communication between the Nagios system and my domain controller. It does pass the user name in clear-text, but it does not transmit the password in clear text. I don’t love it that the user names are clear, but I also know that there are much easier ways to determine domain user accounts than by sniffing Apache LDAP authentication packets. There are also easier ways to crack your domain than by spoofing a domain controller to your Nagios system. If you can’t get TLS or LDAP to work, it won’t be the weakest link in your authentication system.

Note 1: Be very, very, very careful about typing. You’re handling your directory in read-only mode, so I wouldn’t worry about breaking the directory. What you need to worry about is the very poor error reporting in this module. I lost an enormous amount of time over a hyphen where an equal sign should have been. It was right on the side-scroll break of nano so I didn’t see it for a very long time. The only error that I got was AH01618: user esadmin not found: /., or whatever account I was trying to authenticate. If things don’t work, slow down, check for typos, check for overrides from other conf files.

Note 2: I will happily welcome verifiable assistance on improving this section. If you just throw URLs at me, they’d better contain something that I didn’t find on any of the 20+ pages that made big promises without delivering, and the directions had better work. For example, using port 3268 to authenticate against the global catalog vs. 389 LDAP or 636 LDAPS does not do anything special for trying to authenticate the entire directory.

Configure Apache Directory Security for LDAP Authentication

From here, the Apache portion is relatively simple. Assuming that you already have a Nagios directory configured, just compare with mine:

The default Nagios site created by the Nagios installer contains a lot of fluff, which I’ve removed. For instance, I don’t check the Apache version because I know what version it is. There’s only one major change, though: look at the AuthBasicProvider line. Yours, if you’re using the default, just says file. Mine also says ldap-users ldap-accounts. Those are the tags that I applied to the providers in the previous sub-section. By leaving file in there, I can still use the original “nagiosadmin” account, as well as any others that I might have created. If you create additional providers for other OUs, just keep tacking them onto the AuthBasicProvider lines.

On the AuthBasicProvider line, order is important. I placed file first because I want accounts to be verified there first. The majority of my accounts will be found in Active Directory, but the file is only a couple of lines and can be searched in a few milliseconds. If I need to reach out to the directory for an uncached account, that will cause a noticeable delay. For the same reason, order your LDAP locations wisely.


We’re not quite done; Nagios still doesn’t know what to do with these accounts. However, stop right now and go make sure that AD authentication is working.

sudo service apache2 restart or sudo service httpd restart, depending on your distribution. If Apache doesn’t restart successfully, use sudo journalctl -xe to find out why. Fix that, and move on. Once Apache successfully restarts, access your site at https:/yournagiosite.yourdomain.yourtld. Log in using an Active Directory account inside a selected OU. You do not need to prefix it with the domain name.

If all is well, you should be greeted with the Nagios home page. Click any of the navigation links on the left. The pages should load, but you should not be able to see anything — no hosts, no services, nothing. If so, that means that Apache has figured out who you are, but Nagios hasn’t. You can double-check that at the top left of most any of the pages. For instance, on the Tactical Overview:


Do not move past this point until AD authentication works.

Configure Nagios to Recognize Active Directory Accounts

Truthfully, Nagios doesn’t know an AD account from a file account. All it knows is that Apache is delivering an account to it. It will then look through its registered contacts for a match. So, in /usr/local/nagios/etc/objects/contacts.cfg, I have:

From there, add that account to groups, services, hosts, etc. as necessary. So, if your CFO wants a dashboard to show him that the accounting server is working, add his AD account accordingly. An account will only be shown its assigned items.

Remember, after any change to Nagios files, you must:

Note on cgi access: by default, only the “nagiosadmin” account can access the CGIs (most of the headings underneath the System menu item at the bottom left). That access is controlled by several “authorized_” lines in /usr/local/nagios/etc/cgi.cfg. As you become accustomed to using multiple accounts in Nagios, you’ll begin plopping them into groups for easier file maintenance. In this particular .cfg file, groups don’t mean anything. I found some Nagios documentation that insists that you can use groups in cgi.cfg, but I couldn’t make that work. You’ll have to enter each account name that you want to access any CGI.

Step 6: Configure check_nrpe and NSClient++ for SSL

After all that you’ve been through in this article, I hope that this serves as comfort: the rest is easy.

We’re going to take three major actions. First, we’ll create a “client” certificate for the check_nrpe utility, and then we’ll create a “server” certificate to be used with all of your NSClient++ systems. After that, we deploy the certificate to monitored systems and configure NSClient++ to use it.

Configure a Certificate for check_nrpe

This part is almost identical to the creation of the SSL certificate for the Apache site. You need to set up a config file to feed into openssl (or modify the default, but I don’t recommend that).

You have two choices:

  1. Edit your existing openssl.cnf file with the differences for the new certificate(s).
  2. Copy your existing openssl.cnf file, make the changes to the copy, and override the openssl command to use the copied file.

I suppose a third option would, again, be to hack at the openssl command line to manually insert what you want. I’m going with option 2 this time, as well.

Of course, it’s not a requirement to use nano. Use the editor you prefer.

The following shows sample replacements and additions to the file. They are not sufficient on their own!

The req_extensions line already exists in the default sample config, but has a hash mark in front of it to comment it out. Remove that (or type a new line, whatever suits you). The [ v3_req ] section probably exists; whatever it’s got, leave it. Just add the subjectAltName line. The [ alt_names ] segment won’t exist (probably). Add it, along with the DNS and IP entries that you want. I don’t know precisely how the check_nrpe tool presents itself to remote systems, or even if the monitored systems care beyond receiving a valid certificate, so set the DNS and IP entries to cover all of your bases.

Now, create the certificate request:

[Enter] through the questions (unless you need to change a default). Do not use a password when asked.

Lock the private key down:

Submit the CSR to your CA. I’m going to use the CA that I created on my Nagios system:

First, openssl will ask you to supply the password for the CA’s private key. Next, you’ll be shown a preview of the certificate and asked twice to confirm its creation.

The generated file will appear in your CA’s configured output directory (the one used in these directions is /var/ca/newcerts). It will use the next serial from your /var/ca/serial file as the name. So, if you’re following straight through, that will be /var/ca/newcerts/02.pem. You can ls /var/ca/newcerts to see them all. The highest number is the one that was just generated. Verify that it’s yours:

Transfer the certificate to the location that you’ll have check_nrpe load it from, and, for convenience, rename it:

Now “all” you need to do is go around and change every instance of check_nrpe in your object files to use the new certificate and never forget to use it on all new check_nrpe commands and change all those instances if you ever change something about the certificate. Who wants to do that? Oh, right, no one wants to do that.

So, let’s do this instead:

In the nano screen, paste this:


Note: In the original run of this document, I did not include the -L 'ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH' portion, and both encrypted and unencrypted NSClient++ installations worked just fine. Two weeks later, all checks began failing over cipher mismatches. There was no indication as to why they worked up until that moment. So, you may need to test your unencrypted systems WITHOUT the -L bit, then add it back in later. The $* after the -L is required.

Assuming that you used a valid IP for a system running NSClient++ on the default port, you should receive something similar to:

It doesn’t matter if the NSClient++ system is still configured to use insecure mode (mind the note above). If it says that you used incorrect parameters, that means that you didn’t compile check_nrpe with SSL support. Go back and do that (I gave instructions in all of my Nagios articles; if you update to the most recent version of check_nrpe then SSL is implicitly configured). For any other problems, you may need additional parameters, ex: to override the port. Use whatever is working in your existing commands.

Once you have this working, you can go into all of your command files and swap out all instances of “check_nrpe” for “check_nrpe_secure” with a simple Find/Replace operation. That’s all that you’ll need to remember to do going forward. If you need to change anything about the certificate(s), then you only need to modify check_nrpe_secure itself.

Configure a Certificate for NSClient++

You’ve seen this game a couple of times now. The primary difference between this and previous certificate generation is that I will not be using any subject alternate names for the NSClient++ certificate. Working from the assumption that you don’t want to sit around generating CSRs and delivering certificates to every single monitored system, I’m going to have you create one certificate to use on all of your monitored hosts. I suppose that’s a slight security risk, because anyone with the certificate set could pretend to be one of your monitored hosts. I think if they can spoof your system that well, being able to fool Nagios will be the least of your concerns. This exercise satisfies our primary goal of encrypting the Nagios<->monitored system traffic with a nice touch of allowing the monitored systems to authenticate the Nagios system. Anything more is mostly extra effort.

Generate the CSR first. You can do this from pretty much anywhere, but I’ll continue to use the pattern that we’ve established in this article:

Of course, it’s not a requirement to use nano. Use the editor you prefer.

The following shows sample replacements to the file. They are not sufficient on their own!

All of the changes that we made for the other certificates are unnecessary here. In fact, you could probably feed in the default openssl.cnf and just change the commonName when prompted. To each their own.

Create the certificate request:

[Enter] through the questions (unless you need to change a default). Do not use a password when asked.

Locking down this particular private key file won’t do much because you’re going to be copying it to all of your monitored hosts anyway.

Submit the CSR to your CA. Again, I’m using my Nagios system’s CA:

First, openssl will ask you to supply the password for the CA’s private key. Next, you’ll be shown a preview of the certificate and asked twice to confirm its creation.

The generated file will appear in your CA’s configured output directory (the one used in these directions is /var/ca/newcerts). It will use the next serial from your /var/ca/serial file as the name. So, if you’re following straight through, that will be /var/ca/newcerts/03.pem. You can ls /var/ca/newcerts to see them all. The highest number is the one that was just generated. Verify that it’s yours:

Transfer the certificate and private key to some distribution point. While I’m not overly concerned about securing the key this time, do perform some basic due diligence. No need to just open the door and invite attackers in. I do generally copy it over to /var/certs just for consistency (avoids the “now, where did I put that” problem).

Configure NSClient++ for Certificate Security

Start by making sure that all checks that target this host use the new check_nrpe_secure. If they use the original check_nrpe without certificate information, all checks to the host will fail once you complete these steps.

You need three things to proceed: the certificate that you created for NSClient, the private key that you created for NSClient, and the certificate for the CA that signed the NSClient certificate. You do not need the CA’s private key. Just leave that where it is.

Copy the three files to C:\Program Files\NSClient++\security.

Edit your nsclient.ini file. Here’s mine:

Restart the service ( Restart-Service nscp or net stop nscp && net start nscp).

From here, you just need to come up with a deployment technique that transfers the certificates, key, and ini file to all other systems, then restarts the ncsp service.


How to Monitor Hyper-V with Nagios: CentOS Edition

How to Monitor Hyper-V with Nagios: CentOS Edition


Are you monitoring your systems yet? I’m fairly certain that I make a big deal about that every few articles, don’t I? Last year, I wrote an article showing how to install and configure Nagios on Ubuntu Server. I’ve learned a few things in the interim. I also realize that not everyone wants to use Ubuntu Server. So, I set out on a mission to deploy Nagios on another popular distribution: CentOS. I’m here to share the procedure with you. Follow along to learn how to create your own monitoring system with no out-of-pocket cost. If you’re new to CentOS Linux on Hyper-V we suggest you check out the post.

Included in this Article

This article is very long because I believe in detailed instructions that help the reader understand why they’re typing something. It looks much worse than it is. I’ll cover:

  • A brief description of Nagios Core, the monitoring system
  • A brief description of NSClient++, the agent that operates on Windows/Hyper-V Servers to enable monitoring
  • Configuring a CentOS system to operate Nagios
  • Acquiring and installing the Linux packages
  • A discussion on the security of the NRPE plugin. Take time to skip down to this section and read it. The NSClient++/NRPE configuration that I will demonstrate in this article presents a real security concern. I believe that the risk is worthwhile and manageable, but you and/or your security team may disagree. Decide before you start this project, not halfway through.
  • Acquiring and installing the Windows packages
  • Configuring basic Nagios monitoring
  • Nagios usage

Not Included in this Article

While comprehensive, this article isn’t entirely all-encompassing. I am going to get you started on configuring monitoring, but you’re going to need to do some tinkering and investigation on your own. Building an optimal Nagios system requires practice more than rote instruction and memorization.

I have designed several monitoring scripts specifically for Hyper-V and failover clusters. These can be found in our subscriber’s area. Currently, the list includes:

  • Checking the free space of a Cluster Shared Volume
  • Checking the status of a Cluster Shared Volume (Redirected Access, etc.)
  • Checking the age of checkpoints
  • Checking the expansion percentage of dynamically-expanding VHD/Xs
  • Checking the health of a quorum witness

Since Nagios will alert you if any of these resources get into trouble, you can begin using those features without fear that they’ll break something while you’re not paying attention. Some of those monitors are shown in this grab of Nagios’ web interface:

Nagios Sample Data

Nagios Sample Data

What is Nagios Core?

Nagios Core is an open source software tool that can be used to monitor network-connected systems and devices. It processes data from sensors and separates the results into categories. By name, these categories are OK, Warning, and Critical. By default, Nagios Core sends a repeating e-mail when a sensor is in a persistent Warning or Critical state and a single “Recovery” e-mail when it has returned to the OK state.

Sensors collect data by “active” and/or “passive” checks. Nagios Core initiates active checks by periodically triggering plug-ins. Passive checks are when remote processes “call home” to the Nagios Core system to report status to a plug-in. The plug-in then delivers the sensor data to Nagios Core.

These plug-ins give Nagios Core its flexibility. Several plugins ship alongside Nagios Core. The Nagios community makes others available separately. Some are only included with the paid Nagios XI, which I will not cover. A plug-in is simply a Linux executable that collects information in accordance with its programming and returns data in a format that Nagios can parse.

Nagios Core provides multiple configurable options. One that we will be using is its web interface — a tiny snippet is shown in the screenshot above. This interface is not required, but grants you the ability to visually scan your environment from an overview level down to the individual sensor level. It also gives you other abilities, such as “Acknowledging” a Warning or Critical state and re-scheduling pending checks to make the next one occur very quickly(for testing) or much later (for repairs).

Isn’t Nagios Core Difficult?

Nagios Core has a reputation for being difficult to use, which I don’t think is appropriate. I believe that it got that reputation because you configure it with text files instead of in some flowery GUI. Nagios XI adds simpler configuration, but many will find that the cost jump from Core to XI makes editing text files more attractive.

Fortunately, the default installation include templates that not only show you exactly what you need to do, but also give you the ability to set things up via copy/paste and only a bit of typing. Personally, I found the learning curve to be very steep but also very short. Overall, I find Nagios Core much easier to use than the monitoring component of Microsoft’s full-blown Systems Center Operations Manager.

From here on out, I’m only going to use “Nagios” to mean “Nagios Core”.

What is NSClient++?

NSClient++ is a small service application that resides on Windows systems and interacts with a remote Nagios system. Since Nagios runs on Linux, it cannot perform a number of common Windows tasks. NSClient++ bridges the gap. Of its many features, we will be using it as a target for the “check_nt” and “check_nrpe” Nagios plug-ins. Upon receiving active check queries from these two plug-ins, it performs the requested checks and returns the data to those plug-ins.

Prerequisites for Installing Nagios and NSClient++

I’ve done my best to make this a one-stop experience. You’ll need to bring these things for an optimal experience:

  • One installation of CentOS. You can use a virtual machine, with these guidelines:
    • 2 vCPU, 512MB startup RAM, 256MB minimum RAM, 1GB maximum RAM, and a 40GB disk. Mine uses around 600MB of RAM in production, a negligible amount of CPU, and the VHDX has remained under 2GB.
    • Assign a static IP or use a DHCP reservation. You will be configuring NSClient++ to restrict queries to that IP.
    • If you only have one Hyper-V host, find some piece of hardware to use for Nagios. If you don’t have anything handy, check Goodwill or garage sales. You don’t want your monitoring system to be dependent on your only Hyper-V system.
    • I recommend against clustering the virtual machine that holds your Nagios installation. The less it depends upon, the better. In a 2-node Hyper-V cluster, I configure one Nagios system on internal storage on one node and a second Nagios system on the second node that does nothing but monitor the first.
    • Refer to my prior article if you need assistance installing CentOS. Includes instructions for running it in Hyper-V.
  • Download NSClient++ for the Windows/Hyper-V Servers to monitor. If you only have a few systems, the MSI will be the easiest to work with. If you have many, you might want to get the ZIP for Robocopy distribution.
    • Note: If using the ZIP, install the latest VC++ redistributable on target systems. Without the necessary DLLs, the NSClient service will not run and does not have the ability to throw any errors to explain why it won’t run.
  • NRPE (Nagios Remote Plugin Executor). This will run on the Nagios system.
  • WinSCP (optional). You can get by without WinSCP, but it makes Nagios administration much easier. See my previously linked article on CentOS for a WinSCP primer.
  • PuTTY (optional). You could also get by without PuTTY, if you absolutely had to. I wouldn’t try it. The linked CentOS article includes a primer for PuTTY as well.
  • Download Nagios Core and the Nagios plugins from to your management computer. More detailed instructions follow.

Software Versions in this Document

This article was written using the following software versions:

  • Nagios Core 4.3.1
  • Nagios Plugins 2.2.1
  • NSClient++
  • NRPE 3.1.0. For our purposes, a 2.x version would be fine as well because v3.x needs to downgrade its packets to talk to NSClient++ anyway.

Downloading Nagios Core and Transferring it to the Target System

Start on I’ll give step-by-step instructions that worked for me. Seeing as how this is the Internet, things might be different by the time you read these words. Your goal is to download Nagios Core and the Nagios Plugins.

  1. From the Nagios home page, hover over Downloads at the top right of the menu. Click Nagios Core.
  2. You’ll be taken to the editions page. Under the Core column, click Download.
  3. If you want to fill in your information, go ahead. Otherwise, there’s a Skip to download link.
  4. You should now be looking at a table with the latest release and the release immediately prior. At the far right of the table are the download links. For reference, the version that I downloaded said nagios-4.3.1.tar.gz. Click the link to begin the download. Don’t close this window.
  5. After, or while, the main package is downloading, you can download the plugins. You can hover over Downloads and click Nagios Plugins, or you can scroll down on the main package download screen to Step 2 where you’ll find a link that takes you to the same page.
  6. You should now be looking at a similar table that has a single entry with the latest version of Nagios tools. The link is at the far right of this table; the one that I acquired was nagios-plugins-2.2.1.tar.gz. Download the current version.
  7. If you didn’t already download NRPE, do so now.
  8. Connect to your target system in WinSCP (or whatever other tool that you like) and transfer the files to your user’s home folder. I tend to create a Downloads folder (keep in mind that Linux is case-sensitive), but it doesn’t really matter if you create a folder or what you call it as long as you can navigate the system well enough to find the files.

Note: You could use the wget application to download directly to your CentOS system. I never download anything from the Internet directly to a server.

Prepare CentOS for Nagios

Nagios depends on a number of other packages in order for its installation and operation. These steps were tested on a standard deployment of CentOS, but should also work on a minimal build.

Download and install prerequisite packages:

Then, we’ll create a user for operating Nagios.

Upon entering the passwd command, you’ll be asked to provide a password. I don’t want to tell you what to do, but you should probably keep note of it.

Next, we’ll create a security group responsible for managing Nagios and populate it with that new “nagios” user group and the account that Apache runs under.

Install or Upgrade Nagios on CentOS

Now we’re ready to compile and install Nagios.

First, you need to extract the files.

Note:I am not using sudo for the extraction! If you run the extraction with sudo, then you will always need to use sudo to manipulate the extracted files.

Note: I am using the directory structure and versions from my WinSCP screenshot earlier. If you placed your files elsewhere or have newer versions, this is an example instead of something you can copy/paste.

Execute the following to build and install Nagios.

Note: If upgrading, STOP after sudo make install or your config files will be renamed and replaced with the new defaults!

Note: Do not copy/paste this entire block at once. Run each line individually and watch for errors in the output.

Only if new install. Set Nagios to start automatically when the system starts.

Only if upgrading. Instruct CentOS to refresh its daemons and restart the newly replaced Nagios executable.

Install or Upgrade Nagios Plugins on CentOS

The plugin installation process is similar to the Nagios installation process, but shorter and easier.

Unpack the files first. The same notes from the Nagios section are applicable.

Compile and install the plugins. The same notes from the Nagios section are applicable, especially the bit about taking this one line at a time.

You’ve just installed several plugins for Nagios, most of which I’m not going to show you how to use. If you’d like to take a look, navigate to /usr/local/nagios/libexec:


Most of them have built-in help that you can access with a -h or –help:

You can also search on the Internet for assistance and examples.

Security and the NRPE Plugin

The next step is to compile and install the NRPE plugin. Before that, we need to stop and have a serious chat about it.

You might read in some places that NRPE plugin is a security risk. That is correct. It allows one computer to tell another computer to run a script and return the results. Furthermore, we’re going to be sending arguments (essentially, parameters) to those scripts. Doing so opens the door to injection attacks. One method that has been used to combat the issue is NRPE traffic encryption. I am not going to be exploring how to encrypt NRPE communications at this time.

I have several reasons for this:

  •  The simplest reason is that it’s difficult to do and I’m not certain of how much value is in the effort.
  • Encryption is often mistaken for data security when it is, in fact, more about data privacy. For example, if you transmit your password in encrypted format and the packet is intercepted, the attacker still has your password. The fact that it’s encrypted might be enough to put the attacker off, but any encryption can be broken with sufficient time and effort. Therefore, your password is only private in the sense that no casual observer will be able to see it. To keep it secure, you should not transmit it at all. We don’t really have that option. Because we are not encrypting, what an attacker could see is the command string and the result string. You’ll have full knowledge of what those are, so you can decide how serious that is to you. Our best approach is to ensure that the Nagios<->host communications chain only occurs on secured networks, even if we later enable SSL.
  • The author of NSClient++ had the good sense to ensure that you can’t operate just any old free-form script via NRPE. Scripts must be specifically defined and can be tightly controlled. If the script itself is sufficiently well-designed, a script injection attack should be prohibitively difficult. That still leaves the door open to data snooping, so take care in what data your checks return.
  • The author of NSClient++ also coded in the ability to restrict NRPE activities to specific source IP addresses. IP spoofing is possible, of course.
  • Windows, Linux, and/or hardware firewalls can help enforce the source and destination IP communications. Spoofing is still a risk, of course.
  • I ran a Wireshark trace on a Nagios-to-NSClient++ communications channel. Nothing was transmitted in clear text. There were changes made in NRPE 3.x that lead me to believe that it might be performing some encryption. Then again, it might just be Base64 encoding. Either way, no casual observer will be able to snoop it.

What I didn’t address in the above points is that NSClient++ could effectively authenticate the Nagios computer by only accepting traffic that was encrypted with its private key. So, yes, NRPE is a security risk and it is a higher risk without SSL. I won’t try to convince you otherwise.

I believe that, for internal systems, the risk is very manageable. If you’re going to be connecting to remote client sites, I would put the entire Nagios communications chain inside an encrypted VPN tunnel anyway because even if you encrypt NRPE, the other traffic is clear-text. The only people that I think should worry much about this are those that will be connecting Nagios to Hyper-V hosts using unsecured networks. Personally, I’m uncertain how a case could be made to do that even with SSL configured.

I’m not saying that I’ll never look into encrypting NRPE. Just not now, not in this article.

Install or Upgrade the NRPE Plugin on CentOS

For our purposes, the NRPE plugin requires little effort to install.

If you’re not already in the folder that contains the NRPE gzip file, return there. Unpack the file just like you did with the others.

Switch to the extracted folder. Compile and install the plugin. Part of the configure process includes creating a new SSL key. It requires several minutes to complete.

Verify that the plugin was created:

If CentOS responded by showing you a file, then all is well.

Connecting Apache to Nagios on CentOS

At this point, Nagios works and can begin monitoring systems. You’ll need to do some extra work to get the web interface going.

Start by creating an administrative web user account. This account belongs to Apache, not CentOS. As created in this article, it will have full access to everything in the web interface.

You will be prompted to enter a password for the “nagiosadmin” account. Be aware that the -c parameter causes the file to be created anew. If it already exists, it will be overwritten.

You also need to add the CentOS account that Apache uses to the security group that can access Nagios’ files:

Next, set Apache to start when the system boots:

Allow port 80 through the firewall:

By default, the Security-Enhanced Linux module will prevent several of Nagios’ CGI modules from operating. If you want to quickly get around that, simply disable SELinux. Since we’re not hosting an online banking website or anything, I feel that this is an appropriate solution:

When nano opens the config file, change the line that reads SELINUX=enforcing to SELINUX=disabled. Press [CTRL]+[X] when finished, then [Y] and [Enter] to exit. I tinkered with some of the other options and mostly managed to break my system. If you’re concerned, then this article might help you.

Apache on CentOS ships with a default page that we need to disable:

Start Apache:

Your Nagios installation can now be viewed at http://yourserver/nagios. If you want a quick test from the local command line:

If you get a 301 or a lot of HTML with an embedded message that your browser doesn’t work, that’s a good sign.

Configuring Apache to Use Nagios as the Default Site

If you don’t want to hang /nagios off of URL requests to your system, follow these directions.

Open /etc/httpd/conf.d/nagios.conf in any text editor. It’s a fairly large file, so WinSCP or Notepad++ will make the chore simpler. From nano:

Add the following lines, either at the beginning or the end of the file. I’ve highlighted two lines where you’ll want to substitute your system details for mine:

Restart Apache for the settings to take effect:

Your Nagios installation will now appear at http://yourserver, without the need to add /nagios. If you changed those first two lines and you add matching records to your internal DNS server, the system will also respond at the specified URL.

Configure Nagios to Send E-mail on CentOS

By default, Nagios will use the /usr/bin/mail executable to send e-mail. You need to configure Postfix for that to work. There are many ways that can be done, and I have neither the time nor the systems to test them all. Fortunately, a document already exists that can help you with the most common configurations. You can also find several how-tos from the postfix documentation page. I will show you how to get started and I’ll demonstrate the two methods that I know. For anything else, you’ll need to research on your own.

Initial Postfix Configuration on CentOS

It’s easy to get started:

The basic e-mail infrastructure is now on your system.

Relaying Mail Through a Friendly Mail Server

If you’ve got a mail server that will allow anonymous e-mail via port 25 connections (like an Exchange server that allows local addresses), you have very little to do.

Open /etc/postfix/ in a text editor. This is a large file and you’re going to be doing a lot of navigating, so choose your editor wisely. For nano:

Make these changes:

  • Uncomment the #myhostname line (by removing the #). Change it to: myhostname = myserver.mydomain.mytld (Substituting your server and domain information). This is the host name that it will present to the mail server.
  • Uncomment the #myorigin line. Change it to myorigin = mydomain.mytld (Substituting your domain information). E-mails sent by this server will append that domain to the user name.
  • Uncomment one of the #inet_interfaces lines or add a new one. Change it to: inet_interfaces = loopback-only. This sets this server to not receive any inbound e-mail.
  • After the #mydestination lines, add this: mydestination = . This will also prevent this server from accepting e-mail.
  • Uncomment one of the #relayhost lines or add a new one in this format: relayhost = myrealmailserver.mydomain.mytld. Substitute the real name or IP address of the host that will relay e-mail for this server.

Restart postfix:

Relaying Mail Through Your ISP

Some of us don’t have our own mail server. If you’re paying for a static IP and have registered a domain name, then you could configure your new Postfix installation as a true mail server. But, most of us aren’t that lucky either. Instead, we can configure Postfix to log in to our ISP’s SMTP account and send e-mail as us. Credit to ProfitBricks.

Install the necessary security binaries:

Open /etc/postfix/ in a text editor. This is a large file and you’re going to be doing a lot of navigating, so choose your editor wisely. For nano:

Make these changes:

  • Uncomment the #myhostname line (by removing the #). Change it to: myhostname = myserver.mydomain.mytld (Substituting your server and domain information). This is the host name that it will present to the mail server.
  • Uncomment one of the #relayhost lines or add a new one in this format: relayhost = smtpserver.yourisp.tld. Substitute the real name or IP address of your ISP according to their instructions for SMTP connections. If your ISP requires a different port (ex: Gmail), use brackets around the host name, a colon, and the port: relayhost = []:587.
  • At the end of the file, add:

Create a file to hold the username and password to be used with your ISP’s mail server:

Inside that file, you’re going to enter your information in this format: username:password. Two examples:

Generate a lookup table for Postfix to retrieve the passwords:

Restart postfix:

Remove the clear-text file containing your password:

Nagios Is Installed!

You’ve completed all the functionality steps for the server! Walk through the web pages and check for any issues. I followed all of these same steps through and ended up with a fully functional system. If you’re having troubles, check that all of the prerequisite components installed successfully. If you have issues in one browser, try another.

You don’t have any sensors set up yet, so your displays will be very dull. We’ll rectify that in a bit. First, we need to talk about the Windows agent NSClient++.

Installing NSClient++

If you didn’t download NSClient++ before, do so now. NSClient++ has multiple deployment options. For your very first, I highly recommend one of the MSI installs. If you’ve got many systems, you might opt to grab a ZIP distribution as well. You can then mass-push pre-defined configurations via Robocopy, login scripts, or other means.

The installation screens:

  1. I didn’t show the initial screen. The first that you care about asks you to select the Monitoring Tool. Choose Generic.
  2. Next, choose your installation type. I ordinarily pick Custom so that I can deselect the op5 options, but any will work.
  3. Pay at least some attention on this screen. Everything here will be written to the NSClient++.ini file, so you can change it all later. These are the appropriate options to use with Nagios, but I’ll discuss each after the list.
  4. Finish the installation.

The configuration items that I instructed to choose:

  • Allowed hosts: This field is required. Any source IP not in the list will be rejected by NSClient++. You can use ranges (ex: 192.168.0/24).
  • Password: check_nt uses this password (-s switch). check_nrpe does not care. By default, Nagios has a single check_nt command item that you call from other sensors. If you wish to prevent password-sharing, you’ll need to duplicate the check_nt command for each separate password.
  • Common check plugins: These are built-in plugins that you can use with NSClient++. I don’t do much with them, but you might.
  • Enable nsclient server (check_nt): You will almost certainly use several check_nt sensors.
  • Enable NRPE server (check_nrpe): My Hyper-V test scripts depend upon NRPE.
  • Insecure legacy mode (required by old check_nrpe). Since we aren’t configuring certificates, this setting is required by the current check_nrpe as well.
  • Enable NSCA client: I’m not using this client, so I didn’t enable it.
  • Enable Web server: I just configure by text file, so I didn’t enable this, either.

Configuring NSClient++ to Work with PowerShell

You’ll need to modify NSClient++ to work with PowerShell. The installer doesn’t do that.

The ini file can be found at C:\Program Files\NSClient++\nsclient.ini. If you installed the 32-bit NSClient++ on 64-bit Windows, look in C:\Program Files (x86). The ini file is quite a mess. The following is my nsclient.ini file, with all of the fluff stripped away and the necessary lines added for PowerShell to function. I’ve highlighted what you must add:

The lines afterward show how I set up the commands and parameters for my customized scripts. The script bodies themselves are not included in this article (subscriber’s area, remember?).

Change your file to include the necessary lines and save the file. At an elevated command prompt, run:

It’s a normal Windows service with the name “nscp”, so you can also use ‘services.msc’, sc, or the PowerShell Stop-Service, Start-Service, and Restart-Service commands.

After the above, run netstat -aon | findstr LISTENING. Verify that there is a line item for 5666 (check_nrpe) and a line for 12489 (check_nt).

If this host is not configured to run unsigned PowerShell commands, run this at an elevated PowerShell prompt:

Much has been written about the execution policy and I have nothing to add. You can do an Internet search to make your own decisions, of course. None of my scripts are signed, so you’ll need RemoteSigned or looser in order to use them.

That’s It!

Your deployment is complete! Now it’s time to learn how to manage Nagios and configure sensors.

Controlling Nagios

During Nagios sensor configuration, you’ll find that you spend a great deal of time managing the Nagios service. Nagios control from the Linux command line is very simple. You’ll soon memorize these commands. Activate them in a PuTTY session or a local console.

ALWAYS Check the Nagios Configuration

After making any changes to configuration files, verify that they are valid before attempting to apply them to the running configuration:

If there are any problems, you’ll be told what they are and where to find them in the files. As long as you don’t stop Nagios, it will continue running with the configuration that it was started with. That gives you plenty of time to fix any errors.

Restart Nagios

Restart the Nagios service (only after verifying configuration!):

Stop and Start Nagios

If you need to take Nagios offline for a while and bring it up later (or if you forgot to checkconfig and have to recover from a broken setup), these are the commands:

Verify that Nagios is Running

Usually, the ability to access the web site is a good indication of whether or not Nagios is operational. If you want to check from within the Linux environment:

This will usually fill up the screen with information, so you’ll be given the ability to scroll up and down with the arrow keys to read all of the messages. Press [Q] when you’re finished.

An Introduction to Nagios Configuration Files

From here on out, I will be using WinSCP to manipulate the Nagios configuration files on the Linux host. Use PuTTY to issue the commands to check and restart the Nagios service after configuration file changes. You do not need to restart the Apache service.

Personally, I connect using the nagios account that we created in the beginning. WinSCP remembers the last folder that it was in per user, so it’s easier for navigation and so that you never run into any file permission problems. Just make a separate entry to the host for that account:

WinSCP Nagios Site

WinSCP Nagios Site

Work your way to /usr/local/nagios/etc. This is the root configuration folder. It mostly contains information that drives how it processes other files.

Nagios Root etc

Nagios Root etc

This location contains four files. I’m not going to dive into them in great detail, but I encourage you to open them up and give their contents a look-over to familiarize yourself.

  • cgi.cfg: As it says in the text, this is the primary configuration file. I have not changed anything in it.
  • htpasswd.users: This is the file that Apache will check when loading objects. Use the instructions at the top of this article to modify it.
  • nagios.cfg: This file contains a number of configuration elements for how Nagios interacts with the system. We are going to modify the OBJECT CONFIGURATION FILE(S) portion momentarily.
  • resource.cfg: This file holds customizable macros that you create, like the ones for e-mail.

Now, open up /usr/local/nagios/etc/objects. This is where the real work is done.

Nagios Configuration Folder in WinSCP

Nagios Configuration Folder in WinSCP

The file names are for your convenience only. Nagios reads them all the same way. So, don’t get agitated if you feel like a host template definition would be better in some file other than templates.cfg; Nagios doesn’t care as long as everything is formatted properly. This is what the files generally mean:

  • commands.cfg: This contains the commands that constitute the actual checks. For instance, check_ping is defined here.
  • contacts.cfg: When Nagios needs to tell somebody something, this is where those somebodies’ information is stored. It’s also where you connect users to time periods. For example, I have my administrative account in the business hours time period because I don’t really want to be woken up in the middle of the night because my test lab is unhappy.
  • localhost.cfg: Contains checks for the Linux system that runs Nagios.
  • printer.cfg: Define printer objects and checks here.
  • switch.cfg: Physical switches and their check definitions are in this file.
  • templates.cfg: Basic definitions that other definitions can inherit from are contained within.
  • timeperiods.cfg: You probably don’t want to be notified in the middle of the night when a switch misses a single ping, but you might want to know about it during normal work hours. Define what “normal work hours” and “leave me alone” time is in this file.
  • windows.cfg: Basic definitions for Windows hosts and checks.

Poke through these and get a feel for how Nagios is configured.

Nagios Objects and Their Uses

Nagios uses a few species of objects. Getting these right is important. Use the template file to guide you. The most pertinent objects are listed below:

  • contact: A target for notifications — usually an individual.
  • host: A host is any endpoint that can be checked. A computer, a switch, a printer, and a network-enabled refrigerator all qualify as a host.
  • command: Nagios checks things by running commands. The command files live in its plugins folder. The command definitions explain to Nagios how to call those plugins.
  • service: A “service” in Nagios is anything that Nagios can check with a command, and is a much more vague term than it is in Windows. In Nagios, services belong to hosts. So, if you want to know if a switch is alive by pinging it, the switch is a “host” and the ping is a “service” that calls a “command” called check_ping. You might call these “sensors” to compare to other products.
  • host group: Multiple hosts that are logically lumped together constitute a host group. Use them to apply one service to lots of hosts at once.
  • time period: This object is fairly well-explained by its name. They’re probably best understood by looking in the timeperiods.cfg file.

Nagios Templates

I’d say that the best place to start looking at Nagios objects is in the templates file. This is a copy/paste of the Contact template:

Start with the define line that indicates what type of object this block is describing. Most importantly, it signals to Nagios which properties should exist. Within this particular block, all properties for a contact are present with specific settings for each. If you use this template with a new object, then these will be its default settings. Next, notice the register line. By setting it to 0, you make it unavailable for Nagios to use directly, which is what makes this definition a template.. Now, look at an implementation of the above template:

It is also defined as a contact. First, notice the use line. Its name matches that of the template. That means that you don’t need to provide every setting for this contact, only the ones that you want to differ from the template. It is not necessary for an object to use a template. You can fill out all details for an object. A live object cannot use another live object, though, but one template can use another.

I often make backups of my configuration files before tinkering with them. WinSCP makes this simple with the Duplicate command. I also tend to copy my live configuration files to a safe place. Even though this whole thing seems easy to understand, you will make mistakes. Some of your mistakes are going to seem very stupid in retrospect. Always, always, always run sudo service nagios checkconfig before applying any new changes!

Nagios Hosts

A host in Nagios is an endpoint. It’s an easy definition in my case because I am going to specifically talk about Hyper-V hosts. The following is a host template definition that I created for my environment:

These hosts use a template that I created:

You’ll notice that this template uses the base windows-server template, but really makes no changes. I’m not overriding much in the windows-server template, so I could have all of my hosts use that one directly. However, creating a template to set up an inheritance hierarchy now is an inexpensive step that gives me flexibility later.

Nagios Groups

Most of the singular objects, like contacts and hosts, also have a corresponding group object. You might have noticed in my Hyper-V host template that it has a hostgroups property. Every host object that uses this template will be a member of the hyper-v-servers host group. Groups have very simple definitions:

I could also have used a members property within the host group definition or a hostgroups property within my Hyper-V host definitions to accomplish the same thing. This is less typing.

Host groups are very useful. First, they get their own organization on the Host Groups menu item in the Nagios web interface:

Nagios Host Groups Display

Nagios Host Groups Display

Second, you can define services at the host group level. That’s important, because otherwise, you’d have to define services for each and every host that you want to check, even if they’re all using the same check!

Nagios Services

Don’t let the term service confuse you with the same thing in a Windows environment. In Nagios, a service has a broader, although still perfectly correct definition. Anything that we can check is a service, whether that’s a ping response, Apache returning valid information on port 80, or even the output of a customized script like I have created for Hyper-V items.

The following is a service that I have created to monitor a Windows service — the Hyper-V Virtual Machine Manager service, to be exact:

Notice my use of hostgroup_name so that I only have to create this service one time. If I were creating a service for a specific host, I would use host_name instead.

I encourage you to look at the documentation for services. You may want to change the frequency of when checks occur. You may also want to redefine how long a service can be in a trouble state before you are notified.

Useful Nagios Objects Documentation

I’ve spent a little bit of time going over the objects within Nagios, but there is already a wealth of documentation on them. You will, no doubt, want to configure Nagios items on your own. NSClient++ also has a great deal more capability than what I’ve shown. These links helped me more than anything else:

Dealing with Problems Reported by Nagios

The web display is nice, and everyone enjoys seeing a screen-full of happy green monitor reports, but that’s not why we set up Nagios installations. Things break, and we want to know before users start calling. With the configuration that you have, you’ll have the ability to start getting notifications as soon as you set yourself up as a contact with valid information. When a problem occurs, Nagios will mark it as being in a SOFT warning or critical state, then it will wait to see if the problem persists for a total of three check periods (configurable). One the third check, it will mark the service as being in a HARD warning or critical state and send a notification.

If you fix a problem quickly, or if it resolves on its own, you’ll get a Recovery e-mail to let you know that all is well again. If the problem persists, you’ll continue getting an e-mail every few minutes (configurable). If one host has many services in a critical state, or if many separate hosts have issues, you’re going to be looking at a lot of e-mails.

The following screenshot shows what a service looks like in a critical state. You can see it on the Services menu item, the Services submenu under the Problems menu, and on the (Unhandled) link that is next to it.

Nagios Critical Service

Nagios Critical Service

If you click the link for the name of the service, in this case, Service: DNS, it will take you to the following details screen:

Nagios Service Detail

Nagios Service Detail

Take some time to familiarize yourself with this screen. I’m not going to discuss every option, but they are all useful. For now, I want you to look at Acknowledge this service problem.

Acknowledging Problems in Nagios

“Acknowledging” means that you are aware that the service is down. Once acknowledged, an acknowledgement notification will be sent out, but then no further notifications until the service is recovered. Basically, you’re telling Nagios, “Yes, yes, I know it’s down, leave me alone!” Click the Acknowledge this service problem link as shown in the previous screen shot and you’ll be taken to the following screen:

Nagios Acknowledgement

Nagios Acknowledgement

You can read the Command Description for an extended explanation of what Acknowledgement does and what your options are. I tend to fill out the comments field, but it’s up to you. Upon pressing Commit, the notification message is sent out and Nagios stops alerting until the service recovers (sometimes you get one more problem notification first).

Rescheduling a Nagios Service Check

Nagios runs checks on its own clock. You might have a service that doesn’t need frequent checks, so you might set it to only be tested every hour. During testing, you certainly won’t want to wait that long to see if your check is going to work. You might also want that Recovery message to go out right away after fixing a problem. In the service detail screen as shown a couple of screen shots up, click the Re-schedule the next check of this service link:

Reschedule Nagios Service Check

Reschedule Nagios Service Check

Of course, the time in the screen shot doesn’t mean anything to you. It’s the exact moment that I clicked the link on my system. If you then click Commit, it will immediately run the check. It might still take a few moments for the results to be returned so you won’t necessarily see any differences immediately, but the check does occur on time.

Scheduling Downtime

Smaller shops might not find it important to schedule downtime. If your Hyper-V host can reboot in less than 15 minutes, then you might not even get a downtime notification using the default settings. However, Nagios will give you the ability to start providing availability reports. Wouldn’t it be nice to show your boss and/or the company owners that your host was only ever down during scheduled maintenance windows?

From the service detail screen shot earlier, you can see the Schedule downtime for this service link. I’m assuming that you’ll be more likely to want to set downtime on a host rather than an individual service. The granularity is there for you to do either (or both) as suits your needs. A host’s detail screen (not shown) has Schedule downtime for this host and Schedule downtime for all services on this host links. You can also schedule downtime for an entire host or service group. These screens all look like this:

Nagios Downtime Scheduler

Nagios Downtime Scheduler

During scheduled downtime, notifications aren’t sent. In all reports, any outages during downtime are in the Scheduled category rather than Unscheduled.

The default Nagios Core distribution does not have a way to automatically schedule recurring downtime. There are some community-supported options.

Nagios Availability Reports

You saw a link in the service details screen shot above to View Availability Report for This Service. Hosts and services have this link. There’s also an Availability menu in the Reports section on the left that allows you to build custom reports. The following is a simple host availability report:

Nagios Availability

Nagios Availability

This is only for a single day. Notice the report options in the top right.

User Management for Nagios

So far, I’ve had you use the “nagiosadmin” account. As you spread out your deployment, you’re going to also set up new contacts. If you like, you can restrict those contacts to only see their own systems.

First, add a user to the nagcmd group. This will allow them to configure Nagios’ files. Be careful! If you don’t trust someone, skip this part and handle configuration for them. Optionally, you can skip the usermod parts (adds them to a group) and give them targeted access to specific configuration files.

Due to a difference between the Windows security model and the Windows security model, there is no secure way for Apache on Linux to be able to read the system users. So, you need to create a completely separate account for the Nagios web interface:

Now, create a Nagios contact with a matching contact_name:

The hosts, services, contact groups, etc. that the “andy” account is attached to will determine what Andy sees when he logs in to the Nagios web interface.

Using Nagios to Monitor Hyper-V – The real fun stuff starts here!

You now have all the tools you need to build your Hyper-V monitoring framework with Nagios. I’ve also written a few scripts and services that will get you up and running: Required Base Scripts, Monitoring the Oldest Checkpoint Age, Monitoring Dynamically Expanding VHDX Size, and more.

If you’d like to pick up these scripts and services, please register below to get access!

Nagios for Hyper-V: Alert on Failed Quorum

Nagios for Hyper-V: Alert on Failed Quorum


The health of a Microsoft Failover Cluster’s quorum leans most heavily on the state of the nodes. If you’re already using Nagios to monitor individual node states, then you’ll find out very quickly if any of them are down. Sometimes, though, the witness goes offline. If you haven’t got a monitor on that, then you can run into other problems. For instance, you may opt to manually pause a node for maintenance. If the witness is already down, the loss combination might cause the entire cluster to go offline. This article presents a short Nagios detection script for the status of your quorum witness.

This script is useful for any cluster, not just Hyper-V clusters.

If you’re new to Nagios, then you should probably start with the How To: Monitor Hyper-V with Nagios article first. I did publish a follow-up article with a script with some base functions for a cluster, but that script is not required to use this one.

NSClient++ Configuration

These changes are to be made to the NSClient++ files on all Windows nodes that are part of the cluster to be monitored. These instructions do not include configuring NSClient++ to operate PowerShell scripts. Please refer to the aforementioned how-to article for that.

C:\Program Files\NSClient++\nsclient.ini

If the indicated INI section does not exist, create it. Otherwise, just add the second line to the existing section.

The NSClient++ service must be restarted after all changes to its ini file.

C:\Program Files\NSClient++\scripts\check_clusterquorumwitness.ps1

Create the file with the following contents:

Nagios Configuration

These changes are to be made on the Nagios host. I recommend using WinSCP as outlined in our main Nagios and Ubuntu Server articles.


The Hyper-V Host Commands section should already exist if you followed our main Nagios article. Add this command there. If you are not working with a Hyper-V system, then you can create any section heading that makes sense to you, or just insert the command wherever you like.


This file and section were created in the Hyper-V base scripts article. As long as it appears somewhere in one of the activated .cfg files, it will work.

This is a sample! You must use your own cluster name object! If you have multiple clusters to monitor, remember that you can place them into a Nagios hostgroup. You can then apply this service to the group rather than the individual cluster name objects. Do not assign the service to the nodes! The monitor will still work, but it’s inefficient and failures will result in many duplicate notifications.

Nagios must be restarted after these files are modified. Remember to run these separately. Do not just copy/paste! If the first command indicates a validation failure, check your work and fix the problem before restarting the Nagios service!

How to Monitor Hyper-V Using Nagios

How to Monitor Hyper-V Using Nagios

System monitoring is one of the more important things we can do for our datacenters, as it helps to spot problems in the onset phase where they can be dealt with gracefully instead of waiting for an obvious failure that requires downtime. There are a huge number of monitoring tools that work with Hyper-V on the market today. One that I’m fond of is Nagios Core. It is easily one of the most powerful systems you can get your hands on, and you just can’t beat the price (free). It does come in a commercial version which adds a lot features, mostly in the user-friendliness department. Therein lies the rub with the free version: it’s not the easiest thing in the world to set up. This blog post will walk you through setting up Nagios Core and configuring basic monitoring to monitor Hyper-V. (more…)