Route Hops: How to Create a Nagios Plugin to Monitor Them

Picture of Tyler Larson
Tyler Larson
Junior Developer
Route Hops: How to Create a Nagios Plugin to Monitor Them

An important metric in network diagnostics is the number of hops it takes to reach a target host. In this article, we will create a plugin for Nagios using Python to monitor route hops to a target host.

Step 1: Prerequisites

  • Python 3: The script uses Python, so make sure you have it installed so you can test your plugin as you develop it. It can be downloaded here.
  • A code editor: To create this plugin I used VS Code, which can be downloaded here.

Step 2: Set Up the Python Script

Create a new Python file named check_route_hops.py and begin with a shebang line and essential import statements.

#!/usr/bin/env python3

import sys
import traceback
import argparse
import subprocess

def main():

if __name__ == "__main__":
    main()

Define the Nagios status codes to make the code more readable.

OK = 0
WARNING = 1
CRITICAL = 2
UNKNOWN = 3

Now, create a global variable for the command line arguments.

_args = None

Step 3: Parse Command-Line Arguments

The plugin should accept arguments to specify the target host, thresholds, and other options. Use Python’s argparse module to define and handle these arguments. This code should be placed within the main function.

parser = argparse.ArgumentParser(
 description='This script checks the number of hops to a target host using traceroute.'
 'You can define warning and critical thresholds for hop counts. '
 'If the host is unreachable with the default protocol (UDP) other protocols will be       tried.')
        
        # Define the arguments
        parser.add_argument('-H', '--host', required=True, type=str, help='Target host IP to check hop count')

        parser.add_argument('-t', '--timeout', default=3, type=int, help='Timeout duration in seconds for the check (default is 3)')

        parser.add_argument('-w', '--warning', required=False, type=int, help='Warning threshold for hop count')

        parser.add_argument('-c', '--critical', required=False, type=int, help='Critical threshold for hop count')

        parser.add_argument('-v', '--verbose', required=False, type=int, default=0, help=(
            'The verbosity level of the output. '
            '(0: Single line summary, '
            '1: Single line with additional information, '
            '2: Multi line with configuration debug output)'
        ))

        parser.add_argument('-p', '--protocol', type=str, default='UDP', help='First protocol to attempt for traceroute (default: UDP)')

        parser.add_argument('--debug', action='store_true', help='Enable debug mode')
        
        _args = parser.parse_args(sys.argv[1:])

Step 4: Implement the Traceroute Logic

Create a helper function to execute traceroute and calculate the number of hops to the host. Use subprocess.run to call the traceroute command. The stdout result of the command is then used to count the number of hops to the host. Different protocols are used by adding additional arguments to the command.

def get_route_hops(protocol):
    global _args

    try:
        result = None
        if protocol == "UDP":
            result = subprocess.run(['traceroute', _args.host], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True, timeout=_args.timeout)
        elif protocol == "ICMP":
            result = subprocess.run(['traceroute', _args.host, '-I'], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True, timeout=_args.timeout)
        
        # Count the number of hops (lines returned by traceroute)
        hops = len(result.stdout.splitlines()) - 1

        return hops if hops > 0 else 0

    except subprocess.TimeoutExpired:
        return 0
    except Exception as e:
        return 0

Step 5: Combine Logic in the Main Function

The main function shall do the following:

  1. Parse arguments.
  2. Attempt traceroute on the specified protocol.
  3. Fallback to alternative protocols if the specified protocol fails.
  4. Compare the hop counts against the thresholds.
  5. Build and print a Nagios status message.
  6. Exit with the proper status code.

For testing, you can add debug statements that print out additional information if the –debug argument exists.

    # Find the number of hops to the host using different protocols    
    protocols = ['UDP', 'ICMP']    
    if _args.protocol in protocols:        
      protocols.remove(_args.protocol)       
      protocols.insert(0, _args.protocol)   
      
      hops = 0   
      used_protocol = _args.protocol  
      
      # Check the default protocol first   
      hops = get_route_hops(_args.protocol)  
      
      # If the default protocol fails, try the other protocols   
      if hops == 0:     
      for protocol in protocols[1:]:        
        hops = get_route_hops(protocol)
             
        if hops > 0:            
          used_protocol = protocol             
          break
        if hops == 0:       
          print("CRITICAL - Failed to determine hop count for all protocols")            
          sys.exit(CRITICAL)  
            
        # Set the status code based on the number of hops to the host    
        status_code = OK   
        if _args.warning and hops >= _args.warning:       
          status_code = WARNING   
        if _args.critical and hops >= _args.critical:       
          status_code = CRITICAL
          status_dict = {      
            0: "OK",       
            1: "WARNING",      
            2: "CRITICAL",   
          }
         
        # Build the status message   
        message = ''   
        if (hops == 1):      
          message = f"{status_dict[status_code]} - {hops} hop was counted to {_args.host}" 
        else:       
        message = f"{status_dict[status_code]} - {hops} hops were counted to {_args.host}"
         
        print(message)    
        sys.exit(status_code)

Step 6: Test the Plugin

Run your script manually to verify that your plugin functions as expected. Try different combinations of arguments to ensure everything functions as expected.

Here is an example of what you could run in your terminal:

python3 check_route_hops.py -H example.com -w 10 -c 20 -p ICMP --debug

Step 7: Upload your plugin to Nagios XI

Follow the steps here to add your new plugin to Nagios XI!

Video

Conclusion

You’ve successfully created a custom Nagios plugin to monitor route hops. This plugin is highly customizable, allowing you to modify thresholds and protocols as needed. It’s a solid tool for monitoring network paths and troubleshooting connectivity issues.

Share: