需求描述:
在安装完nagios之后,需要对本地资源进行监控,比如磁盘空间的使用,进程数,swap空间,等等.这些都不是通过网络提供出来的,
所以,都是本地资源,可以通过NRPE插件实现在客户端中采集数据,然后通过网络传递给监控服务器,由监控服务器实现对传递过来
的数据进行判断.
环境描述:
操作系统:RedHat6.6 x64
安装过程:
----客户端----
1.关闭selinux
[root@testvm02 ~]# sed -i 's/SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config[root@testvm02 ~]# getenforcePermissive
2.下载,上传nagios-plugins插件和nrpe插件
nagios-plugins下载地址:
nrpe下载地址:
[root@testvm03 software]# ls -ltrtotal 3080-rw-r--r--. 1 root root 419695 Jul 31 16:18 nrpe-2.15.tar.gz-rw-r--r--. 1 root root 2728818 Jul 31 16:18 nagios-plugins-2.2.1.tar.gz
3.安装nrpe之前需要先安装nagios-plugins
3.1安装需需要的软件包
yum install -y gcc glibc glibc-common make gettext automake autoconf wget openssl-devel net-snmp net-snmp-utils
3.2解压,编译,安装
[root@testvm03 software]# tar -zxf nagios-plugins-2.2.1.tar.gz [root@testvm03 software]# cd nagios-plugins-2.2.1[root@testvm03 nagios-plugins-2.2.1]# ./configure [root@testvm03 nagios-plugins-2.2.1]# make [root@testvm03 nagios-plugins-2.2.1]# make install
4.创建nagios用户组,用户
[root@testvm03 nagios-plugins-2.2.1]# groupadd nagios[root@testvm03 nagios-plugins-2.2.1]# useradd -r -g nagios nagios
5.解压,编译,安装nrpe
[root@testvm03 software]# tar zxf nrpe-2.15.tar.gz [root@testvm03 software]# cd nrpe-2.15[root@testvm03 nrpe-2.15]# ./configure [root@testvm03 nrpe-2.15]# make all [root@testvm03 nrpe-2.15]# make install-daemon #安装nrpe命令. [root@testvm03 nrpe-2.15]# make install-daemon-config #安装nrpe配置文件 /usr/bin/install -c -m 775 -o nagios -g nagios -d /usr/local/nagios/etc /usr/bin/install -c -m 644 -o nagios -g nagios sample-config/nrpe.cfg /usr/local/nagios/etc
6.编辑nrpe.cfg(/usr/local/nagios/etc/nrpe.cfg)配置文件,增加监控主机的地址
allowed_hosts=127.0.0.1,192.168.53.25 #找到allowed_hosts,增加监控主机的地址.
7.将nrpe的启动脚本加入到/etc/init.d目录,加入到开机启动
[root@testvm03 etc]# cd /opt/software/nrpe-2.15[root@testvm03 nrpe-2.15]# cp init-script /etc/init.d/nrpe[root@testvm03 nrpe-2.15]# chmod +x /etc/init.d/nrpe [root@testvm03 nrpe-2.15]# chkconfig --add nrpe[root@testvm03 nrpe-2.15]# chkconfig --list | grep nrpenrpe 0:off 1:off 2:on 3:on 4:on 5:on 6:off
8.启动nrpe服务,查看程序监听的端口
[root@testvm03 nrpe-2.15]# service nrpe startStarting nrpe: [ OK ][root@testvm03 nrpe-2.15]# ps -ef | grep nrpenagios 23979 1 0 16:43 ? 00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -droot 23982 1050 0 16:43 pts/0 00:00:00 grep nrpe [root@testvm03 nrpe-2.15]# netstat -ntlp | grep nrpe tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 23979/nrpe #nrpe监听端口是5666 tcp 0 0 :::5666 :::* LISTEN 23979/nrpe
----监控服务器----
1.关闭selinux
[root@testvm02 ~]# sed -i 's/SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config[root@testvm02 ~]# getenforce Permissive
2.上传nrpe包,解压,编译,安装
[root@testvm02 softwares]# tar zxf nrpe-2.15.tar.gz [root@testvm02 softwares]# cd nrpe-2.15[root@testvm02 nrpe-2.15]# ./configure [root@testvm02 nrpe-2.15]# make all [root@testvm02 nrpe-2.15]# make install-plugin #安装check_nrpe插件而不是nrpe命令,要注意.
3.检查nrpe_check插件已经正确安装
[root@testvm02 nrpe-2.15]# ls /usr/local/nagios/libexec/check_nrpe /usr/local/nagios/libexec/check_nrpe
4.在nagios的commands.cfg中加入nrpe命令
define command { command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ } #特别注意不要有中文的中划线
5.创建主机配置文件,修改cfg文件的权限
define host { use linux-server host_name 192.168.53.26 address 192.168.53.26 } define service { use generic-service,graphed-service ; Name of service template to use host_name 192.168.53.26 service_description System_Load check_command check_nrpe!check_load } define service { use generic-service,graphed-service host_name 192.168.53.26 service_description disk_usage check_command check_nrpe!check_disk } [root@testvm02 objects]# chown nagios:nagios 192.168.53.26.cfg
7.创建nagios配置文件的目录,将配置文件归类存放
mkdir -p /usr/local/nagios/etc/objects/commandsmkdir -p /usr/local/nagios/etc/objects/timeperiodsmkdir -p /usr/local/nagios/etc/objects/contactsmkdir -p /usr/local/nagios/etc/objects/contactgroupsmkdir -p /usr/local/nagios/etc/objects/hostsmkdir -p /usr/local/nagios/etc/objects/hostgroupsmkdir -p /usr/local/nagios/etc/objects/servicesmkdir -p /usr/local/nagios/etc/objects/servicegroupsmkdir -p /usr/local/nagios/etc/objects/templatesmkdir -p /usr/local/nagios/etc/objects/others 将文件归类到具体的路径中: [root@testvm02 objects]# ls -ltr total 96 -rw-rw-r--. 1 nagios nagios 1797 Jul 31 11:38 contacts.cfg -rw-rw-r--. 1 nagios nagios 3512 Jul 31 11:38 timeperiods.cfg -rw-rw-r--. 1 nagios nagios 4074 Jul 31 11:38 windows.cfg -rw-rw-r--. 1 nagios nagios 3001 Jul 31 11:38 printer.cfg -rw-rw-r--. 1 nagios nagios 3484 Jul 31 11:38 switch.cfg -rw-rw-r--. 1 nagios nagios 12869 Jul 31 14:44 templates.cfg -rw-rw-r--. 1 nagios nagios 4905 Jul 31 14:47 localhost.cfg -rw-rw-r--. 1 nagios nagios 7120 Jul 31 16:54 commands.cfg -rw-r--r--. 1 nagios nagios 446 Jul 31 17:03 192.168.53.26.cfg drwxr-xr-x. 2 root root 4096 Jul 31 17:09 commands drwxr-xr-x. 2 root root 4096 Jul 31 17:09 timeperiods drwxr-xr-x. 2 root root 4096 Jul 31 17:09 contacts drwxr-xr-x. 2 root root 4096 Jul 31 17:09 contactgroups drwxr-xr-x. 2 root root 4096 Jul 31 17:09 hosts drwxr-xr-x. 2 root root 4096 Jul 31 17:09 hostgroups drwxr-xr-x. 2 root root 4096 Jul 31 17:09 services drwxr-xr-x. 2 root root 4096 Jul 31 17:09 servicegroups drwxr-xr-x. 2 root root 4096 Jul 31 17:09 templates drwxr-xr-x. 2 root root 4096 Jul 31 17:09 others [root@testvm02 objects]# mv 192.168.53.26.cfg hosts/ [root@testvm02 objects]# mv localhost.cfg hosts/ [root@testvm02 objects]# mv commands.cfg commands [root@testvm02 objects]# mv windows.cfg switch.cfg printer.cfg others/ [root@testvm02 objects]# mv templates.cfg templates/ [root@testvm02 objects]# mv timeperiods.cfg timeperiods/ [root@testvm02 objects]# mv contacts.cfg contacts
8.修改nagios.cfg配置文件,增加cfg_dir配置,将cfg_file的都注释掉
cfg_dir=/usr/local/nagios/etc/objects/commandscfg_dir=/usr/local/nagios/etc/objects/timeperiodscfg_dir=/usr/local/nagios/etc/objects/contactscfg_dir=/usr/local/nagios/etc/objects/contactgroupscfg_dir=/usr/local/nagios/etc/objects/hostscfg_dir=/usr/local/nagios/etc/objects/hostgroupscfg_dir=/usr/local/nagios/etc/objects/servicescfg_dir=/usr/local/nagios/etc/objects/servicegroupscfg_dir=/usr/local/nagios/etc/objects/templates #不配置others目录,目的是对这些其他的设备不进行监控
# You can specify individual object config files as shown below:
#cfg_file=/usr/local/nagios/etc/objects/commands.cfg#cfg_file=/usr/local/nagios/etc/objects/contacts.cfg#cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg#cfg_file=/usr/local/nagios/etc/objects/templates.cfg# Definitions for monitoring the local (Linux) host#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg9.重启nagios
[root@testvm02 objects]# service nagios restartRunning configuration check... Stopping nagios: .done.Starting nagios: Running configuration check... done.
10.查看监控页面
备注:系统负载已经能够正常的检查了.但是NRPE: Command 'check_disk' not defined,有这个错误.
11.NRPE: Command 'check_disk' not defined 处理,在客户端的nrpe.cfg中配置check_disk命令及告警
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p / -C -w 20% -c 10% -p /boot #指定检查特定的分区,设定告警,验证的百分比. #command后面的中括号内,就是定义的nrpe的命令.在server端调用的就是这个命令,然后在客户端执行,也可理解这个命令就是等号后面一长串的别名.
12.重启客户端的nrpe
[root@testvm03 etc]# service nrpe restartShutting down nrpe: [ OK ]Starting nrpe: [ OK ]
13.查看nagios的页面
备注:已经能够监控到远程的磁盘空间使用情况了,并且设置了告警,其他的命令可以模拟这个.具体的调度命令要在libexec目录中.或者自行安装插件和命令.
另:
- 针对磁盘空间设置告警值,检查哪个目录都是在客户端的nrpe.cfg文件中配置的.其他的检查也是在客户端配置告警百分比
- 监控端,不需要启动nrpe进程,nrpe进程只是在客户端启动
文档创建时间:2018年7月31日18:05:31