<< | [up] | >> |
dim_STAT User's Guide. by Dimitri |
STAT-service |
STAT-service was introduced in dim_STAT since v.3.0. and provides a simple, stable and secure way for on-line STAT collecting from Solaris/SPARC, Solaris/x86 and Linux/x86 servers. Since v.8.1 it's distributed under GPL with source code, so you may compile it now yourself on other platforms and collect data from other UNIX systems. As pilot example, package for HP/UX is provided. Any new ported kits are welcome!...
Install STAT-service |
STAT-service module is shipped within dim_STAT distribution (dim_STAT-INSTALL/STAT-service directory) as normal Solaris packages or TAR archives for manual integration. STAT-service should be installed on every machine supposed to be under live-monitoring (you should be "root" user, of course :))Package install (".pkg" file):
# pkgadd -d STATsrv.pkgManual install (".tar" file):# cd /etc # tar xvf /path_to/STATsrv.tar # ln -s /etc/STATsrv/STAT-service /etc/rc2.d/S99STATsrv # ln -s /etc/STATsrv/STAT-service /etc/rc1.d/K99STATsrv # ln -s /etc/STATsrv/STAT-service /etc/rc0.d/K99STATsrv # ln -s /etc/STATsrv/STAT-service /etc/rcS.d/K99STATsrvIncluded software installed into special /etc/STATsrv directory (home directory of STAT-service). The contents of this directory:
/etc/STATsrv/ STAT-service -- script to start/stop service daemon, also defines port number to listen (def:5000) access -- access control file /bin -- contains extended STAT programs/scripts /log -- contains all logged information about service demandsNext step - start the service daemon:
# /etc/STATsrv/STAT-service startSchema of communication with STAT-service is very simple:
1.) dim_STAT connecting to STAT-service of the machine under monitoring...
2.) if service is not available: wait a time-out and go to 1.) or exit if STAT collect is stopped during this time...
3.) dim_STAT asking about a stat command it needs...
4.) if there is no permissions for this command or command is not found: close "command" connection with error message...
5.) dim_STAT collects data keeping any time-shifting due previous time-outs...
6.) if TCP connection is broken: go to 1.)
7.) if STAT is stopped: close connection and exit...
*.) if there was no activity during "auto-eject" timeout - close connection and goto 1.)
As you see, this schema is quite robust and will work after cluster switching, network corruptions, rebooting, etc. Collections may be started once and for a long period. In case you need collect only during specific time intervals - you may just start and stop STAT-service via "cron" or any other similar tool...
Note: it seems during halt of system (ex:power off of working machine) TCP/IP connections stays sticked and never receive error code... In this case collect should be broken via "auto-eject" timeout. However, auto-eject may happens also due mini-hang on system or simply on the stat program, in this case you'll see holes in your collects, so take care during interpretation :))
Here is an example of STAT-service access control file. As you see, you may limit the number of stat commands accessible for each machine. This task may be done by host administrator and may be completely independent.Notes:
- access file all the time checked by STAT-service daemon, so you never need to restart service to activate your modifications.
- since v.8.0 only stat commands working for sure on a given system are enabled by default. It's up to you to enable other commands which may need some additional configuration (like jvmSTAT) or simple software presence (like VxVM for vxstat).
# # STAT-service access file # # Format: # ... # command name fullpath # ... # access IP-address # ... # command name fullpath # ... # # By default all machines in the network may access to STAT-services # # Keyword "access" make access restriction by IP-adress for all following # commands till next "access" section. # # For example: # # ==================================================================== # # # # Any host may access to vmstat and mpstat collections # # # command vmstat /usr/bin/vmstat # command mpstat /usr/bin/mpstat # # # # Only machines 129.157.1.[1-3] may access netLOAD collections # # # access 129.157.1.1 # access 129.157.1.2 # access 129.157.1.3 # command netLOAD.sh /etc/STATsrv/bin/netLOAD.sh # # # # Only machine 129.157.1.1 may access psSTAT collections # # # access 129.157.1.1 # command psSTAT /etc/STATsrv/bin/psSTAT # # # ==================================================================== # # command vmstat /usr/bin/vmstat command mpstat /usr/bin/mpstat command netstat /usr/bin/netstat command vxstat /usr/sbin/vxstat command memstat /etc/STATsrv/bin/memstat command tailX /etc/STATsrv/bin/tailX command ioSTAT.sh /etc/STATsrv/bin/ioSTAT.sh command netLOAD.sh /etc/STATsrv/bin/netLOAD.sh command psSTAT /etc/STATsrv/bin/psSTAT.sh command bsdlink /etc/STATsrv/bin/bsdlink.sh command bsdlink.sh /etc/STATsrv/bin/bsdlink.sh command harSTAT.sh /etc/STATsrv/bin/harSTAT.sh command harSTAT /etc/STATsrv/bin/harSTAT.sh command harSTATus3 /etc/STATsrv/bin/harSTATus3.sh command harSTATus3.sh /etc/STATsrv/bin/harSTATus3.sh command T3stat /etc/STATsrv/bin/T3stat.sh command T3stat.sh /etc/STATsrv/bin/T3stat.sh command sysinfo /etc/STATsrv/bin/sysinfo.sh command SysINFO /etc/STATsrv/bin/sysinfo.sh
<< | [up] | >> |