|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
|
In HPOM, a message can be generated whenever specified threshold values are met or exceeded. Because you may not want to create a message for a single short-term peak, HPOM enables you to define a time period over which the monitored value must exceed the threshold before generating a message.
|
NOTE: A set duration (for example, three minutes) does not necessarily mean that the monitored value exceeds the threshold throughout the whole period. A message is generated when all samples collected during the polling interval have exceeded the threshold.
You can start corrective actions immediately by configuring automatic or operator-initiated actions as responses to the message. You can define monitors that respond to existing problems, as well as monitors that respond to developing problems. As a result, you can use monitoring as both a proactive and a reactive tool.
You can integrate new or existing monitoring programs or utilities, then specify minimum or maximum thresholds. You specify a polling interval that directs HPOM to start the monitor. The results of the monitor program are read by HPOM and compared with the threshold limits you have defined.
For example, you can integrate the UNIX utility who(1) to check how many users are logged on, or df(1M) to check the number of free disk blocks. The result of the script is compared to a threshold limit you define, and a message is generated if the threshold is exceeded. By setting a threshold beneath the maximum acceptable limit, you warn the operator before performance exceeds the absolute limit. In this way, you can manage thresholds proactively by starting corrective actions before problems affect users.
The HPOM monitor agent supports the following types of monitors:
The first time the monitored value exceeds its threshold, the timer is started. Or, if the duration is not specified, a message is generated. If all subsequent values reported by opcmon within the specified interval exceed the specified threshold, a message is sent. The monitored value does not necessarily exceed the threshold throughout the whole period, but specifically when each sample is collected.
The monitor agent checks the success of the scripts and programs by reading the exit value. If the exit value is not equal to zero, the monitor agent sends a message about execution failure of the monitor to the message agent about the failure. This is a different message then a regular message which should be sent via opcmon.
The monitoring scripts or programs collect the current value of the monitored object. The value is sent to the monitor agent through the opcmon application program interface (API) or through the command interface provided by HPOM.
The monitor agent checks the value against the configured threshold. If the threshold is exceeded, the monitor agent sends a message. For details, see the HPOM Administrator’s Reference. In addition, the program monitor is used to integrate metrics collected by the embedded performance component.
In this instance, <community> is the community for which the snmpd is configured.
The first time the monitored value exceeds the threshold, the timer starts counting. Each time the value is rechecked and still exceeds the threshold, the counter is incremented and compared with the specified duration. When the duration is reached, a message is generated.
The following three message generation policies are available for use with threshold monitors:
Instrumentation data planned for deployment is placed in the instrumentation directory on the HP Operations management server, at the following location:
/var/opt/OV/share/databases/OpC/mgd_node/
NOTE If no categories are created, the data from the monitor directory is anyway deployed. Although the category-based distribution method is recommended, you can chose to distribute your monitor from this directory. If you do so, you must place the monitor on the management server in a directory specific for each managed node platform to which it will be distributed. For example, monitor programs or scripts for HP-UX 11i managed nodes are located on the management server at:
/var/opt/OV/share/databases/OpC/mgd_node/customer/hp/\ ipf32/hpux1100/monitor
All distribution methods, the administration tasks related to them (including category management), and the instrumentation data directory structure, are described in the HPOM Administrator’s Reference.
a. In the instrumentation directory, place the monitor program or script properly (for each managed node platform to which you want to distribute the data) within the category related to your data. If there is no such category, you can create and assign it to your policy, and/or managed node.
2. Distribute the threshold monitor to the managed nodes.
To do this, use opcragt command-line utility (see opcragt.1M man page for usage information).
On HPOM managed nodes, all deployed instrumentation data (category-based instrumentation, as well as the monitor|actions|cmds files) is located in the following directory: /var/opt/OV/bin/instrumentation
3. Configure the threshold monitor policy.
Use the opcpolicy command-line tool to upload a threshold monitor policy.
Each policy defines a monitor, including automatic actions or operator-initiated actions to be started if the threshold is exceeded.
NOTE If you have chosen a category-based method for distributing your threshold monitor, make sure that the appropriate categories are assigned to the policy.
4. Configure conditions for the threshold monitor policy. The MSGCONDITIONS section of the policy body determines whether the matched condition produces a message that is sent to the Java GUI Message Browser. You can further filter the messages by using the SUPPRESSCONDITIONS sections.
NOTE If you have more than one condition for a monitor, the order of the conditions is important.
Order the conditions according to size of the threshold value:
You can re-configure a threshold monitor policy by editing the policy body of the ADVMONITOR policy.
In this example, a user is running a custom filesystem utilization calculation script fs_util_mon.sh, which calls the opcmon command-line tool to pass the calculated value back to the monitor agent, naming it extra_util (passed as a parameter to the script). The monitor agent will produce a message when the filesystem utilization is higher (MAXTHRESHOLD) than the configured one (THRESHOLD). However, messages will not be sent again until the utilization falls below the value configured with the RESET keyword. All messages will have severity set to Warning, application field set to “Filesystem”, object to “/extra” and message group to “Disks”. No other messages will be sent to the management server apart from ones matching the configured condition.
ADVMONITOR “extra_util” DESCRIPTION “Monitor /extra filesystem utilization” INTERVAL “5m” INSTANCEMODE SAME MAXTHRESHOLD SEVERITY Warning PROGRAM “Source” DESCRIPTION “Universal FS usage monitoring script” MONPROG “fs_util_mon.sh /extra extra_util” MSGCONDITIONS DESCRIPTION “Monitor /extra FS util” CONDITION THRESHOLD 85.00 RESET 80.00 SETSTART SEVERITY Warning APPLICATION “Filesystem” MSGGRP “Disks” OBJECT “/extra” TEXT “Filesystem /extra utilization <$VALUE> exceeds configured threshold <$THRESHOLD>” AUTOACTION “du –k /extra” ANNOTATE
HPOM provides a set of default threshold monitors. For details, see the HPOM HTTPS Agent Concepts and Configuration Guide.
You can set conditions for threshold monitor policies to monitor multiple instances of a single monitored object.
To do set conditions for threshold monitors, follow these steps:
1. Use the opcmon(1) command with the option -object to submit the name of the monitored object to the monitor agent. The option -option gives passes additional information to the monitor agent. This information can be used in the message text or referenced in corrective actions.
HPOM compares the name against the pattern set with OBJECT keyword in the advanced monitor policy body.
2. Use the HPOM pattern-matching language to match the incoming object pattern.
For more information, see the opcmon(1) manpage. For an example of how you can monitor disk utilization in different file systems, see “Examples of Threshold Monitor Conditions” on page 405. Implementing Message Policies
When setting up multiple conditions with different threshold and reset values for a monitored object in one policy, you receive messages whenever the monitoring range of another condition is reached. Consider the example in Figure 4-19 on page 403. The figure shows three conditions, each with a maximum threshold and a reset.
At the fifth polling (five minutes), the value exceeds the threshold of condition my_mon 1 (threshold value = 99) and a message is sent (A). One minute later, the value drops below the reset value of condition my_mon 1 (reset value = 95). Since it exceeds the threshold value of condition my_mon 2 (threshold value = 90), another message is sent (B). This means that reaching a condition’s monitoring range from above also generates a message, although the value does not drop below the reset value of that condition.
After 11 minutes, the value drops below the threshold value of condition my_mon 2 (threshold value = 90) but still exceeds the reset value of 85. Another message is generated (C). The original message text of this message reports Reset value still exceeded because only the threshold value was crossed, not the reset value. This message is generated only when the monitored value drops below the threshold value. When the monitored value exceeds the threshold value, the reset value is also exceeded and the message is not generated. In this example, you receive many messages for the same monitored object. To reduce the number of messages in the browser, configure your conditions so that messages are acknowledged automatically. For more information, see “State-Based Browsers” on page 363. Implementing Message Policies
The following examples show how you can use threshold monitor conditions to monitor the disk space in the /var and /file systems with the disk_util threshold monitor policy. These examples assume that you have written a shell script that determines and reports the disk utilization in each file system.
The HPOM event interceptor (opctrapi) is the message interface for feeding SNMP traps into HPOM.
Defaults for Intercepting Traps and Events By default, HPOM intercepts SNMP traps and CMIP (Common Management Information Protocol) events as follows:
Figure 4-21 shows the relationship between opctrapi and the HP processes that forward SNMP traps and CMIP events to HPOM.
The ovtrapd background process is responsible for receiving SNMP traps and CMIP events on port 162. The process buffers the traps and events, and passes them to the Postmaster process (pmd). The pmd process routes the events it receives from ovtrapd to a subsystem (for example, opctrapi or the file trapd.conf. opctrapi), then enters them into the HPOM message stream. The trapd.conf contains definitions for the handling of SNMP traps (generated by SNMP agents) and events (generated by applications registered with pmd). These definitions can be converted to HPOM message or suppress conditions with the ovtrap2opc utility. For details, see the ovtrap2opc(1M) manpage.
On some managed node platforms, the HPOM event interceptor can also directly access port 162 and capture SNMP traps. For details, see the HPOM Administrator’s Reference.
Avoiding Duplicate Messages
Although the HP discovery process configures the SNMP devices to send the traps to the management server, SNMP devices may broadcast traps to several systems. SNMP devices that do this may create duplicate messages if the traps are forwarded to one management server by several managed nodes.
To avoid this situation, follow these guidelines:
This sample catches Cisco linkDown trap (.1.3.6.1.4.1.9.2.0) produced by Cisco routers. When the trap is caught, message is produced with its severity set to Warning. Notice that the enterprise is separate from the generic trap. Variables <$1> and <$2> are a part of the trap (link index and description, respectively).
SNMP “Sample trap interceptor template” DESCRIPTION “This is catches Cisco linkDown trap” CONDITION $G 2 $e “.1.3.6.1.4.1.9” SET MSGTYPE “Cisco_Link_Down” SEVERITY “Warning” OBJECT “<$2>” TEXT “Interface <$1> down” Example of an SNMP Trap Condition HP Data Protector issues the following SNMP trap when a backup starts and a syntax error is detected in the worklist file: snmptrap idriss1 1.3.6.1.4.11.2.3.2 15.232. 117.22 58916871 6 \ 1.3.6.1.4.11.2.15.2.0 Integer 1 \ 1.3.2.1.4.11.2.15.3.0 OctetString doghouse.bbn.hp.com \ 1.3.2.1.4.11.2.15.4.0 OctetString ”HP Data Protector:[Error](Worklist Syntax)Can’t open worklist ‘/etc/omni/work’ Status:Critical” \ 1.3.2.1.4.11.2.15.5.0 OctetString ”Critical” \ 1.3.2.1.4.11.2.15.6.0 OctetString ”dp” Implementing Message Policies SNMP Traps and Events Chapter 4 411 The SNMP trap policy needs a condition with the following definition: Node doghouse Enterprise ID 1.3.6.1.4.11.2.3.2 Generic Trap ID 6 Specific Trap ID 58916871 (SNMP status event) Variable Bindings Application Type: 1(agent) Object ID: mailhouse.bbn.hp.com.omniback Event Description: HP Data Protector: [Error](Worklist Syntax)Can’t open worklist ‘/etc/omniback/work’ Status:Critical Trap-specific Data: critical Set Attribute Severity: critical Message Group: print services Text: Error in HP Data Protector: <text>
Internal HPOM error messages can be extracted from or filtered out of the internal Message Stream Interface (MSI) so that automatic and operator-initiated actions may be attached, and the message treated as if it were a normal, visible HPOM message. You can enable this functionality on the managed node and on the management server. Depending on where the functionality is enabled, all internal HPOM messages are sent back to the local message interceptor, either on the HP Operations management server or on the managed node. There they are read and handled in the same way as any other HPOM message. Management Server
On the management server, use the ovconfchg command-line tool. Enter the following:
ovconfchg -ovrg <OV_resource_group> -ns opc -set \ OPC_INT_MSG_FLT TRUE
In this command, <OV_resource_group> is the name of the management server resource group.
Managed Nodes
On HTTPS-based managed nodes, use the ovconfchg command-line tool. Enter the following:
ovconfchg -ns eaagt -set OPC_INT_MSG_FLT TRUE
Set up at least one condition for internal HPOM error messages in the opcmsg (1/3) policy (using message group OpC). Then set the SUPP_DUPL_IDENT_OUTPUT_MSG keyword in the policy body.
For the full list see HP Operations Manager Policy variables
Variables for Threshold Monitor Policies Only. The variables listed below can be used in most threshold monitor policy text entry fields (exceptions are noted). The variables can be used within OVO, or passed to external programs.
|
Switchboard | ||||
Latest | |||||
Past week | |||||
Past month |
Hi all,
I have a threshold template that triggers every 24 hours.
I deployed the template yesterday evening.
So here is my question: is it normal that I don't get anythig from that template yet ?
It triggers an automatic action that should fill a database, but nothing happens.
Any idea ? Does the template trigger only after a full period ?Jean-Bernard
yes. it triggers after 24h..:) It is not a crontab, that executes in certain moments. It is a schedule "every 24h after deployment", "in every 2 minutes" not every 2,4,6... minute. If yo need it to be triggered in certain moments, use schedule template to execute that same script you use in monitor template and put monitor template as "external" for catching your scripts output.
October 21, 2010
Use the following script to do process monitoring of managed node from HPOM -
To use the script, create a measurement threshold policy and write program name as : "script_name" "monitor_name" "process_name"
and put threshold 1 for not running condition and threshold 0 for running condition.
###ENV##
MON_NAME=$1
OPCMON="/opt/OV/bin/OpC/opcmon";
OPTION="-option proc"
CMD="/usr/bin/ps"
PROCESS=$2
PROGNAME=`basename $0`
###ENV#OBJECT=""
#############################################Find whether file exists####CMD_OUTPUT=`$CMD -ef | grep $PROCESS | grep -v grep | wc -l`
if [ ${CMD_OUTPUT} -eq 0 ]
then
STATUS=1;
else
STATUS=0;
fi
$OPCMON $MON_NAME=$STATUSexit
Google matched content |
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: March 12, 2019