“Help! My EMS is overloaded!”
Does this sound familiar? When your NonStop gets very busy, your EMS also
gets very busy. In fact, sometimes you may find that EMS processing consumes a
lot of your CPU resources just for processing the flood of error messages. Why
are there so many messages in EMS?
Does this remind you of your EMS?
Many users dump EVERYTHING into EMS. The original intention of EMS is to allow
all the different errors to be analyzed and filtered in one place. But when everything
goes into this one pipe, the result is an overloaded, clogged pipe. When you dump
too much stuff into EMS:
- It becomes difficult to find the error messages
- It consumes a lot of CPU resource for EMS to file the messages
- Operation tends to start ignoring messages in EMS console because they are too overwhelming
There is a better way –LogWatch
Instead of clogging up EMS, use LogWatch to monitor the different log files and work in conjunction with EMS.
LogWatch can monitor different files including:
- Guardian files
- OSS logs
- VHS logs
- Pathway logs
- Third party logs, etc.
Lighten up the EMS load
Here is quick way to reduce EMS load: instead of routing your application errors to EMS, write them to disk logs.
- Use LogWatch to monitor these application log files for errors.
- LogWatch is scalable – you can have different instances of LogWatch monitoring different things.
- LogWatch is easy to set up – you can set one up in minutes, and it won’t interfere with other instances.
- Have LogWatch route only the errors to EMS.
Perfect companion to Prognosis or MOMI
If you are using a performance monitoring tool like Prognosis or MOMI, you will find LogWatch will work with it very effectively.
- Use LogWatch to monitor disk log files for errors.
- Configure LogWatch to route a message to EMS with specific Message ID or text pattern.
- Enable Prognosis or MOMI to pick up these specific messages from EMS to take corrective actions.
Take Away – “Prevention is better than cure”
More than many other IT folks, NonStop users understand and appreciate the importance of availability, the cornerstone of the platform. But applications do encounter errors, which could lead to stoppage. When that happens, it is important to recover from the failure as quickly as possible. Any extended down time due to an unavailable application translates to loss of revenue and users’ confidence. With some advanced planning and a good implementation plan for log monitoring, problems can be detected early and remedied promptly.
-
- Analyze your logs – Where are the logs? What is written to the application logs? Take a look at some of the old logs and see what is going on in the environment.
- Plan ahead – What are some of the log messages that require specific actions? What actions? Who should be responsible for actions?
- Execute the plan – Start implementing a plan to monitor the key log files, and automate the log monitoring process with a tool like LogWatch.
Feedback please
Do you find this tutorial blog helpful? Let us know what you think, and how we can make it even better. Don’t forget, you can subscribe to our blogs (top right-hand corner) to get automatic email notification when a new blog is available.
Phil Ly is the president and founder of TIC Software, a New York-based company specializing in software and services that integrate NonStop with the latest technologies, including Web Services, .NET and Java. Prior to founding TIC in 1983, Phil worked for Tandem Computer in technical support and software development.