The usual IT babble
Posts tagged MSCS
Nagios: Watching Clustered environments (the other way)
Mar 19th
Well, recently I stepped up to watch our cluster environments … Michael has a good howto on how to watch Windows Cluster environments in the NSclient++ wiki.
Now, this has it’s own perks … Which I stumbled upon when trying to write a Linux-HA OCF resource agent for the Nagios NRPE server. Combining that Linux-HA with SLES10 is a good thing generally, but using startproc in that resource agent is not such a good idea.
Apparently Novell (or SuSE GmbH) thought it might be wise to include some additional logic into the wrapper. startproc, checkproc and killproc do check for the name of the executable. So if you try to start an additional process with the same name, you need to dig a bit deeper.
For this to work, you need two additional things (quotations directly from man 8 startproc):
-p pid_file
(Former option -f changed due to the LSB specification.) Use an alternate pid file instead of the default (/var/run/<basename>.pid). The pid read from this file is being matched against the pid of running processes that have an executable with specified path of the program. In order to avoid confusion with stale pid files, a not up-to-date pid will be ignored.
Now, then apparently this isn’t enough. startproc is still refusing to start a second process.
-i ignore_file
The pid found in this file is used as session id of the same binary program which should be ignored by startproc.
Tivoli Storage Manager Client and Microsoft Cluster Services
Jan 18th
Well, I just had another look at our client scheduler services on our Microsoft Cluster. A while back we noticed that those scheduler services were going nuts after some time. Well, as it turns out, I can tell why. Microsoft Cluster Services have a feature called registration replication, which replicates a given key, if changed when the resource is online, to all connected cluster nodes.
Now, we added the obvious registry key to the settings of our cluster resources for the scheduler services (SOFTWARE\IBM\ADSM\CurrentVersion\BackupClient\Nodes\<TSM NODE NAME>) and the scheduler service would use the same registry key to store it’s passwords. But it seems we were far off with that assumption.
The scheduler service uses another registry key, it’s quite similar to the one the GUI is using, but it’s different enough (SOFTWARE\IBM\ADSM\CurrentVersion\Nodes\<TSM NODE NAME>).