IBM RDAC and Windows Cluster Service
Posted on Wednesday, 28th May, 2008 in Life
Okay, so we received a brand new x3650 the other day entitled to replace one (or better two) of our NAS frontend servers. We installed Windows on it the other day (had to create a custom Windows Server 2003 CD first, since the default one doesn’t recognize the integrated ServeRAID), and we prepped the box during the week with the usual things.
On Monday I started installing the “IBM StorageManager RDAC” MultiPath driver (since the box got two single port PCIe FC-HBA’s) and figured I’d be nice if we had this. I asked a IBM Systems Engineer of one of our partners, which told me generally there wouldn’t be a problem with Microsoft Cluster Services (MSCS) and the IBM MPIO driver. Only requirement would be that I’d install the new storport.sys driver (version 5.2.3790.4021) first (as in Microsoft KB932755).
Now, yesterday I finished the zoning, did the mappings on the storage arrays and then figured the box should see the hard disks. So I started adding another node to our existing Microsoft Cluster.
Result: Zip (as in MSCS telling me not all nodes could see the quorum disk)
Reason: a combination of two things. First, said IBM Storage Manager RDAC. The first time I installed it, I forgot about the storage mappings, thus the box seeing zero disks. After uninstalling it, I was seeing 121 (that’s right, one hundred and twenty one) new devices.
That is basically a result of the zoning I did for this particular device, which has *all* controllers present in a single SAN zone, thus the HBA’s seeing devices eight (or nine) times .. Update: yes, I’m missing one controller …
Now, as I reinstalled the RDAC *after* the host discovered the volumes, it’s showing only a dozen drives.
Now, as I figured this out, I told myself “Hey, adding the third node to the Windows Cluster should now work without a clue …” … guess what ?
It’s Microsoft and it doesn’t. Now why doesn’t it work ? ‘Cause the Cluster Setup Wizard is getting confused in Typical mode, as it’s creating a “local quorum disk” which naturally isn’t present in the cluster it’s joining. Now, switching the wizard to “Advanced (minimum) configuration” as suggested in Q331801, just works … *shrug*
SLES, ZendOptimizer and IBM PowerPC(4)+
Posted on Tuesday, 10th July, 2007 in Life
What would you figure from the above ? Hopefully the rather obvious, that it’s a *really* shitty combination.
So we figured it would be a nice thing to test our new setup before going into pre-production testing or production, but we don’t have an extra spare box. So we took one of the power4 boxes we have mounted in the rack basically consuming energy all day (that’s about 38kWh a day) and installed SLES10 onto it. Which wasn’t all that bad (at first the box repeatedly started back to AIX, from CD and after convincing the SMS - that’s basically the bios on the power*-boxes also known as System Management Services with a hammer to boot from the first hard disk).
The real bad part started later. First the box committed suicide sometime on the weekend (the last one that is), which is rather not so good.
So we installed the ocfs2-tools (which is obviously needed if you want do writes on a SAN volume mounted on two separate boxes), configured the o2cb thing to start automatically on boot and added the entry to /etc/fstab.
So far so good, but as we slowly activated the apache-vhosts, we finally came to what cost me about three damned hours of my life:
child pid ### exit signal Segmentation fault (11)
Now guess what … ZendOptimizer just went bye-bye … Damn and what now ? So I looked at the Knowledgebase on zend.com, even found an Article stating it’d do that from time to time …
And attached also the usual crap .. “Please update to the latest version”. Only problem with that is that the latest version is indeed available for x86_64 (meaning amd64 in Gentoo terms), but ain’t for ppc (even if the product page states it should be).
So I went home, knowing what the problem is - since it was already past 4pm - swearing a short “frack that“.
Now that I’m home, ate something (a rather good salad), listening to some Korn/Kid Rock/Offspring and after doing some undertakers work, I asked myself “Why exactly do we need that crappy application anyway ?” (beyond the obvious point, that the ZendOptimizer is like/ is a php-compiler cache).
It turns out, one of my co-workers wrote a TYPO3-plugin interfacing our local research database .. and the catchy thing is, guess what …
He “guarded” it with ZendGuard, thus we need to use the ZendOptimizer thingy; otherwise we couldn’t use it either …

O RLY ?
SLES10 on pSeries
Posted on Wednesday, 4th July, 2007 in Life
Okay, yet another day passed by blazing fast. I had a good day at work, spent nearly the whole day trying to get my bloody systems hooked up to our SAN (which was interrupted by a non-working SAN-switch, disappearing WWN’s, lunch and my trainees), messing around with our internal network, hacking our Blade Chassis switches to get me what I want and some random paperwork.
But first things first .. We installed SLES10 on a pSeries box the other day (I think on Monday), and now I’m trying to get the WWN of it’s Emulex HBA, out of either sysfs or procfs. But whatcha’ thinking ?
I can’t get the dreaded WWN our of anything. Emulex’s hbacmd (from their HBAnyware utility) tells me there is no HBA and/or I don’t have the lpfc driver loaded (which can’t be, since I see IBM Tape Drives and my DS4300/FAStT900 via the lpfc), which is like … ![]()
So if any Emulex/pSeries expert is reading this, *please* (I beg you) tell me how the frack I get the WWN squashed out of it without looking either at the back of the rack or into the BIOS.
And here’s just for the record (my own - so I don’t need to look it up more often) the way on how to reset the attention indicators (basically LED’s) on the front of a pSeries box running Linux, which gets turned on when either resetting the box or killing it in startup:
# Make sure we have powerpc-utils installed .. pSeries ~ [0] $ rpm -qa | grep powerpc-utils powerpc-utils-1.0.0-5.4 # Tell us, which LEDs have which address/status pSeries ~ [0] $ usysattn U0.1 [on] # Turn of the given LED pSeries ~ [0] $ usysattn -l U0.1 -s normal
That’s it, the LED is off.


