Monday, May 21, 2007

Its the People

It has been a sh-t week.

Thursday left at 10pm.

Friday left at 11pm.

Call outs on Sunday.

One power outage and my whole week + weekend ends up screwed.

IT equipments don't like power outages. 1 motherboard change and 2 interconnect HBA card changes later, and everything seemed to be up and running. Hardware failures -- I can handle, its the people that get to me.

After I got my downtime approved, some IT Continuity Manager called up and said that the outage was not to go ahead, and if we were to replace broken parts, then we were to do it without any downtime.


He may as well asked a doctor to perform open heart surgery without putting the patient to sleep first.

I explained that the Interconnect cards were down. Which is the hardware that controls the cluster failover links. The ability to have redundancy. And if there was another power outage, or even any other hardware failure, and then if the system goes down, there will be an extended period of downtime during the day for which I cannot account for.

After another hour of phone calls, it was still a 'No-Go'.

I wonder what part of, "If system suffers another problem, and Interconnects not fixed, system will go down indefinitely", did he not understand.

He said that if I couldn't guarantee that the system will be up by 7am the next day, then I couldn't go ahead. I told him I couldn't guarantee that the system would be up in the next 5 minutes.

I was so tempted to just shutdown everything and report a Severity 1.

Although if I did do that, I will probably be spending the rest of the month filling out incident reports.

In the end my manager took the phone, and said, 'If we don't do anything tonight, and system goes down tomorrow, I will put it down that you were the one who didn't allow work to be done tonight?'.

I didn't hear the answer, but once my manager put the phone down, he told me it will go ahead tonight.

That was thursday night. After 4 reboots and the system still didn't come back up after the interconnects were replaced, I realised the problem was much bigger than just the HBA cards. It would take another 4 hours for parts to be dispatched onsite. And it looked like a new motherboard was needed.

Normally, I wouldn't mind staying back. I could just go home, get something to eat and come back when parts arrived. But not this time. I called my manager up and asked if he would want me to stay, he said 'No', I could do it tomorrow night.

So, Friday morning I called to let him know that it didn't go as planned and that further downtime would be required.

He didn't sound happy. But I really didn't care anymore.

1 comment:

- chi sin - said...

oh no... poor astrogirl...
yes... you are such a computer surgeon...