|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: osstest down, PDU failure
Ian Jackson writes ("osstest down, PDU failure"):
> Currently, osstest is not working. We have lost one of our PDUs,
> meaning that about half a rack is out of action, including one of the
> VM hosts.
>
> There has been quite a bit of outstanding maintenance which has been
> deferred due to the pandemic. I am trying to see if we can get
> someone on-site to the colo, in Massachusetts, soon. A complication
> is that the replacement PDU is in still New York. Again, due to the
> pandemic.
I managed to get an on-site look by the staff of the colo facility. A
breaker had tripped, depriving our PDU of power. They reset the
breaker. The VM host has come back fully operational. I have
verified that all the test boxes connected to that PDU (apart from one
knonw-dead box) are powered and responsive enough. Initial reports
from a smoke flight were encouraging, so I have re-enabled everything.
It may trip again of course.
A power trip in a colo is not a normal event, but we haven't
determined the root cause. The colo facility are going to ask their
electrical supply technicians to investigate the trip. I think the
breaker or associated equipment is probably "smart" and will have some
useful records.
Ian.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |