[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Wg-test-framework] baroque1 hardware problem
On Wed, 20 May 2015 11:54:51 -0400 Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> wrote: > Ian Jackson writes ("Minutes All-Net synchronisation call, 20th May"): > > ACTION: Ian to send technical details, so that All-net can raise > > with supplier (Intel). > > So, the problem is as follows: > > > Summary: > -------- > > Sometimes, when powered on, baroque1 does not come up. > > Symptoms are that the serial control lines do change (visible in > sympathy log), but no text is printed on the serial console. Waiting > a long time (up to at least ten minutes) has no effect. Sending > "return" on the serial console elicits no response. > > After the problem has occurred, often more than one further attempt to > power cycle the machine is required to get it to work again. After > that it works normally until the fault recurs. > > > Repro method: > ------------- > > * Write a pxeboot file which refers to a stock Wheey amd64 > debian-installer netboot image, and specifies a preseed file. > > * Power off (via the PDU). > > * Wait 30s. > > * Power on (via the PDU). > > * Monitor the preseed file http server access log waiting for the > pressed file to be downloaded. > > * When the preseed file has been fetched, declare "success". > > Then run round for the next repetition. (Generally, this means > that the server is powered off in the middle of one of > debian-installer's software-fetching steps.) > > Alternatively, after 350s, declare "failure" and stop. > > > Statistical information: > ------------------------ > > * My records show failures after the following number of repetitions: > 96 (not quite sure about this - data collection was affected by an > unrelated network problem on my workstation) > 29, 25, 26. > > * My records show the following number of attempts needed to get the > machine to work at all, again: > 1, 3, 2, 3 > > * I have run the same test on baroque0. It has managed (at least) 400 > consecutive power cycle restarts without problem. > > > Handover: > --------- > > I hereby hand both baroque0 and baroque1 over to you. (It seems most > sensible to give you the working machine too, for comparison.) Noted. > The current setup in the colo is the PXE configuration as described > above. > > So I think it should be possible to reproduce the problem as follows: > > - power baroque1 off > - wait 30s > - power baroque1 on > > - wait for it to show life on the serial console > > - wait for it to show entry into debian-installer > (eg wait for "Setting up the clock" to be printed on the > serial console), then declare success and go round again I should be able to modify the oseleta test script to do this. > I have disconnected the serial consoles of both machines from > sympathy, so you should be able to connect to them with picocom or > expect on /dev/ttyRP5 and /dev/ttyRP6. Thanks. > NB that I am now going to be away until next Wednesday morning. NB'ed. -d > Ian. > _______________________________________________ Wg-test-framework mailing list Wg-test-framework@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/wg-test-framework
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |