[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG 1747]Guest could't find bootable device with memory more than 3600M



On Thu, 2013-06-13 at 17:55 +0100, Stefano Stabellini wrote:

> > > We could have a xenstore flag somewhere that enables the old behaviour
> > > so that people can revert back to qemu-xen-traditional and make the pci
> > > hole below 4G even bigger than 448MB, but I think that keeping the old
> > > behaviour around is going to make the code more difficult to maintain.
> > 
> > The downside of that is that things which worked with the old scheme may
> > not work with the new one though. Early in a release cycle when we have
> > time to discover what has broken then that might be OK, but is post rc4
> > really the time to be risking it?
> 
> Yes, you are right: there are some scenarios that would have worked
> before that wouldn't work anymore with the new scheme.
> Are they important enough to have a workaround, pretty difficult to
> identify for a user?

That question would be reasonable early in the development cycle. At rc4
the question should be: do we think this problem is so critical that we
want to risk breaking something else which currently works for people.

Remember that we are invalidating whatever passthrough testing people
have already done up to this point of the release.

It is also worth noting that the things which this change ends up
breaking may for all we know be equally difficult for a user to identify
(they are after all approximately the same class of issue).

The problem here is that the risk is difficult to evaluate, we just
don't know what will break with this change, and we don't know therefore
if the cure is worse than the disease. The conservative approach at this
point in the release would be to not change anything, or to change the
minimal possible number of things (which would preclude changes which
impact qemu-trad IMHO).

WRT pretty difficult to identify -- the root of this thread suggests the
guest entered a reboot loop with "No bootable device", that sounds
eminently release notable to me. I also not that it was changing the
size of the PCI hole which caused the issue -- which does somewhat
underscore the risks involved in this sort of change.

> > > Also it's difficult for people to realize that they need the workaround
> > > because hvmloader logs aren't enabled by default and only go to the Xen
> > > serial console. The value of this workaround pretty low in my view.
> > > Finally it's worth noting that Windows XP is going EOL in less than an
> > > year.
> > 
> > That's been true for something like 5 years...
> > 
> > Also, apart from XP, doesn't Windows still pick a HAL at install time,
> > so even a modern guest installed under the old scheme may not get a PAE
> > capable HAL. If you increase the amount of RAM I think Windows will
> > "upgrade" the HAL, but is changing the MMIO layout enough to trigger
> > this? Or maybe modern Windows all use PAE (or even 64 bit) anyway?
> > 
> > There are also performance implications of enabling PAE over 2 level
> > paging. Not sure how significant they are with HAP though. Made a big
> > difference with shadow IIRC.
> > 
> > Maybe I'm worrying about nothing but while all of these unknowns might
> > be OK towards the start of a release cycle rc4 seems awfully late in the
> > day to be risking it.
> 
> Keep in mind that all these configurations are perfectly valid even with
> the code that we have out there today. We aren't doing anything new,
> just modifying the default.

I don't think that is true. We are changing the behaviour, calling it
"just" a default doesn't make it any less worrying or any less of a
change.

> One just needs to assign a PCI device with more than 190MB to trigger it.
> I am trusting the fact that given that we had this behaviour for many
> years now, and it's pretty common to assign a device only some of the
> times you are booting your guest, any problems would have already come
> up.

With qemu-trad perhaps, although that's not completely obvious TBH. In
any case should we really be crossing our fingers and "trusting" that
it'll be ok at rc4?

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.