[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] HVM support for e820_host (Was: Bug: Limitation of <=2GB RAM in domU persists with 4.3.0)
On Fri, 6 Sep 2013 10:32:23 -0400, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote: >>On the face of it, that's actually fine - my PCI IOMEM mappings show>>the lowest mapping (according to lspci -vvv) starts at a8000000, > ><surprise> Indeed - on the host, the hole is 1GB-4GB, but there is no IOMEM mapped between 1024M and 2688MB. Hence why I can get away with a domU memory allocation up to 2688MB.When you say 'IOMEM' you mean /proc/iomem output? I mean what lspci shows WRT where PCI device memory regions are mapped. >>explain what is actually going wrong and why the crash is still >>occuring - unless some other piece of hardware is having it's domU >>IOMEM mapped somewhere in the range f3df4000-fec8b000 and that is >>causing a memory overwrite. >> >>I am just not seeing any obvious memory stomp at the moment... > >Neither am I. I may have pasted the wrong domU e820. I have a sneaky suspicion that this above map was from a domU with 2688MB of RAM assigned, hence why there is on domU RAM in the map above a7800000. I'll re-check when I'm in front of that machine again. Are you OK with the plan to _only_ copy the holes from host E820 to the hvmloader E820? I think this would be sufficient and not cause any undue problems. The only things that would need to change are: 1) Enlarge the domU hole 2) Do something with the top reserved block, starting at RESERVED_MEMBASE=0xFC000000. What is this actually for? It overlaps with the host memory hole which extends all the way up to 0xfee00000. If it must be where it is, this could be problematic. What to do in this case?I would do a git log or git annotate to find it. I recall some patches to move that - but I can't recall the details. Will do. But what could this possibly be for? So would it perhaps be neater, easier, more consistent and more debuggable to just make the hvmloader put in a hole between 0x40000000-0xffffffff (the whole 3GB) by default? Or is that deemed to be too crippling for 32-bit non-PAE domUs (and are there enough of these aroudn to matter?)?Correct. Also it would wreak havoc when migrating to other hvmloader's which have a different layout. Two points here that might just be worth pointing out here: 1) domUs with e820_host set aren't migratable anyway (including PV ones for which e820_host is currently implemented) 2) All of this is conditional on e820_host=1 being set in the config. Since legacy hosts won't have this set anyway (since it isn't implemented, and won't be until this patch set is completed), surely any notion of backward compatibility for HVMs with e820_host=1 set is null and void. Thus - as a first pass solution that would work in most cases where this option is useful in the first place, setting the low RAM limit to the beginning of the first memory hole above 0x100000 (1MB) should be OK. Leave anything after that unmapped (that seems to be what shows up as "HOLE" on the dumps) all the way up to RESERVED_MEMBASE. That would only leave the question of what it is (if anything) that uses the memory between RESERVED_MEMBASE and 0xffffffff (4GB) and under which circumstances. This could be somewhat important because 0xfec8a000 -> +4KB on my machine is actually the Intel I/O APIC. If it is reserved and nothing uses it, no problem, it can stay as is. If SeaBIOS or similar is known to write to it under some circumstances, that could easily be quite crashtastic. Caveat - this alone wouldn't cover any other weirdness such as the odd memory hole 0x3f7e0000-0x3f7e7000 on my hardware. Was this what you were thinking about when asking whether my domUs work OK with 1GB of RAM? Since that is just under the 1GB limit.So there are some issues with i915 IGD having to have a 'flush page'. Mainly some non-RAM region that they can tell the IGD to flush its pages. And it had to be non-RAM and somehow via magic IGD registers you can program the physical address in the card - so the card has it remapped to itself. Usually it is some gap (aka hole) that ends has to be faithfully reproduced in the guest. But you are using nvidia and are not playing those nasty tricks. Mere a different set of nasty tricks instead. :) But yes, on the whole, I agree. I will try to get the holes as similar as possible for a "production" level patch. To clarify, I am not suggesting just hard coding a 3GB memory hole - I am suggesting defaulting to at least that and them mapping in any additional memory holes as well. My reasoning behind this suggestion is that it would make things more consistent between different (possibly dissimilar) hosts.Potentially. The other option when thinking about migration and PCI - is to interogate _All_ of the hosts that will be involved in the migration and construct an E820 that covers all the right regions. Then use that for the guests and then you can unplug/plug the PCI devices without much trouble. That's possibly a step too far at this point. That is where the e820_host=1 parameter can be used and also some extra code to slurp up an XML of the E820 could be implemented. The 3GB HOLE could do it, but what if the host has some odd layout where the HOLE is above 4GB? Then we are back at remapping. Such a host would also only work with devices that _only_ require 64-bit BARs. But they do exist (e.g. ATI GPUs). Gordan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |