[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Load increase after memory upgrade (part2)
Friday, January 13, 2012, 4:13:07 PM, you wrote: >> >> > I also have done some experiments with the patch, in domU i also get >> >> > the 0% full for my usb controllers with video grabbers , in dom0 my i >> >> > get 12% full, both my realtek 8169 ethernet controllers seem to use the >> >> > bounce buffering ... >> >> > And that with a iommu (amd) ? it all seems kind of strange, although it >> >> > is also working ... >> >> > I'm not having much time now, hoping to get back with a full report >> >> > soon. >> >> >> >> Hm, so domU nothing, but dom0 it reports. Maybe the patch is incorrect >> >> when running as PV guest .. Will look in more details after the >> >> holidays. Thanks for being willing to try it out. >> >> > Good news is I am able to reproduce this with my 32-bit NIC with 3.2 domU: >> >> > [ 771.896140] SWIOTLB is 11% full >> > [ 776.896116] 0 [e1000 0000:00:00.0] bounce: from:222028(slow:0)to:2 >> > map:222037 unmap:227220 sync:0 >> > [ 776.896126] 1 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:5188 >> > map:5188 unmap:0 sync:0 >> > [ 776.896133] 3 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:1 map:1 >> > unmap:0 sync:0 >> >> > but interestingly enough, if I boot the guest as the first one I do not >> > get these bounce >> > requests. I will shortly bootup a Xen-O-Linux kernel and see if I get >> > these same >> > numbers. >> >> >> I started to expiriment some more with what i encountered. >> >> On dom0 i was seeing that my r8169 ethernet controllers where using bounce >> buffering with the dump-swiotlb module. >> It was showing "12% full". >> Checking in sysfs shows: >> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat >> consistent_dma_mask_bits >> 32 >> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat dma_mask_bits >> 32 >> >> If i remember correctly wasn't the allocation for dom0 changed to be to the >> top of memory instead of low .. somewhere between 2.6.32 and 3.0 ? > ? We never actually had dom0 support in the upstream kernel until 2.6.37.. > The 2.6.32<->2.6.36 you are > referring to must have been the trees that I spun up - but the implementation > of SWIOTLB in them > had not really changed. >> Could that change cause the need for all devices to need bounce buffering >> and could it therefore explain some people seeing more cpu usage for dom0 ? > The issue I am seeing is not CPU usage in dom0, but rather the CPU usage in > domU with guests. > And that the older domU's (XenOLinux) do not have this. > That I can't understand - the implementation in both cases _looks_ to do the > same thing. > There was one issue I found in the upstream one, but even with that fix I > still > get that "bounce" usage in domU. > Interestingly enough, I get that only if I have launched, destroyed, > launched, etc, the guest multiple > times before I get this. Which leads me to believe this is not a kernel issue > but that we > are simply fragmented the Xen memory so much, so that when it launches the > guest all of the > memory is above 4GB. But that seems counter-intuive as by default Xen starts > guests at the far end of > memory (so on my 16GB box it would stick a 4GB guest at 12GB->16GB roughly). > The SWIOTLB > swizzles some memory under the 4GB , and this is where we get the bounce > buffer effect > (as the memory from 4GB is then copied to the memory 12GB->16GB). > But it does not explain why on the first couple of starts I did not see this > with pvops. > And it does not seem to happen with the XenOLinux kernel, so there must be > something else > in here. >> >> I have forced my r8169 to use 64bits dma mask (using use_dac=1) > Ah yes. >> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat >> consistent_dma_mask_bits >> 32 >> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat dma_mask_bits >> 64 >> >> This results in dump-swiotlb reporting: >> >> [ 1265.616106] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 >> unmap:0 sync:10 >> [ 1265.625043] SWIOTLB is 0% full >> [ 1270.626085] 0 [r8169 0000:08:00.0] bounce: from:6(slow:0)to:0 map:0 >> unmap:0 sync:12 >> [ 1270.635024] SWIOTLB is 0% full >> [ 1275.635091] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 >> unmap:0 sync:10 >> [ 1275.644261] SWIOTLB is 0% full >> [ 1280.654097] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 >> unmap:0 sync:10 > Which is what we expect. No need to bounce since the PCI adapter can reach > memory > above the 4GB mark. >> >> >> >> So it has changed from 12% to 0%, although it still reports something about >> bouncing ? or am i mis interpreting stuff ? > The bouncing can happen due to two cases: > - Memory is above 4GB > - Memory crosses a page-boundary (rarely happens). >> >> >> Another thing i was wondering about, couldn't the hypervisor offer a small >> window in 32bit addressable mem to all (or only when pci passthrough is >> used) domU's to be used for DMA ? > It does. That is what the Xen SWIOTLB does with "swizzling" the pages in its > pool. > But it can't do it for every part of memory. That is why there are DMA pools > which are used by graphics adapters, video capture devices,storage and network > drivers. They are used for small packet sizes so that the driver does not have > to allocate DMA buffers when it gets a 100bytes ping response. But for large > packets (say that ISO file you are downloading) it allocates memory on the fly > and "maps" it into the PCI space using the DMA API. That "mapping" sets up > an "physical memory" -> "guest memory" translation - and if that allocated > memory is above 4GB, part of this mapping is to copy ("bounce") the memory > under the 4GB (where XenSWIOTLB has allocated a pool), so that the adapter > can physically fetch/put the data. Once that is completed it is "sync"-ed > back, which is bouncing that data to the "allocated memory". > So having a DMA pool is very good - and most drivers use it. The thing I can't > figure out is: > - why the DVB do not seem to use it, even thought they look to use the > videobuf_dma > driver. > - why the XenOLinux does not seem to have this problem (and this might be > false - > perhaps it does have this problem and it just takes a couple of guest > launches, > destructions, starts, etc to actually see it). > - are there any flags in the domain builder to say: "ok, this domain is > going to > service 32-bit cards, hence build the memory from 0->4GB". This seems like > a good know at first, but it probably is a bad idea (imagine using it by > mistake > on every guest). And also nowadays most cards are PCIe and they can do > 64-bit, so > it would not be that important in the future. >> >> (oh yes, i haven't got i clue what i'm talking about ... so it probably make >> no sense at all :-) ) > Nonsense. You were on the correct path . Hopefully the level of details hasn't > scared you off now :-) Well it only gives some more questions :-) The thing is, pci passthrough and especially the DMA part of it, all work behind the scenes without giving much output about the way it is actually working. The thing i was wondering about is if my AMD IOMMU is actually doing something for PV guests. When booting with iommu=off machine has 8GB mem, dom0 limited to 1024M and just starting one domU with iommu=soft, with pci-passthrough and the USB pci-cards with USB videograbbers attached to it, i would expect to find some bounce buffering going. (HV_START_LOW 18446603336221196288) (FEATURES '!writable_page_tables|pae_pgdir_above_4gb') (VIRT_BASE 18446744071562067968) (GUEST_VERSION 2.6) (PADDR_OFFSET 0) (GUEST_OS linux) (HYPERCALL_PAGE 18446744071578849280) (LOADER generic) (SUSPEND_CANCEL 1) (PAE_MODE yes) (ENTRY 18446744071594476032) (XEN_VERSION xen-3.0) Still i only see: [ 47.449072] Starting SWIOTLB debug thread. [ 47.449090] swiotlb_start_thread: Go! [ 47.449262] xen_swiotlb_start_thread: Go! [ 52.449158] 0 [ehci_hcd 0000:0a:00.3] bounce: from:432(slow:0)to:1329 map:1756 unmap:1781 sync:0 [ 52.449180] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:16 map:23 unmap:0 sync:0 [ 52.449187] 2 [ohci_hcd 0000:0a:00.4] bounce: from:0(slow:0)to:4 map:5 unmap:0 sync:0 [ 52.449226] SWIOTLB is 0% full [ 57.449180] 0 ehci_hcd 0000:0a:00.3 alloc coherent: 35, free: 0 [ 57.449219] 1 ohci_hcd 0000:0a:00.6 alloc coherent: 1, free: 0 [ 57.449265] SWIOTLB is 0% full [ 62.449176] SWIOTLB is 0% full [ 67.449336] SWIOTLB is 0% full [ 72.449279] SWIOTLB is 0% full [ 77.449121] SWIOTLB is 0% full [ 82.449236] SWIOTLB is 0% full [ 87.449242] SWIOTLB is 0% full [ 92.449241] SWIOTLB is 0% full [ 172.449102] 0 [ehci_hcd 0000:0a:00.7] bounce: from:3839(slow:0)to:664 map:4486 unmap:4617 sync:0 [ 172.449123] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:82 map:111 unmap:0 sync:0 [ 172.449130] 2 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:32 map:36 unmap:0 sync:0 [ 172.449170] SWIOTLB is 0% full [ 177.449109] 0 [ehci_hcd 0000:0a:00.7] bounce: from:5348(slow:0)to:524 map:5834 unmap:5952 sync:0 [ 177.449131] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:76 map:112 unmap:0 sync:0 [ 177.449138] 2 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:4 map:6 unmap:0 sync:0 [ 177.449178] SWIOTLB is 0% full [ 182.449143] 0 [ehci_hcd 0000:0a:00.7] bounce: from:5349(slow:0)to:563 map:5899 unmap:5949 sync:0 [ 182.449157] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:27 map:35 unmap:0 sync:0 [ 182.449164] 2 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:10 map:15 unmap:0 sync:0 [ 182.449204] SWIOTLB is 0% full [ 187.449112] 0 [ehci_hcd 0000:0a:00.7] bounce: from:5375(slow:0)to:592 map:5941 unmap:6022 sync:0 [ 187.449126] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:46 map:69 unmap:0 sync:0 [ 187.449133] 2 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:9 map:12 unmap:0 sync:0 [ 187.449173] SWIOTLB is 0% full [ 192.449183] 0 [ehci_hcd 0000:0a:00.7] bounce: from:5360(slow:0)to:556 map:5890 unmap:5978 sync:0 [ 192.449226] 1 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:52 map:74 unmap:0 sync:0 [ 192.449234] 2 [ehci_hcd 0000:0a:00.7] bounce: from:0(slow:0)to:10 map:14 unmap:0 sync:0 [ 192.449275] SWIOTLB is 0% full And the devices do work ... so how does that work ... Thx for your explanation so far ! -- Sander >> >> >> -- >> Sander >> >> -- Best regards, Sander mailto:linux@xxxxxxxxxxxxxx _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |