[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Re: [Xci-devel] Porblem with disabling and then re-enabling a PT device in Windows
Well, as an updtae, i found the followings: after i do a disable, and then re-enable, the interrupt is "kind-of" stuck for ~1:30 minutes, and after that, the interrupt is somehow released, and everything works ok. I tested it a few times, and in each time the interrupt was released about ~1:30 minutes later. Does that imples anything? or is that probably just a recover of the Windows driver? On Thu, Nov 26, 2009 at 9:55 AM, Jiang, Yunhong <yunhong.jiang@xxxxxxxxx> wrote: > > Tom Rotenberg wrote: >> How do i know if i'm using 'ack_type_new', what does it mean? >> Do u have any idea, on how i can check inside domU windows XP (using >> WinDBG of-course) if the virtual local APIC/IOAPIC has EOI the >> interrupt? > > I didn't try windbg to check local/io apic before, although I suppose you can > do that since that is simply MMIO access. > Also you can add a hotkey (i.e. the key pressed after the 3 "ctrl+a") to xen > hypervisor to dump guest's virtual local apic/ioapic context. > > --jyh > >> >> It happens every time... it's 100% reproduceable on that Dell machine. >> >> On Thu, Nov 26, 2009 at 3:40 AM, Jiang, Yunhong >> <yunhong.jiang@xxxxxxxxx> wrote: >>> >>> >>> xen-devel-bounces@xxxxxxxxxxxxxxxxxxx wrote: >>>> After digging more into this problem, i found out that the problem >>>> is because the interrupt generated on the wlan device, isn't being >>>> transfered to the domain, for some reason, after the device was >>>> re-enablked in Windows. I saw that, by connecting to the xen >>>> console, and then clicking 'i', and i got the following lines: ... >>>> (XEN) Vec192 IRQ 17: type=IO-APIC-level status=00000010 >>>> in-flight=1 domain-list=0: 17(----),3: 17(---M), >>>> ... >>>> (XEN) Apic 0x00, Pin 17: vector=192, delivery_mode=1, >>>> dest_mode=logical, delivery_status=1, polarity=1, irr=1, >>>> trigger=level, mask=0 .... >>>> >>>> You can see, that the interrupt 17, which is in my Windows domU, was >>>> generated, but still weren't injected to the CPU (the 'irr' is 1). >>>> So, i guess that this is what is causing the problem. >>>> Now, the only issue left, is why the hell, the interrupt isn't being >>>> injected to the domain? >>> >>> I assume you are using ack_type_new on your system, am I right? >>> Usually it means guest has not EOI the interrupt, so that >> host has no chance to EOI the physical IOAPIC. Can you check >> the virtual Local APIC/IOAPIC for the guest to see if we have >> any finding? >>> BTW, does it happen everytime? >>> >>> --jyh >>> >>>> >>>> Has anyone has any idea about it? >>>> >>>> On Wed, Nov 25, 2009 at 6:31 PM, Tom Rotenberg >>>> <tom.rotenberg@xxxxxxxxx> wrote: >>>>> Well, i just performed some tests, and it doesn't look like the >>>>> disable_msi/enable_msi functions in pciback are being called at all >>>>> (moreover, not in the disable-enable from domU Windows XP), so i >>>>> don't think it's related. Also, since when, a config space write >>>>> from a guest domU triggers code in the pciback? >>>>> >>>>> I think that it's not the problem here... >>>>> Maybe someone from the XCI can shed some light here, and tell us >>>>> how they solve it (or not)? since their code should run on the >>>>> same Dell machines, no? >>>>> >>>>> On Wed, Nov 25, 2009 at 5:13 PM, Kamala Narasimhan >>>>> <Kamala.Narasimhan@xxxxxxxxxx> wrote: >>>>>> I shouldn't have suggested that you build without pciback; >>>> I got carried away trying to make it simple for you :-); >>>> Obviously you would need it and I should have stopped with >>>> suggesting that you tweak it. >>>>>> >>>>>> Here is the thought process that led to my suggestion - >>>>>> >>>>>> Clearly, that bit is getting changed as indicated in your >>>> log. It is unlikely that the guest is triggering that change >>>> which makes pciback a potential candidate to suspect as it >>>> does change pci configuration space bits. I need to add some >>>> tracing and look at the path of execution to answer some of >>>> your specific questions accurately and I won't be able to do >>>> that right now but I can give some context to help you based >>>> on what I have experienced in comparable situation and based >>>> on that I would say pciback is one place to suspect. To be a >>>> bit more specific I would say look into >>>> pciback_enable_msi/pciback_disable_msi code, add some tracing >>>> there, observe whether or not that code path is taken when the >>>> device is disabled/reenabled within guest etc. To reiterate, >>>> these are mere suggestions but looks plausible based on prior >>>> observations. >>>>>> >>>>>> Kamala >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Tom Rotenberg [mailto:tom.rotenberg@xxxxxxxxx] >>>>>>> Sent: Wednesday, November 25, 2009 9:22 AM >>>>>>> To: Kamala Narasimhan >>>>>>> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; xci-devel@xxxxxxxxxxxxxxxxxxx >>>>>>> Subject: Re: [Xci-devel] Porblem with disabling and then >>>>>>> re-enabling a PT device in Windows >>>>>>> >>>>>>> I am not sure i undertand how to test it... >>>>>>> 1) Avoid doing FLR for the device - isn';t that done only when >>>>>>> building the domain? does that happen when i disable the device >>>>>>> in domU? 2) Don't build pciback - and then, i won't bind the wlan >>>>>>> device to pciback? and change the xend scripts which check for >>>>>>> it? 3) Comment out the relevant code - which code?? >>>>>>> >>>>>>> I also don't understand, how could it be that the pciback device >>>>>>> is "messing" with it? isn't it supposed to be in-active when the >>>>>>> device is being used in PT? >>>>>>> >>>>>>> Tom >>>>>>> >>>>>>> On Wed, Nov 25, 2009 at 4:12 PM, Kamala Narasimhan >>>>>>> <Kamala.Narasimhan@xxxxxxxxxx> wrote: >>>>>>>> There is a chance pciback is changing the bit you are referring >>>>>>> to. To confirm that, just for testing purpose you might want to >>>>>>> avoid FLR for that device or simply not build pciback or comment >>>>>>> out relevant code in that module whichever is easier and see if >>>>>>> that helps. If it does, you can then look into fixing the >>>>>>> problem the right way. >>>>>>>> >>>>>>>> Kamala >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: xci-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xci-devel- >>>>>>>>> bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Tom Rotenberg >>>>>>>>> Sent: Wednesday, November 25, 2009 8:09 AM >>>>>>>>> To: xen-devel@xxxxxxxxxxxxxxxxxxx; >>>>>>>>> xci-devel@xxxxxxxxxxxxxxxxxxx Subject: [Xci-devel] Porblem >>>>>>>>> with disabling and then re-enabling a PT device in Windows >>>>>>>>> >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> (This is a continuation to my previous mail, but since it looks >>>>>>>>> like a different problem - i decided to open a new thread for >>>>>>>>> it) >>>>>>>>> >>>>>>>>> ---- >>>>>>>>> Problem Description: >>>>>>>>> ---- >>>>>>>>> I am doing pass-through of an Intel wireless LAN device to a >>>>>>>>> Windows XP domU (my machine is Dell e6400), and it looks like >>>>>>>>> it's working ok. Then, i disable the device using Windows >>>>>>>>> device manager, and the device is now disabled, after that i >>>>>>>>> re-enable the device, and Windows re-enables the device >>>>>>>>> correctly. However, the wlan device seems to malfunction (it >>>>>>>>> can't turn on the WiFi of the computer), and can't connect to >>>>>>>>> wireless networks. I tried it, both with MSI translation on, >>>>>>>>> and with MSI translation off - it doesn't matter. >>>>>>>>> >>>>>>>>> ---- >>>>>>>>> My analysis: >>>>>>>>> ---- >>>>>>>>> 1) Well, taking a look at the real PCI config space, before >>>>>>>>> disable and after the (last) enable, shows that the difference >>>>>>>>> is at the Intx bit (read-only bit 3 at status register (offset >>>>>>>>> 0x6) at the PCI config space). Before disable, that bit was 0, >>>>>>>>> and after the last enable that bit was 1. This, according to my >>>>>>>>> understanding, means that the device is asserting it's IntX , >>>>>>>>> and probably waiting for someone to handle it, no? >>>>>>>>> >>>>>>>>> 2) When i tried to track when did this bit was changed - i >>>>>>>>> added a code which in every PCI config read, checks if that >>>>>>>>> bit was changed - and added a print when it changed. The >>>>>>>>> proper lines in the qemu log looks like this: ... >>>>>>>>> pt_pci_read_config: [00:01.0]: address=00f0 val=0x00000000 >>>>>>>>> len=2 ACPI PCI hotplug: read addr=0x10c6, val=0x0f. >>>>>>>>> ACPI PCI hotplug: read addr=0x10c6, val=0x0f. >>>>>>>>> pt_pci_read_config: TEST CODE: STATUS CHNAGED! OLD: 0x10, NEW: >>>>>>>>> 0x18 pt_pci_read_config: [00:01.0]: address=0000 >>>>>>>>> val=0x00008086 len=2 ... >>>>>>>>> >>>>>>>>> This implies that the bit was changed, about the same time that >>>>>>>>> Windows tried to start using it (because, i assume that it >>>>>>>>> tried using it, just after questioning the ACPI for the >>>>>>>>> existence of the device). No? >>>>>>>>> >>>>>>>>> >>>>>>>>> Can someone help me with this? >>>>>>>>> >>>>>>>>> (BTW - i am using Xen 3.4) >>>>>>>>> >>>>>>>>> Tom >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Xci-devel mailing list >>>>>>>>> Xci-devel@xxxxxxxxxxxxxxxxxxx >>>>>>>>> http://lists.xensource.com/mailman/listinfo/xci-devel >>>>>>>> >>>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@xxxxxxxxxxxxxxxxxxx >>>> http://lists.xensource.com/xen-devel _______________________________________________ Xci-devel mailing list Xci-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/mailman/listinfo/xci-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |