[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen4.2 S3 regression?
>>> On 07.08.12 at 22:14, Ben Guthro <ben@xxxxxxxxxx> wrote: > Any suggestions on how best to chase this down? > > The first S3 suspend/resume cycle works, but the second does not. > > On the second try, I never get any interrupts delivered to ahci. > (at least according to /proc/interrupts) > > > syslog traces from the first (good) and the second (bad) are attached, > as well as the output from the "*" debug Ctrl+a handler in both cases. You should have provided this also for the state before the first suspend. The state after the first resume already looks corrupted (presumably just not as badly): (XEN) PCI-MSI interrupt information: (XEN) MSI 26 vec=71 lowest edge assert log lowest dest=00000001 mask=0/1/-1 (XEN) MSI 27 vec=00 fixed edge deassert phys lowest dest=00000001 mask=0/1/-1 ^^ (XEN) MSI 28 vec=29 lowest edge assert log lowest dest=00000001 mask=0/1/-1 (XEN) MSI 29 vec=79 lowest edge assert log lowest dest=00000001 mask=0/1/-1 (XEN) MSI 30 vec=81 lowest edge assert log lowest dest=00000001 mask=0/1/-1 (XEN) MSI 31 vec=99 lowest edge assert log lowest dest=00000001 mask=0/1/-1 so this is likely the reason for thing falling apart on the second iteration: (XEN) Interrupt Remapping: supported and enabled. (XEN) Interrupt remapping table (nr_entry=0x10000. Only dump P=1 entries here): (XEN) SVT SQ SID DST V AVL DLM TM RH DM FPD P (XEN) 0000: 1 0 f0f8 00000001 38 0 1 0 1 1 0 1 ... (XEN) 0014: 1 0 00d8 00000001 a1 0 1 0 1 1 0 1 (XEN) 0015: 1 0 00fa 00000001 00 0 0 0 0 0 0 1 ^ ^ ^ (XEN) 0016: 1 0 f0f8 00000001 31 0 1 1 1 1 0 1 (XEN) 0017: 1 0 00a0 00000001 a9 0 1 0 1 1 0 1 (XEN) 0018: 1 0 0200 00000001 b1 0 1 0 1 1 0 1 (XEN) 0019: 1 0 00c8 00000001 c9 0 1 0 1 1 0 1 Surprisingly in both cases we get (with the other vector fields varying accordingly) (XEN) IRQ: 26 affinity:0001 vec:71 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:279(-S--), (XEN) IRQ: 27 affinity:0001 vec:21 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:278(-S--), ^^ (XEN) IRQ: 28 affinity:0001 vec:29 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:277(-S--), (XEN) IRQ: 29 affinity:0001 vec:79 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:276(-S--), (XEN) IRQ: 30 affinity:0001 vec:81 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:275(PS--), (XEN) IRQ: 31 affinity:0001 vec:99 type=PCI-MSI status=00000010 in-flight=0 domain-list=0:274(PS--), The interrupt in question belongs to 0000:00:1f.2, i.e. the AHCI contoller. Unfortunately I can't make sense of the kernel side config space restore messages - an offset of 1 gets reported for the device in question (and various other odd offsets exist), yet 3.5's drivers/pci/pci.c:pci_restore_config_space_range() calls pci_restore_config_dword() with an offset that's always divisible by 4. Could you clarify which kernel version you were using here? We first need to determine whether the kernel corrupts something (after all, config space isn't protected from Dom0 modifications) - if that's the case, we may need to understand why older Xen was immune against that. If that's not the case, adding some extra logging to Xen's pci_restore_msi_state() would seem the best first step, plus (maybe) logging of Dom0 post-resume config space accesses to the device in question. The most likely thing happening (though unclear where) is that the corresponding struct msi_msg instance gets cleared in the course of the first resume (but after the corresponding interrupt remapping entry already got restored). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |