|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen4.2 S3 regression?
>>> On 07.08.12 at 22:14, Ben Guthro <ben@xxxxxxxxxx> wrote:
> Any suggestions on how best to chase this down?
>
> The first S3 suspend/resume cycle works, but the second does not.
>
> On the second try, I never get any interrupts delivered to ahci.
> (at least according to /proc/interrupts)
>
>
> syslog traces from the first (good) and the second (bad) are attached,
> as well as the output from the "*" debug Ctrl+a handler in both cases.
You should have provided this also for the state before the
first suspend. The state after the first resume already looks
corrupted (presumably just not as badly):
(XEN) PCI-MSI interrupt information:
(XEN) MSI 26 vec=71 lowest edge assert log lowest dest=00000001
mask=0/1/-1
(XEN) MSI 27 vec=00 fixed edge deassert phys lowest dest=00000001
mask=0/1/-1
^^
(XEN) MSI 28 vec=29 lowest edge assert log lowest dest=00000001
mask=0/1/-1
(XEN) MSI 29 vec=79 lowest edge assert log lowest dest=00000001
mask=0/1/-1
(XEN) MSI 30 vec=81 lowest edge assert log lowest dest=00000001
mask=0/1/-1
(XEN) MSI 31 vec=99 lowest edge assert log lowest dest=00000001
mask=0/1/-1
so this is likely the reason for thing falling apart on the second
iteration:
(XEN) Interrupt Remapping: supported and enabled.
(XEN) Interrupt remapping table (nr_entry=0x10000. Only dump P=1 entries
here):
(XEN) SVT SQ SID DST V AVL DLM TM RH DM FPD P
(XEN) 0000: 1 0 f0f8 00000001 38 0 1 0 1 1 0 1
...
(XEN) 0014: 1 0 00d8 00000001 a1 0 1 0 1 1 0 1
(XEN) 0015: 1 0 00fa 00000001 00 0 0 0 0 0 0 1
^ ^ ^
(XEN) 0016: 1 0 f0f8 00000001 31 0 1 1 1 1 0 1
(XEN) 0017: 1 0 00a0 00000001 a9 0 1 0 1 1 0 1
(XEN) 0018: 1 0 0200 00000001 b1 0 1 0 1 1 0 1
(XEN) 0019: 1 0 00c8 00000001 c9 0 1 0 1 1 0 1
Surprisingly in both cases we get (with the other vector fields varying
accordingly)
(XEN) IRQ: 26 affinity:0001 vec:71 type=PCI-MSI status=00000010
in-flight=0 domain-list=0:279(-S--),
(XEN) IRQ: 27 affinity:0001 vec:21 type=PCI-MSI status=00000010
in-flight=0 domain-list=0:278(-S--),
^^
(XEN) IRQ: 28 affinity:0001 vec:29 type=PCI-MSI status=00000010
in-flight=0 domain-list=0:277(-S--),
(XEN) IRQ: 29 affinity:0001 vec:79 type=PCI-MSI status=00000010
in-flight=0 domain-list=0:276(-S--),
(XEN) IRQ: 30 affinity:0001 vec:81 type=PCI-MSI status=00000010
in-flight=0 domain-list=0:275(PS--),
(XEN) IRQ: 31 affinity:0001 vec:99 type=PCI-MSI status=00000010
in-flight=0 domain-list=0:274(PS--),
The interrupt in question belongs to 0000:00:1f.2, i.e. the
AHCI contoller.
Unfortunately I can't make sense of the kernel side config space
restore messages - an offset of 1 gets reported for the device in
question (and various other odd offsets exist), yet 3.5's
drivers/pci/pci.c:pci_restore_config_space_range() calls
pci_restore_config_dword() with an offset that's always divisible
by 4. Could you clarify which kernel version you were using here?
We first need to determine whether the kernel corrupts something
(after all, config space isn't protected from Dom0 modifications) -
if that's the case, we may need to understand why older Xen was
immune against that. If that's not the case, adding some extra
logging to Xen's pci_restore_msi_state() would seem the best
first step, plus (maybe) logging of Dom0 post-resume config space
accesses to the device in question.
The most likely thing happening (though unclear where) is that
the corresponding struct msi_msg instance gets cleared in the
course of the first resume (but after the corresponding interrupt
remapping entry already got restored).
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |