[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC] x86/vMSI-X: avoid missing first unmask of vectors



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: 01 April 2016 15:40
> To: Paul Durrant
> Cc: Andrew Cooper; Anthony Perard; Stefano Stabellini; xen-devel; Keir
> (Xen.org)
> Subject: RE: [PATCH RFC] x86/vMSI-X: avoid missing first unmask of vectors
> 
> >>> On 01.04.16 at 16:27, <Paul.Durrant@xxxxxxxxxx> wrote:
> >>  -----Original Message-----
> >> From: Paul Durrant
> >> Sent: 01 April 2016 15:20
> >> To: 'Jan Beulich'
> >> Cc: Andrew Cooper; Anthony Perard; Stefano Stabellini; xen-devel; Keir
> >> (Xen.org)
> >> Subject: RE: [PATCH RFC] x86/vMSI-X: avoid missing first unmask of
> vectors
> >>
> >> > -----Original Message-----
> >> > From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> > Sent: 01 April 2016 15:18
> >> > To: Paul Durrant
> >> > Cc: Andrew Cooper; Anthony Perard; Stefano Stabellini; xen-devel; Keir
> >> > (Xen.org)
> >> > Subject: RE: [PATCH RFC] x86/vMSI-X: avoid missing first unmask of
> vectors
> >> >
> >> > >>> On 01.04.16 at 15:54, <Paul.Durrant@xxxxxxxxxx> wrote:
> >> > >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> > >> Sent: 01 April 2016 14:43
> >> > >> >>> On 01.04.16 at 15:01, <Paul.Durrant@xxxxxxxxxx> wrote:
> >> > >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> > >> >> Sent: 01 April 2016 12:21
> >> > >> >> >>> On 01.04.16 at 12:56, <Paul.Durrant@xxxxxxxxxx> wrote:
> >> > >> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> > >> >> >> Sent: 01 April 2016 10:59
> >> > >> >> > I guess it could be handled entirely in Xen if we are willing to
> snoop
> >> > on
> >> > >> >> > PCI configuration. It would not be too hard to snoop guest
> writes to
> >> > the
> >> > >> >> BARs
> >> > >> >> > in config space so that Xen can keep track of where they are.
> >> > Snooping
> >> > >> on
> >> > >> >> the
> >> > >> >> > MSI-X capability could then tell Xen when to start interposing on
> >> the
> >> > >> table,
> >> > >> >> > and allow it to discover the GPA at that point (via the BIR and
> offset
> >> > >> >> > values).
> >> > >> >>
> >> > >> >> Well, that's a possibility, but won't - afaict - work without 
> >> > >> >> qemu's
> >> > >> >> help at another point: So far we don't know the guest's PCI bus
> >> > >> >> topology, hence we can't correlate vBAR writes we might snoop
> >> > >> >> with the physical devices they correspond to.
> >> > >> >
> >> > >> > Does Xen need to know that correspondence just to track state? I
> >> > thought
> >> > >> the
> >> > >> > problem here was that Xen does not see every guest access to an
> >> MSI-X
> >> > >> table.
> >> > >> > If Xen always interposes on MSI-X tables then it can at least track
> the
> >> > > state
> >> > >> > of the emulated table, even if we end up just forwarding the
> access
> >> for
> >> > >> QEMU
> >> > >> > to handle at first. When the mapping is created to the actual h/w
> table
> >> > >> then,
> >> > >> > presumably, Xen's idea of state should correspond to QEMU's.
> >> > >>
> >> > >> But Xen doesn't see the guest view of config space,
> >> > >
> >> > > Well Xen interposes on every single config cycle so arguably it sees
> >> exactly
> >> > > what the guest sees.
> >> >
> >> > Ah, so you mean to snoop what qemu returns. Yes, that would be
> >> > an option.
> >> >
> >> > >> And additionally msixtbl_addr_to_desc() needs to know the physical
> >> > >> device.
> >> > >
> >> > > Yes, but msixtbl_range() could be trivially changed to accept any
> access
> >> > > where msixtbl_find_entry() returns non-NULL. That would allow
> >> > msixtbl_write()
> >> > > to manipulate entry->flags even if msixtbl_addr_to_desc() returns
> NULL.
> >> >
> >> > Are you looking at some old code base? There's no entry->flags
> >> > manipulation. We call guest_mask_msi_irq(), and for that we need
> >> > to know the IRQ descriptor, which in turn requires knowing the
> >> > pdev (for msixtbl_addr_to_desc() to return non-NULL).
> >>
> >> Ah, maybe I'm out of date. I haven't pulled for a day or so.
> >>
> >
> > I pulled staging and I still see (starting at line 300 in vmsi.c)
> >
> >     /* Exit to device model when unmasking and address/data got modified.
> */
> >     if ( !(val & PCI_MSIX_VECTOR_BITMASK) &&
> >          test_and_clear_bit(nr_entry, &entry->table_flags) )
> >     {
> >         v->arch.hvm_vcpu.hvm_io.msix_unmask_address = address;
> >         goto out;
> >     }
> >
> >     msi_desc = msixtbl_addr_to_desc(entry, address);
> >     if ( !msi_desc || msi_desc->irq < 0 )
> >         goto out;
> >
> > I was wrong about the name. I meant 'entry->table_flags', and that's clearly
> > manipulated before calling msixtbl_addr_to_desc() so even if that returns
> > NULL Xen still keeps in sync with QEMU AFAICT.
> 
> Staying in sync with qemu is not the problem. The problem is that
> the re-invocation from msix_write_completion() then would get
> past this, and find said NULL returned from msixtbl_addr_to_desc().
> 

True, but perhaps if we know this is a completion cycle then we should always 
set r to X86EMUL_OKAY? TBH I'm not sure what would happen if we didn't... I 
suspect the emulation state machine would get upset.

  Paul

> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.