Xen project Mailing List

Re: [Xen-devel] [PATCH] passthrough: give XEN_DOMCTL_test_assign_device more sane semantics

To: George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>

From: Daniel De Graaf <dgdegra@xxxxxxxxxxxxx>

Date: Thu, 22 Jun 2017 16:30:07 -0400

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Thu, 22 Jun 2017 20:32:03 +0000

Ironport-phdr: 9a23:qYgbNx+7TxwGGv9uRHKM819IXTAuvvDOBiVQ1KB+1+MQIJqq85mqBkHD//Il1AaPBtSEraMewLOK7OjJYi8p2d65qncMcZhBBVcuqP49uEgeOvODElDxN/XwbiY3T4xoXV5h+GynYwAOQJ6tL1LdrWev4jEMBx7xKRR6JvjvGo7Vks+7y/2+94fdbghMhjexe7d/IAu5oQnMucQbg5ZpJ7osxBfOvnZGYfldy3lyJVKUkRb858Ow84Bm/i9Npf8v9NNOXLvjcaggQrNWEDopM2Yu5M32rhbDVheA5mEdUmoNjBVFBRXO4QzgUZfwtiv6sfd92DWfMMbrQ704RSiu4qF2QxLzliwJKyA2/33WisxojaJUvhShpwBkw4XJZI2ZLedycr/Bcd8fQ2dOUNxRVyhcCY2iaYUBAfcKMeJBo4Xju1cCqB2zDhSuCuzy0D9FnmL407M00+ohFgHI3wIuENwBv3vWsNr7O7wfUfy3waTS0TnPc/1b1Df75YPVch4hu/aMXbdofMTf00YvEQLFgUuUqYf4MT2ayPkGvWmB7+V6T+2vhXMspgZsrTig28gjlIbJhpgPxV3f6SV4wJo6Jd2/SEJhZ96kC4FfuzuVN4txXMMvWmdlszs0xL0BvJ60ZikKyJI/yhLDdfCGfJKE4g/4VOuXPDx2h2pldaqiixu9/kWs0O3xWtSu3FpUoSdJjMPAum0L2hfO8MaIUOF98V2k2TuX0gDT7fxLLl4smKrALp4h3qYwlp0OsUTfBiP2mFv5jKuRdkg85uin8f7nYrT7pp+HLYN0lgH/Pbgumsy4G+g4NBQBX3OH9uim0b3j/En5TK1Ljv0wjKbZrIjXKdkUq6O2GQNY0psv5wyhAzqpztgUh2QLIEpAeB2djojpP1/OIOr/Dfe6m1mjjThryO3YMb3uGJXCNGPOkKvhfLlh605czxA/zdZE551OEL0BL/XzWlHpuNzCEhA5KxC0w/rgCNhl2YMRR2WPArWWMKzMq1OH+/8vI++IZIAPoDbwMOQq5//yjX8jmF8ccrOl0ocQaHC9Bv5mOVmWYWLwgtcdFmcHpg4wQfH0h12fVT5TZmq9X6In6zEgFYKmFpnMSpqxj7yG2SexBodWaXxeClCQDXfocJ2JW/URaCKWI89ujDoEWaKuS487zx6usAv6xqF9IerO/y0Ur47s1N9w5+fLjxE96SR0D9iB02GKV2x7hGUISCIs3K9hr01x0EuM0a9/g/xAC9NT/f1EXxwmOp7d0+x6EdHyWw3bctiVT1amR82sASstQdIp398Of0F9Fs2sjx/d3iqmGbsVl72WBJAq6a/Tw3nxJ9pny3bH26gtlUUpQsxKNWe+nK5w6xDTB5LVk0Wej6ukdLoT3CnX9GeM02WCpk9WUBN2UaXBR38fflDbosrk5k/YU7CuCKgnMhFAyc+NMKdFdtrpjVBeTvf5JNvee36xm3u3BRuQxb2Ddozqd38Z0irHFEcEkBsT/XGANQUlGCihvnjSAyBvFVLzeUPs8OZ/pGmnQU8zygGAd1dh2Kat+h4JmfycTOse3qkfuCc9sTp7B0iy39bSC9qBoQphfb5RYdAj71dd02LWqQh9MoanL6B4iV4Uax53sF/21xVrFoVAltAnrXw0wwp0MK6XzU1Ody2G0pD0IbDXLmjy/Auza67NwF3f38iZ+qEX6PQirFXjvh+mGVY+83l91NlVyXSc7I3QDAUOSZLxTlo39x9iqrHZZSk94ZnU2mdxPqWuvD7C2tYpBOg+xxanZddQKr+LFAvsHMEAG8euL+kqkUCzbh0YJOBS6LI0P8S+evua2a6rOf1tnT24gmRB+ox91ViM9yUvAtLPirkIxOuX00OrSjH4hU/p5s/6nppLaHceA2y7wDDMD49NfKxiO40MDDHqa+++wJ1UioPpWnVYvAqBLV4b3M6ieTKJckfwmwZX0BJEj2agnH6Uxjp1njVhgqfX8zbHyuqqIBYINmNEXmBKkUbnIY/yicsTGkevcV56x1Oe+U/myv0D9+xEJG7JTBINJnKuIg==

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 06/22/2017 05:40 AM, George Dunlap wrote:

On 22/06/17 08:05, Jan Beulich wrote:

On 21.06.17 at 18:36, <george.dunlap@xxxxxxxxxx> wrote:

On 21/06/17 16:59, Jan Beulich wrote:

On 21.06.17 at 16:38, <george.dunlap@xxxxxxxxxx> wrote:

On 21/06/17 11:08, Jan Beulich wrote:

So far callers of the libxc interface passed in a domain ID which was
then ignored in the hypervisor. Instead, make the hypervisor honor it
(accepting DOMID_INVALID to obtain original behavior), allowing to
query whether a device is assigned to a particular domain. Ignore the
passed in domain ID at the libxc layer instead, in order to not break
existing callers. New libxc functions would need to be added if callers
wanted to leverage the new functionality.


I don't think your modified description matches the name of the call at all.

It looks like the callers expect "test_assign_device" to answer the
question: "Can I assign a device to this domain"?


I don't think so - the question being answered by the original
operation is "Is this device assigned to any domain?" with the
implied inverse "Is this device available to be assigned to some
domain (i.e. it is currently unassigned or owned by Dom0)?"


If the question were "Is this device assigned to any domain?", then I
would expect:
1. The return value to be a boolean
2. It would always return, "No it's not assigned" in the case where
there is no IOMMU.

However, that's not what happens:
1. It returns "success" if there is an IOMMU and the device is *not*
assigned, and returns an error if the device is assigned
2. It returns an error if there is no IOMMU.

The only place in the code this is called 'for real' in the tree is in
libxl_pci.c:libxl__device_pci_add()

     if (libxl__domain_type(gc, domid) == LIBXL_DOMAIN_TYPE_HVM) {
         rc = xc_test_assign_device(ctx->xch, domid,
pcidev_encode_bdf(pcidev));
         if (rc) {
             LOGD(ERROR, domid,
                  "PCI device %04x:%02x:%02x.%u %s?",
                  pcidev->domain, pcidev->bus, pcidev->dev, pcidev->func,
                  errno == ENOSYS ? "cannot be assigned - no IOMMU"
                  : "already assigned to a different guest");
             goto out;
         }
     }

Here 'domid' is the domain to which libxl wants to assign the device.
So libxl is now asking Xen, "Am I allowed to assign device $bdf to
domain $domain?"

Your description provides the *algorithm* by which Xen normally provides
an answer: that is, normally the only thing Xen cares about is that it
hasn't already been assigned to a domain.  But it still remains the case
that what libxl is asking is, "Can I assign X to Y?"


Taking the log message into account that you quote, I do not
view the code's intention to be what you describe.


Well, I'm not sure what to say, because in my view the log message
supports my view. :-)  Note that there are two errors, both explaining
why the domain cannot be assigned -- one is "no IOMMU", one is "already
assigned to a different guest".

Yes, at the moment it doesn't have a separate message for -EPERM (which
is presumably what XSM would return if there were some other problem).
But it also doesn't correctly report other potential errors: -ENODEV if
you try to assign a DT device on a PCI-based system, or a PCI device on
a DT-based system.  (Apparently we also retirn -EINVAL if you included
inappropriate flags, *or* if the device didn't exist, *or* if the device
was already assigned somehwere else.  As long as we're re-painting
things we should probably change this as well.)

But to make test_assign_device answer the question, "Is this assigned to
a domU?", you'd have to have it return SUCCESS when there is no IOMMU
(since the device is not, in fact, assigned to a domU); and thus libxl
would have to make a separate call to find out if an IOMMU was present.

It looks like it's meant to be used in XSM environments, to allow a
policy to permit or forbid specific guests to have access to specific
devices.  On a default (non-XSM) system, the answer to that question
doesn't depend on the domain it's being assigned to, but only whether
the device is already assigned to another domain; but on XSM systems the
logic can presumably be more complicated.

That sounds like a perfectly sane semantic to me, and this patch removes
that ability.


And again I don't think so: Prior to the patch, do_domctl() at its
very top makes sure to entirely ignore the passed in domain ID.
This code sits ahead of the XSM check, so XSM has no way of
knowing which domain has been specified by the caller.


Right, I see that now.

Still, I assert that the original hypercall semantics is a very useful
one, and what you're doing is changing the hypercall such that the
question can no longer be asked.  It would be better to extend things so
that XSM can actually deny device assignment based on both the bdf and
the domain.

Do you have a particular use case in mind for your alternate hypercall?


No - I'm open to any change to it which makes the currently ignored
argument no longer ignored, without breaking existing (known and
unknown) callers of the libxc wrapper. I.e. I'm in no way opposed to
make it work the way you think it was originally meant to work; it is
just that given its current use I've come to a different conclusion as
to what the original intention may have been.


So the libxc library interface is not meant to be stable.  Before I
looked at how libxl was using it, I was going to say that we should just
remove the domid argument from that function entirely, rather than
labelling it "unused".

I suggest we ask the toolstack maintainers what kind of a function they
think would be most useful, and then we can implement that.

So, Wei and Ian (and Daniel if you're around):

At the moment, xc_test_assign_device() accepts both a domid and a device
identifier.  It will return -ENOSYS if there is no IOMMU, and

The domid, however, is ignored by Xen, and there is no possibility even
of XSM/Flask paying any attention to it because the domid is not passed
to the XSM hook.

Jan thinks the color of this shed is ugly and wants to repaint it.  I
agree that the color of this shed is ugly, but think a different color
would be more suitable.  Since the function is ultimately mainly for the
benefit of toolstacks like libxl, we'd like your input:

Option 1: Make xc_test_assign_device() explicitly take only a device
identifier.  Add another xc function which takes a domain argument,
which will return true if the BDF is assigned to that particular domain,
and false otherwise.
   1a: Leave the domid argument, but add a comment specifying it as ignored
   1b: Remove the domid argument.

Option 2: Pass the domain to the XSM callback, enabling XSM / Flask
policies that can forbid specific devices from being assigned to
specific guests.

NB that neither of us has a particular requirement for the proposed
additional functionality ("Device X is assigned to domid Y" in the case
of Option 1, or "Flask policy can allow or forbid specific devices to
specific domains" in the case of Option 2).

Any preferences?

  -George

One caveat to add to option 2: the XSM permission check done by test_assign would only cover the PCI device resource (same as assign_device). There are also XSM checks that occur when assigning the MMIO and IO port regions assigned to the device (defined by the BARs), and these ranges are not passed to the hypervisor during the test function. A proper test_assign function as described above would also need to check the XSM permissions for them, which would require adding test_io{port,mem,q}_permission functions too. Alternatively, you could assume that the PCI device and its associated resources all have the same label (which will be almost always be true in a properly configured system) and just use this as an early bail out to avoid user mistakes. -- Daniel De Graaf National Security Agency _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.