[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [v8][PATCH 02/17] introduce XEN_DOMCTL_set_rdm
On Mon, Dec 08, 2014 at 11:16:07AM +0800, Chen, Tiejun wrote: > > On 2014/12/3 3:39, Konrad Rzeszutek Wilk wrote: > >On Mon, Dec 01, 2014 at 05:24:20PM +0800, Tiejun Chen wrote: > >>This should be based on a new parameter globally, 'pci_rdmforce'. > >> > >>pci_rdmforce = 1 => Of course this should be 0 by default. > >> > >>'1' means we should force check to reserve all ranges. If failed > >>VM wouldn't be created successfully. This also can give user a > >>chance to work well with later hotplug, even if not a device > >>assignment while creating VM. > >> > >>But we can override that by one specific pci device: > >> > >>pci = ['AA:BB.CC,rdmforce=0/1] > >> > >>But this 'rdmforce' should be 1 by default since obviously any > >>passthrough device always need to do this. Actually no one really > >>want to set as '0' so it may be unnecessary but I'd like to leave > >>this as a potential approach. > >> > >>So this domctl provides an approach to control how to populate > >>reserved device memory by tools. > >> > >>Note we always post a message to user about this once we owns > >>RMRR. > >> > >>Signed-off-by: Tiejun Chen <tiejun.chen@xxxxxxxxx> > >>--- > >> docs/man/xl.cfg.pod.5 | 6 +++++ > >> docs/misc/vtd.txt | 15 ++++++++++++ > >> tools/libxc/include/xenctrl.h | 6 +++++ > >> tools/libxc/xc_domain.c | 28 +++++++++++++++++++++++ > >> tools/libxl/libxl_create.c | 3 +++ > >> tools/libxl/libxl_dm.c | 47 > >> ++++++++++++++++++++++++++++++++++++++ > >> tools/libxl/libxl_internal.h | 4 ++++ > >> tools/libxl/libxl_types.idl | 2 ++ > >> tools/libxl/libxlu_pci.c | 2 ++ > >> tools/libxl/xl_cmdimpl.c | 10 ++++++++ > > > >In the past we had split the hypervisor and the > >toolstack patches in two. So that one could focus > >on the hypervisor ones first, and then in another > >patch on the toolstack. > > > > Yes. > > >But perhaps this was intended to be in one patch? > > This change also involve docs so its little bit harder to understand the > whole page if we split this. > > > > >> xen/drivers/passthrough/pci.c | 39 +++++++++++++++++++++++++++++++ > >> xen/drivers/passthrough/vtd/dmar.c | 8 +++++++ > >> xen/include/asm-x86/hvm/domain.h | 4 ++++ > > > >I don't see ARM here? Should there be an ARM variant of this? If not > > ARM doesn't need this feature. > > >should the toolstack ones only run under x86? > > And I think this shouldn't broken current ARM path as well. I mean this > would return simply since ARM hasn't such a hypercall handler. > > > > >> xen/include/public/domctl.h | 21 +++++++++++++++++ > >> xen/xsm/flask/hooks.c | 1 + > >> 15 files changed, 196 insertions(+) > >> > >>diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 > >>index 622ea53..9adc41e 100644 > >>--- a/docs/man/xl.cfg.pod.5 > >>+++ b/docs/man/xl.cfg.pod.5 > >>@@ -645,6 +645,12 @@ dom0 without confirmation. Please use with care. > >> D0-D3hot power management states for the PCI device. False (0) by > >> default. > >> > >>+=item B<rdmforce=BOOLEAN> > >>+ > >>+(HVM/x86 only) Specifies that the VM would force to check and try to > > > >s/force/forced/ > > I guess you're saying 'be forced'. > > >>+reserve all reserved device memory, like RMRR, associated to the PCI > >>+device. False (0) by default. > > > >Not sure I understand. How would the VM be forced to do this? Or is > >it that the hvmloader would force to do this? And if it fails (as you > > Yes. > > >say 'try') ? What then? > > In most cases we can reserve these regions but if these RMRR regions overlap > with some fixed range, like guest BIOS, we can't succeed in this case. > > > > >>+ > >> =back > >> > >> =back > >>diff --git a/docs/misc/vtd.txt b/docs/misc/vtd.txt > >>index 9af0e99..23544d5 100644 > >>--- a/docs/misc/vtd.txt > >>+++ b/docs/misc/vtd.txt > >>@@ -111,6 +111,21 @@ in the config file: > >> To override for a specific device: > >> pci = [ '01:00.0,msitranslate=0', '03:00.0' ] > >> > >>+RDM, 'reserved device memory', for PCI Device Passthrough > >>+--------------------------------------------------------- > >>+ > >>+The BIOS controls some devices in terms of some reginos of memory used for > > > >Could you elaborate what 'some devices' are? Network cards? GPUs? What > >are the most commons ones. > > Some legacy USB device to perform PS2 emulation, and GPU has a stolen memory > as I remember. > > > > >s/reginos/regions/ > > Fixed. > > > > >And by regions you mean BAR regions? > > No. I guess you want to know some background about RMRR :) > > There's a good brief description in Linux: > > What is RMRR? > ------------- > > There are some devices the BIOS controls, for e.g USB devices to perform > PS2 emulation. The regions of memory used for these devices are marked > reserved in the e820 map. When we turn on DMA translation, DMA to those > regions will fail. Hence BIOS uses RMRR to specify these regions along with > devices that need to access these regions. OS is expected to setup > unity mappings for these regions for these devices to access these regions. > > > > >>+these devices. This kind of region should be reserved before creating a VM > >>+to make sure they are not occupied by RAM/MMIO to conflict, and also we can > > > >You said 'This' but here you are using the plural ' are'. IF you want it > >plural > >it needs to be 'These regions' > > Thanks for your correction. > > >>+create necessary IOMMU table successfully. > >>+ > >>+To enable this globally, add "pci_rdmforce" in the config file: > >>+ > >>+ pci_rdmforce = 1 (default is 0) > > > >The guest config file? Or /etc/xen/xl.conf ? > > The guest config file. Here I just follow something about 'pci_msitranslate' > since they have that usage in common. > > > > >>+ > >>+Or just enable for a specific device: > >>+ pci = [ '01:00.0,rdmforce=1', '03:00.0' ] > >>+ > >> > >> Caveat on Conventional PCI Device Passthrough > >> --------------------------------------------- > >>diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h > > [snip] > > >>--- a/xen/drivers/passthrough/pci.c > >>+++ b/xen/drivers/passthrough/pci.c > >>@@ -34,6 +34,7 @@ > >> #include <xen/tasklet.h> > >> #include <xsm/xsm.h> > >> #include <asm/msi.h> > >>+#include <xen/stdbool.h> > >> > >> struct pci_seg { > >> struct list_head alldevs_list; > >>@@ -1553,6 +1554,44 @@ int iommu_do_pci_domctl( > >> } > >> break; > >> > >>+ case XEN_DOMCTL_set_rdm: > >>+ { > >>+ struct xen_domctl_set_rdm *xdsr = &domctl->u.set_rdm; > >>+ struct xen_guest_pcidev_info *pcidevs = NULL; > >>+ struct domain *d = rcu_lock_domain_by_any_id(domctl->domain); > >>+ > >>+ if ( d == NULL ) > >>+ return -ESRCH; > >>+ > > > >What if this is called on an PV domain? > > Currently we just support this in HVM, so I'd like to add this, > > if ( d == NULL ) > return -ESRCH; > > + ASSERT( is_hvm_domain(d) ); > + No. Please don't crash the hypervisor. Just return -ENOSYS or such when done for PV guests. > > > > >You are also missing the XSM checks. > > Just see this below. > > > > >What if this is called multiple times. Is it OK to over-ride > >the 'pci_force' or should it stick once? > > It should be fine since just xc/hvmloader need such an information while > creating a VM. > > And especially, currently we just call this one time to set. So why we need > to call this again and again? I think if anyone want to extend such a case > you're worrying, he should know any effect before he take a action, right? Program defensively and also think about preemption. If this call end up being preempted you might need to call it again. Or if the third-party toolstack use this operation and call this with wacky values? > > > > > > >>+ d->arch.hvm_domain.pci_force = > >>+ xdsr->flags & PCI_DEV_RDM_CHECK ? true : false; > > > >Won't we crash here if this is called for PV guests? > > After the line, 'ASSERT( is_hvm_domain(d) );', is added, this problem should > be gone. No it won't be. You will just crash the hypervisor. Please please put yourself in the mind that the toolstack can (and will) have bugs. > > > > >>+ d->arch.hvm_domain.num_pcidevs = xdsr->num_pcidevs; > > > >What if the 'num_pcidevs' has some bogus value. You need to check for that. > > This value is grabbed from that existing interface, assign_device, so I mean > this is already checked. > > > > > > >>+ d->arch.hvm_domain.pcidevs = NULL; > > > >Please first free it. It might be that the toolstack > >is doing this a couple of times. You don't want to leak memory. > > > > Okay, > > + if ( d->arch.hvm_domain.pcidevs ) > + xfree(d->arch.hvm_domain.pcidevs); > > > > >>+ > >>+ if ( xdsr->num_pcidevs ) > >>+ { > >>+ pcidevs = xmalloc_array(xen_guest_pcidev_info_t, > >>+ xdsr->num_pcidevs); > >>+ if ( pcidevs == NULL ) > >>+ { > >>+ rcu_unlock_domain(d); > >>+ return -ENOMEM; > > > >But you already have set 'num_pcidevs' to some value. This copying/check > >should be done before you modify 'd->arch.hvm_domain'... > > This makes sense so I'll move down this fragment. > > >>+ } > >>+ > >>+ if ( copy_from_guest(pcidevs, xdsr->pcidevs, > >>+ xdsr->num_pcidevs*sizeof(*pcidevs)) ) > >>+ { > >>+ xfree(pcidevs); > >>+ rcu_unlock_domain(d); > > > >Ditto. You need to do these checks before you modify 'd->arch.hvm_domain'. > > > >>+ return -EFAULT; > >>+ } > >>+ } > >>+ > >>+ d->arch.hvm_domain.pcidevs = pcidevs; > >>+ rcu_unlock_domain(d); > >>+ } > >>+ break; > >>+ > >> case XEN_DOMCTL_assign_device: > >> if ( unlikely(d->is_dying) ) > >> { > >>diff --git a/xen/drivers/passthrough/vtd/dmar.c > >>b/xen/drivers/passthrough/vtd/dmar.c > >>index 1152c3a..5e41e7a 100644 > >>--- a/xen/drivers/passthrough/vtd/dmar.c > >>+++ b/xen/drivers/passthrough/vtd/dmar.c > >>@@ -674,6 +674,14 @@ acpi_parse_one_rmrr(struct acpi_dmar_header *header) > >> " RMRR region: base_addr %"PRIx64 > >> " end_address %"PRIx64"\n", > >> rmrru->base_address, rmrru->end_address); > >>+ /* > >>+ * TODO: we may provide a precise paramter just to reserve > > > >s/paramter/parameter/ > > Fixed. > > >>+ * RMRR range specific to one device. > >>+ */ > >>+ dprintk(XENLOG_WARNING VTDPREFIX, > >>+ "So please set pci_rdmforce to reserve these ranges" > >>+ " if you need such a device in hotplug case.\n"); > > s/hotplug/passthrough > > > > >'Please set rdmforce to reserve ranges %lx->%lx if you plan to hotplug this > >device.' > > > >But then this is going to be a bit verbose, so perhaps: > > > >'Ranges %lx-%lx need rdmforce to properly work.' ? > > Its unnecessary to output range again since we already have such a print > message here. > > > > >>+ > >> acpi_register_rmrr_unit(rmrru); > >> } > >> } > >>diff --git a/xen/include/asm-x86/hvm/domain.h > >>b/xen/include/asm-x86/hvm/domain.h > >>index 2757c7f..38530e5 100644 > >>--- a/xen/include/asm-x86/hvm/domain.h > >>+++ b/xen/include/asm-x86/hvm/domain.h > >>@@ -90,6 +90,10 @@ struct hvm_domain { > >> /* Cached CF8 for guest PCI config cycles */ > >> uint32_t pci_cf8; > >> > > > >Maybe a comment explaining its purpose? > > Okay. > > /* Force to check/reserve Reserved Device Memory. */ > bool_t pci_force; > > > > >>+ bool_t pci_force; > >>+ uint32_t num_pcidevs; > >>+ struct xen_guest_pcidev_info *pcidevs; > >>+ > > > >You are also missing freeing of this in the hypervisor when the guest > >is destroyed. Please fix that. > > You're right. I will go there next revision. > > > > >> struct pl_time pl_time; > >> > >> struct hvm_io_handler *io_handler; > >>diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h > >>index 57e2ed7..ba8970d 100644 > >>--- a/xen/include/public/domctl.h > >>+++ b/xen/include/public/domctl.h > >>@@ -508,6 +508,25 @@ struct xen_domctl_get_device_group { > >> typedef struct xen_domctl_get_device_group xen_domctl_get_device_group_t; > >> DEFINE_XEN_GUEST_HANDLE(xen_domctl_get_device_group_t); > >> > >>+/* Currently just one bit to indicate force to check Reserved Device > >>Memory. */ > > > >Not sure I understand. Did you mean: > > > >'Check Reserved Device Memory'. > > I can change this as '...force checking Reserved Device Memory.' > > > > >What happens if you do not have this flag? What are the semantics > >of this hypercall - as in what will it mean. > > If we have no this flag, these devices owned RMRR can't work in passthrough > case. > > > > >>+#define PCI_DEV_RDM_CHECK 0x1 > >>+struct xen_guest_pcidev_info { > >>+ uint16_t seg; > >>+ uint8_t bus; > >>+ uint8_t devfn; > >>+ uint32_t flags; > >>+}; > >>+typedef struct xen_guest_pcidev_info xen_guest_pcidev_info_t; > >>+DEFINE_XEN_GUEST_HANDLE(xen_guest_pcidev_info_t); > >>+/* Control whether/how we check and reserve device memory. */ > >>+struct xen_domctl_set_rdm { > >>+ uint32_t flags; > > > >What is this 'flags' purpose compared to the 'pcidevs.flags'? Please > >explain. > > I replied something to Kevin, and we just need a global flag so we can > remove pcidevs.flags. > > > > >>+ uint32_t num_pcidevs; > >>+ XEN_GUEST_HANDLE_64(xen_guest_pcidev_info_t) pcidevs; > >>+}; > >>+typedef struct xen_domctl_set_rdm xen_domctl_set_rdm_t; > >>+DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_rdm_t); > >>+ > >> /* Pass-through interrupts: bind real irq -> hvm devfn. */ > >> /* XEN_DOMCTL_bind_pt_irq */ > >> /* XEN_DOMCTL_unbind_pt_irq */ > >>@@ -1070,6 +1089,7 @@ struct xen_domctl { > >> #define XEN_DOMCTL_setvnumainfo 74 > >> #define XEN_DOMCTL_psr_cmt_op 75 > >> #define XEN_DOMCTL_arm_configure_domain 76 > >>+#define XEN_DOMCTL_set_rdm 77 > >> #define XEN_DOMCTL_gdbsx_guestmemio 1000 > >> #define XEN_DOMCTL_gdbsx_pausevcpu 1001 > >> #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 > >>@@ -1135,6 +1155,7 @@ struct xen_domctl { > >> struct xen_domctl_gdbsx_domstatus gdbsx_domstatus; > >> struct xen_domctl_vnuma vnuma; > >> struct xen_domctl_psr_cmt_op psr_cmt_op; > >>+ struct xen_domctl_set_rdm set_rdm; > >> uint8_t pad[128]; > >> } u; > >> }; > >>diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c > >>index d48463f..5a760e2 100644 > >>--- a/xen/xsm/flask/hooks.c > >>+++ b/xen/xsm/flask/hooks.c > >>@@ -592,6 +592,7 @@ static int flask_domctl(struct domain *d, int cmd) > >> case XEN_DOMCTL_test_assign_device: > >> case XEN_DOMCTL_assign_device: > >> case XEN_DOMCTL_deassign_device: > >>+ case XEN_DOMCTL_set_rdm: > > > >There is more to XSM than just this file.. > > But I don't see more other stuff, like XEN_DOMCTL_assign_device. > > > > >Please compile with XSM enabled. > > Anyway, I add XSM_ENABLE = y and FLASK_ENABLE = y in Config.mk then > recompile, but looks good. > > Anything I'm missing? Ah, then it is fine! > > >> #endif > >> return 0; > > > > > >Also how does this work with 32-bit dom0s? Is there a need to use the > >compat layer? > > Are you saying in xsm case? Others? > > Actually this new DOMCTL is similar with XEN_DOMCTL_assign_device in some > senses but I don't see such an issue you're pointing. I was thinking about the compat layer and making sure it works properly. > > Thanks > Tiejun _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |