|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC][PATCH 01/13] tools: introduce some new parameters to set rdm policy
On Mon, May 11, 2015 at 01:35:06PM +0800, Chen, Tiejun wrote:
> On 2015/5/8 21:04, Wei Liu wrote:
> >Sorry for the late review.
> >
>
> Really thanks for taking your time :)
>
> >On Fri, Apr 10, 2015 at 05:21:52PM +0800, Tiejun Chen wrote:
> >>This patch introduces user configurable parameters to specify RDM
> >>resource and according policies,
> >>
> >>Global RDM parameter:
> >> rdm = [ 'host, reserve=force/try' ]
> >>Per-device RDM parameter:
> >> pci = [ 'sbdf, rdm_reserve=force/try' ]
> >>
> >>Global RDM parameter allows user to specify reserved regions explicitly,
> >>e.g. using 'host' to include all reserved regions reported on this platform
> >>which is good to handle hotplug scenario. In the future this parameter
> >>may be further extended to allow specifying random regions, e.g. even
> >>those belonging to another platform as a preparation for live migration
> >>with passthrough devices.
> >>
> >>'force/try' policy decides how to handle conflict when reserving RDM
> >>regions in pfn space. If conflict exists, 'force' means an immediate error
> >>so VM will be killed, while 'try' allows moving forward with a warning
> >>message thrown out.
> >>
> >>Default per-device RDM policy is 'force', while default global RDM policy
> >>is 'try'. When both policies are specified on a given region, 'force' is
> >>always preferred.
> >>
> >>Signed-off-by: Tiejun Chen <tiejun.chen@xxxxxxxxx>
> >>---
> >> docs/man/xl.cfg.pod.5 | 44 +++++++++++++++++++++++++
> >> docs/misc/vtd.txt | 34 ++++++++++++++++++++
> >> tools/libxl/libxl_create.c | 5 +++
> >> tools/libxl/libxl_types.idl | 18 +++++++++++
> >> tools/libxl/libxlu_pci.c | 78
> >> +++++++++++++++++++++++++++++++++++++++++++++
> >> tools/libxl/libxlutil.h | 4 +++
> >> tools/libxl/xl_cmdimpl.c | 21 +++++++++++-
> >> 7 files changed, 203 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> >>index 408653f..9ed3055 100644
> >>--- a/docs/man/xl.cfg.pod.5
> >>+++ b/docs/man/xl.cfg.pod.5
> >>@@ -583,6 +583,36 @@ assigned slave device.
> >>
> >> =back
> >>
> >>+=item B<rdm=[ "TYPE", "RDM_RESERVE_STRING", ... ]>
> >>+
> >
> >Shouldn't this be "TYPE,RDM_RESERVE_STRIGN" according to your commit
> >message? If the only available config is just one string, you probably
> >don't need a list for this?
>
> Yes, based on that design we don't need a list. So
>
> =item B<rdm=[ "RDM_RESERVE_STRING" ]>
>
Note that this is still a list (enclosed by "[]"). Maybe you mean
rdm = "RDM_RESERVE_STRING"
?
> >
> >>+(HVM/x86 only) Specifies the information about Reserved Device Memory
> >>(RDM),
> >>+which is necessary to enable robust device passthrough usage. One example
> >>of
> >>+RDM is reported through ACPI Reserved Memory Region Reporting (RMRR)
> >>+structure on x86 platform.
> >>+Each B<RDM_CHECK_STRING> has the form C<["TYPE",KEY=VALUE,KEY=VALUE,...>
> >>where:
> >>+
> >
> >RDM_CHECK_STRING?
>
> And here should be corrected like this,
>
> B<RDM_RESERVE_STRING> has the form ...
>
> >
> >>+=over 4
> >>+
> >>+=item B<"TYPE">
> >>+
> >>+Currently we just have one type. 'host' means all reserved device memory on
> >>+this platform should be reserved in this VM's pfn space.
> >>+
> >
> >What are other possible types? If there is only one type then we can
>
> Currently we just have one type and looks that design doesn't make this
> clear.
>
> >simply ignore the type?
>
> I just think we may introduce something else specific to live migration in
> the future... But I'm really not sure right now.
>
Fair enough. I was just wondering if there would be any other types. If
so we do need provisioning.
In any case, the "type" argument you proposed is a positional argument
(you require it to be the first element of the spec string").
I think you can just make it a key-value pair to make parsing easier.
> >
> >>+=item B<KEY=VALUE>
> >>+
> >>+Possible B<KEY>s are:
> >>+
> >>+=over 4
> >>+
> >>+=item B<reserve="STRING">
> >>+
> >>+Conflict may be detected when reserving reserved device memory in gfn
> >>space.
> >>+'force' means an unsolved conflict leads to immediate VM destroy, while
> >
> >Do you mean "immediate VM crash"?
>
> Yes. So I guess I should replace this.
>
> >
> >>+'try' allows VM moving forward with a warning message thrown out. 'try'
> >>+is default.
> >
> >Can you please your double quotes for "force", "try" etc.
>
> Sure. Just note we'd like to use "strict"/"relaxed" to replace "force"/"try"
> from next revision according to Jan's suggestion.
>
No problem.
> >
> >>+
> >>+Note this may be overrided by another sub item, rdm_reserve, in pci device.
> >>+
> >> =item B<pci=[ "PCI_SPEC_STRING", "PCI_SPEC_STRING", ... ]>
> >>
> >> Specifies the host PCI devices to passthrough to this guest. Each
> >> B<PCI_SPEC_STRING>
> >>@@ -645,6 +675,20 @@ dom0 without confirmation. Please use with care.
> >> D0-D3hot power management states for the PCI device. False (0) by
> >> default.
> >>
> >>+=item B<rdm_check="STRING">
> >>+
> >>+(HVM/x86 only) Specifies the information about Reserved Device Memory
> >>(RDM),
> >>+which is necessary to enable robust device passthrough usage. One example
> >>of
> >>+RDM is reported through ACPI Reserved Memory Region Reporting (RMRR)
> >>+structure on x86 platform.
> >>+
> >>+Conflict may be detected when reserving reserved device memory in gfn
> >>space.
> >>+'force' means an unsolved conflict leads to immediate VM destroy, while
> >>+'try' allows VM moving forward with a warning message thrown out. 'force'
> >>+is default.
> >>+
> >>+Note this would override another global item, rdm = [''].
> >>+
> >
> >Note this would override global B<rdm> option.
>
> Fixed.
>
> >
> >> =back
> >>
> >> =back
> >>diff --git a/docs/misc/vtd.txt b/docs/misc/vtd.txt
> >>index 9af0e99..d7434d6 100644
> >>--- a/docs/misc/vtd.txt
> >>+++ b/docs/misc/vtd.txt
> >>@@ -111,6 +111,40 @@ in the config file:
> >> To override for a specific device:
> >> pci = [ '01:00.0,msitranslate=0', '03:00.0' ]
> >>
> >>+RDM, 'reserved device memory', for PCI Device Passthrough
> >>+---------------------------------------------------------
> >>+
> >>+There are some devices the BIOS controls, for e.g. USB devices to perform
> >>+PS2 emulation. The regions of memory used for these devices are marked
> >>+reserved in the e820 map. When we turn on DMA translation, DMA to those
> >>+regions will fail. Hence BIOS uses RMRR to specify these regions along with
> >>+devices that need to access these regions. OS is expected to setup
> >>+identity mappings for these regions for these devices to access these
> >>regions.
> >>+
> >>+While creating a VM we should reserve them in advance, and avoid any
> >>conflicts.
> >>+So we introduce user configurable parameters to specify RDM resource and
> >>+according policies,
> >>+
> >>+To enable this globally, add "rdm" in the config file:
> >>+
> >>+ rdm = [ 'host, reserve=force/try' ]
> >>+
> >
> >The "force/try" should be called "policy". And then you explain what
> >policies we have.
>
> Do you mean we should rename this?
>
> rdm = [ 'host, policy=force/try' ]
>
No, I didn't ask you to rename that.
The above line is an example which should reflect the correct syntax.
"force/try" is not the *actual syntax*, i.e. you won't write that in
your config file.
I meant to changes it to "reserve=POLICY". Then you explain the possible
values of POLICY.
> This is really a policy but 'reserve' may can reflect our action explicitly,
> right?
>
> >
> >>+Or just for a specific device:
> >>+
> >>+ pci = [ '01:00.0,rdm_reserve=force/try', '03:00.0' ]
>
> And you also can see this.
>
> But anyway, if you're really stick to rename this, I'm going to be fine as
> well but we should ping every one to check this point since this name is
> from our previous discussion.
>
> >>+
> >>+Global RDM parameter allows user to specify reserved regions explicitly.
> >>+Using 'host' to include all reserved regions reported on this platform
> >>+which is good to handle hotplug scenario. In the future this parameter
> >>+may be further extended to allow specifying random regions, e.g. even
> >>+those belonging to another platform as a preparation for live migration
> >>+with passthrough devices.
> >>+
> >>+'force/try' policy decides how to handle conflict when reserving RDM
> >>+regions in pfn space. If conflict exists, 'force' means an immediate error
> >>+so VM will be killed, while 'try' allows moving forward with a warning
> >
> >Be killed by whom? I think it's hvmloader crashes voluntarily, right?
>
> s/VM will be kille/hvmloader crashes voluntarily
>
> >
> >>+message thrown out.
> >>+
> >>
> >> Caveat on Conventional PCI Device Passthrough
> >> ---------------------------------------------
> >>diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> >>index 98687bd..9ed40d4 100644
> >>--- a/tools/libxl/libxl_create.c
> >>+++ b/tools/libxl/libxl_create.c
> >>@@ -1407,6 +1407,11 @@ static void domcreate_attach_pci(libxl__egc *egc,
> >>libxl__multidev *multidev,
> >> }
> >>
> >> for (i = 0; i < d_config->num_pcidevs; i++) {
> >>+ /*
> >>+ * If the rdm global policy is 'force' we should override each
> >>device.
> >>+ */
> >>+ if (d_config->b_info.rdm.reserve == LIBXL_RDM_RESERVE_FLAG_FORCE)
> >>+ d_config->pcidevs[i].rdm_reserve =
> >>LIBXL_RDM_RESERVE_FLAG_FORCE;
> >> ret = libxl__device_pci_add(gc, domid, &d_config->pcidevs[i], 1);
> >> if (ret < 0) {
> >> LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
> >>diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >>index 47af340..5786455 100644
> >>--- a/tools/libxl/libxl_types.idl
> >>+++ b/tools/libxl/libxl_types.idl
> >>@@ -71,6 +71,17 @@ libxl_domain_type = Enumeration("domain_type", [
> >> (2, "PV"),
> >> ], init_val = "LIBXL_DOMAIN_TYPE_INVALID")
> >>
> >>+libxl_rdm_reserve_type = Enumeration("rdm_reserve_type", [
> >>+ (0, "none"),
> >>+ (1, "host"),
> >>+ ])
> >>+
> >>+libxl_rdm_reserve_flag = Enumeration("rdm_reserve_flag", [
> >>+ (-1, "invalid"),
> >>+ (0, "force"),
> >>+ (1, "try"),
> >>+ ])
> >
> >If you don't set init_val, the default value would be "force" (0), is this
>
> Yes.
>
> >want you want?
>
> We have a little bit of complexity here,
>
> "Default per-device RDM policy is 'force', while default global RDM policy
> is 'try'. When both policies are specified on a given region, 'force' is
> always preferred."
>
This is going to be done in actual code anyway.
This type is used both in global and per-device setting, so I envisage
this to have an invalid value to start with. Appropriate default values
should be done in libxl_TYPE_setdefault functions. And the logic to
detect conflict and preferences done in your construct function.
What do you think?
> >
> >>+
> >> libxl_channel_connection = Enumeration("channel_connection", [
> >> (0, "UNKNOWN"),
> >> (1, "PTY"),
> >>@@ -356,6 +367,11 @@ libxl_domain_sched_params =
> >>Struct("domain_sched_params",[
> >> ("budget", integer, {'init_val':
> >> 'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
> >> ])
> >>
> >>+libxl_rdm_reserve = Struct("rdm_reserve", [
> >>+ ("type", libxl_rdm_reserve_type),
> >>+ ("reserve", libxl_rdm_reserve_flag),
> >>+ ])
> >>+
> >> libxl_domain_build_info = Struct("domain_build_info",[
> >> ("max_vcpus", integer),
> >> ("avail_vcpus", libxl_bitmap),
> >>@@ -401,6 +417,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
> >> ("kernel", string),
> >> ("cmdline", string),
> >> ("ramdisk", string),
> >>+ ("rdm", libxl_rdm_reserve),
> >> ("u", KeyedUnion(None, libxl_domain_type, "type",
> >> [("hvm", Struct(None, [("firmware", string),
> >> ("bios",
> >> libxl_bios_type),
> >>@@ -521,6 +538,7 @@ libxl_device_pci = Struct("device_pci", [
> >> ("power_mgmt", bool),
> >> ("permissive", bool),
> >> ("seize", bool),
> >>+ ("rdm_reserve", libxl_rdm_reserve_flag),
> >> ])
> >>
> >> libxl_device_vtpm = Struct("device_vtpm", [
> >>diff --git a/tools/libxl/libxlu_pci.c b/tools/libxl/libxlu_pci.c
> >>index 26fb143..45be0d9 100644
> >>--- a/tools/libxl/libxlu_pci.c
> >>+++ b/tools/libxl/libxlu_pci.c
> >>@@ -42,6 +42,8 @@ static int pcidev_struct_fill(libxl_device_pci *pcidev,
> >>unsigned int domain,
> >> #define STATE_OPTIONS_K 6
> >> #define STATE_OPTIONS_V 7
> >> #define STATE_TERMINAL 8
> >>+#define STATE_TYPE 9
> >>+#define STATE_CHECK_FLAG 10
> >> int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_device_pci *pcidev, const
> >> char *str)
> >> {
> >> unsigned state = STATE_DOMAIN;
> >>@@ -143,6 +145,17 @@ int xlu_pci_parse_bdf(XLU_Config *cfg,
> >>libxl_device_pci *pcidev, const char *str
> >> pcidev->permissive = atoi(tok);
> >> }else if ( !strcmp(optkey, "seize") ) {
> >> pcidev->seize = atoi(tok);
> >>+ }else if ( !strcmp(optkey, "rdm_reserve") ) {
> >>+ if ( !strcmp(tok, "force") ) {
> >>+ pcidev->rdm_reserve = LIBXL_RDM_RESERVE_FLAG_FORCE;
> >>+ } else if ( !strcmp(tok, "try") ) {
> >>+ pcidev->rdm_reserve = LIBXL_RDM_RESERVE_FLAG_TRY;
> >>+ } else {
> >>+ pcidev->rdm_reserve = LIBXL_RDM_RESERVE_FLAG_FORCE;
> >>+ XLU__PCI_ERR(cfg, "Unknown PCI RDM property flag
> >>value:"
> >>+ " %s, so goes 'force' by
> >>default.",
> >
> >If this is not an error, you don't need XLU__PCI_ERR.
> >
> >But I would say we should just treat this as an error and
> >abort/exit/report (whatever the parser should do in this case).
>
> In our case we just want to post a message to set a appropriate flag to
> recover this behavior like we write here,
>
> XLU__PCI_ERR(cfg, "Unknown PCI RDM property flag
> value:"
> " %s, so goes 'strict' by
> default.",
> tok);
I suggest we just abort in this case and not second guess what the admin
wants.
>
> This may just be a warning? But I don't we have this sort of definition,
> XLU__PCI_WARN, ...
>
> So what LOG format can be adopted here?
Feel free to introduce XLU__PCI_WARN if it turns out to be necessary.
>
> >
> >>+ tok);
> >>+ }
> >> }else{
> >> XLU__PCI_ERR(cfg, "Unknown PCI BDF option: %s",
> >> optkey);
> >> }
> >>@@ -167,6 +180,71 @@ parse_error:
> >> return ERROR_INVAL;
> >> }
> >>
> >>+int xlu_rdm_parse(XLU_Config *cfg, libxl_rdm_reserve *rdm, const char *str)
> >>+{
> >>+ unsigned state = STATE_TYPE;
> >>+ char *buf2, *tok, *ptr, *end;
> >>+
> >>+ if ( NULL == (buf2 = ptr = strdup(str)) )
> >>+ return ERROR_NOMEM;
> >>+
> >>+ for(tok = ptr, end = ptr + strlen(ptr) + 1; ptr < end; ptr++) {
> >>+ switch(state) {
Coding style. I haven't checked what actual style this file uses, but
there is inconsistency in this function by itself.
> >>+ case STATE_TYPE:
> >>+ if ( *ptr == '\0' || *ptr == ',' ) {
> >>+ state = STATE_CHECK_FLAG;
> >>+ *ptr = '\0';
> >>+ if ( !strcmp(tok, "host") ) {
> >>+ rdm->type = LIBXL_RDM_RESERVE_TYPE_HOST;
> >>+ } else {
> >>+ XLU__PCI_ERR(cfg, "Unknown RDM state option: %s", tok);
> >>+ goto parse_error;
> >>+ }
> >>+ tok = ptr + 1;
> >>+ }
> >>+ break;
> >>+ case STATE_CHECK_FLAG:
> >>+ if ( *ptr == '=' ) {
> >>+ state = STATE_OPTIONS_V;
> >>+ *ptr = '\0';
> >>+ if ( strcmp(tok, "reserve") ) {
> >>+ XLU__PCI_ERR(cfg, "Unknown RDM property value: %s",
> >>tok);
> >>+ goto parse_error;
> >>+ }
> >>+ tok = ptr + 1;
> >>+ }
> >>+ break;
> >>+ case STATE_OPTIONS_V:
> >>+ if ( *ptr == ',' || *ptr == '\0' ) {
> >>+ state = STATE_TERMINAL;
> >>+ *ptr = '\0';
> >>+ if ( !strcmp(tok, "force") ) {
> >>+ rdm->reserve = LIBXL_RDM_RESERVE_FLAG_FORCE;
> >>+ } else if ( !strcmp(tok, "try") ) {
> >>+ rdm->reserve = LIBXL_RDM_RESERVE_FLAG_TRY;
> >>+ } else {
> >>+ XLU__PCI_ERR(cfg, "Unknown RDM property flag value:
> >>%s",
> >>+ tok);
> >>+ goto parse_error;
> >>+ }
> >>+ tok = ptr + 1;
> >>+ }
> >>+ default:
> >>+ break;
> >>+ }
> >>+ }
> >>+
> >>+ free(buf2);
> >>+
> >>+ if ( tok != ptr || state != STATE_TERMINAL )
> >>+ goto parse_error;
> >>+
> >>+ return 0;
> >>+
> >>+parse_error:
> >>+ return ERROR_INVAL;
> >>+}
> >>+
> >> /*
> >> * Local variables:
> >> * mode: C
> >>diff --git a/tools/libxl/libxlutil.h b/tools/libxl/libxlutil.h
> >>index 0333e55..80497f8 100644
> >>--- a/tools/libxl/libxlutil.h
> >>+++ b/tools/libxl/libxlutil.h
> >>@@ -93,6 +93,10 @@ int xlu_disk_parse(XLU_Config *cfg, int nspecs, const
> >>char *const *specs,
> >> */
> >> int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_device_pci *pcidev, const
> >> char *str);
> >>
> >>+/*
> >>+ * RDM parsing
> >>+ */
> >>+int xlu_rdm_parse(XLU_Config *cfg, libxl_rdm_reserve *rdm, const char
> >>*str);
> >>
> >> /*
> >> * Vif rate parsing.
> >>diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> >>index 04faf98..9a58464 100644
> >>--- a/tools/libxl/xl_cmdimpl.c
> >>+++ b/tools/libxl/xl_cmdimpl.c
> >>@@ -988,7 +988,7 @@ static void parse_config_data(const char *config_source,
> >> const char *buf;
> >> long l;
> >> XLU_Config *config;
> >>- XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms;
> >>+ XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms,
> >>*rdms;
> >> XLU_ConfigList *channels, *ioports, *irqs, *iomem, *viridian;
> >> int num_ioports, num_irqs, num_iomem, num_cpus, num_viridian;
> >> int pci_power_mgmt = 0;
> >>@@ -1727,6 +1727,23 @@ skip_vfb:
> >> xlu_cfg_get_defbool(config, "e820_host", &b_info->u.pv.e820_host,
> >> 0);
> >> }
> >>
> >>+ /*
> >>+ * By default our global policy is to query all rdm entries, and
> >>+ * force reserve them.
> >>+ */
> >>+ b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_HOST;
> >>+ b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_TRY;
> >
> >This should probably to into the _setdefault function of
> >libxl_domain_build_info.
>
> Sorry, I just see this
>
> libxl_domain_build_info_init()
> |
> + libxl_rdm_reserve_init(&p->rdm);
> |
> + memset(p, '\0', sizeof(*p));
>
> But this should be generated automatically, right? So how to implement your
> idea? Could you give me a show?
>
Check libxl_domain_build_info_setdefault.
To use libxl types. You normally do:
libxl_TYPE_init
libxl_TYPE_setdefault
DO STUFF
libxl_TYPE_dispose
_init and _dispose are auto-generated. _setdefault is not.
> >
> >>+ if (!xlu_cfg_get_list (config, "rdm", &rdms, 0, 0)) {
> >>+ if ((buf = xlu_cfg_get_listitem (rdms, 0)) != NULL ) {
> >>+ libxl_rdm_reserve rdm;
> >>+ if (!xlu_rdm_parse(config, &rdm, buf))
> >>+ {
> >>+ b_info->rdm.type = rdm.type;
> >>+ b_info->rdm.reserve = rdm.reserve;
> >>+ }
> >
> >You only have one rdm in b_info, so there is no need to use a list for
> >it in config file.
> >
>
> Is this fine?
>
> if (!xlu_cfg_get_string(config, "rdm", &buf, 0)) {
>
> libxl_rdm_reserve rdm;
>
> if (!xlu_rdm_parse(config, &rdm, buf)) {
> b_info->rdm.type = rdm.type;
>
> b_info->rdm.reserve = rdm.reserve;
>
> }
> }
>
I think it is fine. But you'd better wait a little bit for other people
to voice their opinions.
Wei.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |