|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC][PATCH 01/13] tools: introduce some new parameters to set rdm policy
On Fri, May 15, 2015 at 09:52:23AM +0800, Chen, Tiejun wrote:
[...]
>
>
> ...
>
> B<RDM_RESERVE_STRING> has the form C<[KEY=VALUE,KEY=VALUE,...> where:
>
> =over 4
>
> =item B<KEY=VALUE>
>
> Possible B<KEY>s are:
>
> =over 4
>
> =item B<type="STRING">
>
> Currently we just have one type. "host" means all reserved device memory on
> this platform should be reserved in this VM's pfn space.
>
> =over 4
>
> =item B<reserve="STRING">
> ...
>
Yes, something like this.
>
> >
> >>>
[...]
> >>>>index 9af0e99..d7434d6 100644
> >>>>--- a/docs/misc/vtd.txt
> >>>>+++ b/docs/misc/vtd.txt
> >>>>@@ -111,6 +111,40 @@ in the config file:
> >>>> To override for a specific device:
> >>>> pci = [ '01:00.0,msitranslate=0', '03:00.0' ]
> >>>>
> >>>>+RDM, 'reserved device memory', for PCI Device Passthrough
> >>>>+---------------------------------------------------------
> >>>>+
> >>>>+There are some devices the BIOS controls, for e.g. USB devices to perform
> >>>>+PS2 emulation. The regions of memory used for these devices are marked
> >>>>+reserved in the e820 map. When we turn on DMA translation, DMA to those
> >>>>+regions will fail. Hence BIOS uses RMRR to specify these regions along
> >>>>with
> >>>>+devices that need to access these regions. OS is expected to setup
> >>>>+identity mappings for these regions for these devices to access these
> >>>>regions.
> >>>>+
> >>>>+While creating a VM we should reserve them in advance, and avoid any
> >>>>conflicts.
> >>>>+So we introduce user configurable parameters to specify RDM resource and
> >>>>+according policies,
> >>>>+
> >>>>+To enable this globally, add "rdm" in the config file:
> >>>>+
> >>>>+ rdm = [ 'host, reserve=force/try' ]
> >>>>+
> >>>
> >>>The "force/try" should be called "policy". And then you explain what
> >>>policies we have.
> >>
> >>Do you mean we should rename this?
> >>
> >>rdm = [ 'host, policy=force/try' ]
> >>
> >
> >No, I didn't ask you to rename that.
> >
> >The above line is an example which should reflect the correct syntax.
> >"force/try" is not the *actual syntax*, i.e. you won't write that in
> >your config file.
> >
> >I meant to changes it to "reserve=POLICY". Then you explain the possible
> >values of POLICY.
> >
>
> Understood so what about this,
>
> To enable this globally, add "rdm" in the config file:
>
> rdm = [ 'host, reserve=<POLICY>' ]
>
OK, so this is a specific example in vtd.txt. Last time I misread it as
part of the manpage.
I think you meant in this specific example (with other suggestions
incorporated):
rdm = "type=host, reserve=force"
Then you point user to xl.cfg manpage.
> Or just for a specific device:
>
> pci = [ '01:00.0,rdm_reserve=<POLICY>', '03:00.0' ]
>
Same here.
Just don't write "force/try" or "strcit/relax" because that's not the
exact syntax you would use in a real config file.
> Global RDM parameter allows user to specify reserved regions explicitly.
> Using "host" to include all reserved regions reported on this platform
> which is good to handle hotplug scenario. In the future this parameter
> may be further extended to allow specifying random regions, e.g. even
> those belonging to another platform as a preparation for live migration
> with passthrough devices.
>
> Currently "POLICY" includes two options, "strict" and "relaxed". It decides
> how to handle conflict when reserving RDM regions in pfn space. If conflict
> ...
>
> >>This is really a policy but 'reserve' may can reflect our action explicitly,
> >>right?
> >>
> >>>
> >>>>+Or just for a specific device:
> >>>>+
> >>>>+ pci = [ '01:00.0,rdm_reserve=force/try', '03:00.0' ]
> >>
> >>And you also can see this.
> >>
> >>But anyway, if you're really stick to rename this, I'm going to be fine as
> >>well but we should ping every one to check this point since this name is
> >>from our previous discussion.
> >>
> >>>>+
> >>>>+Global RDM parameter allows user to specify reserved regions explicitly.
> >>>>+Using 'host' to include all reserved regions reported on this platform
> >>>>+which is good to handle hotplug scenario. In the future this parameter
> >>>>+may be further extended to allow specifying random regions, e.g. even
> >>>>+those belonging to another platform as a preparation for live migration
> >>>>+with passthrough devices.
> >>>>+
> >>>>+'force/try' policy decides how to handle conflict when reserving RDM
> >>>>+regions in pfn space. If conflict exists, 'force' means an immediate
> >>>>error
> >>>>+so VM will be killed, while 'try' allows moving forward with a warning
> >>>
> >>>Be killed by whom? I think it's hvmloader crashes voluntarily, right?
> >>
> >>s/VM will be kille/hvmloader crashes voluntarily
> >>
> >>>
> >>>>+message thrown out.
> >>>>+
> >>>>
> >>>> Caveat on Conventional PCI Device Passthrough
> >>>> ---------------------------------------------
> >>>>diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> >>>>index 98687bd..9ed40d4 100644
> >>>>--- a/tools/libxl/libxl_create.c
> >>>>+++ b/tools/libxl/libxl_create.c
> >>>>@@ -1407,6 +1407,11 @@ static void domcreate_attach_pci(libxl__egc *egc,
> >>>>libxl__multidev *multidev,
> >>>> }
> >>>>
> >>>> for (i = 0; i < d_config->num_pcidevs; i++) {
> >>>>+ /*
> >>>>+ * If the rdm global policy is 'force' we should override each
> >>>>device.
> >>>>+ */
> >>>>+ if (d_config->b_info.rdm.reserve == LIBXL_RDM_RESERVE_FLAG_FORCE)
> >>>>+ d_config->pcidevs[i].rdm_reserve =
> >>>>LIBXL_RDM_RESERVE_FLAG_FORCE;
> >>>> ret = libxl__device_pci_add(gc, domid, &d_config->pcidevs[i],
> >>>> 1);
> >>>> if (ret < 0) {
> >>>> LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
> >>>>diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >>>>index 47af340..5786455 100644
> >>>>--- a/tools/libxl/libxl_types.idl
> >>>>+++ b/tools/libxl/libxl_types.idl
> >>>>@@ -71,6 +71,17 @@ libxl_domain_type = Enumeration("domain_type", [
> >>>> (2, "PV"),
> >>>> ], init_val = "LIBXL_DOMAIN_TYPE_INVALID")
> >>>>
> >>>>+libxl_rdm_reserve_type = Enumeration("rdm_reserve_type", [
> >>>>+ (0, "none"),
> >>>>+ (1, "host"),
> >>>>+ ])
> >>>>+
> >>>>+libxl_rdm_reserve_flag = Enumeration("rdm_reserve_flag", [
> >>>>+ (-1, "invalid"),
> >>>>+ (0, "force"),
> >>>>+ (1, "try"),
> >>>>+ ])
> >>>
> >>>If you don't set init_val, the default value would be "force" (0), is this
> >>
> >>Yes.
> >>
> >>>want you want?
> >>
> >>We have a little bit of complexity here,
> >>
> >>"Default per-device RDM policy is 'force', while default global RDM policy
> >>is 'try'. When both policies are specified on a given region, 'force' is
> >>always preferred."
> >>
> >
> >This is going to be done in actual code anyway.
> >
> >This type is used both in global and per-device setting, so I envisage
>
> Yes.
>
> >this to have an invalid value to start with. Appropriate default values
>
> Sounds I should set this,
>
> +libxl_rdm_reserve_flag = Enumeration("rdm_reserve_flag", [
> + (-1, "invalid"),
> + (0, "strict"),
> + (1, "relaxed"),
> + ], init_val = "LIBXL_RDM_RESERVE_FLAG_INVALID")
> +
>
Yes, and then don't forget to set the value to appropriate value in the
_setdefault functions for different types.
>
> >should be done in libxl_TYPE_setdefault functions. And the logic to
> >detect conflict and preferences done in your construct function.
> >
> >What do you think?
> >
> >>>
> >>>>+
[...]
> >>>> pcidev->permissive = atoi(tok);
> >>>> }else if ( !strcmp(optkey, "seize") ) {
> >>>> pcidev->seize = atoi(tok);
> >>>>+ }else if ( !strcmp(optkey, "rdm_reserve") ) {
> >>>>+ if ( !strcmp(tok, "force") ) {
> >>>>+ pcidev->rdm_reserve =
> >>>>LIBXL_RDM_RESERVE_FLAG_FORCE;
> >>>>+ } else if ( !strcmp(tok, "try") ) {
> >>>>+ pcidev->rdm_reserve = LIBXL_RDM_RESERVE_FLAG_TRY;
> >>>>+ } else {
> >>>>+ pcidev->rdm_reserve =
> >>>>LIBXL_RDM_RESERVE_FLAG_FORCE;
> >>>>+ XLU__PCI_ERR(cfg, "Unknown PCI RDM property flag
> >>>>value:"
> >>>>+ " %s, so goes 'force' by
> >>>>default.",
> >>>
> >>>If this is not an error, you don't need XLU__PCI_ERR.
> >>>
> >>>But I would say we should just treat this as an error and
> >>>abort/exit/report (whatever the parser should do in this case).
> >>
> >>In our case we just want to post a message to set a appropriate flag to
> >>recover this behavior like we write here,
> >>
> >> XLU__PCI_ERR(cfg, "Unknown PCI RDM property flag
> >>value:"
> >> " %s, so goes 'strict' by
> >>default.",
> >> tok);
> >
> >I suggest we just abort in this case and not second guess what the admin
> >wants.
>
> Okay,
> } else {
> XLU__PCI_ERR(cfg, "%s is not an valid PCI RDM
> property"
> " flag: 'strict' or 'relaxed'.",
> tok);
> abort();
>
No, not calling the "abort" function. I meant returning appropriate error
value and let the caller handles this situation.
>
> >
> >>
> >>This may just be a warning? But I don't we have this sort of definition,
> >>XLU__PCI_WARN, ...
> >>
> >>So what LOG format can be adopted here?
> >
> >Feel free to introduce XLU__PCI_WARN if it turns out to be necessary.
>
> If it goes to abort(), I think XLU__PCI_ERR() should be good.
>
> >
> >>
> >>>
> >>>>+ tok);
> >>>>+ }
> >>>> }else{
> >>>> XLU__PCI_ERR(cfg, "Unknown PCI BDF option: %s",
> >>>> optkey);
> >>>> }
> >>>>@@ -167,6 +180,71 @@ parse_error:
> >>>> return ERROR_INVAL;
> >>>> }
> >>>>
> >>>>+int xlu_rdm_parse(XLU_Config *cfg, libxl_rdm_reserve *rdm, const char
> >>>>*str)
> >>>>+{
> >>>>+ unsigned state = STATE_TYPE;
> >>>>+ char *buf2, *tok, *ptr, *end;
> >>>>+
> >>>>+ if ( NULL == (buf2 = ptr = strdup(str)) )
> >>>>+ return ERROR_NOMEM;
> >>>>+
> >>>>+ for(tok = ptr, end = ptr + strlen(ptr) + 1; ptr < end; ptr++) {
> >>>>+ switch(state) {
> >
> >Coding style. I haven't checked what actual style this file uses, but
> >there is inconsistency in this function by itself.
>
> I just refer to xlu_pci_parse_bdf() to generate xlu_rdm_parse(), and they
> are in the same file...
>
> Anyway, I should change this line,
>
> for ( tok = ptr, end = ptr + strlen(ptr) + 1; ptr < end; ptr++ ) {
>
for (tok = ptr, end...)
switch (state) {
> >
> >>>>+ case STATE_TYPE:
> >>>>+ if ( *ptr == '\0' || *ptr == ',' ) {
> >>>>+ state = STATE_CHECK_FLAG;
> >>>>+ *ptr = '\0';
[...]
> >>>>
> >>>>+ /*
> >>>>+ * By default our global policy is to query all rdm entries, and
> >>>>+ * force reserve them.
> >>>>+ */
> >>>>+ b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_HOST;
> >>>>+ b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_TRY;
> >>>
> >>>This should probably to into the _setdefault function of
> >>>libxl_domain_build_info.
> >>
> >>Sorry, I just see this
> >>
> >>libxl_domain_build_info_init()
> >> |
> >> + libxl_rdm_reserve_init(&p->rdm);
> >> |
> >> + memset(p, '\0', sizeof(*p));
> >>
> >>But this should be generated automatically, right? So how to implement your
> >>idea? Could you give me a show?
> >>
> >
> >Check libxl_domain_build_info_setdefault.
> >
> >To use libxl types. You normally do:
> >
> > libxl_TYPE_init
> > libxl_TYPE_setdefault
> >
> > DO STUFF
> >
> > libxl_TYPE_dispose
> >
> >_init and _dispose are auto-generated. _setdefault is not.
>
> So in our case, maybe we can do this,
>
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index f0da7dc..461606c 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -100,6 +100,17 @@ static int sched_params_valid(libxl__gc *gc,
> return 1;
> }
>
> +void libxl__device_rdm_setdefault(libxl__gc *gc,
> + libxl_domain_build_info *b_info)
It's not a device. Use libxl__rdm_setdefault.
> +{
> + /*
> + * By default our global policy is to query all rdm entries, and
> + * force reserve them.
> + */
> + b_info->rdm.type = LIBXL_RDM_RESERVE_TYPE_HOST;
> + b_info->rdm.reserve = LIBXL_RDM_RESERVE_FLAG_STRICT;
> +}
> +
Isn't the global policy "relaxed" (or "try")? At least that's what your
old code does. BTW your original code contradicts your original comment.
> int libxl__domain_build_info_setdefault(libxl__gc *gc,
> libxl_domain_build_info *b_info)
> {
> @@ -410,6 +421,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
> libxl_domain_type_to_string(b_info->type));
> return ERROR_INVAL;
> }
> +
> + libxl__device_rdm_setdefault(gc, b_info);
> return 0;
> }
>
And you also need to modify libxl__device_pci_setdefault.
I actually have another question on the interface design. To recap, in
your patch:
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 47af340..5786455 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -71,6 +71,17 @@ libxl_domain_type = Enumeration("domain_type", [
(2, "PV"),
], init_val = "LIBXL_DOMAIN_TYPE_INVALID")
+libxl_rdm_reserve_type = Enumeration("rdm_reserve_type", [
+ (0, "none"),
+ (1, "host"),
+ ])
+
+libxl_rdm_reserve_flag = Enumeration("rdm_reserve_flag", [
+ (-1, "invalid"),
+ (0, "force"),
+ (1, "try"),
+ ])
+
libxl_channel_connection = Enumeration("channel_connection", [
(0, "UNKNOWN"),
(1, "PTY"),
@@ -356,6 +367,11 @@ libxl_domain_sched_params = Struct("domain_sched_params",[
("budget", integer, {'init_val':
'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
])
+libxl_rdm_reserve = Struct("rdm_reserve", [
+ ("type", libxl_rdm_reserve_type),
+ ("reserve", libxl_rdm_reserve_flag),
+ ])
+
libxl_domain_build_info = Struct("domain_build_info",[
("max_vcpus", integer),
("avail_vcpus", libxl_bitmap),
@@ -401,6 +417,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
("kernel", string),
("cmdline", string),
("ramdisk", string),
+ ("rdm", libxl_rdm_reserve),
("u", KeyedUnion(None, libxl_domain_type, "type",
[("hvm", Struct(None, [("firmware", string),
("bios", libxl_bios_type),
@@ -521,6 +538,7 @@ libxl_device_pci = Struct("device_pci", [
("power_mgmt", bool),
("permissive", bool),
("seize", bool),
+ ("rdm_reserve", libxl_rdm_reserve_flag),
])
Do you actually need libxl_rdm_reserve type? I.e. do you envisage that
structure to change a lot? Can you not just use libxl_rdm_reserve_type
and libxl_rdm_reserve_flag in build_info.
Wei.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |