[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 05/23] xen/arm: Add capabilities to dom0less
On 10/03/2025 09:01, Bertrand Marquis wrote: > > > Hi, > >> On 8 Mar 2025, at 13:37, Julien Grall <julien@xxxxxxx> wrote: >> >> Hi Jason, >> >> On 08/03/2025 00:02, Jason Andryuk wrote: >>> On 2025-03-07 16:21, Julien Grall wrote: >>>> Hi Jason, >>>> >>>> On 07/03/2025 17:58, Jason Andryuk wrote: >>>>> On 2025-03-07 04:01, Julien Grall wrote: >>>>>> Hi, >>>>>> >>>>>> On 06/03/2025 22:03, Jason Andryuk wrote: >>>>>>> Add capabilities property to dom0less to allow building a >>>>>>> disaggregated system. >>>>>>> >>>>>>> Introduce bootfdt.h to contain these constants. >>>>>>> >>>>>>> When using the hardware or xenstore capabilities, adjust the grant and >>>>>>> event channel limits similar to dom0. >>>>>> > > Also for the hardware domain, set directmap and iommu. This brings >>>>>> its >>>>>>> configuration in line with a dom0. >>>>>> >>>>>> Looking the device tree bindings, a user would be allowed to disable >>>>>> "passthrough" or even "directmap". This means, we would never be able to >>>>>> disable "directmap" for the hardware domain in the future with the >>>>>> existing property (this is to avoid break backwards compatibility). >>>>>> >>>>>> Instead, I think we should check what the user provided and confirm this >>>>>> is matching our expectation for an hardware domain. >>>>> > >>>>>> That said, I am not entirely sure why we should force directmap for the >>>>>> HW domain. We are starting from a clean slate, so I think it would be >>>>>> better to have by default no directmap and imposing the presence of an >>>>>> IOMMU in the system. >>>>> >>>>> Ok, it seems like directmap is not necessary. It was helpful early on to >>>>> get things booting, but I think it's no longer necessary after factoring >>>>> out construct_hwdom(). >>>>> >>>>> What exactly do you mean by imposing with respect to the iommu? Require >>>>> one, or mirror the dom0 code and set it for the hwdom? >>>>> >>>>> if ( iommu_enabled ) >>>>> dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu; >>>> >>>> I mean requires one. Without it, you would need to set directmap and I >>>> don't think this should be allowed (at least for now) for the HW domain. >>>> >>>>> >>>>>> Lastly, can you provide an example of what the hardware domain DT node >>>>>> would looke like? >>>>> >>>>> I've attached a dump of /sys/firmware/fdt from hwdom. (This is direct >>>>> mapped). >>>> >>>> Sorry if this was not clear, I am asking for the configuration you wrote >>>> in the host DT for create the hardware domain. I am interested to know >>>> which properties you set... >>> I've attached the u-boot fdt commands which generate the DT. Hopefully >>> that works for you. >>>>> >>>>>>> --- a/xen/arch/arm/dom0less-build.c >>>>>>> +++ b/xen/arch/arm/dom0less-build.c >>>>>>> @@ -12,6 +12,7 @@ >>>>>>> #include <xen/sizes.h> >>>>>>> #include <xen/vmap.h> >>>>>>> +#include <public/bootfdt.h> >>>>>>> #include <public/io/xs_wire.h> >>>>>>> #include <asm/arm64/sve.h> >>>>>>> @@ -994,6 +995,34 @@ void __init create_domUs(void) >>>>>>> if ( (max_init_domid + 1) >= DOMID_FIRST_RESERVED ) >>>>>>> panic("No more domain IDs available\n"); >>>>>>> + if ( dt_property_read_u32(node, "capabilities", &val) ) >>>>>>> + { >>>>>>> + if ( val & ~DOMAIN_CAPS_MASK ) >>>>>>> + panic("Invalid capabilities (%"PRIx32")\n", val); >>>>>>> + >>>>>>> + if ( val & DOMAIN_CAPS_CONTROL ) >>>>>>> + flags |= CDF_privileged; >>>>>>> + >>>>>>> + if ( val & DOMAIN_CAPS_HARDWARE ) >>>>>>> + { >>>>>>> + if ( hardware_domain ) >>>>>>> + panic("Only 1 hardware domain can be specified! >>>>>>> (%pd)\n", >>>>>>> + hardware_domain); >>>>>>> + >>>>>>> + d_cfg.max_grant_frames = gnttab_dom0_frames(); >>>>>>> + d_cfg.max_evtchn_port = -1; >>>>>> >>>>>> What about d_cfg.arch.nr_spis? Are we expecting the user to pass >>>>>> "nr_spis"? >>>>> >>>>> Further down, when nr_spis isn't specified in the DT, it defaults to: >>>>> d_cfg.arch.nr_spis = gic_number_lines() - 32; >>>> >>>> Thanks. One further question, what if the user pass "nr_spis" for the HW >>>> domain. Wouldn't this result to more issue further down the line? >>> I'm not familiar with ARM, so I'll to whatever is best. I did put the >>> capabilities first, thinking it would set defaults, and then later options >>> could override them. >> >> I am not sure it is related to Arm. It is more that the HW domain is going >> to re-use the memory layout of the host (this is including the mapping for >> the GIC) and also have all the irqs with pirq == virq. >> >> I am a bit concerned that letting the users mistakenly tweaking the defaults >> could prevent attaching devices (for instance if the IRQ ID is above > >> nr_spis). >> >>>>> >>>>> Dom0 does: >>>>> /* >>>>> * Xen vGIC supports a maximum of 992 interrupt lines. >>>>> * 32 are substracted to cover local IRQs. >>>>> */ >>>>> dom0_cfg.arch.nr_spis = min(gic_number_lines(), (unsigned int) 992) >>>>> - 32; >>>>> if ( gic_number_lines() > 992 ) >>>>> printk(XENLOG_WARNING "Maximum number of vGIC IRQs exceeded. >>>>> \n"); >>>>> >>>>> So I think it's fine as-is? I could add the min() and warning if you >>>>> think it's necessary. >>>> >>>> Regardless this discussion, I wonder why we didn't add the min(...) here >>>> like for dom0 because we are using the same vGIC emulation. So the max >>>> SPIs supported is the same... >>>> >>>> What I am trying to understand is whether it is ok to allow the user to >>>> select "nr_spis", "vpl011" & co if they are either not honored (like for >>>> vpl011) or could introduce further issue (like for nr_spis as the HW >>>> domain is supposed to have the same configuration as dom0). >>>> >>>> I also don't think it is a good idea to silently ignore what the user >>>> requested. So far, on Arm, we mainly decided to reject/panic early if the >>>> setup is not correct. >>> Again, I'll do whatever is best. >> >> Bertrand, Michal, Stefano, any opinions? > > I definitely think that any user configuration mistake should end up in a > panic, a warning message is definitely not enough. > Here the user might discover or not at runtime that what he thought was > configured is not. > So a panic here would be better from my point of view. I think this code handling is a bit messy today. GIC supports max 1020 lines. That's 1020-32=988 SPIs. In our vGICs we round up SPIs to 32, so the true vGIC max SPI is 960. Issues: 1) We warn about current lines exceeding max vGIC lines only for hwdom. This should be made common: diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index d4570bc0b4e4..8d97bbcc86c7 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -2368,8 +2368,6 @@ void __init create_dom0(void) * 32 are substracted to cover local IRQs. */ dom0_cfg.arch.nr_spis = min(gic_number_lines(), (unsigned int) 992) - 32; - if ( gic_number_lines() > 992 ) - printk(XENLOG_WARNING "Maximum number of vGIC IRQs exceeded.\n"); dom0_cfg.arch.tee_type = tee_get_type(); dom0_cfg.max_vcpus = dom0_max_vcpus(); diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index c563ba93af22..494210a7337d 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -116,6 +116,9 @@ int domain_vgic_init(struct domain *d, unsigned int nr_spis) int i; int ret; + if ( gic_number_lines() > 992 ) + printk(XENLOG_WARNING "Maximum number of vGIC IRQs exceeded\n"); + d->arch.vgic.ctlr = 0; /* diff --git a/xen/arch/arm/vgic/vgic-init.c b/xen/arch/arm/vgic/vgic-init.c index f8d7d3a226d0..dd1734a61f57 100644 --- a/xen/arch/arm/vgic/vgic-init.c +++ b/xen/arch/arm/vgic/vgic-init.c @@ -132,6 +132,9 @@ int domain_vgic_init(struct domain *d, unsigned int nr_spis) unsigned int i; int ret; + if ( gic_number_lines() > 992 ) + printk(XENLOG_WARNING "Maximum number of vGIC IRQs exceeded\n"); + /* The number of SPIs must be a multiple of 32 per the GIC spec. */ nr_spis = ROUNDUP(nr_spis, 32); 2) If user does not specify nr_spis for dom0less domUs, we should take the max vGIC lines into account as we do for hwdom: diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c index 31f31c38da3f..019ae2173400 100644 --- a/xen/arch/arm/dom0less-build.c +++ b/xen/arch/arm/dom0less-build.c @@ -1018,7 +1018,8 @@ void __init create_domUs(void) { int vpl011_virq = GUEST_VPL011_SPI; - d_cfg.arch.nr_spis = gic_number_lines() - 32; + d_cfg.arch.nr_spis = min(gic_number_lines(), + (unsigned int)992) - 32; /* * The VPL011 virq is GUEST_VPL011_SPI, unless direct-map is 3) Doc for xl cfg nr_spis is utterly wrong. First it mentions 991 as the number of max SPIs where it should be 960. 4) Handling in libxl silently ignores user provided value if the calculated number of required SPIs (e.g. vpl011, virtio, etc.) is bigger than user provided value. We should bail out. I think we should first fix the above (I can prepare patches) and then continue adding logic if user specifies nr_spis for hwdom (not possible today). ~Michal
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |