[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 08/15] x86/hyperlaunch: locate dom0 kernel with hyperlaunch


  • To: "Daniel P. Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 12 Dec 2024 12:25:50 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: jason.andryuk@xxxxxxx, christopher.w.clark@xxxxxxxxx, stefano.stabellini@xxxxxxx, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 12 Dec 2024 11:26:13 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 11.12.2024 16:41, Daniel P. Smith wrote:
> On 12/2/24 06:53, Jan Beulich wrote:
>> On 23.11.2024 19:20, Daniel P. Smith wrote:
>>> Look for a subnode of type `multiboot,kernel` within a domain node. If 
>>> found,
>>> process the reg property for the MB1 module index. If the bootargs property 
>>> is
>>> present and there was not an MB1 string, then use the command line from the
>>> device tree definition.
>>
>> Why specifically MB1?
> 
> Because Xen converts MB2 into an MB1 chain very early in the entry 
> points that take MB2. By the time HL code is executed, it will only ever 
> see a list of MB1 modules.

Yet that's all Xen's internal representation, which merely is kind of
originating from MB1. That origin is, afaict, irrelevant here, and is
instead, imo, confusing.

>>> --- a/xen/arch/x86/domain_builder/fdt.c
>>> +++ b/xen/arch/x86/domain_builder/fdt.c
>>> @@ -14,6 +14,122 @@
>>>   
>>>   #include "fdt.h"
>>>   
>>> +static inline int __init fdt_get_prop_as_reg(
>>
>> What does "reg" stand for here?
> 
> Device Tree defines a two field "prop-encoded=array" property type 
> called reg.

Okay, this explains where you took it from, yet I remain curious what it
actually stands for. Just from the letters I would derive "register", yet
that seems unlikely here.

>>> +    const void *fdt, int node, const char *name, unsigned int ssize,
>>> +    unsigned int asize, uint64_t *size, uint64_t *addr)
>>> +{
>>> +    int ret;
>>> +    const struct fdt_property *prop;
>>> +    fdt32_t *cell;
>>> +
>>> +    /* FDT spec max size is 4 (128bit int), but largest arch int size is 
>>> 64 */
>>> +    if ( ssize > 2 || asize > 2 )
>>> +        return -EINVAL;
>>> +
>>> +    prop = fdt_get_property(fdt, node, name, &ret);
>>> +    if ( !prop || ret < sizeof(u32) )
>>> +        return ret < 0 ? ret : -EINVAL;
>>> +
>>> +    /* read address field */
>>> +    cell = (fdt32_t *)prop->data;
>>> +
>>> +    if ( asize == 1 )
>>> +    {
>>> +        uint32_t val;
>>> +        fdt_cell_as_u32(cell, &val);
>>> +        *addr = (uint64_t)val;
>>
>> No need for a cast here nor ...
>>
>>> +    }
>>> +    else
>>> +        fdt_cell_as_u64(cell, addr);
>>> +
>>> +    /* read size field */
>>> +    cell += asize;
>>> +
>>> +    if ( ssize == 1 )
>>> +    {
>>> +        uint32_t val;
>>> +        fdt_cell_as_u32(cell, &val);
>>> +        *size = (uint64_t)val;
>>
>> ... here?
> 
> No the compiler does not need the cast, placed to remind readers what 
> was being done. Can/will drop.
> 
>>> +    }
>>> +    else
>>> +        fdt_cell_as_u64(cell, size);
>>> +
>>> +    return 0;
>>> +}
>>
>> This whole function reads very much like a library one. Does it really need
>> adding here, rather than to the FDT library code we already have? In any
>> event there's nothing x86-specific about it, afaics.
> 
> This is where it gets complicated. Most of the higher order functions 
> exposed by xen/device_tree.h are written to work with FDT indexing 
> structures, referred to as the unflattened tree. Deconflicting the mixed 
> use of FDT and FDT index in device_tree.h is beyond the scope of this 
> series.

Going that far also wasn't my request. Yet it's still library-like code,
which seems odd to introduce as x86-specific when later it may very well
want using by other architectures as well. If the FDT library code isn't
a good place, then put it somewhere under xen/lib/?

>>> +                                  &size, &addr);
>>> +    /* An FDT error value may have been returned, translate to -EINVAL */
>>> +    if ( ret < 0 )
>>> +        return -EINVAL;
>>> +
>>> +    if ( size != 0 )
>>> +        return -EOPNOTSUPP;
>>
>> Not knowing much about DT: What does 0 represent here?
> 
> The libfdt code treats 0 as a valid value, whether zero is a valid value 
> is driven by the contextual usage of the property.

And 0 is the _only_ valid value here? Another comment may be on order, to
at least briefly indicate why this is.

>>> +    if ( addr > MAX_NR_BOOTMODS )
>>> +        return -ERANGE;
>>> +
>>> +    /*
>>> +     * MAX_NR_BOOTMODS cannot exceed the max for MB1, represented by a u32,
>>> +     * thus the cast down to a u32 will be safe due to the prior check.
>>> +     */
>>> +    return (int)addr;
>>
>> Comment and cast contradict one another. DYM u32 (really: uint32_t), or plain
>> int? If you mean to return a plain int (for the sake of the -errno values
>> further up), MAX_NR_BOOTMODS needs to stay below 2**31.
> 
> Good point, we cannot artificially impose 2^31 limit when 2^32 is the 
> legitimate upper bound supported by the MB1 protocol. Even if that value 
> is impractical. At the same time, it is beneficial to be able to 
> communicate failures along with some delineation of the failure. Let me 
> think about this, and in the meantime suggestions are welcomed.

    BUILD_BUG_ON(MAX_NR_BOOTMODS > INT_MAX);

deferring the thinking about the "bigger than this" aspect until a (perhaps
much) later time?

>>> +static int __init process_domain_node(
>>> +    struct boot_info *bi, void *fdt, int dom_node)
>>> +{
>>> +    int node;
>>> +    struct boot_domain *bd = &bi->domains[bi->nr_domains];
>>> +    const char *name = fdt_get_name(fdt, dom_node, NULL);
>>> +    int address_size = fdt_address_cells(fdt, dom_node);
>>> +    int size_size = fdt_size_cells(fdt, dom_node);
>>> +
>>> +    if ( address_size < 0 || size_size < 0 )
>>> +    {
>>> +        printk("  failed processing #address or #size for domain %s)\n",
>>> +               name == NULL ? "unknown" : name);
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    fdt_for_each_subnode(node, fdt, dom_node)
>>> +    {
>>> +        if ( fdt_node_check_compatible(fdt, node, "multiboot,kernel") == 0 
>>> )
>>> +        {
>>> +            int idx = dom0less_module_node(fdt, node, size_size, 
>>> address_size);
>>> +            if ( idx < 0 )
>>> +            {
>>> +                printk("  failed processing kernel module for domain 
>>> %s)\n",
>>> +                       name == NULL ? "unknown" : name);
>>> +                return idx;
>>> +            }
>>> +
>>> +            if ( idx > bi->nr_modules )
>>> +            {
>>> +                printk("  invalid kernel module index for domain node 
>>> (%d)\n",
>>> +                       bi->nr_domains);
>>> +                return -EINVAL;
>>> +            }
>>> +
>>> +            printk("  kernel: boot module %d\n", idx);
>>> +            bi->mods[idx].type = BOOTMOD_KERNEL;
>>> +            bd->kernel = &bi->mods[idx];
>>> +        }
>>> +    }
>>
>> What if you find two?
> 
> No different than if someone accidentally duplicated the module line for 
> the kernel in grub.cfg. It's a violation of the boot convention with the 
> resulting behavior being indeterminate, which may or may not result in 
> failure/panic when the domain attempts to boot. It might be worth adding 
> a warning if a duplicate kernel entry is detected. It is possible that 
> such a configuration would boot if it was a duplicate paste situation. 
> So I would not feel right panicking, when there is a possibility that 
> the configuration could boot.

I agree with not panic()ing, and I didn't mean to ask that you do so. A
warning ought to be enough indeed.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.