[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 4/9] xen/x86: populate PVHv2 Dom0 physical memory map



>>> On 27.01.17 at 13:23, <roger.pau@xxxxxxxxxx> wrote:
> On Thu, Jan 26, 2017 at 05:41:58AM -0700, Jan Beulich wrote:
>> >>> On 19.01.17 at 18:29, <roger.pau@xxxxxxxxxx> wrote:
>> > @@ -43,6 +44,9 @@ static long __initdata dom0_nrpages;
>> >  static long __initdata dom0_min_nrpages;
>> >  static long __initdata dom0_max_nrpages = LONG_MAX;
>> >  
>> > +/* Size of the VM86 TSS for virtual 8086 mode to use. */
>> > +#define HVM_VM86_TSS_SIZE   128
>> 
>> I continue to be puzzled by this value. Why 128? I think this really
>> needs to be clarified in the comment.
> 
> Given the recent comments by Tim, and that this is starting to look like a can
> of worms, I would like to leave this as-is for the moment, on the grounds that
> it's what hvmloader does (I'm not saying it's right), and that this issue
> should be treated independently from this patch series.

Well, for the purpose of this patch it would be sufficient if the
comment referred to hvmloader. But then I think I saw you set the
TSS limit to 0x67, which is neither in line with the value above nor
- according to what Tim said (but I didn't check myself yet) - the
255 used in hvmloader. I.e. if you clone hvmloader code, all
aspects of it should match.

> Alternatively, I can just remove setting HVM_PARAM_VM86_TSS for a PVHv2 Dom0.
> IIRC I've tried that before (without unrestricted mode support) and it was
> working fine.

Now if that's the case, then why bother with the TSS?

>> > @@ -333,7 +338,9 @@ static unsigned long __init compute_dom0_nr_pages(
>> >              avail -= max_pdx >> s;
>> >      }
>> >  
>> > -    need_paging = opt_dom0_shadow || (is_pvh_domain(d) && 
>> > !iommu_hap_pt_share);
>> > +    need_paging = opt_dom0_shadow ||
>> > +                  (has_hvm_container_domain(d) && (!iommu_hap_pt_share ||
>> > +                                                   !paging_mode_hap(d)));
>> 
>> What is the !paging_mode_hap() part good for? It's being taken care
>> of by checking opt_dom0_shadow already, isn't it? Alternatively, to
>> make the distinction more obvious, I'd suggest
>> 
>>     need_paging = has_hvm_container_domain(d)
>>                   ? !iommu_hap_pt_share || !paging_mode_hap(d)
>>                   : opt_dom0_shadow;
> 
> AFAICT it *might* be possible to run a PVHv2 Dom0 on a box with no EPT, but
> with an IOMMU? Does that exist? In that case opt_dom0_shadow won't be set, but
> paging_mode_hap would be false. Maybe that's just an impossible combination in
> any case...

At least when running Xen itself virtualized, I wouldn't dare to assume
this is an impossible combination. However, I can't see how that case
would be handled any different by the original or the suggested
replacement expressions: need_paging would get set either way afaict.

>> > @@ -608,8 +617,22 @@ static __init void pvh_setup_e820(struct domain *d, 
>> > unsigned long nr_pages)
>> >              continue;
>> >          }
>> >  
>> > -        *entry_guest = *entry;
>> > -        pages = PFN_UP(entry_guest->size);
>> > +        /*
>> > +         * Make sure the start and length are aligned to PAGE_SIZE, 
>> > because
>> > +         * that's the minimum granularity of the 2nd stage translation. 
>> > Since
>> > +         * the p2m code uses PAGE_ORDER_4K internally, also use it here in
>> > +         * order to prevent this code from getting out of sync.
>> > +         */
>> > +        start = ROUNDUP(entry->addr, _AC(1,L) << PAGE_ORDER_4K << 
>> > PAGE_SHIFT);
>> 
>> You definitely don't need to use _AC() in C code. But the whole thing
>> can anyway simply be
>> 
>>         start = ROUNDUP(entry->addr, PAGE_SIZE << PAGE_ORDER_4K);
>> 
>> (albeit I'd like to note that if anything we'd have to be prepared
>> for page sizes > 4k, not smaller ones, and the whole idea of
>> PAGE_ORDER_4K breaks in that case).
> 
> Thanks, I will change as per your recommendation above, although I'm not sure
> what to do with the PAGE_ORDER_4K thing. Are you fine with leaving it like you
> suggest?

Yes, there's far more broken code in that case, and hence the remark
was in parentheses in an attempt to make clear it's really just a remark.

>> > +static int __init pvh_setup_vmx_realmode_helpers(struct domain *d)
>> > +{
>> > +    p2m_type_t p2mt;
>> > +    uint32_t rc, *ident_pt;
>> > +    uint8_t *tss;
>> > +    mfn_t mfn;
>> > +    paddr_t gaddr;
>> > +    unsigned int i;
>> > +
>> > +    /*
>> > +     * Steal some space from the last found RAM region. One page will be
>> > +     * used for the identity page tables, and the remaining space for the
>> > +     * VM86 TSS. Note that after this not all e820 regions will be aligned
>> > +     * to PAGE_SIZE.
>> > +     */
>> > +    if ( pvh_steal_ram(d, PAGE_SIZE + HVM_VM86_TSS_SIZE, ULONG_MAX, 
>> > &gaddr) )
>> > +    {
>> > +        printk("Unable to find memory to stash the identity map and 
>> > TSS\n");
>> > +        return -ENOMEM;
>> > +    }
>> > +
>> > +    /*
>> > +     * Identity-map page table is required for running with CR0.PG=0
>> > +     * when using Intel EPT. Create a 32-bit non-PAE page directory of
>> > +     * superpages.
>> > +     */
>> > +    ident_pt = map_domain_gfn(p2m_get_hostp2m(d), _gfn(PFN_DOWN(gaddr)),
>> > +                              &mfn, &p2mt, 0, &rc);
>> > +    if ( ident_pt == NULL )
>> > +    {
>> > +        printk("Unable to map identity page tables\n");
>> > +        return -ENOMEM;
>> > +    }
>> > +    for ( i = 0; i < PAGE_SIZE / sizeof(*ident_pt); i++ )
>> > +        ident_pt[i] = ((i << 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
>> > +                       _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
>> > +    unmap_domain_page(ident_pt);
>> > +    put_page(mfn_to_page(mfn_x(mfn)));
>> > +    d->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] = gaddr;
>> > +    gaddr += PAGE_SIZE;
>> > +    ASSERT(IS_ALIGNED(gaddr, PAGE_SIZE));
>> 
>> This comes too late - the page table setup above also requires
>> page alignment (and with that, adding PAGE_SIZE would not break
>> the alignment requirement). Even more, the code below doesn't
>> strictly require page alignment, it only requires for the range to
>> not cross a page boundary.
> 
> Given the change that you requested in pvh_steal_ram, now the start of the
> memory area returned by it it's not going to be page-aligned, so I will have 
> to
> perform the TSS setup first, and then the identity page tables.

Or simply pass the required alignment.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.