Re: [Xen-devel] [PATCH] Xen: Fix retry calls into PRIVCMD_MMAPBATCH*.

On Aug 1, 2013, at 8:04 AM, David Vrabel <david.vrabel@xxxxxxxxxx> wrote:

> On 01/08/13 12:49, Andres Lagar-Cavilla wrote:
>> On Aug 1, 2013, at 7:23 AM, David Vrabel <david.vrabel@xxxxxxxxxx> wrote:
>>> On 01/08/13 04:30, Andres Lagar-Cavilla wrote:
>>>> -- Resend as I haven't seen this hit the lists. Maybe some smtp misconfig. 
>>>> Apologies. Also expanded cc --
>>>> When a foreign mapper attempts to map guest frames that are paged out,
>>>> the mapper receives an ENOENT response and will have to try again
>>>> while a helper process pages the target frame back in.
>>>> Gating checks on PRIVCMD_MMAPBATCH* ioctl args were preventing retries
>>>> of mapping calls.
>>> This breaks the auto_translated_physmap case as will allocate another
>>> set of empty pages and leak the previous set.
>> David,
>> not able to follow you here. Under what circumstances will another
>> set of empty pages be allocated? And where? are we talking page table pages?
>       ....
>       vma = find_vma(mm, m.addr);
>       if (!vma ||
>           vma->vm_ops != &privcmd_vm_ops ||
>           (m.addr != vma->vm_start) ||
>           ((m.addr + (nr_pages << PAGE_SHIFT)) != vma->vm_end) ||
>           !privcmd_enforce_singleshot_mapping(vma)) {
>               up_write(&mm->mmap_sem);
>               ret = -EINVAL;
>               goto out;
>       }
>       if (xen_feature(XENFEAT_auto_translated_physmap)) {
>               ret = alloc_empty_pages(vma, m.num);
> Here.

Right right right. Excellent observation thanks. I fwd ported from 3.4 and this 
slipped through the cracks. Ok, V2 coming.
>               if (ret < 0) {
>                       up_write(&mm->mmap_sem);
>                       goto out;
>               }
>       }
>>> This privcmd_enforce_singleshot_mapping() stuff seems very odd anyway.
>>> Does anyone know what it was for originally?  It would be preferrable if
>>> we could update the mappings with a new set of foreign MFNs without
>>> having to tear down the VMA and recreate a new VMA.
>> I believe it's mostly historical. I agree with you on principle, but 
>> recreating VMAs is super-cheap.
> Tearing them down is not cheap as each page requires a trap-and-emulate
> to clear the PTE (see ptep_get_and_clear_full() in zap_pte_range()).

You need to tell the hypervisor to drop the ref on the mapped page. So you'd 
need a hyper call (arguably a multi-call) to do that, which is not free. Then 
you'd need privcmd and libxc to collude on agreeing to reuse the vma -- which 
has very low value in itself, just a piece of metadata. And you still need to 
deal with cleaning up the mapped refs when the mapping process crashes.

So a whole lot of new complexity for small value, imho.

Probably that's the whole point of the singleshot: don't forget you have 
something mapped in there. Because if you do you might leak the ref forever.

> David

