|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2] xen/arm: p2m: Correctly flush TLB in create_p2m_entries
Hrm, our TLB flush discipline is horribly confused isn't it...
On Thu, 2014-01-09 at 16:34 +0000, Julien Grall wrote:
> The p2m is shared between VCPUs for each domain. Currently Xen only flush
> TLB on the local PCPU. This could result to mismatch between the mapping in
> the
> p2m and TLBs.
>
> Flush TLB entries used by this domain on every PCPU. The flush can also be
> moved out of the loop because:
> - ALLOCATE: only called for dom0 RAM allocation, so the flush is never
> called
OK.
An ASSERT(!third[third_table_offset(addr)].p2m.valid) might be
worthwhile if that is the case.
(I'm not sure why ALLOCATE can't be replaced by allocation followed by
an INSERT, it's seems very special case)
> - INSERT: if valid = 1 that would means with have replaced a
> page that already belongs to the domain. A VCPU can write on the wrong
> page.
> This can append for dom0 with the 1:1 mapping because the mapping is not
> removed from the p2m.
"append"? Do you mean "happen"?
In the non-dom0 1:1 case eventually the page will be freed, I guess by a
subsequent put_page elsewhere -- do they all contain the correct
flushing? Or do we just leak?
What happens if the page being replaced is foreign? Do we leak a
reference to another domain's page? (This is probably a separate issue
though, I suspect the put_page needs pulling out of the switch(op)
statement).
> - REMOVE: except for grant-table (replace_grant_host_mapping),
What about this case? What makes it safe?
> each
> call to guest_physmap_remove_page are protected by the callers via a
> get_page -> .... -> guest_physmap_remove_page -> ... -> put_page. So
> the page can't be allocated for another domain until the last put_page.
I have confirmed this is the case for guest_remove_page, memory_exchange
and XENMEM_remove_from_physmap.
There is one case I saw where this isn't true which is gnttab_transfer,
AFAICT that will fail because steal_page unconditionally returns an
error on arm. There is also a flush_tlb_mask there, FWIW.
> - RELINQUISH : the domain is not running anymore so we don't care...
At some point this page will be reused, as will the VMID, where/how is
it ensured that a flush will happen before that point?
So I think the main outstanding question is what makes
replace_grant_host_mapping safe.
The other big issue is the leak of a foreign mapping on INSERT, but I
think that is separate.
> Signed-off-by: Julien Grall <julien.grall@xxxxxxxxxx>
>
> ---
> Changes in v2:
> - Switch to the domain for only flush its TLBs entries
> - Move the flush out of the loop
>
> This is a possible bug fix (found by reading the code) for Xen 4.4, I moved
> the
> flush out of the loop which should be safe (see why in the commit message).
> Without this patch, the guest can have stale TLB entries when the VCPU is
> moved
> to another PCPU.
>
> Except grant-table (I can't find {get,put}_page for grant-table code???),
> all the callers are protected by a get_page before removing the page. So if
> the
> another VCPU is trying to access to this page before the flush, it will just
> read/write the wrong page.
>
> The downside of this patch is Xen flushes less TLBs. Instead of flushing all
> TLBs
> on the current PCPU, Xen flushes TLBs for a specific VMID on every CPUs. This
> should be safe because create_p2m_entries only deal with a specific domain.
>
> I don't think I forget case in this function. Let me know if it's the case.
> ---
> xen/arch/arm/p2m.c | 24 +++++++++++++++++++-----
> 1 file changed, 19 insertions(+), 5 deletions(-)
>
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 11f4714..ad6f76e 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -238,7 +238,7 @@ static int create_p2m_entries(struct domain *d,
> int mattr,
> p2m_type_t t)
> {
> - int rc, flush;
> + int rc;
> struct p2m_domain *p2m = &d->arch.p2m;
> lpae_t *first = NULL, *second = NULL, *third = NULL;
> paddr_t addr;
> @@ -246,10 +246,14 @@ static int create_p2m_entries(struct domain *d,
> cur_first_offset = ~0,
> cur_second_offset = ~0;
> unsigned long count = 0;
> + unsigned int flush = 0;
> bool_t populate = (op == INSERT || op == ALLOCATE);
>
> spin_lock(&p2m->lock);
>
> + if ( d != current->domain )
> + p2m_load_VTTBR(d);
> +
> addr = start_gpaddr;
> while ( addr < end_gpaddr )
> {
> @@ -316,7 +320,7 @@ static int create_p2m_entries(struct domain *d,
> cur_second_offset = second_table_offset(addr);
> }
>
> - flush = third[third_table_offset(addr)].p2m.valid;
> + flush |= third[third_table_offset(addr)].p2m.valid;
>
> /* Allocate a new RAM page and attach */
> switch (op) {
> @@ -373,9 +377,6 @@ static int create_p2m_entries(struct domain *d,
> break;
> }
>
> - if ( flush )
> - flush_tlb_all_local();
> -
> /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary */
> if ( op == RELINQUISH && count >= 0x2000 )
> {
> @@ -392,6 +393,16 @@ static int create_p2m_entries(struct domain *d,
> addr += PAGE_SIZE;
> }
>
> + if ( flush )
> + {
> + /* At the beginning of the function, Xen is updating VTTBR
> + * with the domain where the mappings are created. In this
> + * case it's only necessary to flush TLBs on every CPUs with
> + * the current VMID (our domain).
> + */
> + flush_tlb();
> + }
> +
> if ( op == ALLOCATE || op == INSERT )
> {
> unsigned long sgfn = paddr_to_pfn(start_gpaddr);
> @@ -409,6 +420,9 @@ out:
> if (second) unmap_domain_page(second);
> if (first) unmap_domain_page(first);
>
> + if ( d != current->domain )
> + p2m_load_VTTBR(current->domain);
> +
> spin_unlock(&p2m->lock);
>
> return rc;
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |