[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V13 4/7] xen/arm: Data abort exception (R/W) mem_events.





On Thu, Mar 12, 2015 at 2:35 PM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:
On Fri, 2015-03-06 at 22:24 +0100, Tamas K Lengyel wrote:
> This patch enables to store, set, check and deliver LPAE R/W mem_events.
> As the LPAE PTE's lack enough available software programmable bits,
> we store the permissions in a Radix tree. The tree is only looked at if
> mem_access_enabled is turned on.

But it is maintained/updated regardless, is that deliberate?
> +static int p2m_mem_access_radix_set(struct p2m_domain *p2m, unsigned long pfn,
> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âp2m_access_t a)
> +{
> +Â Â int rc;
> +
> +Â Â if ( p2m_access_rwx == a )
> +Â Â {
> +Â Â Â Â if ( p2m->mem_access_enabled )

In particular this is gated, but the rest of the function appears not to
be, which seems inconsistent...

Ack, this is a bit awkward here. If mem_access is not enabled, the access type passed here is always p2m_access_rwx, thus the tree wasn't updated/maintained. I'll make it more explicit by moving the check for p2m->mem_access_enabled up front to return 0 if not enabled..
Â

> +Â Â Â Â Â Â radix_tree_delete(&p2m->mem_access_settings, pfn);
> +
> +Â Â Â Â return 0;
> +Â Â }
> +
> +Â Â rc = radix_tree_insert(&p2m->mem_access_settings, pfn,
> +Â Â Â Â Â Â Â Â Â Â Â Â Â Âradix_tree_int_to_ptr(a));
> +Â Â if ( rc == -EEXIST )
> +Â Â {
> +Â Â Â Â /* If a setting existed already, change it to the new one */
> +Â Â Â Â radix_tree_replace_slot(
> +Â Â Â Â Â Â radix_tree_lookup_slot(
> +Â Â Â Â Â Â Â Â &p2m->mem_access_settings, pfn),
> +Â Â Â Â Â Â radix_tree_int_to_ptr(a));
> +Â Â Â Â rc = 0;
> +Â Â }
> +
> +Â Â return rc;
> +}
> +
>Â enum p2m_operation {
>Â Â Â INSERT,
>Â Â Â ALLOCATE,
>Â Â Â REMOVE,
>Â Â Â RELINQUISH,
>Â Â Â CACHEFLUSH,
> +Â Â MEMACCESS,
>Â };
>
> /* Put any references on the single 4K page referenced by pte. TODO:
> @@ -560,13 +592,22 @@ static int apply_one_level(struct domain *d,
>Â Â Â Â Â if ( p2m_valid(orig_pte) )
>Â Â Â Â Â Â Â return P2M_ONE_DESCEND;
>
> -Â Â Â Â if ( is_mapping_aligned(*addr, end_gpaddr, 0, level_size) )
> +Â Â Â Â if ( is_mapping_aligned(*addr, end_gpaddr, 0, level_size) &&
> +Â Â Â Â Â Â/* We only create superpages when mem_access is not in use. */
> +Â Â Â Â Â Â Â(level == 3 || (level < 3 && !p2m->mem_access_enabled)) )

I don't think we can get away with adding this check to
is_mapping_aligned (it's used elsewhere), but perhaps you could wrap
this condition in a helper to use in both places.

mapping_allowed_at_level(p2m, level) or some such.

> -Â Â Â Â Â Â/* We do not handle replacing an existing table with a superpage */
> -Â Â Â Â Â Â Â(level == 3 || !p2m_table(orig_pte)) )
> +Â Â Â Â Â Â/* We do not handle replacing an existing table with a superpage
> +Â Â Â Â Â Â * or when mem_access is in use. */
> +Â Â Â Â Â Â Â(level == 3 || (!p2m_table(orig_pte) && !p2m->mem_access_enabled)) )

Actually, this is very subtly different isn't it. Can it be unified? If
not then ignore the helper idea.

I don't think it would make it any easier to read if abstracted. I rather keep it this way.
Â

> @@ -760,6 +807,47 @@ static int apply_one_level(struct domain *d,
>Â Â Â Â Â Â Â *addr += PAGE_SIZE;
>Â Â Â Â Â Â Â return P2M_ONE_PROGRESS_NOP;
>Â Â Â Â Â }
> +
> +Â Â case MEMACCESS:
> +Â Â Â Â if ( level < 3 )
> +Â Â Â Â {
> +Â Â Â Â Â Â if ( !p2m_valid(orig_pte) )
> +Â Â Â Â Â Â {
> +Â Â Â Â Â Â Â Â *addr += level_size;
> +Â Â Â Â Â Â Â Â return P2M_ONE_PROGRESS_NOP;
> +Â Â Â Â Â Â }
> +
> +Â Â Â Â Â Â /* Shatter large pages as we descend */
> +Â Â Â Â Â Â if ( p2m_mapping(orig_pte) )
> +Â Â Â Â Â Â {
> +Â Â Â Â Â Â Â Â rc = p2m_shatter_page(d, entry, level, flush_cache);
> +Â Â Â Â Â Â Â Â if ( rc < 0 )
> +Â Â Â Â Â Â Â Â Â Â return rc;
> +Â Â Â Â Â Â } /* else: an existing table mapping -> descend */
> +
> +Â Â Â Â Â Â return P2M_ONE_DESCEND;
> +Â Â Â Â }
> +Â Â Â Â else
> +Â Â Â Â {
> +Â Â Â Â Â Â pte = orig_pte;
> +
> +Â Â Â Â Â Â if ( !p2m_table(pte) )
> +Â Â Â Â Â Â Â Â pte.bits = 0;

What is this about? Just clobbering an invalid PTE?

Hm, yea it's not actually required.
Â

> @@ -783,6 +871,8 @@ static int apply_p2m_changes(struct domain *d,
>Â Â Â unsigned int cur_root_table = ~0;
>Â Â Â unsigned int cur_offset[4] = { ~0, ~0, ~0, ~0 };
>Â Â Â unsigned int count = 0;
> +Â Â const unsigned long sgfn = paddr_to_pfn(start_gpaddr),
> +Â Â Â Â Â Â Â Â Â Â Â Â egfn = paddr_to_pfn(end_gpaddr);
>Â Â Â bool_t flush = false;
>Â Â Â bool_t flush_pt;
>
> @@ -912,6 +1006,12 @@ static int apply_p2m_changes(struct domain *d,
>Â Â Â rc = 0;
>
>Â out:
> +Â Â if ( flush )
> +Â Â {
> +Â Â Â Â flush_tlb_domain(d);
> +Â Â Â Â iommu_iotlb_flush(d, sgfn, egfn - sgfn);
> +Â Â }

Is moving the flush out of the loop an independent bug fix? If so please
do in a separate commit with a rationale in the commit log. If it is
somehow related to the changes here then please mention it in this
commit log, since it's a bit subtle.

Right, it's not a bugfix and not required to be outside the loop, I think I just moved it because it made sense to me to flush it only once instead at every iteration. I'll place it back.
Â

> +
>Â Â Â if ( rc < 0 && ( op == INSERT || op == ALLOCATE ) &&
>Â Â Â Â Â Âaddr != start_gpaddr )
>Â Â Â {
> @@ -1281,6 +1381,254 @@ void __init setup_virt_paging(void)
>Â Â Â smp_call_function(setup_virt_paging_one, (void *)val, 1);
>Â }
>
> +bool_t p2m_mem_access_check(paddr_t gpa, vaddr_t gla, const struct npfec npfec)

This is different to the current x86 prototype, is that due to your
other cleanup series?

It's different because of how the x86 code is structured. In ARM it is a bit more straight forward thus the prototype is simpler. The function doesn't get called from common, so arch specific discrepancies are OK.
Â

> +{
> +Â Â int rc;
> +Â Â bool_t violation;
> +Â Â xenmem_access_t xma;
> +Â Â mem_event_request_t *req;
> +Â Â struct vcpu *v = current;
> +Â Â struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
> +
> +Â Â /* Mem_access is not in use. */
> +Â Â if ( !p2m->mem_access_enabled )
> +Â Â Â Â return true;
> +
> +Â Â rc = p2m_get_mem_access(v->domain, paddr_to_pfn(gpa), &xma);
> +Â Â if ( rc )
> +Â Â Â Â return true;
> +
> +Â Â /* Now check for mem_access violation. */
> +Â Â switch ( xma )
> +Â Â {
> +Â Â case XENMEM_access_rwx:
> +Â Â Â Â violation = false;
> +Â Â Â Â break;
> +Â Â case XENMEM_access_rw:
> +Â Â Â Â violation = npfec.insn_fetch;
> +Â Â Â Â break;
> +Â Â case XENMEM_access_wx:
> +Â Â Â Â violation = npfec.read_access;
> +Â Â Â Â break;
> +Â Â case XENMEM_access_rx:
> +Â Â case XENMEM_access_rx2rw:
> +Â Â Â Â violation = npfec.write_access;
> +Â Â Â Â break;
> +Â Â case XENMEM_access_x:
> +Â Â Â Â violation = npfec.read_access || npfec.write_access;
> +Â Â Â Â break;
> +Â Â case XENMEM_access_w:
> +Â Â Â Â violation = npfec.read_access || npfec.insn_fetch;
> +Â Â Â Â break;
> +Â Â case XENMEM_access_r:
> +Â Â Â Â violation = npfec.write_access || npfec.insn_fetch;
> +Â Â Â Â break;
> +Â Â default:
> +Â Â case XENMEM_access_n:
> +Â Â case XENMEM_access_n2rwx:
> +Â Â Â Â violation = true;
> +Â Â Â Â break;
> +Â Â }
> +
> +Â Â if ( !violation )
> +Â Â Â Â return true;


The preceding section looks pretty similar to the guits of x86's
p2m_mem_event_emulate_check, can they be combined?

Not without introducing another wrapper around them. The mem_event_emulate_check in x86 takes mem_event_response_t as input, here we take npfec. While the logic applied afterwards is similar, I would rather do consolidation like that once this series and the other cleanup series are both applied.
Â

> +
> +Â Â /* First, handle rx2rw and n2rwx conversion automatically. */
> +Â Â if ( npfec.write_access && xma == XENMEM_access_rx2rw )
> +Â Â {
> +Â Â Â Â rc = p2m_set_mem_access(v->domain, paddr_to_pfn(gpa), 1,
> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 0, ~0, XENMEM_access_rw);
> +Â Â Â Â return false;
> +Â Â }
> +Â Â else if ( xma == XENMEM_access_n2rwx )
> +Â Â {
> +Â Â Â Â rc = p2m_set_mem_access(v->domain, paddr_to_pfn(gpa), 1,
> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 0, ~0, XENMEM_access_rwx);
> +Â Â }

This looks like a bit of p2m_mem_access_check, can it be made common?

Could be but for now I rather keep them separate.
Â

> +
> +Â Â /* Otherwise, check if there is a memory event listener, and send the message along */
> +Â Â if ( !mem_event_check_ring(&v->domain->mem_event->access) )
> +Â Â {
> +Â Â Â Â /* No listener */
> +Â Â Â Â if ( p2m->access_required )
> +Â Â Â Â {
> +Â Â Â Â Â Â gdprintk(XENLOG_INFO, "Memory access permissions failure, "
> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "no mem_event listener VCPU %d, dom %d\n",
> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â v->vcpu_id, v->domain->domain_id);
> +Â Â Â Â Â Â domain_crash(v->domain);
> +Â Â Â Â }
> +Â Â Â Â else
> +Â Â Â Â {
> +Â Â Â Â Â Â /* n2rwx was already handled */
> +Â Â Â Â Â Â if ( xma != XENMEM_access_n2rwx )
> +Â Â Â Â Â Â {
> +Â Â Â Â Â Â Â Â /* A listener is not required, so clear the access
> +Â Â Â Â Â Â Â Â Â* restrictions. */
> +Â Â Â Â Â Â Â Â rc = p2m_set_mem_access(v->domain, paddr_to_pfn(gpa), 1,
> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 0, ~0, XENMEM_access_rwx);
> +Â Â Â Â Â Â }
> +Â Â Â Â }
> +
> +Â Â Â Â /* No need to reinject */
> +Â Â Â Â return false;
> +Â Â }

And this

This is actually a bit different. On x86 the request is just being setup here and sent only later, while on ARM we can actually send it right away. I had made an attempt before to consolidate these two, but the x86 side required some heavy cleanup before that was possible so it got postponed to happen after the mem_event API itself is cleaned up.
Â

> +Â Â req = xzalloc(mem_event_request_t);
> +Â Â if ( req )
> +Â Â {
> +Â Â Â Â req->reason = MEM_EVENT_REASON_VIOLATION;
> +Â Â Â Â if ( xma != XENMEM_access_n2rwx )
> +Â Â Â Â Â Â req->flags |= MEM_EVENT_FLAG_VCPU_PAUSED;
> +Â Â Â Â req->gfn = gpa >> PAGE_SHIFT;
> +Â Â Â Â req->offset =Â gpa & ((1 << PAGE_SHIFT) - 1);
> +Â Â Â Â req->gla = gla;
> +Â Â Â Â req->gla_valid = npfec.gla_valid;
> +Â Â Â Â req->access_r = npfec.read_access;
> +Â Â Â Â req->access_w = npfec.write_access;
> +Â Â Â Â req->access_x = npfec.insn_fetch;
> +Â Â Â Â if ( npfec_kind_in_gpt == npfec.kind )
> +Â Â Â Â Â Â req->fault_in_gpt = 1;
> +Â Â Â Â if ( npfec_kind_with_gla == npfec.kind )
> +Â Â Â Â Â Â req->fault_with_gla = 1;
> +Â Â Â Â req->vcpu_id = v->vcpu_id;
> +
> +Â Â Â Â mem_access_send_req(v->domain, req);
> +Â Â Â Â xfree(req);
> +Â Â }
> +
> +Â Â /* Pause the current VCPU */
> +Â Â if ( xma != XENMEM_access_n2rwx )
> +Â Â Â Â mem_event_vcpu_pause(v);
> +
> +Â Â return false;
> +}
> +
> +/* Set access type for a region of pfns.
> + * If start_pfn == -1ul, sets the default access type */
> +long p2m_set_mem_access(struct domain *d, unsigned long pfn, uint32_t nr,
> +Â Â Â Â Â Â Â Â Â Â Â Â uint32_t start, uint32_t mask, xenmem_access_t access)

and this function is nearly identical to the x86 one too.


Nearly, but not completely. IMHO consolidation may be possible on some of these bits, but I'm not sure if it would make it any easier to follow when the code jumps back and forth between common and arch specific parts.
Â
> +{
> +Â Â struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +Â Â p2m_access_t a;
> +Â Â long rc = 0;
> +
> +Â Â static const p2m_access_t memaccess[] = {
> +#define ACCESS(ac) [XENMEM_access_##ac] = p2m_access_##ac
> +Â Â Â Â ACCESS(n),
> +Â Â Â Â ACCESS(r),
> +Â Â Â Â ACCESS(w),
> +Â Â Â Â ACCESS(rw),
> +Â Â Â Â ACCESS(x),
> +Â Â Â Â ACCESS(rx),
> +Â Â Â Â ACCESS(wx),
> +Â Â Â Â ACCESS(rwx),
> +Â Â Â Â ACCESS(rx2rw),
> +Â Â Â Â ACCESS(n2rwx),
> +#undef ACCESS
> +Â Â };
> +
> +Â Â switch ( access )
> +Â Â {
> +Â Â case 0 ... ARRAY_SIZE(memaccess) - 1:
> +Â Â Â Â a = memaccess[access];
> +Â Â Â Â break;
> +Â Â case XENMEM_access_default:
> +Â Â Â Â a = p2m->default_access;
> +Â Â Â Â break;
> +Â Â default:
> +Â Â Â Â return -EINVAL;
> +Â Â }
> +
> +Â Â /*
> +Â Â Â* Flip mem_access_enabled to true when a permission is set, as to prevent
> +Â Â Â* allocating or inserting super-pages.
> +Â Â Â*/
> +Â Â p2m->mem_access_enabled = true;
> +
> +Â Â /* If request to set default access. */
> +Â Â if ( pfn == ~0ul )
> +Â Â {
> +Â Â Â Â p2m->default_access = a;
> +Â Â Â Â return 0;
> +Â Â }
> +
> +Â Â rc = apply_p2m_changes(d, MEMACCESS,
> +Â Â Â Â Â Â Â Â Â Â Â Â Â Âpfn_to_paddr(pfn+start), pfn_to_paddr(pfn+nr),
> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â0, MATTR_MEM, mask, 0, a);
> +Â Â if ( rc < 0 )
> +Â Â Â Â return rc;
> +Â Â else if ( rc > 0 )
> +Â Â Â Â return start + rc;
> +
> +Â Â return 0;
> +}
> +
> +int p2m_get_mem_access(struct domain *d, unsigned long gpfn,
> +Â Â Â Â Â Â Â Â Â Â Â Âxenmem_access_t *access)
> +{
> +Â Â struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +Â Â void *i;
> +Â Â unsigned int index;
> +
> +Â Â static const xenmem_access_t memaccess[] = {

Would be nice not to have this static const thing twice in .rodata.

What do you mean twice? One is converting from p2m_access to XENMEM_access, the other is XENMEM_access to p2m_access.
Â

> +#define ACCESS(ac) [p2m_access_##ac] = XENMEM_access_##ac
> +Â Â Â Â Â Â ACCESS(n),
> +Â Â Â Â Â Â ACCESS(r),
> +Â Â Â Â Â Â ACCESS(w),
> +Â Â Â Â Â Â ACCESS(rw),
> +Â Â Â Â Â Â ACCESS(x),
> +Â Â Â Â Â Â ACCESS(rx),
> +Â Â Â Â Â Â ACCESS(wx),
> +Â Â Â Â Â Â ACCESS(rwx),
> +Â Â Â Â Â Â ACCESS(rx2rw),
> +Â Â Â Â Â Â ACCESS(n2rwx),
> +#undef ACCESS
> +Â Â };
> +
> +Â Â /* If no setting was ever set, just return rwx. */
> +Â Â if ( !p2m->mem_access_enabled )
> +Â Â {
> +Â Â Â Â *access = XENMEM_access_rwx;
> +Â Â Â Â return 0;
> +Â Â }
> +
> +Â Â /* If request to get default access */
> +Â Â if ( gpfn == ~0ull )

We should have a suitable constant for this, I think, INVALID_MFN looks
like the one.

~0ull is specifically used by the mem_access API for this purpose. If anywhere, in the cleanup series it might make sense to have a #define added for it.
Â

> +Â Â {
> +Â Â Â Â *access = memaccess[p2m->default_access];
> +Â Â Â Â return 0;
> +Â Â }
> +
> +Â Â spin_lock(&p2m->lock);
> +Â Â i = radix_tree_lookup(&p2m->mem_access_settings, gpfn);
> +Â Â spin_unlock(&p2m->lock);
> +
> +Â Â if ( !i )
> +Â Â {
> +Â Â Â Â /*
> +Â Â Â Â Â* No setting was found in the Radix tree. Check if the
> +Â Â Â Â Â* entry exists in the page-tables.
> +Â Â Â Â Â*/
> +Â Â Â Â paddr_t maddr = p2m_lookup(d, gpfn << PAGE_SHIFT, NULL);
> +Â Â Â Â if ( INVALID_PADDR == maddr )
> +Â Â Â Â Â Â return -ESRCH;
> +
> +Â Â Â Â /* If entry exists then its rwx. */

it's, please

Ack.
Â

> +Â Â Â Â *access = XENMEM_access_rwx;
> +Â Â }
> +Â Â else
> +Â Â {
> +Â Â Â Â /* Setting was found in the Radix tree. */
> +Â Â Â Â index = radix_tree_ptr_to_int(i);
> +Â Â Â Â if ( index >= ARRAY_SIZE(memaccess) )
> +Â Â Â Â Â Â return -ERANGE;
> +
> +Â Â Â Â *access = memaccess[index];
> +Â Â }
> +
> +Â Â return 0;
> +}
> +
>Â /*
>Â Â* Local variables:
>Â Â* mode: C

Ian.

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.