Xen project Mailing List

Re: [Xen-devel] [PATCH 2/2] Xen/mem_event: Prevent underflow of vcpu pause counts

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Andres Lagar Cavilla <andres@xxxxxxxxxxxxxxxx>

From: "Aravindh Puthiyaparambil (aravindp)" <aravindp@xxxxxxxxx>

Date: Thu, 17 Jul 2014 19:18:33 +0000

Accept-language: en-US

Cc: Tim Deegan <tim@xxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Thu, 17 Jul 2014 19:18:39 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHPoe+kA8ia8o/QIU60jJK2IdTsdpukm31ggABViQCAAAO0gP//ruXQ

Thread-topic: [Xen-devel] [PATCH 2/2] Xen/mem_event: Prevent underflow of vcpu pause counts

On Thu, Jul 17, 2014 at 2:51 PM, Aravindh Puthiyaparambil (aravindp) <aravindp@xxxxxxxxx> wrote:

>> +void mem_event_vcpu_unpause(struct vcpu *v) {

>> + if ( test_and_clear_bool(v->paused_for_mem_event) )
>
>And now that we consider more than one mem event piling up to pause a
>vcpu, this has to become an atomic counter, which unpauses on zero, and
>takes care of underflow.

Very true. I have seen this event pile up occur in practice in our product.

The problem becomes how to tell apart real event responses that should dec the pause count from spurious crap from the toolstack. IOW, how to not unpause the vcpu when count reaches zero due to bad responses. I think the answer is: you can't, if the toolstack is evil, behavior undefined and bigger fish to fry.

Andres

You really can't, but the important bit is to ensure that Xen is sufficiently insulated from buggy toolstack components that it doesn't fall over.

From my experimenting with the pausedomain refcoutnging, weird stuff happens when the domain pause count turns negative. I ended up with a domain which would never be scheduled again (even after returning the count to positive and back to 0), and a domain which couldn't be killed using `xl destroy`. Rebooting was the only option.

So long as Xen doesn't fall into these problems, a buggy toolstack (especially with mem_events) already has many ways to screw over a domain, so one more is not a problem.

I misunderstood Andreâs point. Your response made it clear what the concern was.

Thanks,

Aravindh

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel