[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH 2/2] evtchn/fifo: don't corrupt queues if an old tail is linked
From: David Vrabel <david.vrabel@xxxxxxxxxx> An event may still be the tail of a queue even if the queue is now empty (an 'old tail' event). There is logic to handle the case when this old tail event needs to be added to the now empty queue (by checking for q->tail == port). However, this does not cover all cases. 1. An old tail may be re-added simultaneously with another event. LINKED is set on the old tail, and the other CPU may misinterpret this as the old tail still being valid and set LINK instead of HEAD. All events on this queue will then be lost. 2. If the old tail event on queue A is moved to a different queue B (by changing its VCPU or priority), the event may then be linked onto queue B. When another event is linked onto queue A it will check the old tail, see that it is linked (but on queue B) and overwrite the LINK field, corrupting both queues. When an event is linked, save the vcpu id and priority of the queue it is being linked onto. Use this when linking an event to check if it is an unlinked old tail event. If it is an old tail event, the old queue is empty and old_q->tail is invalidated to ensure adding another event to old_q will update HEAD. The tail is invalidated by setting it to 0 since the event 0 is never linked. The old_q->lock is held while setting LINKED to avoid the race with the test of LINKED in evtchn_fifo_set_link(). Since a event channel may move queues after old_q->lock is acquired, we must check that we have the correct lock and retry if not. Since changing VCPUs or priority is expected to be rare events that are serialized in the guest, we try at most 3 times before dropping the event. This prevents a malicious guest from repeatedly adjusting priority to prevent another domain from acquiring old_q->lock. Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx> --- xen/common/event_fifo.c | 80 ++++++++++++++++++++++++++++++++++++++++------- xen/include/xen/sched.h | 2 + 2 files changed, 70 insertions(+), 12 deletions(-) diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c index 2ab4c29..fc43e62 100644 --- a/xen/common/event_fifo.c +++ b/xen/common/event_fifo.c @@ -50,6 +50,36 @@ static void evtchn_fifo_init(struct domain *d, struct evtchn *evtchn) d->domain_id, evtchn->port); } +static struct evtchn_fifo_queue *lock_old_queue(const struct domain *d, + struct evtchn *evtchn, + unsigned long *flags) +{ + struct vcpu *v; + struct evtchn_fifo_queue *q, *old_q; + unsigned int try; + + for ( try = 0; try < 3; try++ ) + { + v = d->vcpu[evtchn->last_vcpu_id]; + old_q = &v->evtchn_fifo->queue[evtchn->last_priority]; + + spin_lock_irqsave(&old_q->lock, *flags); + + v = d->vcpu[evtchn->last_vcpu_id]; + q = &v->evtchn_fifo->queue[evtchn->last_priority]; + + if ( old_q == q ) + return old_q; + + spin_unlock_irqrestore(&old_q->lock, *flags); + } + + gdprintk(XENLOG_WARNING, + "domain %d, port %d lost event (too many queue changes)\n", + d->domain_id, evtchn->port); + return NULL; +} + static int try_set_link(event_word_t *word, event_word_t *w, uint32_t link) { event_word_t new, old; @@ -119,7 +149,6 @@ static void evtchn_fifo_set_pending(struct vcpu *v, struct evtchn *evtchn) struct domain *d = v->domain; unsigned int port; event_word_t *word; - struct evtchn_fifo_queue *q; unsigned long flags; bool_t was_pending; @@ -136,25 +165,52 @@ static void evtchn_fifo_set_pending(struct vcpu *v, struct evtchn *evtchn) return; } - /* - * No locking around getting the queue. This may race with - * changing the priority but we are allowed to signal the event - * once on the old priority. - */ - q = &v->evtchn_fifo->queue[evtchn->priority]; - was_pending = test_and_set_bit(EVTCHN_FIFO_PENDING, word); /* * Link the event if it unmasked and not already linked. */ if ( !test_bit(EVTCHN_FIFO_MASKED, word) - && !test_and_set_bit(EVTCHN_FIFO_LINKED, word) ) + && !test_bit(EVTCHN_FIFO_LINKED, word) ) { + struct evtchn_fifo_queue *q, *old_q; event_word_t *tail_word; bool_t linked = 0; - spin_lock_irqsave(&q->lock, flags); + /* + * No locking around getting the queue. This may race with + * changing the priority but we are allowed to signal the + * event once on the old priority. + */ + q = &v->evtchn_fifo->queue[evtchn->priority]; + + old_q = lock_old_queue(d, evtchn, &flags); + if ( !old_q ) + goto done; + + if ( test_and_set_bit(EVTCHN_FIFO_LINKED, word) ) + { + spin_unlock_irqrestore(&old_q->lock, flags); + goto done; + } + + /* + * If this event was a tail, the old queue is now empty and + * its tail must be invalidated to prevent adding an event to + * the old queue from corrupting the new queue. + */ + if ( old_q->tail == port ) + old_q->tail = 0; + + /* Moved to a different queue? */ + if ( old_q != q ) + { + evtchn->last_vcpu_id = evtchn->notify_vcpu_id; + evtchn->last_priority = evtchn->priority; + + spin_unlock_irqrestore(&old_q->lock, flags); + spin_lock_irqsave(&q->lock, flags); + } /* * Atomically link the tail to port iff the tail is linked. @@ -166,7 +222,7 @@ static void evtchn_fifo_set_pending(struct vcpu *v, struct evtchn *evtchn) * If the queue is empty (i.e., we haven't linked to the new * event), head must be updated. */ - if ( port != q->tail ) + if ( q->tail ) { tail_word = evtchn_fifo_word_from_port(d, q->tail); linked = evtchn_fifo_set_link(d, tail_word, port); @@ -182,7 +238,7 @@ static void evtchn_fifo_set_pending(struct vcpu *v, struct evtchn *evtchn) &v->evtchn_fifo->control_block->ready) ) vcpu_mark_events_pending(v); } - + done: if ( !was_pending ) evtchn_check_pollers(d, port); } diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index cbdf377..5ab92dd 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -98,6 +98,8 @@ struct evtchn } u; u8 priority; u8 pending:1; + u16 last_vcpu_id; + u8 last_priority; #ifdef FLASK_ENABLE void *ssid; #endif -- 1.7.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |