[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Hypervisor memory leak/corruption because of guest irqs



Hello,

I appear to have opened a can of worms here.  This was discovered when
turning on ASSERT()s, looking for another crash (and adding in 98 new
ASSERTs, along with some 'type checking' magic constants)

The issue has been discovered against Xen 4.1.3, but a cursory
inspection of unstable shows no signs of it being fixed. The relevant
assertion (which I added) is attached.  (In the process of debugging, I
also developed the ASSERT_PRINTK(bool, fmt, ...) macro which will be
upstreamed in due course.)

The root cause of the problem is the compelete abuse of the
irq_desc->action pointer being cast to a irq_guest_action_t* when
in-fact it is an irqaction*, but the (correct) solution is not easy.

destroy_irq() calls dynamic_irq_cleanup() which xfree()'s desc->action. 
This would be all well and fine if it were only an irqaction pointer. 
However, in this case, it is actually an irq_guest_action_t pointer,
meaning that we have free()'d an inactive timer, which is on a pcpu's
inactive timer linked list.  This means that as soon as the free()'d
memory is reused for something new, the linked list gets trashed, which
which point all bets are off with regards to the validity of hypervisor
memory.

As far as I can tell, this bug only manifests in combination with PCI
Passthrough, as we only perform cleanup of guest irqs when a domain with
passthrough is shut down.  The issue was first found by the ASSERT()s in
__list_del(), when something tried to use the pcpu inactive timer list,
after the free()'d memory was reused.

In this specific case, a quick and dirty hack would be to check every
time we free an action and possibly kill the timer if it is a guest irq.

Having said that, it is not a correct fix; the utter abuse of
irq_desc->action has been a ticking timebomb for a long time. 
irq_guest_action_t is private to the 2nd half of irq.c(x86), whereas
irqaction is common and architecture independent.  The only acceptable
solution I see is to re-architect a substantial proportion of the irq code.

Am I missing something obvious? or is the best way to continue (in which
case I have my work cut out as, it is currently affecting XenServer
customers) ?

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

Attachment: hypervisor-panic.log
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.