|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] ITS emulation race conditions
On 11/04/17 10:24, Julien Grall wrote: Hi Stefano, On 04/11/2017 12:01 AM, Stefano Stabellini wrote:On Mon, 10 Apr 2017, Andre Przywara wrote:Hi, On 10/04/17 00:12, André Przywara wrote:Hi, I wanted to run some ideas on how to prevent the race conditions we are facing with the ITS emulation and removing devices and/or LPIs. I think Stefano's idea of tagging a discarded LPI is the key, but still some details are left to be solved. I think we are dealing with two issues: 1) A guest's DISCARD can race with in incoming LPI. Ideally DISCARD would remove the host LPI -> vLPI connection (in the host_lpis table), so any new incoming (host) LPI would simply be discarded very early in gicv3_do_LPI() without ever resolving into a virtual LPI. Now while this removal is atomic, we could have just missed an incoming LPI, so the old virtual LPI would still traverse down the VGIC with a "doomed" virtual LPI ID. I wonder if that could be solved by a "crosswise" check: - The DISCARD handler *first* tags the pending_irq as "UNMAPPED", *then* removes the host_lpis connectin. - do_LPI() reads the host_lpis() array, *then* checks the UNMAPPED tag in the corresponding pending_irq (or lets vgic_vcpu_inject_irq() do that). With this setup the DISCARD handler can assume that no new virtual LPI instances enter the VGIC afterwards. 2) An unmapped LPI might still be in use by the VGIC, foremost might still be in an LR. Tagging the pending_irq should solve this in general. Whenever a VGIC function finds the UNMAPPED tag, it does not process the vIRQ, but retires it. For simplicity we might limit this to handling when a VCPU exists and an LR gets cleaned up: If we hit a tagged pending_irq here, we clean the LR, remove the pending_irq from all lists and signal the ITS emulation that this pending_irq is now ready to be removed (by calling some kind of cleanup_lpi() function, for instance). The ITS code can then remove the struct pending_irq from the radix tree. MAPD(V=0) is now using this to tag all still mapped events as "UNMAPPED", the counting down the "still alive" pending_irqs in cleanup_lpi() until they reach zero. At this point it should be safe to free the pend_irq array. Does that sound like a plan? Do I miss something here? Probably yes, hence I am asking ;-)So there are two issues with this: - For doing the LPI cleanup, we would need to call a virtual ITS or virtual LPI specific function directly from gic.c. This looks like bad coding style, as it breaks the abstraction (calling a virtual LPI/ITS specific function from within gic.c)This is just code organization, I am not worried about it. We might have to register cleanup functions. The real problem to solve is below.- If we have a "DISCARD; MAPTI" sequence, targeting the same vLPI, while this vLPI is still in an LR (but not cleaned up until both commands have been handled), we end up with using the wrong pending_irq struct (the new one). (I assume the UNMAPPED tag would be cleared upon the new MAPTI).It looks like "DISCARD; MAPTI" would be a problem even if we go with "GIC: Add checks for NULL pointer pending_irq's", because instead of a NULL pending_irq, we could get the new pending_irq, the one backing the same vlpi but a different eventid.Can this create problems? I see two possibilities: a) the old vLPI is still pending in the LR: as the new LR would be pristine, the code in update_one_lr() would just clean the QUEUED bit, but not do anything further. The LR would be kept as used by this vLPI, with the state still set to GICH_LR_PENDING. Upon re-entering the VCPU this would make the new vLPI pending. b) the old vLPI has been EOIed: the new LR would be pristine, the code in update_one_lr() would try to clean it up anyway, which should not hurt, AFAICT.This cannot be allowed to happen. We have to keep a consistent internal state at all times: we cannot change the vlpi mapping in pend_lpi_tree before the old vlpi is completed discarded. We cannot have a vlpi marked as "UNMAPPED", but still alive in an LR, while vlpi_to_pending already returns the new vlpi. Well, in hardware LPI are handled by the re-distributor and ITS is only acting as a walker to find the mapping (DeviceID, EventID) -> LPI. In real hardware, a LPI may be received afterwards if the re-distributor already inject it. The OS will receive the LPI and have to deal with it, potentially re-directing to the new interrupt handler and not the old. The problem is the same here, an LPI may be in LR whilst the DISCARD happen. I think this is too late and the guest should receive it.
Thinking a bit more, I think there is another race with your suggestion between gic_update_one_lr and vgci_vcpu_inject_irq. If I understand correctly your suggestion, UNMAPPED will clear the lrs and return. However, you may have receive a new interrupt just before clearing the LRs. Do you expect vgic_vcpu_inject_irq to clear UNMAPPED? If so, how the function will know if this is the new pending_irq or the previous and should be ignored? Among these approaches, [b]+[c] would be OK for 4.9. Or [d]. [a] looks simple but it is actually difficult to do correctly.So I would like to know how to proceed here: I) stick to the tag, and fix that particular case above by checking for the "inflight && ENABLED" condition and clearing the LR if the pending_irq does not state it's pending anymore II) introduce a NO_LONGER_PENDING tag in addition to the UNMAPPED tag, where a new MAPTI just clears UNMAPPED_LPI, but keeps NO_LONGER_PENDING. NO_LONGER_PENDING would be checked and cleared by update_one_lr(), UNMAPPED by vgic_vcpu_inject_irq(). That would leave the issue how to call the pend_irq_cleanup() function from update_one_lr() without breaking abstractionIf MAPTI is called with a different eventid but the same vlpi, wouldn't vlpi_to_pending return the wrong pending_irq struct in gic_update_one_lr, causing problems to both I) and II)? The flags would not be set.III) Revert to the "check for NULL" solution.Wouldn't irq_to_pending potentially return the new pending_irq instead of NULL in gic_update_one_lr? This is a problem. For example, the new pending_irq is not in the inflight list, but gic_update_one_lr will try to remove it from it. I think this approach is error prone: we should always get the right pending_irq struct, or one that is appropriately marked with a flag, or NULL. In this case, we would still have to mark the new pending_irq as "UNMAPPED" to tell gic_update_one_lr to do nothing and return. It would end up very similar to [d]. We might as well do [d].I think you still have to handle the NULL case because you may receive a spurious interrupt from the host before whilst cleaning up the mapping. The mapping may pLPI <-> vLPI may still exist, but when you get into vgic_vcpu_inject_irq the irq_to_pending will return NULL as it does not exist anymore in the radix tree. Cheers, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |