[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6 1/7] xen/events: fix race with set_global_virq_handler()
On 07.01.2025 16:56, Juergen Gross wrote: > On 07.01.25 16:49, Jan Beulich wrote: >> On 07.01.2025 16:37, Juergen Gross wrote: >>> On 07.01.25 16:23, Jan Beulich wrote: >>>> On 07.01.2025 11:17, Juergen Gross wrote: >>>>> --- a/xen/common/event_channel.c >>>>> +++ b/xen/common/event_channel.c >>>>> @@ -979,6 +979,7 @@ void send_global_virq(uint32_t virq) >>>>> int set_global_virq_handler(struct domain *d, uint32_t virq) >>>>> { >>>>> struct domain *old; >>>>> + int rc = 0; >>>>> >>>>> if (virq >= NR_VIRQS) >>>>> return -EINVAL; >>>>> @@ -992,14 +993,23 @@ int set_global_virq_handler(struct domain *d, >>>>> uint32_t virq) >>>>> return -EINVAL; >>>>> >>>>> spin_lock(&global_virq_handlers_lock); >>>>> - old = global_virq_handlers[virq]; >>>>> - global_virq_handlers[virq] = d; >>>>> + >>>>> + if ( d->is_dying != DOMDYING_alive ) >>>>> + { >>>>> + old = d; >>>>> + rc = -EINVAL; >>>>> + } >>>> >>>> While I can see how this eliminates the zombie domain aspect, this doesn't >>>> fully eliminate the race. Doing so would require (also) using the domain's >>>> event lock. Assuming we're okay with the remaining race, imo a code comment >>>> would be needed to state this (including the fact that it's then >>>> unpredictable whether this operation might still succeed for a domain >>>> already having d->is_dying != DOMDYING_alive). >>> >>> AFAIU you mean that it is still possible to set a domain to handle a virq >>> when it is in the process of going down, especially if is_dying is set just >>> after it has been tested to be DOMDYING_alive? >>> >>> I don't see this being a problem, as the same would happen if the domain >>> would go down just a millisecond later. This is something we will never be >>> able to handle. >> >> Right, but the sequence of events in the case you mention is different: The >> insertion into the array would still happen while the domain isn't marked >> dying. >> >>> And after all the call of clear_global_virq_handlers() will now reset the >>> handling domain to the hardware domain in all cases. >> >> Of course, but in the meantime an event may be sent to such a domain already >> marked dying. That likely isn't going to cause problems, but is unexpected >> with what description here says is being addressed. >> >>>> Plus the way you do it the early success path remains; ideally that case >>>> would also fail for an already dying domain. >>> >>> Same again: clear_global_virq_handlers() will reset the handling domain. >> >> Right. >> >> In summary: As indicated, we may be okay with the remaining race, but then >> we also should be making clear that we've decided to leave it at that. >> Hence my earlier request: If we accept this, say (and briefly justify) this >> in a code comment. > > Okay, would you be fine with: > > Note that this check won't guarantee that a domain just going down can't be > set as the handling domain of a virq, as the is_dying indicator might > change > just after testing it. > This isn't going to be a major problem, as clear_global_virq_handlers() is > guaranteed to run afterwards and it will reset the handling domain for the > virq to the hardware domain. Reads okay, thanks. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |