|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 2/5] xen: add bitmap to indicate per-domain state changes
On 17.12.2024 16:55, Jürgen Groß wrote:
> On 17.12.24 16:19, Jan Beulich wrote:
>> On 17.12.2024 15:22, Juergen Gross wrote:
>>> Add a bitmap with one bit per possible domid indicating the respective
>>> domain has changed its state (created, deleted, dying, crashed,
>>> shutdown).
>>>
>>> Registering the VIRQ_DOM_EXC event will result in setting the bits for
>>> all existing domains and resetting all other bits.
>>>
>>> As the usage of this bitmap is tightly coupled with the VIRQ_DOM_EXC
>>> event, it is meant to be used only by a single consumer in the system,
>>> just like the VIRQ_DOM_EXC event.
>>
>> I'm sorry, but I need to come back to this. I thought I had got convinced
>> that only a single entity in the system can bind this vIRQ. Yet upon
>> checking I can't seem to find what would guarantee this. In particular
>> binding a vIRQ doesn't involve any XSM check. Hence an unprivileged entity
>> could, on the assumption that the interested privileged entity (xenstore)
>> is already up and running, bind and unbind this vIRQ, just to have the
>> global map freed. What am I overlooking (which would likely want stating
>> here)?
>
> I think you are not overlooking anything.
>
> I guess this can easily be handled by checking that the VIRQ_DOM_EXC handling
> domain is the calling one in domain_[de]init_states(). Note that global virqs
> are only ever sent to vcpu[0] of the handling domain, so rebinding the event
> to another vcpu is possible, but doesn't make sense.
No, that's precluded by
if ( virq_is_global(virq) && (vcpu != 0) )
return -EINVAL;
afaict. That doesn't, however, preclude multiple vCPU-s from trying to bind
the vIRQ to vCPU 0.
>>> V5:
>>> - domain_init_states() may be called only if evtchn_bind_virq() has been
>>> called validly (Jan Beulich)
>>
>> I now recall why I had first suggested the placement later in the handling:
>> You're now doing the allocation with yet another lock held. It's likely not
>> the end of the world, but ...
>>
>>> @@ -138,6 +139,60 @@ bool __read_mostly vmtrace_available;
>>>
>>> bool __read_mostly vpmu_is_available;
>>>
>>> +static DEFINE_SPINLOCK(dom_state_changed_lock);
>>> +static unsigned long *__read_mostly dom_state_changed;
>>> +
>>> +int domain_init_states(void)
>>> +{
>>> + const struct domain *d;
>>> + int rc = -ENOMEM;
>>> +
>>> + spin_lock(&dom_state_changed_lock);
>>> +
>>> + if ( dom_state_changed )
>>> + bitmap_zero(dom_state_changed, DOMID_FIRST_RESERVED);
>>> + else
>>> + {
>>> + dom_state_changed = xvzalloc_array(unsigned long,
>>> +
>>> BITS_TO_LONGS(DOMID_FIRST_RESERVED));
>>
>> ... already this alone wasn't nice, and could be avoided (by doing the
>> allocation prior to acquiring the lock, which of course complicates the
>> logic some).
>>
>> What's perhaps less desirable is that ...
>>
>>> @@ -494,6 +495,15 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind,
>>> evtchn_port_t port)
>>> goto out;
>>> }
>>>
>>> + if ( virq == VIRQ_DOM_EXC )
>>> + {
>>> + rc = domain_init_states();
>>> + if ( rc )
>>> + goto out;
>>> +
>>> + deinit_if_err = true;
>>> + }
>>> +
>>> port = rc = evtchn_get_port(d, port);
>>> if ( rc < 0 )
>>> {
>>
>> ... the placement here additionally introduces lock nesting when really
>> the two locks shouldn't have any relationship.
>>
>> How about giving domain_init_states() a boolean parameter, such that it
>> can be called twice, with the first invocation moved back up where it
>> was, and the second one put ...
>>
>>> @@ -527,6 +537,9 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind,
>>> evtchn_port_t port)
>>> out:
>>> write_unlock(&d->event_lock);
>>>
>>> + if ( rc && deinit_if_err )
>>> + domain_deinit_states();
>>> +
>>> return rc;
>>> }
>>
>> ... down here, not doing any allocation at all (only the clearing), and
>> hence eliminating the need to deal with its failure? (Alternatively
>> there could of course be a split into an init and a reset function.)
>>
>> There of course is the chance of races with such an approach. I'd like
>> to note though that with the placement of the call in the hunk above
>> there's a minor race, too (against ...
>>
>>> @@ -730,6 +743,9 @@ int evtchn_close(struct domain *d1, int port1, bool
>>> guest)
>>> struct vcpu *v;
>>> unsigned long flags;
>>>
>>> + if ( chn1->u.virq == VIRQ_DOM_EXC )
>>> + domain_deinit_states();
>>
>> ... this and the same remote vCPU then immediately binding the vIRQ
>> again). Hence yet another alternative would appear to be to drop the
>> new global lock and use d->event_lock for synchronization instead
>> (provided - see above - that only a single entity can actually set up
>> all of this). That would pretty much want to have the allocation kept
>> with the lock already held (which isn't nice, but as said is perhaps
>> tolerable), but would at least eliminate the undesirable lock nesting.
>>
>> Re-use of the domain's event lock is at least somewhat justified by
>> the bit array being tied to VIRQ_DOM_EXEC.
>>
>> Thoughts?
>
> With my suggestion above I think there is no race, as only the domain handling
> VIRQ_DOM_EXC could alloc/dealloc dom_state_changed.
Yet still it could be multiple vCPU-s therein to try to in parallel.
> Using d->event_lock for synchronization is not a nice option IMO, as it would
> require to take the event_lock of the domain handling VIRQ_DOM_EXEC when
> trying
> to set a bit for another domain changing state.
Well, yes, it's that domain's data that's to be modified, after all.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |