[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] mce: Fix for another race condition



On Mon, 2014-01-20 at 14:51 +0000, Frediano Ziglio wrote:
> Hi,
>   this are actually two patches which both fixes the same race condition
> in mce code. The problem is that these lines (in mctelem_reserve)
> 
> 
>       newhead = oldhead->mcte_next;
>       if (cmpxchgptr(freelp, oldhead, newhead) == oldhead) {
> 
> are racy. After you read the newhead pointer it can happen that another
> flow (thread of recursive invocation) change all the list but set head
> with same value. So oldhead is the same as *freelp but you are setting a
> new head that could point to whatever element (even already used).
> 
> The base idea of both patches is to separate mcte_state in a separate
> state and setting it with cmpxchg to make sure we don't pick up an
> already allocated element.
> 
> The first patch try to use the list detaching entirely (setting head to
> NULL) to avoid the use of mcte_next falling to a slow_reserve which scan
> all the array trying to find an element in the state FREE. This is
> surely safe and easy but if list is mostly allocated you end up scanning
> the array entirely every time.
> 
> The second patch (which needs some cleanup) instead of using pointers
> use array indexes to allow to bound in an atomic way which head and next
> pointer. The head is attached to a counter which is incremented to avoid
> to have the list changed but the head with same value (it's like a list
> version). The state is attached with the next index (which replace
> mcte_next if state is FREE) to allow atomic read of state+next. To
> handle both thread safety and reentrancy mctelem_reserve got a bit more
> complicated and the updates are not so straight forward.
> 
> Now, the question is: Should I just send the first patch and ignore the
> computation problem in the corner case or should I try to put in shape
> the second patch?
> 

I reply to my own mail with another fix. This is quite easy and mainly
uses a bitmap. I used a unsigned long instead of a bitmap as is easier
to test if all bits are zero but this limit number of entries.

Beside changing the structure another idea would be to use a single
bitmap and start from 0 to search for urgent entries while starting from
MC_URGENT_NENT to get non urgent entries.

Frediano

Attachment: mce_fix2_v3.patch
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.